Differences between revisions 25 and 38 (spanning 13 versions)

GEBVtest Software

The GEBV test is a validation procedure described in the Interbull Code of Practice, Appendix VIII.

The software consists of a program (gebvtest.py) and utility module used by such program (ibutils.py). The gebvtest.py program will perform the GEBV validation tests for all traits for one breed and population and then create a zip file with the input and output files, ready for submission to the Interbull Centre.

Installation and testing

The program has been tested under Python 3 (minimum 3.8). As a minimum, you will need to have this extra python module installed on your system: NumPy.

Please note that Python2, all versions, is no longer supported by developers.

Download the attached gebvtest202505.zip file.

Create a working directory and unzip the zip file in that directory. Three subdirectories will be created, programs, sample data and ITBC_results. Typing

python gebvtest.py --help

from a command line prompt, from within the programs directory, should print a brief help message if the installation has been successful.

Some sample data for breed HOL and population ABC are available in the sample_data directory. The program can be run from the programs directory as follows:

python gebvtest.py -v -m hol abc ../sample_data --outdir ../My_results

In this example, input data are in the sample_data directory while results will be created in a dedicated new directory called "My_results". You will then be able to compared your results with the ones produced by Interbull Centre using the same sample data.

Files can be read from other locations and output written to other locations as well. Please see the following sections for further information.

Clarification on GEBVtest Criteria to Use for Interbull Validation

The official Interbull GEBV test, whose results are used for deciding on inclusion/exclusion of national data into the international genomic evaluation (GMACE), must be based on the following criteria, which are the default values used by the 2025 gebvtest software program:

TARGET = DGEBV

WEIGHT= Interbull weighted regression test

FOREIGN BULL = N

BASE ADJUSTMENT= EBV

The software does allow the user to choose different validation targets (based on dEBV, EBV, GEBV) or regression weights (based on Legarra-Reverter un-weighted regression), among other things, but these new features and flexibilities added to the program in 2025 WERE NOT meant to be used for creating a different test than the official one, which is based on only using the default values. The new features were added for the purpose of furthering internal research on improving the national evaluation models, and they are not to be used as a way of deviating from the official test for qualifying on GMACE participation.

EXCEPTIONS related to the input data:

REDUCED DATA SETS: The reduced data sets should be prepared by truncating the phenotypes used as input for both the conventional and the genomic evaluations. The NGEC must exclude phenotypic information from the most recent 4 years. There are some exceptions allowed, however, for newer traits, and for smaller populations:

For newer traits, if the size of genomic reference population in the truncated file is reduced by too much, then the accuracy of GEBV calculated from truncated data becomes significantly lower than with full data. In that case, the country can use n<4 years as the time difference between full and reduced data sets.
For smaller populations, if the number of test bulls is too small (<50), then the country may choose to also include foreign bulls that have been used locally (type of proof = 21 or 22) with EDCf ≥ 20 local progeny and EDCr = 0 as part of the validation group, to increase the number of test bulls.

In both cases above, the criteria used to define test bulls must be communicated to the Interbull Centre.

Further, detailed information on how to create the necessary files and on the test description is available in the Code of Practice, Appendix VIII

Program gebvtest.py - User Manual

Information about the program

The program gebvtest.py performs the GEBV validation tests for one breed-population combination, for all traits. At the end of the program a zip file is created with the input files and the result file, ready for submission to the ITBC. The ITBC will perform some additional data checks and re-run the program to check the results. The result file is a file735 format file.

Input files:

traits - GEBV_test options file(Format: traits)
file300Cf - national official genetic evaluations, written in trait-independent format (Format: File300)
file300Cr - reduced conventional genetic evaluation file, written in trait-independent format
file300Gr - reduced genomic evaluation file, written in trait-independent format
file300Gf - national official genomic evaluation file, written in trait-independent format
file736 - file with birth dates (Format: File736)

Running the program

The program should be run from within the programs directory. Typing

python gebvtest.py --help

will give a summary of the program usage:

usage: gebvtest.py [-h] [--version] [-v] [-Z] [-C] [--target {DGEBV,VFEBV,GEBV,EBV,DEBV}] [--weight {ITB,LR}] [--min_byear MIN_BYEAR] [--fb {Y,N}] [--baseadj {GEBV,EBV,NONE}] [--power POWER]
                         [--baseincl BASEINCL] [--traitsincl TRAITSINCL] [--outdir OUTDIR] [-m] [-M MERGEDIR] [-s SAMPLES] [--accept_bias ACCEPT_BIAS]
                         brd pop datadir

positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files

optional arguments:
  -h, --help            show this help message and exit
  --version             show version of this program and exit
  -v, --verbose         increase output verbosity
  -Z, --no-zip          do not create a zip file (eg. for preliminary testing or usage at ITBC)
  -C, --cleanup         delete all files successfully added to the zip file
  --target {DGEBV,VFEBV,GEBV,EBV,DEBV}
                        validation target options are: [ DGEBV, VFEBV, GEBV, EBV, DEBV ] (default=DGEBV)
  --weight {ITB,LR}     Options are: [ ITB or LR ], for the Interbull weighted-regression test or Legarra-Reverter un-weighted regression, respectively (default=ITB)
  --min_byear MIN_BYEAR
                        specify a minimum birth year to use instead of using the value specified in the traits file
  --fb {Y,N}            specify Y or N to include foreign bulls in the validation group, instead of Y/N from the traits file
  --baseadj {GEBV,EBV,NONE}
                        evaluation variable to use for base adjustments, options are: [ EBV, GEBV, NONE ] (default=EBV)
  --power POWER         specify a base for the power function weighting records in base adjustments, instead of optimizing the base from the data
  --baseincl BASEINCL   comma-separated lists of restrictions on bulls to include for base adjustment estimates, [ min,max byr : proof type list : proof status list : official Y/N ]
  --traitsincl TRAITSINCL
                        comma-separated list of traits to process
  --outdir OUTDIR       absolute or relative path to write output files (default=.)
  -m, --mergefiles      write merged data files (for independent data checks)
  -M MERGEDIR, --mergedir MERGEDIR
                        absolute or relative path for merged data files (default=DATADIR/merged)
  -s SAMPLES, --samples SAMPLES
                        number of bootstrap samples for R-squared test (default=1000)
  --accept_bias ACCEPT_BIAS
                        standardized ignorable bias accepted in practice (default=0.25)

See detailed instructions at: https://interbull.org/ib/gebvtest_software

More detail on the -m --mergefiles options is available here.

Output files

file735 - results from the GEBV test for all traits tested (Format: File735 ) . Additional results are provided in the following extra files: file735_ABCHOL.csv (comma separated format), file735adj_ABCHOL (specific results pertaining to the base adjustment)
gebvtest_log - summary of the calculations
Submission zip file - gebvtest.py generates a zip file including all input and output files which should be sent to the Interbull Centre as the official data submission for the GEBV test. The zip file will be named gtYYMM_POPBRD.zip, where YY and MM are year and month of test date, POP is the population code and BRD is the breed code.

GEBV test data submission

Interbull customers willing to participate in the GEBV test must send to the Interbull Centre the following files to interbull@slu.se :

Submission zip file - generated by the gebvtest.py program.
Form GENO - one form for each trait group validated.

Troubleshooting/FAQ

Double check your data files and make sure the file formats are ok.
- In some cases special characters in bull names make the program crash. A hint is to leave out the bull names as they are not used anyway.
- All your files should contain a field of “country sending this information” and the code should be consistent for all files. Leaving a blank instead of a code for “country sending this information” has the effect that the file is not read.
Make sure to use the -v flag and check the log files carefully (look for files with 0 records, for example)
If python crashes with an error message:
- if any "import" statement causes an error, Python or one of the modules is not correctly installed
If bulls seem to be missing or in excess in the candidate or test groups, use -v -m options, but not -C, and check the link merged files.
If the program crashes, or otherwise misbehaves, double check that you are running the latest version of the program. You can check that by typing the following command:
- python gebvtest.py --version

If you need assistance, please do not hesitate to contact us at interbull@slu.se .

-  ⇤ ← Revision 25 as of 2024-08-20 14:37:08 → 
  Size: 9263
  Editor: Valentina
  Comment:
+   ← Revision 38 as of 2025-06-04 09:23:15 → ⇥
  Size: 11054
  Editor: Valentina
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 5:
-The GEBV test is a validation procedure described in the Interbull Code of Practice, Appendix VIII.
+The GEBV test is a validation procedure described in the Interbull Code of Practice, [[https://interbull.org/ib/cop_appendix8|Appendix VIII]].
 Line 11:
-The  program has been tested under Python  3 (minimum 3.8) . As a minimum,  you will need to have this extra python module installed on your  system: !NumPy.
+The program has been tested under Python 3 (minimum 3.8). As a minimum, you will need to have this extra python module installed on your  system: !NumPy.
 Line 13:
-Please note that Python2 is no longer supported by developers.
+Please note that Python2, all versions, is no longer supported by developers.
 Line 15:
-Download the attached '''[[attachment:gebvtest20240812.zip|gebvtest2024.zip|&do=get]]''' file.
+Download the attached ''' [[attachment:gebvtest20250515.zip|gebvtest202505.zip|&do=get]] ''' file.
 Line 17:
-Create a working directory and unzip the zip file in that directory. Three subdirectories will be created, ''programs, sample data ''and'' Results''. Typing
+Create a working directory and unzip the zip file in that directory. Three subdirectories will be created, ''programs, sample data ''and'' ITBC_results''. Typing
 Line 19:
- . `python gebvtest_2024A.py --help`
+ . `python gebvtest.py --help`
 Line 26:
-python gebvtest_2024A.py -v -m hol abc ../sample_data --outdir ../Results
+python gebvtest.py -v -m hol abc ../sample_data --outdir ../My_results
 Line 28:
-In this example, input data are in the sample_data directory while results will be created in a dedicated new directory called "''Results"''. Files can be read from other  locations and output written to other locations as well. Please see the  following sections for further information.
+In this example, input data are in the sample_data directory while results will be created in a dedicated new directory called "''My_results"''. You will then be able to compared your results with the ones produced by Interbull Centre using the same sample data.
 Line 30:
-To better explain the additional tests that can be performed with the current enhanced software, an additional folder has been created within the "''Results''" one, called "''users_options_combinations''" and containing the results of possible additional tests. The folder includes output from 8 different variations of user option combinations, with all options used in each case being captured at the top of the respective log files.
+Files can be read from other  locations and output written to other locations as well. Please see the  following sections for further information.
 Line 32:
-The outputs should match those in the source zip file.--( )--
+----
== Clarification on GEBVtest Criteria to Use for Interbull Validation ==
The official Interbull GEBV test, whose results are used for deciding on inclusion/exclusion of national data into the international genomic evaluation (GMACE), must be based on the following criteria, which are the default values used by the 2025 gebvtest software program:

TARGET = DGEBV

WEIGHT= Interbull weighted regression test

FOREIGN BULL = N

BASE ADJUSTMENT= EBV

The software does allow the user to choose different validation targets (based on dEBV, EBV, GEBV) or regression weights (based on Legarra-Reverter un-weighted regression), among other things, but these new features and flexibilities added to the program in 2025 WERE NOT meant to be used for creating a different test than the official one, which is based on only using the default values.  The new features were added for the purpose of furthering internal research on improving the national evaluation models, and they are not to be used as a way of deviating from the official test for qualifying on GMACE participation.

''__EXCEPTIONS related to the input data:__''

REDUCED DATA SETS: The reduced data sets should be prepared by truncating the phenotypes used as input for both the conventional and the genomic evaluations. The NGEC must exclude phenotypic information from the most recent '''''4 years.''''' There are some exceptions allowed, however, for newer traits, and for smaller populations:

 * For newer traits, if the size of genomic reference population in the truncated file is reduced by too much, then the accuracy of GEBV calculated from truncated data becomes significantly lower than with full data. In that case, the country can use n<4 years as the time difference between full and reduced data sets.

 * For smaller populations, if the number of test bulls is too small (<50), then the country may choose to also include foreign bulls that have been used locally (type of proof = 21 or 22) with EDCf ≥ 20 local progeny and EDCr = 0 as part of the validation group, to increase the number of test bulls.

In both cases above, the criteria used to define test bulls must be communicated to the Interbull Centre.

Further, detailed information on how to create the necessary files and on the test description is available in the Code of Practice, [[https://interbull.org/ib/whole_cop|Appendix VIII]]
-Line 40:
+Line 64:
- * '''''traits''''' - GEBV_test options file(Format: [[https://wiki.interbull.org/public/GEBVtest_traits?action=print&rev=5|traits]])
+ * '''''traits''''' - GEBV_test options file(Format: [[https://interbull.org/ib/gebvtest_traits_file|traits]])
-Line 55:
+Line 79:
-usage: gebvtest_2024A.py [-h] [--version] [-v] [-Z] [-C] [--target {DGEBV,VFEBV,GEBV,EBV,DEBV}] [--weight {ITB,LR}] [--min_byear MIN_BYEAR] [--fb {Y,N}] [--baseadj {GEBV,EBV,NONE}] [--power POWER]
+usage: gebvtest.py [-h] [--version] [-v] [-Z] [-C] [--target {DGEBV,VFEBV,GEBV,EBV,DEBV}] [--weight {ITB,LR}] [--min_byear MIN_BYEAR] [--fb {Y,N}] [--baseadj {GEBV,EBV,NONE}] [--power POWER]
-Line 66:
+Line 90:
-  --version             show version of this program
+  --version             show version of this program and exit
-Line 91:
+Line 115:
-See detailed instructions at: https://interbull.org/ib/gebvtest_software_2024
+See detailed instructions at: https://interbull.org/ib/gebvtest_software
-Line 93:
+Line 117:
-More detail on the ''-m --mergefiles'' options is available [[https://wiki.interbull.org/public/gebvtest_mergefiles?action=print&rev=7|here]].
+More detail on the ''-m --mergefiles'' options is available [[https://interbull.org/ib/gebvtest_merge_files|here]].
-Line 96:
+Line 120:
- * '''''file735''''' - results from the GEBV test for all traits tested ([[https://interbull.org/ib/file735_gebvtest_results|Format: File735]] , [[https://wiki.interbull.org/public/File735_example?action=print&rev=6|Examples]]) . Additional results are provided in the following extra files: file735_ABCHOL.''csv'' (comma separated format), file735''adj''_ABCHOL (specific results pertaining to the base adjustment)
 * '''''gebvtest_log''''' - summary of the calculations ([[https://wiki.interbull.org/public/gebvtest_log?action=print&rev=6|Example]])
+ * '''''file735''''' - results from the GEBV test for all traits tested ([[https://interbull.org/ib/file735_gebvtest_results|Format: File735]] ) . Additional results are provided in the following extra files: file735_ABCHOL.''csv'' (comma separated format), file735''adj''_ABCHOL (specific results pertaining to the base adjustment)
 * '''''gebvtest_log''''' - summary of the calculations
-Line 115:
+Line 139:
- * If bulls seem to be missing or in excess in the candidate or test groups, use -v -m options, but not -C, and check the link '''[[https://wiki.interbull.org/public/gebvtest_mergefiles?action=print&rev=7|merged files]]'''.
+ * If bulls seem to be missing or in excess in the candidate or test groups, use -v -m options, but not -C, and check the link '''[[http://interbull.org/ib/gebvtest_merge_files|merged files]]'''.
-Line 117:
+Line 141:
-  * python gebvtest_2024A.py --version and exit
+  * python gebvtest.py --version
-Line 120:
+Line 144:
-If you need assistance, please do not hesitate to contact us at [[mailto:interbull-hgen@slu.se|interbull@slu.se]] .
+If you need assistance, please do not hesitate to contact us at interbull@slu.se .