Size: 3400
Comment:
|
Size: 8164
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
## page was renamed from GEBVtest_enhanced2024 {{attachment:ibc_logo.jpg}} |
|
Line 6: | Line 7: |
<<TableOfContents>> This software consists of a program ('''''gebvtest.py''''') and utility module used by those programs ('''''ibutils.py'''''). '''''gebvtest.py''''' program will perform the GEBV validation tests for all traits for one breed and population and then create a zip file with the input and output files, ready for submission to the Interbull Centre. |
The software consists of a program ('''''gebvtest.py''''') and utility module used by such program ('''''ibutils.py'''''). The '''''gebvtest.py''''' program will perform the GEBV validation tests for all traits for one breed and population and then create a zip file with the input and output files, ready for submission to the Interbull Centre. |
Line 12: | Line 11: |
The programs have been tested under Python 3 (minimum 3.6) . As a minimum you will need to have these extra python modules installed on your system: !NumPy. Please note that Python2 is no longer supported by developers. | The program has been tested under Python 3 (minimum 3.6) . As a minimum, you will need to have this extra python module installed on your system: !NumPy. |
Line 14: | Line 13: |
Download the attached '''[[attachment:gebvtest20211001.zip||&do=get]]''' file. | Please note that Python2 is no longer supported by developers. |
Line 16: | Line 15: |
Create a working directory and unzip the zip file in that directory. Two subdirectories will be created, ''programs'' and ''sample data''. Typing | Download the attached '''gebvtest2024.zip''' file. Create a working directory and unzip the zip file in that directory. Three subdirectories will be created, ''programs, sample data ''and'' Results''. Typing |
Line 25: | Line 26: |
python gebvtest.py -v -m hol abc ../sample_data | python gebvtest.py -v -m hol abc ../sample_data --outdir ../Results |
Line 27: | Line 28: |
In this example data, parameters and output are all in the sample_data directory. Files can be read from other locations and output written to other locations as well. Please see the following sections for further information. | In this example data, parameters are in the sample_data directory while results will be created in a dedicated new directory called Results. Files can be read from other locations and output written to other locations as well. Please see the following sections for further information. |
Line 32: | Line 33: |
<<Include(public/gebvtest_py)>> | == Program gebvtest.py - User Manual == === Information about the program === The program '''''gebvtest.py''''' performs the GEBV validation tests for one breed-population combination, for all traits. At the end of the program a zip file is created with the input files and the result file, ready for submission to the ITBC. The ITBC will perform some additional data checks and re-run the program to check the results. The result file is a new file735 format file, which is a modification and extension of the previous file731 format file. === Input files: === * '''''traits''''' - GEBV_test options file(Format: [[https://wiki.interbull.org/public/GEBVtest_traits?action=print|traits]]) * '''''file300Cf''''' - national official genetic evaluations, written in trait-independent format (Format: [[https://wiki.interbull.org/public/File300?action=print|File300]]) * '''''file300Cr''''' - reduced conventional genetic evaluation file, written in trait-independent format * '''''file300Gr''''' - reduced genomic evaluation file, written in trait-independent format * '''file300Gf''' - national official genomic evaluation file, written in trait-independent format * '''''file736''''' - file with birth dates (Format: [[https://wiki.interbull.org/public/File736?action=print|File736]]) === Running the program === The program should be run from within the '''''programs''''' directory. Typing '''python gebvtest.py --help''' will give a summary of the program usage: {{{ usage: gebvtest_2023C.py [-h] [-v] [-Z] [-C] [--target {DGEBV,DGPA,VFEBV,GEBV,EBV,DEBV}] [--weight {ITB,LR}] [--min_byear MIN_BYEAR] [--fb {Y,N}] [--baseadj {GEBV,EBV,NONE}] [--power POWER] [--baseincl BASEINCL] [--traitsincl TRAITSINCL] [--outdir OUTDIR] [-m] [-M MERGEDIR] [-s SAMPLES] [--accept_bias ACCEPT_BIAS] brd pop datadir positional arguments: brd evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM) pop population code (same as country code except for CHR/DEA/DFS/FRR/FRM) datadir absolute or relative path to data files optional arguments: -h, --help show this help message and exit -v, --verbose increase output verbosity -Z, --no-zip do not create a zip file (eg. for preliminary testing or usage at ITBC) -C, --cleanup delete all files successfully added to the zip file --target {DGEBV,DGPA,VFEBV,GEBV,EBV,DEBV} validation target options are: [ DGEBV, DGPA, VFEBV, GEBV, EBV, DEBV ] (default=DGEBV) --weight {ITB,LR} Options are: [ ITB or LR ], for the Interbull weighted-regression test or Legarra-Reverter un-weighted regression, respectively (default=ITB) --min_byear MIN_BYEAR specify a minimum birth year to use instead of using the value specified in the traits file --fb {Y,N} specify Y or N to include foreign bulls in the validation group, instead of Y/N from the traits file --baseadj {GEBV,EBV,NONE} evaluation variable to use for base adjustments, options are: [ NONE, EBV, GEBV ] (default=EBV) --power POWER specify a base for the power function weighting records in base adjustments, instead of optimizing the base from the data --baseincl BASEINCL comma-separated lists of restrictions on bulls to include for base adjustment estimates, [ min,max byr : proof type list : proof status list : official Y/N ] --traitsincl TRAITSINCL comma-separated list of traits to process --outdir OUTDIR absolute or relative path to write output files (default=Results) -m, --mergefiles write merged data files (for independent data checks) -M MERGEDIR, --mergedir MERGEDIR absolute or relative path for merged data files (default=DATADIR/merged) -s SAMPLES, --samples SAMPLES number of bootstrap samples for R-squared test (default=1000) --accept_bias ACCEPT_BIAS standardized ignorable bias accepted in practice (default=0.25) }}} More detail on the ''-m --mergefiles'' options is available [[https://wiki.interbull.org/public/gebvtest_mergefiles?action=print|here]]. === Output files === * '''''file735''''' - results from the GEBV test for all traits tested ([[https://wiki.interbull.org/public/File735format?action=print|Format: File735]]) ([[https://wiki.interbull.org/public/File735?action=print|Example]]) * '''''gebvtest_log''''' - summary of the calculations ([[https://wiki.interbull.org/public/gebvtest_log?action=print|Example]]) * '''''Submission zip file''''' - gebvtest.py generates a zip file including all input and output files which '''should be sent to the Interbull Centre as the official data submission for the GEBV test'''. The zip file will be named '''''gtYYMM_POPBRD.zip''''', where YY and MM are year and month of test date, POP is the population code and BRD is the breed code. |
Line 46: | Line 100: |
* The record type (fist three positions of the file) must correspond to the data. So if your file is a longevity file the record type must be ‘017’ and not ‘717’ or ‘019’ or anything else. |
GEBVtest Software
The GEBV test is a validation procedure described in the Interbull Code of Practice, Appendix VIII.
The software consists of a program (gebvtest.py) and utility module used by such program (ibutils.py). The gebvtest.py program will perform the GEBV validation tests for all traits for one breed and population and then create a zip file with the input and output files, ready for submission to the Interbull Centre.
Installation and testing
The program has been tested under Python 3 (minimum 3.6) . As a minimum, you will need to have this extra python module installed on your system: NumPy.
Please note that Python2 is no longer supported by developers.
Download the attached gebvtest2024.zip file.
Create a working directory and unzip the zip file in that directory. Three subdirectories will be created, programs, sample data and Results. Typing
python gebvtest.py --help
from a command line prompt, from within the programs directory, should print a brief help message if the installation has been successful.
Some sample data for breed HOL and population ABC are available in the sample_data directory. The program can be run from the programs directory as follows:
python gebvtest.py -v -m hol abc ../sample_data --outdir ../Results
In this example data, parameters are in the sample_data directory while results will be created in a dedicated new directory called Results. Files can be read from other locations and output written to other locations as well. Please see the following sections for further information.
The outputs should match those in the source zip file.
Program gebvtest.py - User Manual
Information about the program
The program gebvtest.py performs the GEBV validation tests for one breed-population combination, for all traits. At the end of the program a zip file is created with the input files and the result file, ready for submission to the ITBC. The ITBC will perform some additional data checks and re-run the program to check the results. The result file is a new file735 format file, which is a modification and extension of the previous file731 format file.
Input files:
traits - GEBV_test options file(Format: traits)
file300Cf - national official genetic evaluations, written in trait-independent format (Format: File300)
file300Cr - reduced conventional genetic evaluation file, written in trait-independent format
file300Gr - reduced genomic evaluation file, written in trait-independent format
file300Gf - national official genomic evaluation file, written in trait-independent format
file736 - file with birth dates (Format: File736)
Running the program
The program should be run from within the programs directory. Typing
python gebvtest.py --help
will give a summary of the program usage:
usage: gebvtest_2023C.py [-h] [-v] [-Z] [-C] [--target {DGEBV,DGPA,VFEBV,GEBV,EBV,DEBV}] [--weight {ITB,LR}] [--min_byear MIN_BYEAR] [--fb {Y,N}] [--baseadj {GEBV,EBV,NONE}] [--power POWER] [--baseincl BASEINCL] [--traitsincl TRAITSINCL] [--outdir OUTDIR] [-m] [-M MERGEDIR] [-s SAMPLES] [--accept_bias ACCEPT_BIAS] brd pop datadir positional arguments: brd evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM) pop population code (same as country code except for CHR/DEA/DFS/FRR/FRM) datadir absolute or relative path to data files optional arguments: -h, --help show this help message and exit -v, --verbose increase output verbosity -Z, --no-zip do not create a zip file (eg. for preliminary testing or usage at ITBC) -C, --cleanup delete all files successfully added to the zip file --target {DGEBV,DGPA,VFEBV,GEBV,EBV,DEBV} validation target options are: [ DGEBV, DGPA, VFEBV, GEBV, EBV, DEBV ] (default=DGEBV) --weight {ITB,LR} Options are: [ ITB or LR ], for the Interbull weighted-regression test or Legarra-Reverter un-weighted regression, respectively (default=ITB) --min_byear MIN_BYEAR specify a minimum birth year to use instead of using the value specified in the traits file --fb {Y,N} specify Y or N to include foreign bulls in the validation group, instead of Y/N from the traits file --baseadj {GEBV,EBV,NONE} evaluation variable to use for base adjustments, options are: [ NONE, EBV, GEBV ] (default=EBV) --power POWER specify a base for the power function weighting records in base adjustments, instead of optimizing the base from the data --baseincl BASEINCL comma-separated lists of restrictions on bulls to include for base adjustment estimates, [ min,max byr : proof type list : proof status list : official Y/N ] --traitsincl TRAITSINCL comma-separated list of traits to process --outdir OUTDIR absolute or relative path to write output files (default=Results) -m, --mergefiles write merged data files (for independent data checks) -M MERGEDIR, --mergedir MERGEDIR absolute or relative path for merged data files (default=DATADIR/merged) -s SAMPLES, --samples SAMPLES number of bootstrap samples for R-squared test (default=1000) --accept_bias ACCEPT_BIAS standardized ignorable bias accepted in practice (default=0.25)
More detail on the -m --mergefiles options is available here.
Output files
file735 - results from the GEBV test for all traits tested (Format: File735) (Example)
gebvtest_log - summary of the calculations (Example)
Submission zip file - gebvtest.py generates a zip file including all input and output files which should be sent to the Interbull Centre as the official data submission for the GEBV test. The zip file will be named gtYYMM_POPBRD.zip, where YY and MM are year and month of test date, POP is the population code and BRD is the breed code.
GEBV test data submission
Interbull customers willing to participate in the GEBV test must send to the Interbull Centre the following files to interbull@slu.se :
Submission zip file - generated by the gebvtest.py program.
Form GENO - one form for each trait group validated.
Troubleshooting/FAQ
- Double check your data files and make sure the file formats are ok.
- In some cases special characters in bull names make the program crash. A hint is to leave out the bull names as they are not used anyway.
- All your files should contain a field of “country sending this information” and the code should be consistent for all files. Leaving a blank instead of a code for “country sending this information” has the effect that the file is not read.
- Make sure to use the -v flag and check the log files carefully (look for files with 0 records, for example)
- If python crashes with an error message:
- if any "import" statement causes an error, Python or one of the modules is not correctly installed
If bulls seem to be missing or in excess in the candidate or test groups, use -v -m options, but not -C, and check the link merged files.
If you need assistance, please do not hesitate to contact us at interbull@slu.se .