| Size: 2745 Comment:  | Size: 26696 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 18: | Line 18: | 
| Create a working directory and unzip the zip file in that directory. Two subdirectories will be created, ''programs'' and ''sample_data''. Typing | Create a working directory and unzip the zip file in that directory. Two subdirectories will be created, ''programs'' and ''sample_data''. Typing, for example, | 
| Line 38: | Line 38: | 
| == More to come ... == Joao/Vale: you could follow the example of the GEBVtest documentation ... | == Control File == Shortly after the beginning of each test run Interbull Centre will send a control file, called ''file305_POPBRD'', to every organization that will have to provide validation results for a given population and trait, either because they are testing significant changes in their model or they are participating for the first time or because their last validation was conducted more than two years ago. The format of the file is available in APPENDIX I, and an example is presented below: {{{ #grp trt evaldate herit siresd gm x mh md byr1 miny maxy corr preval chg uder scs 20130630 0.2240 0.38579 B- N 10 20 1981 1999 2003 0.99 09-may N work msp 20130719 0.0890 26.23801 B+ N 10 20 1981 1999 2003 0.99 ------ Y }}} === Usage notes === The ''siresd'' contained in the file is the MACE sire standard deviation as calculated in the current test run evaluation. <<BR>> The fields ''preval'' and ''chg'' together give information of why validation is required: * ''preval'' is set and ''chg'' is N = validation required as it has been more than two years since last evaluation (date shown in ''preval'' field) * ''preval'' is not set and ''chg'' is Y = validation required as significant changes are tested in the model or the population is participating for the first time. == Validation Method I == '''Definition:''' Comparison of genetic trends estimated using only first lactation versus all lactations in the routine national genetic evaluations. Validation method I is taken care by the program ''trendtest1.py. ''The program reads in three files: File305_POPBRD (a control file sent by ITBC, see above) File300_POPBRD (alias file01x, see APPENDIX IIa) and file300FL_POPBRD (a new file following the same format as file300 but pertaining to first lactation only, see APPENDIX IIa). In order to make the file format transition as smooth as possible, the program ''ttconvert1.py'' will take care of converting the legacy files format 01x into the new file format 300_POPBRD. === TTCONVERT1.PY === The program is located in the ''programs'' directory. Typing . '''python ttconvert1.py --help''' will give you a small summary of the program usage: {{{ usage: ttconvert1.py [-h] [-v] [-s SUFFIX] [-e ENCODING] [-o OUTDIR] brd pop datadir positional arguments: brd evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM) pop population code (same as country code except for CHR/DEA/DFS/FRR/FRM) datadir absolute or relative path to data files optional arguments: -h, --help show this help message and exit -v, --verbose increase output verbosity -s SUFFIX, --suffix SUFFIX suffix to add to all input file names, eg. ".usa" if file names are like file010.usa (default=none) -e ENCODING, --encoding ENCODING input file encoding (default=utf-8; try also iso-8859-1 or other values listed at http://docs.python.org/2/library/codecs.html#standard- encodings) -o OUTDIR, --outdir OUTDIR directory for output files (default=DATADIR) }}} ==== Warning ==== * INPUT FILE NAME: the input files must be named file01x.cou (for all lactations) and file01xFL.cou (for first lactation), for example: file010.can and file010FL.can, any other names will make the program to crash. * ENCODING: the files are expected to be in utf-8 * OLD FILE FORMAT: the file format for the file01x.cou and file01xFL.cou needs to follow exactly the format reported in our webpage http://interbull2.slu.se/www/v1/index.php?option=com_content&view=article&id=55&Itemid=161 ==== How to run the program ==== Go to the ''programs'' directory and type: . python ttconvert1.py -v -s'.abc' hol abc ../testing/abc1401 In this example * s'.abc' = the suffix of the population 'abc' is added to input files * breed of evaluation is HOL * population is ABC * input/output files are located in directory ../testing/abc1401 === TRENDTEST1.PY === Typing . '''python trendtest1.py --help''' within the ''programs'' directory will give you a small summary of the program usage: {{{ usage: trendtest1.py [-h] [-v] [-c CONTROLFILE] [-m] [-M MERGEDIR] brd pop datadir positional arguments: brd evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM) pop population code (same as country code except for CHR/DEA/DFS/FRR/FRM) datadir absolute or relative path to data files optional arguments: -h, --help show this help message and exit -v, --verbose increase output verbosity -c CONTROLFILE, --controlfile CONTROLFILE path/name of the control file (default=DATADIR/file305_POPBRD) -m, --mergefiles write merged data files (for independent data checks) -M MERGEDIR, --mergedir MERGEDIR absolute or relative path for merged data files (default=DATADIR/merged1) }}} ==== Trendtest1 - How to run the program ==== Go to the ''programs'' directory and type: . python trendtest1.py -v -m hol abc ../testing/abc1401 In this example * m is the option to have a merged file written out * breed of evaluation is HOL * population is ABC * input/output files are located in directory ../testing/abc1401 '''Output files''' The following files are wriiten to the DATADIR or OUTDIR, if specified. All files have a _POPBRD suffix, so that multiple sets of output files for different breeds or population can co-exist in the same directory. * '''tt1_POPBRD.log''': a log containing all information on the parameters and statistical analisys performed (see Trendtest1 - Log and Result File) * '''file311_POPBRD:''' summarizing the trend test results for method I (see Trendtest1 - Log and Result File) ==== Trendtest1 - Editings ==== The program will read the three input files, file305_POPBRD, file300_POPBRD and file300FL_POPBRD, and apply some editings on the data such as only the following bulls will be selected for the test: * All bulls reported in the file300FL_POPBRD must have an entry in the file300_POPBRD * Bulls with birth year equal or higher than the ''byr1'' specified in the file305_POPBRD * Bulls with type of proof equal to 11 or 12 and status of bulls not equal 20 * Bulls with number of herd equal of higher than minimum number of herd in file305_POPBRD (in both file300_POPBRD and file300FL_POPBRD) * Bulls with number of daughters and EDC equal or higher than minimum number of daughters in file305_POPBRD (in both file300_POPBRD and file300FL_POPBRD) A merged file is created, called ''trt.csv ''(mil.csv for example), and placed under the ''DATADIR/merged1 ''directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX IIIa. ==== Trendtest1 - Statistical test ==== The statistical test for method I would be calculated as: . {{{ model T: EBVT= b0 + b2*BYEAR + e (all lactations) }}} . {{{ model 1: EBV1= b0 + b1*BYEAR + e (first lactation) }}} The criteria for passing the test will then be equal to: . {{{ [abs(bT-b1)/sdg] < 0,02 (if BV are used, otherwise 0,01). }}} ==== Trendtest1 - Log and Result File ==== A logfile is created, called ''tt1_POPBRD.log, ''and placed under ''DATADIR, ''if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as * birthyear, minimum number of daughters, herds, genetic merit and MACE standard deviation used * Total number of records read from the file300_POPBRD and file300FL_POPBRD * summary of statistics for each of the trait analysed * Results on the regression on year of birth for first and all lactations * warning messages * method I final result. A result file, called ''file311_POPBRD'', will be created in the ''DATADIR, ''if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below: {{{ rec brd pop tgrp trt testdate pass testval SDg bv b_ALL b_1ST bulls stdALL std1ST x byr1 mh md warnings 311 HOL ABC prod mil 20131024 FAIL 0.021 434.925 BV 48.409 39.246 5569 490.928 539.577 N 1986 10 20 LACT1_SCALE_WARNING }}} In this example: * '''brd''' is breed of evaluation * '''pop''' is Population * '''tgrp''' is traitgroup * '''trt''' is trait analysed * '''PASS''' is the overall validation result for method I * '''testval''' is the result from [abs(bT-b1)/sdg] * '''SDg''' is MACE genetic standard deviation * '''bv''' is the genetic merit * '''b_ALL''' is the slope for all lactations' model * '''b_1st''' is the slope for first lactation model * '''bulls '''is total number of bulls considered * '''stdALL''' is the standard deviation for all lactations' model * '''std1st''' is the standard deviation for first lactation's model * '''x''' is if type of proof 21 or 22 are considered * '''byr1''' is minimum bull's birth year * '''mh''' is minimum number of herds * '''md''' is minimum number of daughters == Validation Method II == '''Definition:''' Analysis of within bull yearly Daughter Deviations (e.g. Daughter Yield Deviations, DYD), hereafter referred to as DD Validation method II is taken care by the program ''trendtest2.py.'' The program reads in three files: File305_POPBRD (a control file sent by ITBC, see above section ''Control File''), File300_POPBRD (alias file01x, see APPENDIX IIa) and file302_POPBRD (a new file format for submission of DD records, see APPENDIX IIb). === TRENDTEST2.PY === Typing . '''python trendtest2.py --help''' within the ''programs'' directory will give you a small summary of the program usage: {{{ usage: trendtest2.py [-h] [-v] [-c CONTROLFILE] [-m] [-M MERGEDIR] brd pop datadir positional arguments: brd evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM) pop population code (same as country code except for CHR/DEA/DFS/FRR/FRM) datadir absolute or relative path to data files optional arguments: -h, --help show this help message and exit -v, --verbose increase output verbosity -c CONTROLFILE, --controlfile CONTROLFILE path/name of the control file (default=DATADIR/file305_POPBRD) -m, --mergefiles write merged data files (for independent data checks) -M MERGEDIR, --mergedir MERGEDIR absolute or relative path for merged data files (default=DATADIR/merged2) }}} ==== Trendtest2 - How to run the program ==== Go to the ''programs'' directory and type: . python trendtest2.py -v -m hol abc ../testing/abc1401 In this example * m is the option to have a merged file written out * breed of evaluation is HOL * population is ABC * input/output files are located in directory ../testing/abc1401 '''Output files''' The following files are written to the DATADIR or OUTDIR, if specified. All files have a _POPBRD suffix, so that multiple sets of output files for different breeds or population can co-exist in the same directory. * '''tt2_POPBRD.log''': a log containing all information on the parameters and statistical analysis performed (see Trendtest2 - Log and Result File) * '''file312_POPBRD:''' summarizing the trend test results for method II (see Trendtest2 - Log and Result File) ==== Trendtest2 - Editings ==== The program will read the three input files, file305_POPBRD, file300_POPBRD and file302_POPBRD, and apply some editings on the data such as only the following bulls will be selected for the test: * All bulls reported in the file302_POPBRD must have an entry in the file300_POPBRD * Bulls with birth year equal or higher than the ''byr1'' specified in the file305_POPBRD * Bulls with type of proof equal to 11 or 12 and status of bulls not equal 20 * Bulls with number of herd equal or higher than minimum number of herd in file305_POPBRD (in both file300_POPBRD and file302_POPBRD) * Bulls with number of daughters equal or higher than minimun number of daughters in file305_POPBRD (in both file300_POPBRD and file302_POPBRD) * Inclusion of first year record only if number of daughters is equal or higher than 10. * Bulls with daughters in more than just one qualifying year. A merged file is created, called ''trt.csv ''(mil.csv for example), and placed under the ''DATADIR/merged2 ''directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX IIIb. ==== Trendtest2 - Statistical test ==== The statistical test for method II would be calculated as: . {{{ yij=BULLi+b*j+eij }}} The criteria for passing the test will then be equal to: . {{{ [abs(b) / sdg] < 0,01 }}} ==== Trendtest2 - Log and Result File ==== A logfile is created, called ''tt2_POPBRD.log, ''and placed under ''DATADIR, ''if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as * birthyear, minimum number of daughters, herds, genetic merit and MACE genetic standard deviation used * Total number of records read from the file300''_POPBRD'' and file302''_POPBRD'' * Warning messages * Summary of statistics for each of the trait analysed * Estimate of b from model yij=BULLi+b*j+eij * Method II final result. A result file, called ''file312_POPBRD'', will be created in the ''DATADIR, ''if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below: {{{ rec brd pop tgrp trt testdate pass testval b SDg bv bulls std_DD x byr1 mh md warnings 312 HOL ABC prod fat 20131017 PASS 0.009 0.185 21.496 BV 153 19.186 N 1986 10 20 none }}} In this example: * '''brd''' is breed of evaluation * '''pop''' is Population * '''tgrp''' is traitgroup * '''trt''' is trait analysed * '''PASS''' is the overall validation result for method II * '''testval''' is the result from [abs(b) / sdg] * '''b''' is the result from model yij=BULLi+b*j+eij * '''SDg''' is MACE genetic standard deviation * '''bv''' is the genetic merit * '''bulls ''' is total number of bulls considered * '''std_DD''' is the genetic standard deviation for DD * '''x''' is if type of proof 21 or 22 are considered * '''byr1''' is minimum bull's birth year * '''mh''' is minimum number of herds * '''md''' is minimum number of daughters == Validation Method III == '''Definition:''' Analysis of official national predicted genetic merit variation across evaluation runs. Validation method III is taken care by the program ''trendtest3.py. ''The program reads in three files: File305_POPBRD (a control file sent by ITBC, see above section ''Control File''), File300_POPBRD (alias file01x, see APPENDIX IIa) and file303_POPBRD (alias file04x, see APPENDIX IIc). In order to make the file format transition as smooth as possible, the program ''ttconvert3.py'' will take care to convert the legacy files format 01x and file04x into the new file formats 300_POPBRD and 303_POPBRD. === TTCONVERT3.PY === The program is located in the ''programs'' directory. Typing . '''python ttconvert3.py --help''' will give you a small summary of the program usage: {{{ usage: ttconvert3.py [-h] [-v] [-s SUFFIX] [-e ENCODING] [-o OUTDIR] brd pop datadir positional arguments: brd evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM) pop population code (same as country code except for CHR/DEA/DFS/FRR/FRM) datadir absolute or relative path to data files optional arguments: -h, --help show this help message and exit -v, --verbose increase output verbosity -s SUFFIX, --suffix SUFFIX suffix to add to all input file names, eg. ".usa" if file names are like fileC010f.usa (default=none) -e ENCODING, --encoding ENCODING input file encoding (default=utf-8; try also iso-8859-1 or other values listed at http://docs.python.org/2/library/codecs.html#standard- encodings) -o OUTDIR, --outdir OUTDIR directory for output files (default=DATADIR) }}} ==== Warning ==== * INPUT FILE NAME: the input files must be named file01x.cou (for proofs file) and file04x.cou (for validation file), for example: file010.can and file040.can, any other names will make the program to crash. * ENCODING: the files are expected to be in utf-8 * OLD FILE FORMAT: the file format for the file01x.cou and file04x.cou needs to follow exactly the format reported in our webpage http://interbull2.slu.se/www/v1/index.php?option=com_content&view=article&id=55&Itemid=161 ==== How to run the program ==== Go to the ''programs'' directory and type: . python ttconvert3.py -v -s'.abc' hol abc ../testing/abc1401 In this example * s'.abc' = the suffix of the population 'abc' is added to input files * breed of evaluation is HOL * population is ABC * input/output files are located in directory ../testing/abc1401 === TRENDTEST3.PY === Typing . '''python trendtest3.py --help''' within the ''programs'' directory will give you a small summary of the program usage: {{{ usage: trendtest3.py [-h] [-v] [-s SAMPLES] [-c CONTROLFILE] [-m] [-M MERGEDIR] brd pop datadir positional arguments: brd evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM) pop population code (same as country code except for CHR/DEA/DFS/FRR/FRM) datadir absolute or relative path to data files optional arguments: -h, --help show this help message and exit -v, --verbose increase output verbosity -s SAMPLES, --samples SAMPLES number of bootstrap samples (default=1000) -c CONTROLFILE, --controlfile CONTROLFILE path/name of the control file (default=DATADIR/file305_POPBRD) -m, --mergefiles write merged data files (for independent data checks) -M MERGEDIR, --mergedir MERGEDIR absolute or relative path for merged data files (default=DATADIR/merged3) }}} ==== Trendtest3 - How to run the program ==== Go to the ''programs'' directory and type: . python trendtest3.py -v -m hol abc ../testing/abc1401 In this example * m is the option to have a merged file written out * breed of evaluation is HOL * population is ABC * input/output files are located in directory ../testing/abc1401 '''Output files''' The following files are wriiten to the DATADIR or OUTDIR, if specified. All files have a _POPBRD suffix, so that multiple sets of output files for different breeds or population can co-exist in the same directory. * '''tt3_POPBRD.log''': a log containing all information on the parameters and statistical analisys performed (see Trendtest3 - Log and Result File) * '''file313_POPBRD:''' summarizing the trend test results for method III (see Trendtest3 - Log and Result File) ==== Trendtest3 - Editings ==== The program will read the three input files, file305_POPBRD, file300_POPBRD and file303_POPBRD, and apply some editings on the data such as only the following bulls will be selected for the test: * All bulls reported in the file303_POPBRD must have an entry in the file300_POPBRD * Bulls with birth year within ''miny'' and ''maxy'' reported in file305_POPBRD * Bulls with type of proof equal to 11 or 12 and status of bulls not equal 20 * Bulls with number of herd equal of higher than minimum number of herd in file305_POPBRD * Bulls with number of daughters equal or higher than minimum number of daughters in file305_POPBRD (in both file300_POPBRD and file303_POPBRD) * Bulls with at least 1 daughter added over the four years' period * Bulls with number of added daughter consistent with total number of daughters. A merged file is created, called ''trt.csv ''(mil.csv for example), and placed under the ''DATADIR/merged3 ''directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX IIIc. ==== Trendtest3 - Statistical test ==== The statistical test for method III would be calculated as: . {{{ y = b0 + b1*x + b2*t + e }}} . {{{ Regression 95% C.I. for delta (b[2] - 1.96 * bse[2], b[2] + 1.96 * bse[2]) }}} The criteria for passing the test will then be equal to: . {{{ [abs(b2)/sdg] < 0,02 (if BV are used, otherwise 0,01) or statistical validation test (95% C.I. contains 0) }}} ==== Trendtest3 - Log and Result File ==== A logfile is created, called ''tt3_POPBRD.log, ''and placed under ''DATADIR, ''if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as * minimum and maximum birthyear, minimum number of daughters, herds, genetic merit and MACE standard deviation used * Total number of records read from the file300_POPBRD and file303_POPBRD * summary of statistics for each of the trait analysed * Regression of current EBV (y) on previous EBV (x) and TIME variate (t) * warning messages * method III final result. A result file, called ''file313_POPBRD'', will be created in the ''DATADIR, ''if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below: {{{ rec brd pop tgrp trt testdate pass delta lower upper stat testval biol SDg bv bulls std_y std_x x yyyy miny maxy herit corr mh md nsamp warnings 313 HOL ABC conf sta 20131023 PASS 0.013 -0.001 0.027 PASS 0.023 FAIL 0.564 BV 581 0.463 0.328 N 2009 1991 1999 0.3700 0.86 10 20 1000 }}} In this example: * '''brd''' is breed of evaluation * '''pop''' is Population * '''tgrp''' is traitgroup * '''trt''' is trait analysed * '''PASS''' is the overall validation result for method III * '''delta''' is the slope of he time variate ''t'' * '''lower''' is the 95% lower limit * '''upper''' is the 95% upper limit * '''stat''' is the result from the statistical test * '''testval''' is the result from the [abs(b2)/sdg] * '''biol''' is the result from the y = b0 + b1*x + b2*t + e * '''SDg''' is MACE genetic standard deviation * '''bv''' is the genetic merit * '''bulls '''is total number of bulls considered * '''std_y''' is the standard deviation for the current EBV * '''std_x''' is the standard deviation for EBV YYY-4 * '''x''' is if type of proof 21 or 22 are considered * '''yyyy''' is YYYY-4 * '''miny''' is minimum birthyear * '''maxy''' is maximum birthyear * '''herit''' is heritability of the trait * '''corr''' is the correlation between YYYY and YYYY-4 * '''mh''' is minimum number of herds * '''md''' is minimum number of daughters * '''nsamp''' number of bootstrap samples | 
TrendTest Software
The trend validation procedures are described in the Interbull Code of Practice, https://wiki.interbull.org/public/CoPAppendixIII?action=print.
Contents
This software consists of two programs to convert legacy file formats to new formats (ttconvert1/3.py), three programs to perform trend validation by methods 1 - 3 (trendtest1-3.py), a program to combine the results across methods and prepare a zip file ready for submission to the Interbull Center (ttzip.py), and utility module used by those programs (ibutils.py). The conversion programs will process sets of legacy files (file01x and file04x) for all trait groups for a single breed and population of evalution and create a single set of files in a trait independent format. The remaining programs will perform the trend validation tests for all traits for one breed and population and then create a zip file with the input and output files, ready for submission to the Interbull Centre.
Note: In the future, organizations may prefer to prepare the data for the trendtest1-3.py programs directly, bypassing the creation of the legacy file formats and the use of the ttconvert1/3.py programs.
Installation and testing
The programs have been tested under Python 2.6, 2.7, 3.2 and 3.3. As a minimum you will need to have these extra python modules installed on your system: NumPy and, just for Python 2.6, argparse.
Download the attached trendtest20131017.zip file.
Create a working directory and unzip the zip file in that directory. Two subdirectories will be created, programs and sample_data. Typing, for example,
- python trendtest1.py --help 
from a command line prompt, from within the programs directory, should print a brief help message if the installation has been successful.
Some sample data for breed HOL and population ABC are available in the sample_data directory. The two programs for method 1 can be run from the programs directory as follows:
python ttconvert1.py -v -s'.abc' hol abc ../sample_data python trendtest1.py -v -m hol abc ../sample_data
In this example data, parameters and output are all in the sample_data directory. Files can be read from other locations and output written to other locations as well. Please see the following sections for further information.
The outputs should match those in the source zip file.
Detailed descriptions of the single programs are given in the following sections.
Control File
Shortly after the beginning of each test run Interbull Centre will send a control file, called file305_POPBRD, to every organization that will have to provide validation results for a given population and trait, either because they are testing significant changes in their model or they are participating for the first time or because their last validation was conducted more than two years ago. The format of the file is available in APPENDIX I, and an example is presented below:
#grp trt evaldate herit siresd gm x mh md byr1 miny maxy corr preval chg uder scs 20130630 0.2240 0.38579 B- N 10 20 1981 1999 2003 0.99 09-may N work msp 20130719 0.0890 26.23801 B+ N 10 20 1981 1999 2003 0.99 ------ Y
Usage notes
The siresd contained in the file is the MACE sire standard deviation as calculated in the current test run evaluation. 
 The fields preval and chg together give information of why validation is required: 
- preval is set and chg is N = validation required as it has been more than two years since last evaluation (date shown in preval field) 
- preval is not set and chg is Y = validation required as significant changes are tested in the model or the population is participating for the first time. 
Validation Method I
Definition: Comparison of genetic trends estimated using only first lactation versus all lactations in the routine national genetic evaluations.
Validation method I is taken care by the program trendtest1.py. The program reads in three files: File305_POPBRD (a control file sent by ITBC, see above) File300_POPBRD (alias file01x, see APPENDIX IIa) and file300FL_POPBRD (a new file following the same format as file300 but pertaining to first lactation only, see APPENDIX IIa). In order to make the file format transition as smooth as possible, the program ttconvert1.py will take care of converting the legacy files format 01x into the new file format 300_POPBRD.
TTCONVERT1.PY
The program is located in the programs directory. Typing
- python ttconvert1.py --help 
will give you a small summary of the program usage:
usage: ttconvert1.py [-h] [-v] [-s SUFFIX] [-e ENCODING] [-o OUTDIR]
                     brd pop datadir
positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files
optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -s SUFFIX, --suffix SUFFIX
                        suffix to add to all input file names, eg. ".usa" if
                        file names are like file010.usa (default=none)
  -e ENCODING, --encoding ENCODING
                        input file encoding (default=utf-8; try also
                        iso-8859-1 or other values listed at
                        http://docs.python.org/2/library/codecs.html#standard-
                        encodings)
  -o OUTDIR, --outdir OUTDIR
                        directory for output files (default=DATADIR)
Warning
- INPUT FILE NAME: the input files must be named file01x.cou (for all lactations) and file01xFL.cou (for first lactation), for example: file010.can and file010FL.can, any other names will make the program to crash.
- ENCODING: the files are expected to be in utf-8
- OLD FILE FORMAT: the file format for the file01x.cou and file01xFL.cou needs to follow exactly the format reported in our webpage http://interbull2.slu.se/www/v1/index.php?option=com_content&view=article&id=55&Itemid=161 
How to run the program
Go to the programs directory and type:
- python ttconvert1.py -v -s'.abc' hol abc ../testing/abc1401
In this example
- s'.abc' = the suffix of the population 'abc' is added to input files
- breed of evaluation is HOL
- population is ABC
- input/output files are located in directory ../testing/abc1401
TRENDTEST1.PY
Typing
- python trendtest1.py --help 
within the programs directory will give you a small summary of the program usage:
usage: trendtest1.py [-h] [-v] [-c CONTROLFILE] [-m] [-M MERGEDIR]
                     brd pop datadir
positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files
optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -c CONTROLFILE, --controlfile CONTROLFILE
                        path/name of the control file
                        (default=DATADIR/file305_POPBRD)
  -m, --mergefiles      write merged data files (for independent data checks)
  -M MERGEDIR, --mergedir MERGEDIR
                        absolute or relative path for merged data files
                        (default=DATADIR/merged1)
Trendtest1 - How to run the program
Go to the programs directory and type:
- python trendtest1.py -v -m hol abc ../testing/abc1401
In this example
- m is the option to have a merged file written out
- breed of evaluation is HOL
- population is ABC
- input/output files are located in directory ../testing/abc1401
Output files
The following files are wriiten to the DATADIR or OUTDIR, if specified. All files have a _POPBRD suffix, so that multiple sets of output files for different breeds or population can co-exist in the same directory.
- tt1_POPBRD.log: a log containing all information on the parameters and statistical analisys performed (see Trendtest1 - Log and Result File) 
- file311_POPBRD: summarizing the trend test results for method I (see Trendtest1 - Log and Result File) 
Trendtest1 - Editings
The program will read the three input files, file305_POPBRD, file300_POPBRD and file300FL_POPBRD, and apply some editings on the data such as only the following bulls will be selected for the test:
- All bulls reported in the file300FL_POPBRD must have an entry in the file300_POPBRD
- Bulls with birth year equal or higher than the byr1 specified in the file305_POPBRD 
- Bulls with type of proof equal to 11 or 12 and status of bulls not equal 20
- Bulls with number of herd equal of higher than minimum number of herd in file305_POPBRD (in both file300_POPBRD and file300FL_POPBRD)
- Bulls with number of daughters and EDC equal or higher than minimum number of daughters in file305_POPBRD (in both file300_POPBRD and file300FL_POPBRD)
A merged file is created, called trt.csv (mil.csv for example), and placed under the DATADIR/merged1 directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX IIIa.
Trendtest1 - Statistical test
The statistical test for method I would be calculated as:
- model T: EBVT= b0 + b2*BYEAR + e (all lactations) 
- model 1: EBV1= b0 + b1*BYEAR + e (first lactation) 
The criteria for passing the test will then be equal to:
- [abs(bT-b1)/sdg] < 0,02 (if BV are used, otherwise 0,01). 
Trendtest1 - Log and Result File
A logfile is created, called tt1_POPBRD.log, and placed under DATADIR, if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as
- birthyear, minimum number of daughters, herds, genetic merit and MACE standard deviation used
- Total number of records read from the file300_POPBRD and file300FL_POPBRD
- summary of statistics for each of the trait analysed
- Results on the regression on year of birth for first and all lactations
- warning messages
- method I final result.
A result file, called file311_POPBRD, will be created in the DATADIR, if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below:
rec brd pop tgrp trt testdate pass testval SDg bv b_ALL b_1ST bulls stdALL std1ST x byr1 mh md warnings 311 HOL ABC prod mil 20131024 FAIL 0.021 434.925 BV 48.409 39.246 5569 490.928 539.577 N 1986 10 20 LACT1_SCALE_WARNING
In this example:
- brd is breed of evaluation 
- pop is Population 
- tgrp is traitgroup 
- trt is trait analysed 
- PASS is the overall validation result for method I 
- testval is the result from [abs(bT-b1)/sdg] 
- SDg is MACE genetic standard deviation 
- bv is the genetic merit 
- b_ALL is the slope for all lactations' model 
- b_1st is the slope for first lactation model 
- bulls is total number of bulls considered 
- stdALL is the standard deviation for all lactations' model 
- std1st is the standard deviation for first lactation's model 
- x is if type of proof 21 or 22 are considered 
- byr1 is minimum bull's birth year 
- mh is minimum number of herds 
- md is minimum number of daughters 
Validation Method II
Definition: Analysis of within bull yearly Daughter Deviations (e.g. Daughter Yield Deviations, DYD), hereafter referred to as DD
Validation method II is taken care by the program trendtest2.py. The program reads in three files: File305_POPBRD (a control file sent by ITBC, see above section Control File), File300_POPBRD (alias file01x, see APPENDIX IIa) and file302_POPBRD (a new file format for submission of DD records, see APPENDIX IIb).
TRENDTEST2.PY
Typing
- python trendtest2.py --help 
within the programs directory will give you a small summary of the program usage:
usage: trendtest2.py [-h] [-v] [-c CONTROLFILE] [-m] [-M MERGEDIR]
                     brd pop datadir
positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files
optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -c CONTROLFILE, --controlfile CONTROLFILE
                        path/name of the control file
                        (default=DATADIR/file305_POPBRD)
  -m, --mergefiles      write merged data files (for independent data checks)
  -M MERGEDIR, --mergedir MERGEDIR
                        absolute or relative path for merged data files
                        (default=DATADIR/merged2)
Trendtest2 - How to run the program
Go to the programs directory and type:
- python trendtest2.py -v -m hol abc ../testing/abc1401
In this example
- m is the option to have a merged file written out
- breed of evaluation is HOL
- population is ABC
- input/output files are located in directory ../testing/abc1401
Output files
The following files are written to the DATADIR or OUTDIR, if specified. All files have a _POPBRD suffix, so that multiple sets of output files for different breeds or population can co-exist in the same directory.
- tt2_POPBRD.log: a log containing all information on the parameters and statistical analysis performed (see Trendtest2 - Log and Result File) 
- file312_POPBRD: summarizing the trend test results for method II (see Trendtest2 - Log and Result File) 
Trendtest2 - Editings
The program will read the three input files, file305_POPBRD, file300_POPBRD and file302_POPBRD, and apply some editings on the data such as only the following bulls will be selected for the test:
- All bulls reported in the file302_POPBRD must have an entry in the file300_POPBRD
- Bulls with birth year equal or higher than the byr1 specified in the file305_POPBRD 
- Bulls with type of proof equal to 11 or 12 and status of bulls not equal 20
- Bulls with number of herd equal or higher than minimum number of herd in file305_POPBRD (in both file300_POPBRD and file302_POPBRD)
- Bulls with number of daughters equal or higher than minimun number of daughters in file305_POPBRD (in both file300_POPBRD and file302_POPBRD)
- Inclusion of first year record only if number of daughters is equal or higher than 10.
- Bulls with daughters in more than just one qualifying year.
A merged file is created, called trt.csv (mil.csv for example), and placed under the DATADIR/merged2 directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX IIIb.
Trendtest2 - Statistical test
The statistical test for method II would be calculated as:
- yij=BULLi+b*j+eij 
The criteria for passing the test will then be equal to:
- [abs(b) / sdg] < 0,01 
Trendtest2 - Log and Result File
A logfile is created, called tt2_POPBRD.log, and placed under DATADIR, if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as
- birthyear, minimum number of daughters, herds, genetic merit and MACE genetic standard deviation used
- Total number of records read from the file300_POPBRD and file302_POPBRD 
- Warning messages
- Summary of statistics for each of the trait analysed
- Estimate of b from model yij=BULLi+b*j+eij
- Method II final result.
A result file, called file312_POPBRD, will be created in the DATADIR, if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below:
rec brd pop tgrp trt testdate pass testval b SDg bv bulls std_DD x byr1 mh md warnings 312 HOL ABC prod fat 20131017 PASS 0.009 0.185 21.496 BV 153 19.186 N 1986 10 20 none
In this example:
- brd is breed of evaluation 
- pop is Population 
- tgrp is traitgroup 
- trt is trait analysed 
- PASS is the overall validation result for method II 
- testval is the result from [abs(b) / sdg] 
- b is the result from model yij=BULLi+b*j+eij 
- SDg is MACE genetic standard deviation 
- bv is the genetic merit 
- bulls is total number of bulls considered 
- std_DD is the genetic standard deviation for DD 
- x is if type of proof 21 or 22 are considered 
- byr1 is minimum bull's birth year 
- mh is minimum number of herds 
- md is minimum number of daughters 
Validation Method III
Definition: Analysis of official national predicted genetic merit variation across evaluation runs.
Validation method III is taken care by the program trendtest3.py. The program reads in three files: File305_POPBRD (a control file sent by ITBC, see above section Control File), File300_POPBRD (alias file01x, see APPENDIX IIa) and file303_POPBRD (alias file04x, see APPENDIX IIc). In order to make the file format transition as smooth as possible, the program ttconvert3.py will take care to convert the legacy files format 01x and file04x into the new file formats 300_POPBRD and 303_POPBRD.
TTCONVERT3.PY
The program is located in the programs directory. Typing
- python ttconvert3.py --help 
will give you a small summary of the program usage:
usage: ttconvert3.py [-h] [-v] [-s SUFFIX] [-e ENCODING] [-o OUTDIR]
                     brd pop datadir
positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files
optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -s SUFFIX, --suffix SUFFIX
                        suffix to add to all input file names, eg. ".usa" if
                        file names are like fileC010f.usa (default=none)
  -e ENCODING, --encoding ENCODING
                        input file encoding (default=utf-8; try also
                        iso-8859-1 or other values listed at
                        http://docs.python.org/2/library/codecs.html#standard-
                        encodings)
  -o OUTDIR, --outdir OUTDIR
                        directory for output files (default=DATADIR)
Warning
- INPUT FILE NAME: the input files must be named file01x.cou (for proofs file) and file04x.cou (for validation file), for example: file010.can and file040.can, any other names will make the program to crash.
- ENCODING: the files are expected to be in utf-8
- OLD FILE FORMAT: the file format for the file01x.cou and file04x.cou needs to follow exactly the format reported in our webpage http://interbull2.slu.se/www/v1/index.php?option=com_content&view=article&id=55&Itemid=161 
How to run the program
Go to the programs directory and type:
- python ttconvert3.py -v -s'.abc' hol abc ../testing/abc1401
In this example
- s'.abc' = the suffix of the population 'abc' is added to input files
- breed of evaluation is HOL
- population is ABC
- input/output files are located in directory ../testing/abc1401
TRENDTEST3.PY
Typing
- python trendtest3.py --help 
within the programs directory will give you a small summary of the program usage:
usage: trendtest3.py [-h] [-v] [-s SAMPLES] [-c CONTROLFILE] [-m]
                     [-M MERGEDIR]
                     brd pop datadir
positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files
optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -s SAMPLES, --samples SAMPLES
                        number of bootstrap samples (default=1000)
  -c CONTROLFILE, --controlfile CONTROLFILE
                        path/name of the control file
                        (default=DATADIR/file305_POPBRD)
  -m, --mergefiles      write merged data files (for independent data checks)
  -M MERGEDIR, --mergedir MERGEDIR
                        absolute or relative path for merged data files
                        (default=DATADIR/merged3)
Trendtest3 - How to run the program
Go to the programs directory and type:
- python trendtest3.py -v -m hol abc ../testing/abc1401
In this example
- m is the option to have a merged file written out
- breed of evaluation is HOL
- population is ABC
- input/output files are located in directory ../testing/abc1401
Output files
The following files are wriiten to the DATADIR or OUTDIR, if specified. All files have a _POPBRD suffix, so that multiple sets of output files for different breeds or population can co-exist in the same directory.
- tt3_POPBRD.log: a log containing all information on the parameters and statistical analisys performed (see Trendtest3 - Log and Result File) 
- file313_POPBRD: summarizing the trend test results for method III (see Trendtest3 - Log and Result File) 
Trendtest3 - Editings
The program will read the three input files, file305_POPBRD, file300_POPBRD and file303_POPBRD, and apply some editings on the data such as only the following bulls will be selected for the test:
- All bulls reported in the file303_POPBRD must have an entry in the file300_POPBRD
- Bulls with birth year within miny and maxy reported in file305_POPBRD 
- Bulls with type of proof equal to 11 or 12 and status of bulls not equal 20
- Bulls with number of herd equal of higher than minimum number of herd in file305_POPBRD
- Bulls with number of daughters equal or higher than minimum number of daughters in file305_POPBRD (in both file300_POPBRD and file303_POPBRD)
- Bulls with at least 1 daughter added over the four years' period
- Bulls with number of added daughter consistent with total number of daughters.
A merged file is created, called trt.csv (mil.csv for example), and placed under the DATADIR/merged3 directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX IIIc.
Trendtest3 - Statistical test
The statistical test for method III would be calculated as:
- y = b0 + b1*x + b2*t + e 
- Regression 95% C.I. for delta (b[2] - 1.96 * bse[2], b[2] + 1.96 * bse[2]) 
The criteria for passing the test will then be equal to:
- [abs(b2)/sdg] < 0,02 (if BV are used, otherwise 0,01) or statistical validation test (95% C.I. contains 0) 
Trendtest3 - Log and Result File
A logfile is created, called tt3_POPBRD.log, and placed under DATADIR, if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as
- minimum and maximum birthyear, minimum number of daughters, herds, genetic merit and MACE standard deviation used
- Total number of records read from the file300_POPBRD and file303_POPBRD
- summary of statistics for each of the trait analysed
- Regression of current EBV (y) on previous EBV (x) and TIME variate (t)
- warning messages
- method III final result.
A result file, called file313_POPBRD, will be created in the DATADIR, if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below:
rec brd pop tgrp trt testdate pass delta lower upper stat testval biol SDg bv bulls std_y std_x x yyyy miny maxy herit corr mh md nsamp warnings 313 HOL ABC conf sta 20131023 PASS 0.013 -0.001 0.027 PASS 0.023 FAIL 0.564 BV 581 0.463 0.328 N 2009 1991 1999 0.3700 0.86 10 20 1000
In this example:
- brd is breed of evaluation 
- pop is Population 
- tgrp is traitgroup 
- trt is trait analysed 
- PASS is the overall validation result for method III 
- delta is the slope of he time variate t 
- lower is the 95% lower limit 
- upper is the 95% upper limit 
- stat is the result from the statistical test 
- testval is the result from the [abs(b2)/sdg] 
- biol is the result from the y = b0 + b1*x + b2*t + e 
- SDg is MACE genetic standard deviation 
- bv is the genetic merit 
- bulls is total number of bulls considered 
- std_y is the standard deviation for the current EBV 
- std_x is the standard deviation for EBV YYY-4 
- x is if type of proof 21 or 22 are considered 
- yyyy is YYYY-4 
- miny is minimum birthyear 
- maxy is maximum birthyear 
- herit is heritability of the trait 
- corr is the correlation between YYYY and YYYY-4 
- mh is minimum number of herds 
- md is minimum number of daughters 
- nsamp number of bootstrap samples 

