TrendTest Software

The trend validation procedures are described in the Interbull Code of Practice, https://wiki.interbull.org/public/CoPAppendixIII?action=print.

This software consists of two programs to convert legacy file formats to new formats (ttconvert1/3.py), three programs to perform trend validation by methods 1 - 3 (trendtest1-3.py), a program to combine the results across methods and prepare a zip file ready for submission to the Interbull Center (ttzip.py), and utility module used by those programs (ibutils.py). The conversion programs will process sets of legacy files (file01x and file04x) for all trait groups for a single breed and population of evalution and create a single set of files in a trait independent format. The remaining programs will perform the trend validation tests for all traits for one breed and population and then create a zip file with the input and output files, ready for submission to the Interbull Centre.

Note: In the future, organizations may prefer to prepare the data for the trendtest1-3.py programs directly, bypassing the creation of the legacy file formats and the use of the ttconvert1/3.py programs.


Installation and testing

The programs have been tested under Python 2.6, 2.7, 3.2 and 3.3. As a minimum you will need to have these extra python modules installed on your system: NumPy and, just for Python 2.6, argparse.

Download the attached trendtest20131017.zip file.

Create a working directory and unzip the zip file in that directory. Two subdirectories will be created, programs and sample_data. Typing, for example,

from a command line prompt, from within the programs directory, should print a brief help message if the installation has been successful.

Some sample data for breed HOL and population ABC are available in the sample_data directory. The two programs for method 1 can be run from the programs directory as follows:

python ttconvert1.py -v -s'.abc' hol abc ../sample_data
python trendtest1.py -v -m hol abc ../sample_data

In this example data, parameters and output are all in the sample_data directory. Files can be read from other locations and output written to other locations as well. Please see the following sections for further information.

The outputs should match those in the source zip file.

Detailed descriptions of the single programs are given in the following sections.


Control File

Shortly after the beginning of each test run Interbull Centre will send a control file, called file305_POPBRD, to every organization that will have to provide validation results for a given population and trait, either because they are testing significant changes in their model or they are participating for the first time or because their last validation was conducted more than two years ago. The format of the file is available in APPENDIX I, and an example is presented below:

#grp trt evaldate  herit     siresd gm x mh md byr1 miny maxy corr preval chg
uder scs 20130630 0.2240    0.38579 B- N 10 20 1981 1999 2003 0.99 09-may N
work msp 20130719 0.0890   26.23801 B+ N 10 20 1981 1999 2003 0.99 ------ Y

Usage notes

The siresd contained in the file is the MACE sire standard deviation as calculated in the current test run evaluation.
The fields preval and chg together give information of why validation is required:

Validation Method I

Definition: Comparison of genetic trends estimated using only first lactation versus all lactations in the routine national genetic evaluations.

Validation method I is taken care by the program trendtest1.py. The program reads in three files: File305_POPBRD (a control file sent by ITBC, see APPENDIX IIa) File300_POPBRD (alias file01x, see APPENDIX IIb) and file300FL_POPBRD (a new file following the same format as file300 but pertaining to first lactations only, see APPENDIX IIb). In order to make the file format transition as smooth as possible, the program ttconvert1.py will take care to convert the legacy file format 01x into the new file format300_POPBRD.

TTCONVERT1.PY

The program is located in the programs directory. Typing

will give you a small summary of the program usage:

usage: ttconvert1.py [-h] [-v] [-s SUFFIX] [-e ENCODING] [-o OUTDIR]
                     brd pop datadir

positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -s SUFFIX, --suffix SUFFIX
                        suffix to add to all input file names, eg. ".usa" if
                        file names are like file010.usa (default=none)
  -e ENCODING, --encoding ENCODING
                        input file encoding (default=utf-8; try also
                        iso-8859-1 or other values listed at
                        http://docs.python.org/2/library/codecs.html#standard-
                        encodings)
  -o OUTDIR, --outdir OUTDIR
                        directory for output files (default=DATADIR)

Warning

How to run the program

Go to the programs directory and type:

In this example

TRENDTEST1.PY

Typing

within the programs directory will give you a small summary of the program usage:

usage: trendtest1.py [-h] [-v] [-c CONTROLFILE] [-m] [-M MERGEDIR]
                     brd pop datadir
positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files
optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -c CONTROLFILE, --controlfile CONTROLFILE
                        path/name of the control file
                        (default=DATADIR/file305_POPBRD)
  -m, --mergefiles      write merged data files (for independent data checks)
  -M MERGEDIR, --mergedir MERGEDIR
                        absolute or relative path for merged data files
                        (default=DATADIR/merged1)

Trendtest1 - How to run the program

Go to the programs directory and type:

In this example

Output files

The following files are wriiten to the DATADIR or OUTDIR, if specified. All files have a _POPBRD suffix, so that multiple sets of output files for different breeds or population can co-exist in the same directory.

Trendtest1 - Editings

The program will read the three input files, file305_POPBRD, file300.cou and file300FL.cou, and apply some editings on the data such as only the following bulls will be selected for the test:

A merged file is created, called trt.csv (mil.csv for example), and placed under the DATADIR/merged1 directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX IIIa.

Trendtest1 - Statistical test

The statistical test for method I would be calculated as:

The criteria for passing the test will then be equal to:

Trendtest1 - Log and Result File

A logfile is created, called tt1_POPBRD.log, and placed under DATADIR, if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as

A result file, called file311_POPBRD, will be created in the DATADIR, if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below:

rec brd pop tgrp trt testdate pass testval      SDg bv    b_ALL    b_1ST bulls   stdALL   std1ST x byr1 mh md warnings
311 HOL ABC prod mil 20131024 FAIL   0.021  434.925 BV   48.409   39.246  5569  490.928  539.577 N 1986 10 20 LACT1_SCALE_WARNING

In this example:

Sending results to Interbull Centre