TrendTest Software

The trend validation procedures are described in the Interbull Code of Practice, https://wiki.interbull.org/public/CoP_AppendixIII?action=print.

Contents

TrendTest Software
APPENDIX I
1. Format305 for control files for the TrendTest software
APPENDIX IIa
1. APPENDIX I - Format File300-EBV and File700-GEBV
APPENDIX IIb
1. Format302 for Submission of validation method II
APPENDIX IIc
1. Format303 for data file for validation method III
APPENDIX III
APPENDIX IVa
1. Format311 for TrendTest results for method 1
APPENDIX IVb
1. Format312 for TrendTest results for method 2
APPENDIX IVc
1. Format313 for TrendTest results for method 3
APPENDIX IVd
1. Format for file bdate_POPBRD for validation methods
Frequently Asked Questions

This software consists of two programs to convert legacy file formats to new formats (ttconvert1/3.py), three programs to perform trend validation by methods 1 - 3 (trendtest1-3.py), a program to combine the results across methods and prepare a zip file ready for submission to the Interbull Center (ttzip.py), and utility module used by those programs (ibutils.py). The conversion programs will process sets of legacy files (file01x and file04x) for all trait groups for a single breed and population of evalution and create a single set of files in a trait independent format. The remaining programs will perform the trend validation tests for all traits for one breed and population and then create a zip file with the input and output files, ready for submission to the Interbull Centre.

Note: In the future, organizations may prefer to prepare the data for the trendtest1-3 programs directly, bypassing the creation of the legacy file formats and the use of the ttconvert1/3.py programs. In this case organizations should also prepare an additional file, called bdate_POPBRD, containing a list of common international IDs between the file 300 and the relative trend data file used together with their birthdates. The format of the file is reported in Appendix IVd.

Installation and testing

The programs have been tested under Python 2.6, 2.7, 3.2 and 3.3. As a minimum you will need to have these extra python modules installed on your system: NumPy and, just for Python 2.6, argparse.

Download the attached trendtest20150907.zip file.

Create a working directory and unzip the zip file in that directory. Two subdirectories will be created, programs and sample_data. Typing, for example,

python trendtest1.py --help

from a command line prompt, from within the programs directory, should print a brief help message if the installation has been successful.

Some sample data for breed HOL and population ABC are available in the sample_data directory. The two programs for method 1 can be run from the programs directory as follows:

python ttconvert1.py -v -s'.abc' hol abc ../sample_data
python trendtest1.py -v -m hol abc ../sample_data

In this example data, parameters and output are all in the sample_data directory. Files can be read from other locations and output written to other locations as well. Please see the following sections for further information.

The outputs should match those in the source zip file.

Detailed descriptions of the single programs are given in the following sections.

Control File

Shortly after the beginning of each test run Interbull Centre will send a control file, called file305_POPBRD, to every organization that will have to provide validation results for a given population and trait, either because they are testing significant changes in their model or they are participating for the first time or because their last validation was conducted more than two years ago. The format of the file is available in APPENDIX I, and an example is presented below:

#grp trt evaldate  herit     siresd gm x mh md byr1 miny maxy corr preval chg
uder scs 20130630 0.2240    0.38579 B- N 10 20 1981 1999 2003 0.99 09-may N
work msp 20130719 0.0890   26.23801 B+ N 10 20 1981 1999 2003 0.99 ------ Y

Usage notes

The siresd contained in the file is the MACE sire standard deviation as calculated in the current test run evaluation.
The fields preval and chg together give information of why validation is required:

preval is set and chg is N = validation required as it has been more than two years since last evaluation (date shown in preval field)
preval is not set and chg is Y = validation required as significant changes are tested in the model or the population is participating for the first time.

Validation Method I

Definition: Comparison of genetic trends estimated using only first lactation versus all lactations in the routine national genetic evaluations.

Validation method I is taken care by the program trendtest1.py. The program reads in four files: File305_POPBRD (a control file sent by ITBC, see above) File300_POPBRD (alias file01x, see APPENDIX IIa), file300FL_POPBRD (a new file following the same format as file300 but pertaining to first lactation only, see APPENDIX IIa) and bdate_POPBRD, containing a list of common international IDs between the file 300 and the relative trend data file used together with their birthdates. The file bdate_POPBRD is automatically created when using the ttconvert programs. If ttconvert programs are no longer used by the user than such file needs to be manually created by the user. The format of the file is reported in Appendix IVd. In order to make the file format transition as smooth as possible, the program ttconvert1.py will take care of converting the legacy files format 01x into the new file format 300_POPBRD.

TTCONVERT1.PY

The program is located in the programs directory. Typing

python ttconvert1.py --help

will give you a small summary of the program usage:

usage: ttconvert1.py [-h] [-v] [-s SUFFIX] [-e ENCODING] [-o OUTDIR]
                     brd pop datadir

positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -s SUFFIX, --suffix SUFFIX
                        suffix to add to all input file names, eg. ".usa" if
                        file names are like file010.usa (default=none)
  -e ENCODING, --encoding ENCODING
                        input file encoding (default=utf-8; try also
                        iso-8859-1 or other values listed at
                        http://docs.python.org/2/library/codecs.html#standard-
                        encodings)
  -o OUTDIR, --outdir OUTDIR
                        directory for output files (default=DATADIR)

Warning

INPUT FILE NAME: the input files must be named file01x.cou (for all lactations) and file01xFL.cou (for first lactation), for example: file010.can and file010FL.can, any other names will make the program to crash.
ENCODING: the files are expected to be in utf-8
OLD FILE FORMAT: the file format for the file01x.cou and file01xFL.cou needs to follow exactly the format reported in our webpage http://www.interbull.org/ib/servicedocumentation
trait code must be in lower case

How to run the program

Go to the programs directory and type:

python ttconvert1.py -v -s'.abc' hol abc ../testing/abc1401

In this example

s'.abc' = the suffix of the population 'abc' is added to input files
breed of evaluation is HOL
population is ABC
input/output files are located in directory ../testing/abc1401

TRENDTEST1.PY

Typing

python trendtest1.py --help

within the programs directory will give you a small summary of the program usage:

usage: trendtest1.py [-h] [-v] [-c CONTROLFILE] [-m] [-M MERGEDIR]
                     brd pop datadir
positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files
optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -c CONTROLFILE, --controlfile CONTROLFILE
                        path/name of the control file
                        (default=DATADIR/file305_POPBRD)
  -m, --mergefiles      write merged data files (for independent data checks)
  -M MERGEDIR, --mergedir MERGEDIR
                        absolute or relative path for merged data files
                        (default=DATADIR/merged1)

Trendtest1 - How to run the program

Go to the programs directory and type:

python trendtest1.py -v -m hol abc ../testing/abc1401

In this example

m is the option to have a merged file written out
breed of evaluation is HOL
population is ABC
input/output files are located in directory ../testing/abc1401

Output files

The following files are written to the DATADIR or OUTDIR, if specified. All files have a _POPBRD suffix, so that multiple sets of output files for different breeds or population can co-exist in the same directory.

tt1_POPBRD.log: a log containing all information on the parameters and statistical analisys performed (see Trendtest1 - Log and Result File)
file311_POPBRD: summarizing the trend test results for method I (see Trendtest1 - Log and Result File)

Trendtest1 - Editings

The program will read the four input files, file305_POPBRD, file300_POPBRD, file300FL_POPBRD and bdate_POPBRD, and apply some editings on the data such as only the following bulls will be selected for the test:

All bulls reported in the file300FL_POPBRD must have an entry in the file300_POPBRD
Bulls with birth year equal or higher than the byr1 specified in the file305_POPBRD
Bulls with type of proof equal to 11 or 12 and status of bulls not equal 20
Bulls with number of herd equal of higher than minimum number of herd in file305_POPBRD (in both file300_POPBRD and file300FL_POPBRD)
Bulls with number of daughters and EDC equal or higher than minimum number of daughters in file305_POPBRD (in both file300_POPBRD and file300FL_POPBRD)

A merged file is created, called trt.csv (mil.csv for example), and placed under the DATADIR/merged1 directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX III.

Trendtest1 - Statistical test

The statistical test for method I would be calculated as:

model T: EBVT= b0 + b2*BYEAR + e (all lactations)

model 1: EBV1= b0 + b1*BYEAR + e (first lactation)

The criteria for passing the test will then be equal to:

[abs(bT-b1)/sdg] < 0,02 (if BV are used, otherwise 0,01).

Trendtest1 - Log and Result File

A logfile is created, called tt1_POPBRD.log, and placed under DATADIR, if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as

birthyear, minimum number of daughters, herds, genetic merit and MACE standard deviation used
Total number of records read from the file300_POPBRD and file300FL_POPBRD
summary of statistics for each of the trait analysed
Results on the regression on year of birth for first and all lactations
warning messages
method I final result.

A result file, called file311_POPBRD, will be created in the DATADIR, if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below:

rec brd pop tgrp trt testdate pass testval      SDg bv    b_ALL    b_1ST bulls   stdALL   std1ST x byr1 mh md warnings
311 HOL ABC prod mil 20131024 FAIL   0.021  434.925 BV   48.409   39.246  5569  490.928  539.577 N 1986 10 20 LACT1_SCALE_WARNING

A detailed description of the format for file311_POPBRD is available in APPENDIX IVa

Validation Method II

Definition: Analysis of within bull yearly Daughter Deviations (e.g. Daughter Yield Deviations, DYD), hereafter referred to as DD

Validation method II is taken care by the program trendtest2.py. The program reads in four files: File305_POPBRD (a control file sent by ITBC, see above section Control File), File300_POPBRD (alias file01x, see APPENDIX IIa) , file302_POPBRD (a new file format for submission of DD records, see APPENDIX IIb) and bdate_POPBRD, containing a list of common international IDs between the file 300 and the relative trend data file used together with their birthdates, such file needs to be manually created by the user. The format of the file is reported in Appendix IVd.

TRENDTEST2.PY

Warning

ENCODING: the files are expected to be in utf-8
trait code must be in lower case

Typing

python trendtest2.py --help

within the programs directory will give you a small summary of the program usage:

usage: trendtest2.py [-h] [-v] [-c CONTROLFILE] [-m] [-M MERGEDIR]
                     brd pop datadir

positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -c CONTROLFILE, --controlfile CONTROLFILE
                        path/name of the control file
                        (default=DATADIR/file305_POPBRD)
  -m, --mergefiles      write merged data files (for independent data checks)
  -M MERGEDIR, --mergedir MERGEDIR
                        absolute or relative path for merged data files
                        (default=DATADIR/merged2)

Trendtest2 - How to run the program

Go to the programs directory and type:

python trendtest2.py -v -m hol abc ../testing/abc1401

In this example

m is the option to have a merged file written out
breed of evaluation is HOL
population is ABC
input/output files are located in directory ../testing/abc1401

Output files

tt2_POPBRD.log: a log containing all information on the parameters and statistical analysis performed (see Trendtest2 - Log and Result File)
file312_POPBRD: summarizing the trend test results for method II (see Trendtest2 - Log and Result File)

Trendtest2 - Editings

The program will read the four input files, file305_POPBRD, file300_POPBRD, file302_POPBRD and bdate_POPBRD, and apply some editings on the data such as only the following bulls will be selected for the test:

All bulls reported in the file302_POPBRD must have an entry in the file300_POPBRD
Bulls with birth year equal or higher than the byr1 specified in the file305_POPBRD
Bulls with type of proof equal to 11 or 12 and status of bulls not equal 20
Bulls with number of herd equal or higher than minimum number of herd in file305_POPBRD (in both file300_POPBRD and file302_POPBRD)
Bulls with number of daughters equal or higher than minimun number of daughters in file305_POPBRD (in both file300_POPBRD and file302_POPBRD)
Inclusion of first year record only if number of daughters is equal or higher than 10.
Bulls with daughters in more than just one qualifying year.

A merged file is created, called trt.csv (mil.csv for example), and placed under the DATADIR/merged2 directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX III.

Trendtest2 - Statistical test

The statistical test for method II would be calculated as:

```
yij=BULLi+b*j+eij
```

The criteria for passing the test will then be equal to:

```
[abs(b) / sdg] < 0,01
```

Trendtest2 - Log and Result File

A logfile is created, called tt2_POPBRD.log, and placed under DATADIR, if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as

birthyear, minimum number of daughters, herds, genetic merit and MACE genetic standard deviation used
Total number of records read from the file300_POPBRD and file302_POPBRD
Warning messages
Summary of statistics for each of the trait analysed
Estimate of b from model yij=BULLi+b*j+eij
Method II final result.

A result file, called file312_POPBRD, will be created in the DATADIR, if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below:

rec brd pop tgrp trt testdate pass testval       b      SDg bv bulls   std_DD x byr1 mh md warnings
312 HOL ABC prod fat 20131017 PASS   0.009   0.185   21.496 BV   153   19.186 N 1986 10 20 none

A detailed description of the format for file311_POPBRD is available in APPENDIX IVb

Validation Method III

Definition: Analysis of official national predicted genetic merit variation across evaluation runs.

Validation method III is taken care by the program trendtest3.py. The program reads in four files: File305_POPBRD (a control file sent by ITBC, see above section Control File), File300_POPBRD (alias file01x, see APPENDIX IIa) , file303_POPBRD (alias file04x, see APPENDIX IIc) and bdate_POPBRD, containing a list of common international IDs between the file 300 and the relative trend data file used together with their birthdates. The file bdate_POPBRD is automatically created when using the ttconvert programs. If ttconvert programs are no longer used by the user than such file needs to be manually created by the user. The format of the file is reported in Appendix IVd. In order to make the file format transition as smooth as possible, the program ttconvert3.py will take care to convert the legacy files format 01x and file04x into the new file formats 300_POPBRD and 303_POPBRD.

TTCONVERT3.PY

The program is located in the programs directory. Typing

python ttconvert3.py --help

will give you a small summary of the program usage:

usage: ttconvert3.py [-h] [-v] [-s SUFFIX] [-e ENCODING] [-o OUTDIR]
                     brd pop datadir

positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -s SUFFIX, --suffix SUFFIX
                        suffix to add to all input file names, eg. ".usa" if
                        file names are like fileC010f.usa (default=none)
  -e ENCODING, --encoding ENCODING
                        input file encoding (default=utf-8; try also
                        iso-8859-1 or other values listed at
                        http://docs.python.org/2/library/codecs.html#standard-
                        encodings)
  -o OUTDIR, --outdir OUTDIR
                        directory for output files (default=DATADIR)

Warning

INPUT FILE NAME: the input files must be named file01x.cou (for proofs file) and file04x.cou (for validation file), for example: file010.can and file040.can, any other names will make the program to crash.
ENCODING: the files are expected to be in utf-8
OLD FILE FORMAT: the file format for the file01x.cou and file04x.cou needs to follow exactly the format reported in our webpage http://www.interbull.org/ib/servicedocumentation
trait code must be in lower case

How to run the program

Go to the programs directory and type:

python ttconvert3.py -v -s'.abc' hol abc ../testing/abc1401

In this example

s'.abc' = the suffix of the population 'abc' is added to input files
breed of evaluation is HOL
population is ABC
input/output files are located in directory ../testing/abc1401

TRENDTEST3.PY

Typing

python trendtest3.py --help

within the programs directory will give you a small summary of the program usage:

usage: trendtest3.py [-h] [-v] [-s SAMPLES] [-c CONTROLFILE] [-m]
                     [-M MERGEDIR]
                     brd pop datadir

positional arguments:
  brd                   evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop                   population code (same as country code except for
                        CHR/DEA/DFS/FRR/FRM)
  datadir               absolute or relative path to data files

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  -s SAMPLES, --samples SAMPLES
                        number of bootstrap samples (default=1000)
  -c CONTROLFILE, --controlfile CONTROLFILE
                        path/name of the control file
                        (default=DATADIR/file305_POPBRD)
  -m, --mergefiles      write merged data files (for independent data checks)
  -M MERGEDIR, --mergedir MERGEDIR
                        absolute or relative path for merged data files
                        (default=DATADIR/merged3)

Trendtest3 - How to run the program

Go to the programs directory and type:

python trendtest3.py -v -m hol abc ../testing/abc1401

In this example

m is the option to have a merged file written out
breed of evaluation is HOL
population is ABC
input/output files are located in directory ../testing/abc1401

Output files

The following files are wriiten to the DATADIR or OUTDIR, if specified. All files have a _POPBRD suffix, so that multiple sets of output files for different breeds or population can co-exist in the same directory.

tt3_POPBRD.log: a log containing all information on the parameters and statistical analisys performed (see Trendtest3 - Log and Result File)
file313_POPBRD: summarizing the trend test results for method III (see Trendtest3 - Log and Result File)

Trendtest3 - Editings

The program will read the four input files, file305_POPBRD, file300_POPBRD, file303_POPBRD and bdate_POPBRD, and apply some editings on the data such as only the following bulls will be selected for the test:

All bulls reported in the file303_POPBRD must have an entry in the file300_POPBRD
Bulls with birth year within miny and maxy reported in file305_POPBRD
Bulls with type of proof equal to 11 and status of bulls not equal 20 (foreign bulls, type of proof = 21, are considered only for very small populations and only if type2x in file305 is set to Y)
Bulls with number of herd equal of higher than minimum number of herd in file305_POPBRD
Bulls with number of daughters equal or higher than minimum number of daughters in file305_POPBRD (in both file300_POPBRD and file303_POPBRD)
Bulls with at least 1 daughter added over the four years' period
Bulls with number of added daughter consistent with total number of daughters.

A merged file is created, called trt.csv (mil.csv for example), and placed under the DATADIR/merged3 directory, if not otherwise specified. The file can be used if you would like to do further investigation, the format of the file is available in APPENDIX III.

Trendtest3 - Statistical test

The statistical test for method III would be calculated as:

```
 y = b0 + b1*x + b2*t +  e
```

Regression 95% C.I. for delta (b[2] - 1.96 * bse[2], b[2] + 1.96 * bse[2])

The default number of bootstrap samples is set to 1000

The criteria for passing the test will then be equal to:

[abs(b2)/sdg] < 0,02 (if BV are used, otherwise 0,01) or

statistical validation test (95% C.I. contains 0)

Trendtest3 - Log and Result File

A logfile is created, called tt3_POPBRD.log, and placed under DATADIR, if not otherwise specified. The file presents a summary of the information taken in consideration for all the traits analysed, such as

minimum and maximum birthyear, minimum number of daughters, herds, genetic merit and MACE standard deviation used
Total number of records read from the file300_POPBRD and file303_POPBRD
summary of statistics for each of the trait analysed
Regression of current EBV (y) on previous EBV (x) and TIME variate (t)
warning messages
method III final result.

A result file, called file313_POPBRD, will be created in the DATADIR, if not otherwise specified. The file contains an overall summary of the traits analysed, the settings used and the final outcome of the validation, an example is presented below:

rec brd pop tgrp trt testdate pass delta lower   upper stat testval biol      SDg  bv bulls    std_y    std_x x yyyy miny maxy  herit corr mh md nsamp warnings
313 HOL ABC conf sta 20131023 PASS 0.013 -0.001  0.027 PASS  0.023  FAIL     0.564 BV   581    0.463    0.328 N 2009 1991 1999 0.3700 0.86 10 20  1000

A detailed description of the format for file311_POPBRD is available in APPENDIX IVc

Sending Results Back to Interbull Centre

Once you have finished running the validation for all populations and traits you needed, using one or all validation methods, results need to be summarized and send back to the Centre. A program called ttzip.py will take care of that for you.

TTZIP.PY

Typing

python ttzip.py --help

within the programs directory will give you a small summary of the program usage:

usage: ttzip.py [-h] [-C] brd pop datadir

positional arguments:
  brd            evaluation breed code (BSW/GUE/JER/HOL/RDC/SIM)
  pop            population code (same as country code except for
                 CHR/DEA/DFS/FRR/FRM)
  datadir        absolute or relative path to data files

optional arguments:
  -h, --help     show this help message and exit
  -C, --cleanup  delete all files successfully added to the zip file

ttzip.py - How to run the program

Go to the programs directory and type:

python ttzip.py hol abc ../testing/abc1401

In this example

breed of evaluation is HOL
population is ABC
input/output files are located in directory ../testing/abc1401

Output file
The program will create a zip file called ttYYMM_POPBRD.zip (for example tt1310_ABCHOL.zip) containing the results for all validation methods for all populations available in DATADIR or OUTDIR, if specified. Please email the zip file ttYYMM_POPBRD.zip to Interbull Centre ( valentina.palucci@slu.se )

APPENDIX I

Format305 for control files for the TrendTest software

The file305_POPBRD files are prepared by ITBC early in a test run and distributed to the NGECs that need to perform conventional validation for at least one trait in a given population (POP) and breed of evaluation (BRD).

Col	Name	Format	Description
1	tgrp	char	Trait group code (prod/conf/uder/long/calv/fert/work)
2	trt	char	Trait code (seehere)
3	evaldate	int	National evaluation date (yyyymmdd; from param file uploaded to IDEA)
4	herit	float	Heritability (from param file uploaded to IDEA)
5	siresd	float	Sire SD estimated at ITBC in current test run
6	merit	char	Genetic merit definition (B+/B-/T+/T-)
7	type2x	char	Whether foreign bulls (with type of proof 21 or 22) should be included (Y/N)
8	min_hrd	int	Minimum herds to include a bull
9	min_dgh	int	Minimum daughters to include a bull
10	byr1	int	First birth year to include for method 1 (1986 for HOL, 1981 for others)
11	miny	int	First birth year to include for method 3
12	maxy	int	Last birth year to include for method 3
13	corr	int	Correlation between new and old evaluations for method 3 (R=0.99)
14	preval	char	Date of previous validation (yy-mon) for traits last validated more than two years ago
15	chg	char	Change code (Y/N): whether validation is required because population is included for first time or because large changes where introduced in national evaluations for this trait

Notes

BRD: breed of evaluation (BSW/GUE/HOL/JER/RDC/SIM)
POP: population code (see here)
there is a header line which will be skipped by the software
there is an extra space between all fields to allow the file to be easily parsed without needing to specify fixed column positions

Sample data records

#grp trt evaldate  herit     siresd gm x mh md byr1 miny maxy corr preval chg
prod mil 20120101 0.2800  543.07922 B+ N 10 20 1986 1998 2002 0.98 ------ N
prod fat 20120101 0.2800   21.49578 B+ N 10 20 1986 1998 2002 0.98 ------ N
prod pro 20120101 0.2800   15.76838 B+ N 10 20 1986 1998 2002 0.98 ------ N
uder scs 20120101 0.1750   11.52474 B+ N 10 20 1986 1998 2002 0.98 ------ Y
conf sta 20120101 0.4500    0.95646 B+ N 10 20 1986 1998 2002 0.99 99-may N
conf usu 20120101 0.2100    0.90437 B+ N 10 20 1986 1998 2002 0.99 99-may N
conf loc 20120101 0.1200    1.00971 B+ N 10 20 1986 1998 2002 0.99 99-may N

APPENDIX IIa

APPENDIX I - Format File300-EBV and File700-GEBV

Col	Name	Start	Format	Description (footnote)	Example
1	rec type	1	a3	Record type ⁽¹⁾	300
2	brd_eval	5	a3	Breed of evaluation ⁽²⁾	HOL
3	pop	9	a3	Population code ⁽³⁾	USA
4	trt	13	a3	Trait of evaluation ⁽⁴⁾	mil
5	brd_anim	17	a3	Breed of animal	HOL
6	cou_orig	20	a3	Country of first registration	USA
7	sex	23	a1	Sex of animal	M
8	id_no	24	a12	Animal identification number	003000336289
9	typ_prf	37	i2	Type of proof ⁽⁵⁾	11
10	off_pub	40	a1	Official publicationof proof ⁽⁶⁾	Y
11	status	42	i2	Animal status ⁽⁷⁾	10
12	ndau	44	i8	Number of daughters ⁽⁸⁾	115
13	nhrd	52	i8	Number of herds ⁽⁹⁾	75
14	edc	60	i8	Number of effective daughter contributions ⁽¹⁰⁾	133
15	rel	69	f7.4	Repeatability/Reliability ⁽¹¹⁾	82
16	ebv	76	f10.	National predicted genetic merit ⁽¹²⁾	2.780

  IMPORTANT NOTE !!!!!

  In the old fileformat 01x-020 and 115, the national proofs were multiplied by a factor: (prod=100; conf=100;udder=1000;long=1000;calv=1000;fert=1000; fert=1000;work=1000).  This multiplication will no longer be needed.

¹Valid record types:
- 300 for EBV
- 700 for GEBV
²Breed codes accepted:
- BSW=Brown Swiss type; GUE=Guernsey type; HOL=Holstein-Friesian (Black & White) type; JER=Jersey type; RDC=Red Dairy Cattle type ; SIM=Simmental type.
³Valid population codes: ARG AUS BEL CAN ^aCAM CHE CZE ^bDEA DEU ^cDFS ESP EST FIN FRA ^dFRM GBR HUN IRL ISR ITA JPN LTU LVA NLD NOR NZL POL PRT SVN SVK SWE USA URY ZAF
- where: ^a Canadian MilkShorthorn ^bAustria+Germany; ^cDenmark + Finland + Sweden; ^d France Montbeliarde;
⁴Accepted traits abbreviations:
- Production ==> mil = milk;fat =fat; pro = protein;
- Conformation ==> sta = stature;cwi = chest width;bde = body depth;ang = angularity;ran = rump angle;rwi = rump width; rls = rear-leg set;rlr = rear-leg rear view;fan = foot angle;hde = heel depth/hoof height; fua = fore udder attachment; ruh = rear udder height; ruw = rear udder width; usu = udder support;ude = udder depth;ftp = front teat placement;ftl = (front) teat length;rtp = rear teat placement;ous = overall udder score; ofl = overall feet&legs score; ocs = overall conformation score; bcs = body condition score; loc = locomotion;
- Udder ==>scs = somatic cell; mas = mastitis
- Longevity ==> dlo = direct longevity;
- Calving ==> dce = direct calving ease;mce = maternal calving ease;dsb= direct stillbirth;msb = maternal stillbirth; ges = direct gestation length
- Female fertility ==>hco = heifer conception;crc = cow recycling;cc1 = lactating cow's ability to conceive (1);cc2 = lactating cow's ability to conceive (2);int= internval traits
- Workability ==> msp = milking speed;tem = temperament
- SNP Training ==> cma = clinical mastitis
⁵Accepted codes: (Please Note that from September 2026 onwards code 22 will no longer be valid)
- 00 (unknown);
- 11 (based on first crop sampling daughters or based on usage while having a genomic proof);
- 12 (based on first and second crop daughters);
- 13 (based on parent average and genomic information only);
- 21(based on imported semen of proven bull, second crop daughters only, or based on imported daughters/embryos);
- 22 (based on mostly, more than 50%, imported daughters or daughters born from imported embryos.)
- 23 (GEBV with foreign PA) - specific to GMACE files (file700)
- 24 (GEBV with foreign proof) - specific to GMACE files (file700)
⁶Accepted abbreviations:(Please Note that from September 2026 onwards code P will no longer be valid)
- Y (if bull proof meets national standards for official publication in the country sending information.);
- P (if bull is part of a simultaneous progeny-testing program, but the proof does not yet meet national standards for official publication);
- N (otherwise).
⁷Valid codes for status of bulls:(Please Note that from September 2026 onwards code 20 will no longer be valid)
- 00 (other or unknown);
- 10 (national and international AI bulls);
- 15 (young bull, genomically tested, not yet selected for AI) - specific to GMACE files (file700)
- 20 (other bull. Records with “20” in this file will be excluded from the international evaluation, unless type of proof is “21”).
⁸Field for number of daughters should be positive. For missing value put 0.
⁹Field for number of herds should be positive. For missing value put 0.
¹⁰ Production, conformation, udder health, fertility, workability, and SNP training traits: Weighting factor used for these traits is “the effective daughter contribution (EDC)”, which is described In the Interbull document Code of practice, Appendix IV, “Weighting factor for international genetic evaluation”, updated April 27, 2004. EDC values should be rounded to the nearest integer value.
- Calving: The weighting factors used for calving traits it the total number of calvings for the direct effects and number of daughters with calving for maternal effect
- Longevity: The weighting factor used for longevity traits depends on the national genetic evaluation model. For linear models the weighting factor is the same as described above for conformation, fertility, production, udder health and workability traits. For survival models number of culled daughters is used as the weighting factor.
¹¹Reliability values are nationally calculated reliability values expressed in percents with 4 decimials. For missing value put 0.
¹²National predicted genetic merit values published domestically. For threshold models the submitted values are from the underlying scale. For missing values put 9999999999. Please note! In the old fileformat 01x-020 and 115, the national proofs were multiplied by a factor: (prod=100; conf=100;udder=1000;long=1000;calv=1000;fert=1000; fert=1000;work=1000). This multiplication will no longer be needed.

APPENDIX IIb

Format302 for Submission of validation method II

Col	Name	Start	Format	Description
1	rec	1	a3	Record type (302)
2	brd	5	a3	Breed of evaluation
3	pop	9	a3	Population code (see here)
4	trt	13	a3	Trait code (seehere)
5	bullid	17	a19	International ID
6	calvyear	37	i4	calving year (YYYY)
7	ndau	42	i5	number of daughters
8	ave_DD	48	f10.4	average Daughter Yield Deviation

brd and pop must be in upper case
trt must be in lower case

APPENDIX IIc

Format303 for data file for validation method III

Record length = 90

Col	Name	Start	Format	Description
1	rec	1	a3	Record type (303)
2	brd	5	a3	Breed of evaluation (BSW/GUE/HOL/JER/RDC/SIM)
3	pop	9	a3	Population code (see here)
4	trt	13	a3	Trait code (see here)
5	bullid	17	a19	International ID
6	byear	37	i4	Bull's birth year
7	type_prf	42	i2	Type of proof
8	ndau	45	i7	Number of daughters in proof in YYYY-4
9	ebv	53	f9.3	National predicted genetic merit in YYYY-4
10	n1	63	i5	Number of daughters added in YYYY-3
11	n2	69	i5	Number of daughters added in YYYY-2
12	n3	75	i5	Number of daughters added in YYYY-1
13	n4	81	i5	Number of daughters added in YYYY
14	year1d	87	i4	Mean year of first calving of daughters in YYYY-4

Notes:

starting columns allow for an extra blank between all fields
brd and pop must be in upper case
trt must be in lower case
YYYY: year of the most recent routine genetic evaluation run whose results will be included in the international evaluation
nd1-nd4: number of new (first calving) daughters considered in the last available national genetic evaluation in each year
year1d: mean year of first calving of daughters on which the bull’s national evaluation in year YYYY-4 was based
- This field is not currently used by the trendtest software because it is not uniformly supplied by all NGECs. The field can be set to '0000'. The software replaces year1d by byear+4.

APPENDIX III

TrendTest Merged Files

The trendtest programs offer an option (-m, --mergefiles) to create a file of merged 300/300FL; 300/302; 300/303 records as a convenience to the user. These files can make it easier to check for the correctness of the input datasets and they can be used perform additional checks and/or statistical analyses.

One file is created for each trait present in the file305_POPBRD. There is a record for each bull present in the file300 born in/after the cutoff year specified in the file305_POPBRD. Flags are supplied to indicate whether the bull qualifies for the analysis or not. Please see the file formats below.

By default, the files are created in directory DATADIR/merged(1,2,3). The merged files do not have a _POPBRD extension, so if you would like to create files for more than one population or breed, you should also supply the -M, --mergedir option with a different destination directory for each population/breed. The destination directory can be an absolute path, or it can be relative to the programs directory (eg. ../sample_data/my_merges). The directory will be created automatically if it does not exist.

File format merged1

The file is in comma-separated-variable (csv) format, using commas as the separator.

Column	Variable	Type	Description
1	aid	char(19)	animal ID
2	byear	int	Birth year
3	keep	char(1)	Bull qualifies for the analysis (Y/N)
4	top	char(2)	Type of proof (from file300)
5	off	char(1)	Official proof (Y/N; from file300)
6	sta	char(2)	Bull status (from file300)
7	AL	char(2)	Fixed separator for File300 records
8	nd	int	Number of daughters
9	nh	int	Number of herds
10	edc	int	EDC
11	rel	real	Reliability (x100)
12	ebv	real	Predicted genetic merit ("proof")
13	FL	cha(2)	Fixed separator for file300FL records
14	nd	int	Number of daughters
15	nh	int	number of herds
16	edc	int	EDC
17	rel	real	Reliability (x100)
18	ebv	real	Predicted genetic merit ("proof")

File format merged2

The file is in comma-separated-variable (csv) format, using commas as the separator.

Column	Variable	Type	Description
1	aid	char(19)	animal ID
2	byear	int	Birth year
3	top	char(2)	Type of proof (from file300)
4	off	char(1)	Official proof (Y/N; from file300)
5	sta	char(2)	Bull status (from file300)
6	f300	char(4)	Fixed separator for File300 records
7	nd	int	Number of daughters
8	nh	int	Number of herds
9	edc	int	EDC
10	rel	real	Reliability (x100)
11	ebv	real	Predicted genetic merit ("proof")
12	f302	cha(4)	Fixed separator for file302 records
13	year	int	year of calving
14	nd	int	number of daughters
15	dd	int	average Daughter Yield Deviation
16	nb	int	number of bulls
17	j	int

File format merged3

The file is in comma-separated-variable (csv) format, using commas as the separator.

Column	Variable	Type	Description
1	aid	char(19)	animal ID
2	byear	int	Birth year
3	keep	char(1)	Bull qualifies for the analysis (Y/N)
4	top	char(2)	Type of proof (from file300)
5	off	char(1)	Official proof (Y/N; from file300)
6	sta	char(2)	Bull status (from file300)
7	f300	char(4)	Fixed separator for File300 records
8	nd	int	Number of daughters
9	nh	int	Number of herds
10	edc	int	EDC
11	rel	real	Reliability (x100)
12	ebv	real	Predicted genetic merit ("proof")
13	f303	cha(4)	Fixed separator for file303 records
14	topx	char(2)	Type of proof
15	nx	int	Number of daughters
16	ebv	real	Predicted genetic merit ("proof")
17	n1	int	added daughters in YYYY-3
18	n2	int	added daughters in YYYY-2
19	n3	int	added daughters in YYYY-1
20	n4	int	added daughters in YYYY
21	year1d	int	Mean year of first calving of daughters in YYYY-4
22	t_i	real	time variate
23	w_i	real	weight

APPENDIX IVa

Format311 for TrendTest results for method 1

Col	Name	Format	Description
1	rec	char	Record type (311)
2	brd	char	Breed of evaluation (BSW/GUE/HOL/JER/RDC/SIM)
3	pop	char	Population code (see here)
5	tgrp	char	Trait group code (prod/conf/uder/long/calv/fert/work)
6	trt	char	Trait code (see here)
7	testdate	int	Date on which trendtest1 was run (yyyymmdd)
8	pass	char	PASS or FAIL
9	testval	float	Test value: abs(b_ALL - b_1ST)/SDg
10	SDg	float	Genetic standard deviation
11	bvta	char	Genetic merit (BV/TA)
12	b_ALL	float	Genetic trend from ALL lactation evaluations
13	b_1ST	float	Genetic trend from 1ST lactation evaluations
14	bulls	int	Number of bulls included in the test
15	stdALL	float	Raw std of ALL lactation evaluations
16	std1ST	float	Raw std of 1ST lactation evaluations
17	type2x	char	Whether foreign bulls (with type of proof 21 or 22) were included (Y/N)
20	byr1	int	First birth year of bulls included
18	min_hrd	int	Minimum herds for bulls included
19	min_dgh	int	Minimum daughters for bulls included
20	warnings	char	Codes for warnings (see log file for details)

Notes:

there is a header line which will be skipped by the software
there is an extra space between all fields to allow the file to be easily parsed without needing to specify fixed column positions

Sample records

rec brd pop tgrp trt testdate pass testval      SDg bv    b_ALL    b_1ST bulls   stdALL   std1ST x byr1 mh md warnings
311 HOL ABC prod mil 20130930 PASS   0.006  543.079 BV   53.379   49.928   154  565.838  526.043 N 1986 10 20 none
311 HOL ABC prod fat 20130930 PASS   0.007   21.496 BV    1.084    0.943   154   18.614   18.003 N 1986 10 20 none
311 HOL ABC prod pro 20130930 FAIL   0.023   15.768 BV    1.513    1.157   154   15.979   16.561 N 1986 10 20 LACT1_SCALE_WARNING
311 HOL ABC uder scs 20130930 PASS   0.001   21.525 BV    0.106    0.095   154   10.698   10.572 N 1986 10 20 SDG_BV_WARNING

APPENDIX IVb

Format312 for TrendTest results for method 2

Col	Name	Format	Description
1	rec	char	Record type (312)
2	brd	char	Breed of evaluation (BSW/GUE/HOL/JER/RDC/SIM)
3	pop	char	Population code (see here)
4	tgrp	char	Trait group code (prod/conf/uder/long/calv/fert/work)
5	trt	char	Trait code (see here)
6	testdate	int	Date on which trendtest1 was run (yyyymmdd)
7	pass	char	PASS or FAIL
8	testval	float	Biological test value: abs(b)/SDg
9	b	float	Slope of regression of DD on year within bull
10	SDg	float	Genetic standard deviation
11	bvta	char	Genetic merit (BV/TA)
12	bulls	int	Number of bulls included in the test
13	std_DD	float	Raw std of daughter deviations (DD)
14	type2x	char	Whether foreign bulls (with type of proof 21 or 22) were included (Y/N)
15	byr1	int	First birth year of bulls included
16	min_hrd	int	Minimum herds for bulls included
17	min_dgh	int	Minimum daughters for bulls included
18	warnings	char	Codes for warnings (see log file for details)

Notes:

there is a header line which will be skipped by the software
there is an extra space between all fields to allow the file to be easily parsed without needing to specify fixed column positions

Sample records

rec brd pop tgrp trt testdate pass testval       b      SDg bv bulls   std_DD x byr1 mh md warnings
312 HOL ABC prod mil 20131003 FAIL   0.014   7.346  543.079 BV   426  563.335 N 1986 10 20 none
312 HOL ABC prod fat 20131003 PASS   0.009   0.185   21.496 BV   426   19.186 N 1986 10 20 none
312 HOL ABC prod pro 20131003 PASS   0.001   0.022   15.768 BV   426   15.792 N 1986 10 20 none
312 HOL ABC uder scs 20131003 PASS   0.002   0.036   21.525 BV   402   10.598 N 1986 10 20 MISSING_BULLS

APPENDIX IVc

Format313 for TrendTest results for method 3

Col	Name	Format	Description
1	rec	char	Record type (313)
2	brd	char	Breed of evaluation (BSW/GUE/HOL/JER/RDC/SIM)
3	pop	char	Population code (see here)
4	tgrp	char	Trait group code (prod/conf/uder/long/calv/fert/work)
5	trt	char	Trait code (see here)
6	testdate	int	Date on which trendtest1 was run (yyyymmdd)
7	pass	char	PASS or FAIL
8	delta	float	Slope of regression on time variate (t)
9	lower	float	Lower limit of empirical 95% C.I. for delta
10	upper	float	Upper limit of empirical 95% C.I. for delta
11	stat	char	PASS or FAIL for statistical test
12	testval	float	Biological test value: abs(delta)/SDg
13	biol	char	PASS or FAIL for biological test
14	SDg	float	Genetic standard deviation
15	bvta	char	Genetic merit (BV/TA)
16	bulls	int	Number of bulls included in the test
17	std_y	float	Raw std of current evaluations (YYYY)
18	std_x	float	Raw std of previous evaluations (YYYY-4)
20	type2x	char	Whether foreign bulls (with type of proof 21 or 22) were included (Y/N)
21	yyyy	int	Current year of bulls included
22	miny	int	First birth year of bulls included
23	maxy	int	Last birth year of bulls included
24	herit	float	Heritability of the trait
25	corr	float	Correlation between previous and current evaluation methods (R)
26	min_hrd	int	Minimum herds for bulls included
27	min_dgh	int	Minimum daughters for bulls included
28	nsamp	int	Number of samples in bootstrap C.I.
29	warnings	char	Codes for warnings (see log file for details)

Notes:

there is a header line which will be skipped by the software
there is an extra space between all fields to allow the file to be easily parsed without needing to specify fixed column positions

Sample records

rec brd pop tgrp trt testdate pass   delta    lower   upper stat testval biol      SDg bv bulls    std_y    std_x x miny maxy  herit corr nh nd nsamp warnings
311 HOL ABC prod mil 20130930 PASS -10.989  -64.504  82.308 PASS  -0.020 FAIL  543.079 BV    24  422.935  433.905 N 1998 2002 0.2800 0.98 10 20  1000 none
311 HOL ABC prod fat 20130930 PASS   0.939   -3.918   1.751 PASS   0.044 FAIL   21.496 BV    24   14.890   14.913 N 1998 2002 0.2800 0.98 10 20  1000 SDG_BV_WARNING
311 HOL ABC prod pro 20130930 PASS   0.138   -2.374   2.073 PASS   0.009 PASS   15.768 BV    24   12.349   12.363 N 1998 2002 0.2800 0.98 10 20  1000 none
311 HOL ABC uder scs 20130930 PASS   2.107   -1.713   3.901 PASS   0.098 FAIL   21.525 BV    22   12.587   12.043 N 1998 2002 0.1750 0.99 10 20  1000 SDG_BV_WARNING

APPENDIX IVd

Format for file bdate_POPBRD for validation methods

Record length = 28

Col	Name	Start	Format	Description
1	ID	1	a19	International ID (1)
2	birth date	21	i8	birth date

Note:

International IDs in common between the proof file 300 and the relative trend data file used (either file300_FL, file302 or file303)
This file needs to be created ONLY if the converting programs are no longer used, otherwise it will be automatically generated by them.

Frequently Asked Questions

Q. What do I do if my default version of Python is too old?

A. Typing "python --version" at the command line prompt will show the version of the default Python interpreter on your system. If this is <2.6 or 3.0 or 3.1, you need to make sure Python version 2.6, 2.7 or >=3.2 is also installed on your system and then, for all the examples in this documentation, you need to substitute "python" with the complete path to the more recent Python version. For example, on Windows this might be "C:\Python3.3\bin\python3.3" and on linux it might be something like "/usr/local/bin/python3.3".

Q. I normally only submit results from validation method II, is there a converting file format I can use for my file01x?

A. If you wish to convert your file01x for applying validation method II you can use the program ttconvert1.py. The program requires as input file a file01xFL.cou: simply make a copy of your file01x.cou, name it file01xFL.cou and run the program. You can then ignore the converted File300FL_POPBRD.

Q. I run the ttconvert programs but no output are created

A. Be sure that the input files' names are correct. The following are the file names recognized by the ttconvert programs:

file01x.cou
file01xFL.cou
file04x.cou

Also double check that the input files' formats is exactly as defined in our webpage http://www.interbull.org/ib/servicedocumentation

Q. I got the message: "warning: no DD records found" when running trendtest2.py, and no results are produced

A. This warning normally means that the file302 contains different POP BRD TRT then what specified in the file305. The warning may also occur if POP BRD and/or TRT are written in the file using a wrong case: POP and BRD must be upper case while TRT must be lower case.