ibc_logo.jpg

READ ME - CheckPerformance.py

General information

A Python program called CheckPerformance.py will check the XML format performance file for format correctness, and produce a compressed version of the performance file ready for upload to the IDEA site if no problems are detected.

The program prepares a compressed tar file named IB-BEEF-ORGCODE-yyyymmddThhmmss.tbz2 containing the performance file (renamed to a name similar to the compressed tar files name).

The program requires access to the internet and specifically to a few functions/pages in the IDEA web application. Firstly, in order to ensure that the user has the most recent version of the software, the program checks its internal version with the version stored on the Interbull server. If there is a mismatch in versions, a message is printed and the program exits and the new version of the program must be downloaded before trying again. Lists of valid orgcode’s and breed-pop-trait combinations for which the user’s organization has performance upload authority are also obtained from the IDEA web application. If errors occur, they are listed on the screen (unless redirected elsewhere) and no compressed tar file is created.

The compressed tar file produced is your checked data file to upload to the Interbull Centre IDEA database (https://idea.interbull.org/).

One performance file (and hence the corresponding compressed tar file) may contain records for multiple brd-pop-trt combinations but, as that practice leads to very large upload files, this is not a recommended practice.

It is essential that performance records for all animals are included in the same file for any one brd-pop-trt combination whether the dataset is a first submission for the run in question, or a re-submission with some problems corrected. Thus, it is not acceptable to submit a partial data-set in order to correct the evaluations of some subset of animals.

For a more detailed description, see description of the XML performance format.

Before Running the Program

Executing the Program

The shown commands assume that the working directory in point c above, is used as current directory, i.e. start by calling cd <name of directory> (once) before running the commands shown below.

First off, lets look at a simple example of how to execute the program that is sufficient in most cases (you will need to replace CMBCB with your organizations code and the name of the performance file in order to successfully run this):

python3 CheckPerformance.py CMBCB MyPerformanceFile.xml

Synopsis of how to run the program:

python3 CheckPerformance.py [-l LOGPATH] [-B] [-N] ORGCODE DATAFILE

Arguments within square brackets (i.e. []) are optional to specify.
Arguments are meant to be specified in the order indicated.
The argument DATAFILE is supposed to be a filename that may be either relative (i.e. relative to the current directory) or absolute (i.e. it starts with a slash)

The -l LOGPATH argument is used to redirect the log messages to the specified file, but alternatively all output could be redirected to a file by using normal shell redirection (i.e. ending the command with >OUTFILE).

The -B argument is meant to be used in special situations in collaboration with the Interbull Centre, but as no output file is produced in this case it may also be used when testing the input file for correctness.

If the -B argument has not been provided, the generated output is put in a file in the current directory named IB-BEEF-ORGCODE-yyyymmddThhmmss.tbz2
(or possibly with the file extension .tgz or .tar instead of .tbz2 depending on what compression is available).

The -N argument is used to disable the mechanism that partitions DATAFILE inside the generated output file. This should not be used unless specifically requested by the Interbull Centre.

Running the following command will show a brief description of how to run the program if you need a reminder:

python3 CheckPerformance.py -h

Output From the Program

If no problems are discovered in the input file (and neither the -l ... argument nor the -B argument is used), the following message will be written:

Checking program: CheckPerformance.py version 2019-09-30 v2.1, provided by Interbull Centre


***************************************************************************
Detailed report from check of the trait definitions in the performance file
***************************************************************************


TRAITS DATA
***********
A total of 0 errors found


*****************************************************************
Detailed report from check of the animals in the performance file
*****************************************************************


ANIMALS DATA
************
A total of 0 errors found


No errors found in the performance file.


Record counts by breed/country/trait combination
************************************************
LIM/CZE/aww 12679

Writing IB-BEEF-ORGCODE-20190402T070057.tbz2 ...
Done.

and an output file (named as shown in the output) has been produced.

If problems are discovered, the tail of the above messages will be replaced by one or more messages each trying to describe what the problem is. At some points in the execution, any previously detected problem will cause immediate termination. No output file will be created until all problems are fixed.

public/CheckPerformance (last edited 2021-09-03 10:23:39 by Valentina)