README for CheckPedigree.py
Updated: 2012-11-14
Information about the program:
CheckPedigree.py will perform a series of checks on your pedigree data to insure that the data is all right. If no errors are detected then a zip file will be created. The zip file represents your checked pedigree file to upload to the Interbull Centre IDEA database.
The checks relate to:
- Check the international identification numbers (animal, sire and dam)
- Correct three digit country code as in the ISO 3166 standard (no missing countries allowed)
- Correct three digit breed code according to the Interbull breed codes
- Correct construction of the numerical part of the ID (registration numbers, right justified, leading blanks as zeros)
- Missing sires and dams shall be coded as UUUUUUUUUUUUUUUUUUU (i.e. with 19 U)
- Check the animal's birth date
- Has to be reported in the format YYYYMMDD
- If you know only the year of birth then enter it as YYYY0000
- If you know year and month of birth then enter them as YYYYMM00
- Missing birth dates are coded as 00000000 (or blanks or a single 0)
- Check that a male (or female) animal will eventually appear only as sire (or dam)
- Check for inconsistent duplicate records (different sire, dam or birthdate)
- Check that an animal is always younger than its parents and grandparents
Before Running the Program:
Install Python (version 2.5 to 2.7) if necessary
- Create a working directory/folder
Download the CheckPedigree.py program from https://idea.interbull.org/software and copy it to your new directory
- Copy your pedigree file to the working directory
Running the Program:
- Ensure there is a working network connection
Use the command: python CheckPedigree.py -m <ORGCODE> -f <filename>
Use your uppercase ORGCODE as shown on the upper right hand side of the IDEA page.Your organization code is reported within brackets beside the "Logged in as" information.
- The program checks its internal version with the value stored on the Interbull server. You will have to download the most recent version if there is a mismatch.
After Running the Program:
If no errors are detected, the pedigree file will be written into a zip file called IB-ORGCODE-yyyymmddThhmmss.zip. Upload the zip file to Interbull's data exchange site: https://idea.interbull.org/.
In case of errors, no zip file will be created. Please correct your data and re-run the program until the data successfully pass all required checks.
Specific information about your pedigree data, descriptive statistics and a summary of errors are written to the file CheckPedigreeLog.txt.
All errors are listed in detail in the file called CheckPedigreeErrors.txt. The following table describes the brief error messages more fully:
Error message |
Description |
|
Inconsistent duplicates |
An animal appears twice with different sire, dam or birth date |
|
Warning duplicates |
An animal appears twice but with same sire, dam and birth date |
|
Illegal character errors |
The numerical part of the international ID is not valid |
|
Breed-country error |
The breed-country combination is not recognized |
|
Sex coding error |
The sex code is neither M nor F |
|
Parent sex error |
A male animal (or a female) appears in the dam (or sire) column |
|
Birth date errors |
Malformed entry for birth date |
|
Ancestor check |
Animal appears older than its parents or grandparents |
|
Note
Please do not modify the program to circumvent any checks. Doing so would be pointless because the same checking routine is used again inside IDEA to double-check the pedigree file uploaded in the zip file.
If you need assistance, please do not hesitate to contact us at interbull@slu.se .