Differences between revisions 49 and 50
Revision 49 as of 2013-11-22 10:58:35
Size: 69601
Editor: Hjerpe
Comment:
Revision 50 as of 2013-11-22 10:59:39
Size: 69651
Editor: Hjerpe
Comment:
Deletions are marked like this. Additions are marked like this.
Line 10: Line 10:

Line 16: Line 14:
Line 19: Line 16:
The user instructions and file formats (see Appendix I and II) give details on how to run the program and on the checks performed.    The [[https://wiki.interbull.org/public/CheckProofPara#preview|user instructions]] and file formats (see Appendix I and II) give details on how to run the program and on the checks performed.

ibc_logo.jpg

USER MANUAL FOR THE IDEA EBV INTERFACE

Preface

The following is a manual to guide the user through the features of the new IDEA. IDEA stands for Interbull Data Exchange Area. IDEA is a restricted area accessible only to member countries through the Interbull website.

Software

The Software menu gives you access to the Interbull checking programs. By clicking on Software a drop down menu will open and you will be able to choose the type of checking program you are interested in, i.e. Pedigree or Proofs. Under Software you will also find information on the programs and instructions on how to run them.

Proof's Checking Program

A Python program called CheckProofsPara.py will check the 300/700 proof file format and the associated 301/701 parameter file for format correctness against the IDEA EBV User Manual Appendix II. The program prepares a zip file, IB-ORGCODE-IG-yyymmddThhmmss.zip for conventional MACE and IB-ORGCODE-GG-yyymmddThhmmss.zip for GMACE,if no errors are found in the file. The zip file represents your checked data file to upload to the Interbull Centre IDEA database (https://idea.interbull.org/).

The user instructions and file formats (see Appendix I and II) give details on how to run the program and on the checks performed.

Proofs

The Proofs menu gives you access to the main proofs functions which are: Upload, Review, Messages.

Upload and Verify program

By uploading data in IDEA users will no longer need to run the Verify program prior sending data to Interbull Centre. The Verify program will, in fact, be run automatically in IDEA during uploading.

To uploading functionality for parameter and proof files is available under 'Proofs/Upload'. The only file accepted by IDEA is the zip file IB-ORGCODE-IG-yyymmddThhmmss.zip produced by the CheckProofsPara.py, no other files are accepted. Once in 'Proofs/Upload' users will be able to browse the appropriate file and upload it by click on "Submit query".

The uploading of data will not happen on real time but data will be set on a queue and processed on a later time. Right after clicking "Submit query" a message on the screen will display the amount of parameter and proofs records submitted. A confirmation email will be sent to the email address associated to the user that has uploaded data.
The following are the steps the data will go through during uploading:

  1. CheckProofsPara.py will run once again inside IDEA to assure correctness of format and contents

  2. The Verify program will check your data against the previous one available.

Confirmation email

Either an "EBV upload success" or an "EBV upload failure" email will be sent to you upon completion of the uploading process. If the uploading is successful the "EBV upload success" email will list some basic information on your data such as:

  1. Number of rows in parameter file associated with proofs
  2. Number of records read from the proof file
  3. Number of animals found by real AID
  4. Number of animals found by alias ID
  5. Number of flagged rows in proof file

The email can contain up to three different WARNINGs associated to three attachments:

  1. Warning: [n] animal(s) were referenced in your file, but not present in the pedigree database. These animals were discarded! See a detailed list in the attached 'missing_animals.csv' file.
  2. Warning: [n] animal(s) needing updated pedigree records detected! See 'ped_needed.csv' for a complete list of the animals.
  3. Warning: Use of aliases detected! See 'ped_alias.csv' for a complete list of the animals.

A fourth warning, not associated to any attachments, is generated if the uploaded file contains re-uploadings for a given Breed-Pop-Trait combination which have not been withdrawed first (see Withdraw data):

WARNING: This dataset re-uploads already existing data, of which some has not been withdrawn first. These combinations have been skipped; the proper way to re-upload combinations is to withdraw them first (or ask IBC staff to reset them). These combinations have been skipped:[......]

An important source of information comes from Number of flagged rows in proof file: this number represents the amount of discrepancies found by the Verify program. If it is 0 (zero) it means that no discrepancies have been found and your data is automatically submitted for the IGE. If it is not 0 (zero) it means that the Verify program has found some discrepancies in your data compared to the previous one available or the system has detected animals with missing pedigree. To double check the data you need to log in IDEA and go to Proof/Review.

Animals not present in IDEA pedigree or lacking pedigree information (i.e. present in the pedigree database but with sire and dam unknown) will be excluded from the international evaluation .


Table 1 summarizes the action needed upon reception of a confirmation email with such warnings and attachments:

Attachments

Meaning

Action Required

Consequences

missing_animals.csv

The animals listed in this attachment are not present in IDEA pedigree

Log in IDEA,
go to Proofs/review,
withdraw the data file,
prepare a file200 for these animals,
upload the file in IDEA pedigree,
after reception of the pedigree confirmation email upload again the proof file

If pedigree is not provided, animals are excluded from the international evaluation

ped_needed.csv

The animals listed in this attachment have sire/dam set to unknown

If you have pedigree information for these animals:
log in IDEA,
go to Proof/review,
withdraw the data file,
prepare a file200 for these animals,
upload the file in IDEA pedigree,
after reception of the pedigree confirmation email upload again the proof file.
If you don't have such information you can submit the file: you will be asked to write an explanation

If pedigree is not provided, animals are excluded from the international evaluation

ped_alias.csv

The animal IDs listed in this attachment are alias IDs

You are requested to update your own database with the correct animals' IDs.

Alias IDs are automatically switched to their corresponding official IDs.

Refer to the section 'Submit/Withdraw data' for more information

After uploading: What's next?

Uploading represents only the very first step for submitting your data for an IGE. Here is a description of the actions you need to follow to submit your data for an IGE.

Review your data

The Proof/Review page contains all the information you need to review and submit your data for a IGE.

The page contains several important information grouped into three different tables: Filters table, Central table and Actions table

The Filters table allows you to filter and display the information you have uploaded in a way that better suits you. By default the Central table displays all the information associated to your account but you can costumize the displaying of information choosing among:

  • A list of the different breeds you have uploaded data for
  • A list of the population for which you have uploaded data for
  • A list of the traits uploaded
  • A list of the different status of the data uploaded
  • A list of the data set codes you have uploaded

Clicking "Reset all" will erase all your previous filters and display again all the information associated to your account.

Review.png

The Central table displays as many rows as the breed-pop-trait combinations you have uploaded. Each row shows the following information:

  • Status: refers to the status of your data. There are a total of 5 different status: Pending, Submitted, Withdrawed, Accepted, Rejected
  • Flagged: refers to the outcome of the Verify program. Value for this column are either YES or NO. Breed-Pop-trait combination flagged YES are also highlighted in yellow.
  • C/G: refers to the nature of your data, Conventional/Genomic. At the moment only Conventional data are accepted.
  • Datasets: refers to the dataset used to upload that given Breed-Pop-trait combination
  • Reports: for each Breed-Pop-trait combination you get access to a "Brief" and a "Full" output of the Verify program. The "Bulls" report lists all bulls highlighted by the Verify program.

The Breed-Pop-Trait combinations not highlighted and with Flagged=NO are combinations for which the Verify program did not find any discrepancies therefore they get automatically a status=Submitted. No more actions are required from you for these combinations.

The Breed-Pop-Trait combinations highlighted in yellow require your attention. For each of them you need to check the Verify output. For your convenience a "View Summary for Selected" at the end of the Central table will display in a new page the main key checking points of the Verify output. In order for this option to work you need to select some combinations. You can do that either manually of by clicking on "Reverse selection" and then click on "View Summary for Selected".

The Actions table displays the options you have for your data with status=PENDING. You can decide to submit or withdraw such data by selecting the desired action and clicking on "Submit".

Submit data

If, after checking the Verify output ,you consider your data to be ok and want to include it in the IGE you can do so by selecting the box beside each Breed-Pop-Trait combination you want to include, select the action "Submit" and click on the "Submit" button.

Every time you submit a pending data you are required to explain the reasons for the discrepancies found by the Verify program. If the reason are breed-trait dependent you need to processed these cases one by one and provide the full explanation in the designed space. If, on the hand, the same explanation applies to several Breed-Pop-Traits combinations you can select them together in the Review page so that you will be required to write only once the explanation that all these data share.By clicking "Send message" your message will be recorded under IDEA Proofs/Messages and will be visible by you and the Interbull Centre Staff.

In the Review page, the Breed-Pop-Trait combinations you have submitted will be now displayed with status=Submitted.

On the day of the data submission deadline for a given IGE, routine or test run, all your data in the Review page should have status=SUBMITTED. Your aim is therefore to check all pending data and either provide explanations or withdraw and upload new datasets before the data submission deadline.

Submit_pending.png

Withdraw data

If you realize the data you have uploaded for some Breed-Pop-trait combinations is wrong or you want to upload pedigree information for the animals reported in the confirmation email attachments you need to withdraw your data before doing anything else. You do so by simply selecting the affected Breed-Pop-Trait combinations in the Review page, select the action Withdraw and then press the button "Submit".In the Review page that given Bredd-Pop-trait combination will now have status=WITHDRAWED.

When you withdraw a given Breed-Pop-Trait all records present in IDEA for that combination are deleted thus you have to re upload the file before the data deadline. It is not acceptable to submit a partial dataset in order to correct the evaluations of some subset of bulls. It is essential that proof records for all bulls be included in the same file for any one brd-pop-trt combination whether the dataset is a first submission for the IGE run in question, or a re-submission with some problem corrected.

If you fail to withdraw all the traits you are going to re-upload in a new file, the confirmation email will contain the following warning:

  • WARNING: This dataset re-uploads already existing data, of which some has not been withdrawn first. These combinations have been skipped; the proper way to re-upload combinations is to withdraw them first (or ask IBC staff to reset them). These combinations have been skipped:[......]

Example:
You notice some problem with the temperament data for RDC. You withdraw only the RDC-Tem from your Review table and in the new file to upload you also include data for RDC msp. As you did not withdraw the RDC-Msp combination before uploading the new file, all the combinations referring to RDC-Msp are skipped as data already exist in IDEA for that given combination.

Accept/Reject data by ITBCOnce you have submitted your data for a given IGE it will be up to the ITBC staff to finally accept or reject it. In general all data automatically submitted and with FLAGGED=NO will also get accepted as there are no issues pending on these data. For all the data with FLAGGED=YES, ITBC staff will go through the explanations you have reported and if found sufficients will accept that given Breed-Pop-Trait combination as accepted. If more clarifications are needed ITBC staff will email you via IDEA, anytime a new message will be posted in your IDEA account by the staff an email will be sent to your email address informing you about the presence of unreplied messages in IDEA.

In case the discrepancies found are considered too big then ITBC staff can reject your data. You will be able to follow what happens to your data by looking at the Review page as the status will change according to what decision has been made.

Messages

Proofs/Messages represents the place where all your communication with the ITBC for a given IGE staff are displayed. All the explenations you provide for a given Breed-Pop-Trait combination are automatically listed in here. Unreplied messages will be marked in red. You can use this page to send further messages to ITBC staff. When ITBC staff replies their messages will be also listed here and you will be notified by email about the presence of unreplied messages in IDEA.

messages.png

APPENDIX I - Format File300 - Proof file

Col

Name

Start

Format

Description

Example

1

rec type

1

a3

Record type 1

300

2

brd_eval

5

a3

Breed of evaluation 2

HOL

3

pop

9

a3

Population code 3

USA

4

trt

13

a3

Trait of evaluation 4

mil

5

brd_anim

17

a3

Breed of animal

HOL

6

cou_orig

20

a3

Country of first registration

USA

7

sex

23

a1

Sex of animal

M

8

id_no

24

a12

Animal identification number

840M003000336289

9

typ_prf

37

i2

Type of proof 5

11

10

off_pub

40

a1

Official publicationof proof 6

Y

11

status

42

i2

Animal status 7

10

12

ndau

44

i8

Number of daughters 8

115

13

nhrd

52

i8

Number of herds 9

75

14

edc

60

i8

Number of effective daughter contributions 10

133

15

rel

69

f7.4

Repeatability/Reliability 11

82

16

ebv

76

f10.

National predicted genetic merit 12

2.780

1Valid record type 300

2Breed codes accepted: BSW=Brown Swiss type; GUE=Guernsey type; HOL=Holstein-Friesian (Black & White) type; JER=Jersey type; RDC=Red Dairy Cattle type ; SIM=Simmental type.

3Valid population codes: ARG AUS BEL CAN CHE aCHR CZE bDEA DEU cDFS dDNR ESP EST FIN FRA eFRM fFRR GBR HUN IRL ISR ITA JPN LTU LVA NLD NZL POL PRT SVK SWE USA URY ZAF
where: aSwiss Red Holstein; b Austria+Germany; cDenmark + Finland + Sweden; dDenmark Red Holstein e France Montbeliarde; fFrench Pie Rouge

4Accepted traits abbreviations: Production(mil = milk;fat =fat; pro = protein); Conformation ( sta = stature;cwi = chest width;bde = body depth;ang = angularity;ran = rump angle;rwi = rump width; rls = rear-leg set;rlr = rear-leg rear view;fan = foot angle;hde = heel depth/hoof height; fua = fore udder attachment; ruh = rear udder height; ruw = rear udder width; usu = udder support;ude = udder depth;ftp = front teat placement;ftl = (front) teat length;rtp = rear teat placement;ous = overall udder score; ofl = overall feet&legs score; ocs = overall conformation score; bcs = body condition score; loc = locomotion); Udder (scs = somatic cell; mas = mastitis);Longevity (dlo = direct longevity);Calving (dce = direct calving ease;mce = maternal calving ease;dsb= direct stillbirth;msb = maternal stillbirth);Female fertility (hco = heifer conception;crc = cow recycling;cc1 = lactating cow's ability to conceive (1);cc2 = lactating cow's ability to conceive (2);int= internval traits);Workability (msp = milking speed;tem = temperament).

5Accepted codes: 00 (unknown); 11 (based on first crop sampling daughters);12 (based on first and second crop daughters);13 (based on parent average and genomic information only);21 (based on imported semen of proven bull, second crop daughters only); 22 (based on mostly, more than 50%, imported daughters or daughters born from imported embryos.)

6Y (if bull proof meets national standards for official publication in the country sending information.); P (if bull is part of a simultaneous progeny-testing program, but the proof does not yet meet national standards for official publication);N (otherwise).

7Valid codes for status of bulls: 00(unknown);10(bull randomly sampled through an official AI scheme);15 (young bull, genomically selected);20(other bull. Records with “20” in this file will be excluded from the international evaluation, unless type of proof is “21”).

8Field for number of daughters should be positive. For missing value put 0.

9Field for number of herds should be positive. For missing value put 0.

10Production, conformation, udder health, fertility and workability traits: Weighting factor used for these traits is “the effective daughter contribution (EDC)”, which is described In the Interbull document Code of practice, Appendix IV, “Weighting factor for international genetic evaluation”, updated April 27, 2004. EDC values should be rounded to the nearest integer value.
Calving: The weighting factors used for calving traits it the total number of calvings for the direct effects and number of daughters with calving for maternal effect
Longevity: The weighting factor used for longevity traits depends on the national genetic evaluation model. For linear models the weighting factor is the same as described above for conformation, fertility, production, udder health and workability traits. For survival models number of culled daughters is used as the weighting factor.

11Reliability values are nationally calculated reliability values expressed in percents with 4 decimials. For missing value put 0.

12National predicted genetic merit values published domestically. For threshold models the submitted values are from the underlying scale. For missing values put 9999999999

APPENDIX II - Format File301 - Parameter file

Col

Name

Start

Format

Description

Example

1

rec_type

1

a3

Record type 1

301

2

brd_eval

5

a3

Breed of evaluation 2

HOL

3

pop

9

a3

Population code 3

USA

4

trt

13

a3

Trait of evaluation 4

scs

5

evdate

17

i8

National evaluation date 5

20121201

6

herit

26

f8.6

Heritability 6

0.12

7

refbase

35

a7

Reference base definition 7

H10CB05

8

pgmdef

44

a2

Genetic merit definition 8

T-

9

pub_rule

47

a1

Official publication rules 9

Y

1Valid record type 301

2Breed codes accepted: BSW=Brown Swiss type; GUE=Guernsey type; HOL=Holstein-Friesian (Black & White) type; JER=Jersey type; RDC=Red Dairy Cattle type ; SIM=Simmental type.

3Valid population codes: ARG AUS BEL CAN CHE aCHR CZE bDEA DEU cDFS dDNR ESP EST FIN FRA eFRM fFRR GBR HUN IRL ISR ITA JPN LTU LVA NLD NZL POL PRT SVK SWE USA URY ZAF
where: aSwiss Red Holstein; b Austria+Germany; cDenmark + Finland + Sweden; dDenmark Red Holstein e France Montbeliarde; fFrench Pie Rouge

4Accepted traits abbreviations: Production(mil = milk;fat =fat; pro = protein); Conformation ( sta = stature;cwi = chest width;bde = body depth;ang = angularity;ran = rump angle;rwi = rump width; rls = rear-leg set;rlr = rear-leg rear view;fan = foot angle;hde = heel depth/hoof height; fua = fore udder attachment; ruh = rear udder height; ruw = rear udder width; usu = udder support;ude = udder depth;ftp = front teat placement;ftl = (front) teat length;rtp = rear teat placement;ous = overall udder score; ofl = overall feet&legs score; ocs = overall conformation score; bcs = body condition score; loc = locomotion); Udder (scs = somatic cell; mas = mastitis);Longevity (dlo = direct longevity);Calving (dce = direct calving ease;mce = maternal calving ease;dsb= direct stillbirth;msb = maternal stillbirth);Female fertility (hco = heifer conception;crc = cow recycling;cc1 = lactating cow's ability to conceive (1);cc2 = lactating cow's ability to conceive (2);int= internval traits);Workability (msp = milking speed;tem = temperament).

5 National evaluation dates expressed using the format YYYYMMDD

6Heritability for a specific trait in format f8.6. Should be larger than 0 and smaller than 0,999999

7 Reference (genetic) base definition in the country sending information: breed initial (1 char), year established (YY), bull(b) or cow(C) (1 char); birth(B), calving (C) or evaluation (E) (1 char); year of event (use middle year if base on multiple year (YY); for breed initial see breed code in footnote2 (use X If based on multiple breeds):e.g. H00BB95 means a base defined 2000 base on Holstein bulls born in 1995.

8Genetic merit definition consists of a letter and a sign. B = Breeding value, T = Transmitting ability; Sign: ‘+’ = Higher values are desirable, ‘-‘= Lower values are desirable

9 Allowed characters Y=Yes or N=No

APPENDIX III - Frequently Asked Questions

My data has been rejected, what can i do?

National data can sometimes be rejected from ITBC staff if changes in sire standard deviations between previous and current data are larger than 5% (in case of routine run), or if problems in the data are found. If your data get rejected it means that the previous available data will be used for the IGE. In case you are able to fix the problem(s), and the data deadline has not yet passed, you can try to upload a new corrected dataset. You will need to check the conformation email, the verify output and submit the data in the Review page again.

I discovered a problem with a dataset that got automatically submitted, what do I do?

Inform ITBC staff about the problem and the breed-pop-trait combination affected by it. If the data submission deadline has not yet passed, ITBC staff will reset your data so that you will be able to upload a new file again. You will need to check the conformation email, the verify output and submit the data in the Review page again.

General information

A Python program called CheckProofsPara.py will check the 300-MACE/700-GMACE proof file format and the associated 301-MACE/701-GMACE parameter file for format correctness, as described in the IDEA EBV User Manual AppendixI Appendix II. The program prepares a zip file, IB-ORGCODE-IG-yyymmddThhmmss.zip for conventional MACE and IB-ORGCODE-GG-yyymmddThhmmss.zip for GMACE,if no errors are found in the file.

The zip file contains the input proof and parameter files, renamed to proof.dat and param.dat, respectively. The program requires access to the internet and specifically to few functions/pages in the IDEA web application. Firstly, in order to ensure that the user has the most recent version of the software, the program checks its internal version with the version stored on the Interbull server. If there is a mismatch in versions, a message is printed and the program exits and the program must be re downloaded. Lists of valid orgcode’s and breed-pop-trait combinations for which the user’s organization has EBV upload authority are also obtained from the IDEA web application. If errors occur they are listed to the screen and no zip file is created. The zip file represents your checked data file to upload to the Interbull Centre IDEA database (https://idea.interbull.org/). For technical reason the program rejects files containing more than a million of data.

  • Please do not modify the program to circumvent any checks. Doing so would be pointless because the same checking routine is used again inside IDEA to double-check the data file uploaded in the zip file.

One zip file may contain records for as many or few brd-pop-trt combinations as desired. All traits in a trait group can be put in the same file, as they have in the past, but this is not essential. There is also an option to upload evaluations for all breeds, populations and traits in a single file if that is more convenient. Every proof file must be accompanied by a parameter file and each record in the proof file must have associated information for a single brd-pop-trt combination in the parameterfile. Extra records present in the parameter file will simply be ignored. This allows users to maintain a single parameter file to be submitted with several proof files, if desired. A proof/parameter file may contain only conventional (300/301) or only GEBV (700/701) records, not both. CheckProofsPara.py will fail if both record types are found in one file. When checking the proof file only the first 10 errors of each type are listed.

It is essential that proof records for all bulls are included in the same file for any one brd-pop-trt combination whether the dataset is a first submission for the IGE run in question, or a re-submission with some problems corrected. When new data are uploaded for any combination, all records present in IDEA for that combination are deleted prior to loading the new records. Thus, it is not acceptable to submit a partial data-set in order to correct the evaluations of some subset of bulls. If no errors are found, a zip file is created with the name IB-ORGCODE-IG-yyymmddThhmmss.zip (IB-ORGCODE-GG-yyymmddThhmmss.zip) containing the input proof and parameter files.

  • If you need assistance, please do not hesitate to contact us at interbull@slu.se .

Before Running the Programs

  • a. Ensure there is a working network connection
  • b. Install Python (Python3, minimum 3.6). Please note that Python2 is no longer supported by developers.

  • b. Create a working directory/folder
  • d. Download the CheckProofPara.py program from https://idea.interbull.org/software and copy it to your new directory

  • e. Copy your proof and parameter file to the working directory

The Program

  • Execute: python CheckProofsPara.py -m <ORGCODE> –f <prooffile> –g <paramfile> [-o <outpath>] [-s <change_scale>]

    • where:
    • <ORGCODE>=the assigned member organization code (upper case) as shown on the upper right hand side of the IDEA page. Your organization code is reported within brackets beside the "Logged in as" information

    • <prooffile>= /path/to/filename of the format 300/700 national EBV/GEBV file

    • <paramfile>= /path/to/filename of the format 301/701 evaluation parameter file

    • <outpath>= optional path for creation of the zipfile for uploading

    • <change_scale> = if the new data have a different scale compared to the previous data uploaded in IDEA

    • Example on how to run the program:

    • For MACE write: python CheckProofsPara.py -m CDN -f file300.GUE.CAN.prod -g file301.GUE.CAN.prod

    • For GMACE write: python CheckProofsPara.py -m CDN -f file700.GUE.CAN.prod -g file701.GUE.CAN.prod

  • Output: Eventual errors are displayed on the screen and can redirect to a log file if desired.

    • Example of output on the screen in case of no errors:
      • Running CheckProofsPara.py version 2013-06-04 v0.6, provided by the Interbull Centre

      • Organization code : CDN
      • Parameter file : file301.GUE.CAN.prod
      • Proof file : file300.GUE.CAN.prod
      • 0 errors in 3 lines from paramfile
      • 0 errors in 309 lines from prooffile
      • Record counts by breed_population_trait combination
      • GUE CAN fat 103 GUE CAN mil 103 GUE CAN pro 103
      • Everything OK. Zip file is ready for upload. IB-CDN-IG-20131021T152303.zip
    • In case of errors, no zip file will be created. Correct the data and re-run the program until the data successfully pass all required checks.The first 10 errors of each kind will be printed on the screen.
    • Example of output in case of errors.
      • Running CheckProofsPara.py version 2013-06-04 v0.6, provided by the Interbull Centre

      • Organization code : CDN
      • Parameter file : file701.HOL.CAN.work
      • Proof file : file700.HOL.CAN.work
      • 2 errors in 2 lines from paramfile
      • Parameter line 1. Base definition error: <BB12345>. See file format (eg. H00BB95)

      • Parameter line 2. Base definition error: <BB12345>. See file format (eg. H00BB95)

      • Error(s) in parameter file. Skipping proof file.

General information

Interbull Centre is providing two convert programs to help member organizations with a smooth transition from the old file formats to the new database file formats:

  • 1) convert_params.py ==> converts the old parameter file format to the new parameter file format (301 for MACE and 701 for GMACE)

  • 2) convert_proofs.py ==> converts the old proof file format to the new proof file format (300 for MACE and 700 for GMACE)

The programs will be available in IDEA for the first 6 months of the EBV module introduction and can be run by an overall script for more efficiency.

Before Running the Programs

  • Install Python (Python3, minimum 3.6). Please note that Python2 is no longer supported by developers.

  • b. Create a working directory/folder
  • c. Copy the python programs to the new directory
  • d. Copy the files you want to convert to the working directory (This is optional. The path to the file can also be written by the prompt as /path/filename)

convert_params.py

  • General info: This program converts the traditional parameter files to the new 301(MACE) and 701(GMACE) file format that will be used for loading parameters to the Interbull Centre Database.


      • Please note that some extra fields need to be added in the current parameter file for the program to run correctly:

      • a) Genetic merit (GE) abbreviation (T or B ) after the publication column
      • b) Reference base definition (REF_BASE_DEF) as the last column
      • The new parameter file format after the adjustments will be (each field separated by a space):
      • EBRD POP TRAIT EVDATE HER MEANNEW SIGN STDNEW MEANOLD STDOLD PUB GE REF_BASE_DEF

      • An exampel
      • RDC cou dlo 20130801 0.0975 100.0000 +1 5.0000 0.0115 0.0550 N B R13CB06

      • Useful links:

      • Explanation to parameter file codings: https://wiki.interbull.org/public/parameter%20file?action=print&rev=3

      • Description of REF_BASE_DEF : https://wiki.interbull.org/public/file010?action=print&rev=19

  • Execute: python convert_params.py name_of_parameterfile outputfile type ebrd


      • Where
      • parameterfile=parameter file to be converted
      • outputfile=name of the output file (file301.BRD. COU.trtgrp for MACE file701.BRD.COU.trtgrp for GMACE)
      • type=if the parameters are for GMACE or MACE
      • ebrd=breed of evaluation (BSW, GUE,JER,HOL,RDC or SIM)
      • Example on how to run the program:

      • For MACE write: python convert_params.py parameter.can file301.HOL.CAN.prod MACE HOL

      • For GMACE write: python convert_params.py parameter.can file701.HOL.CAN.prod GMACE HOL

  • Output screen: Information/errors from the program will be printed to the screen.


      • If the program runs without finding any errors in the infile ==> this message will be written to the screen (an example)

      • “Now running the convert program for parameters
      • ****************************************************************
      • Infile : parameter.can
      • File is written for : GMACE
      • Breed : HOL
      • Number of records read from parameter file : 186
      • Number of records with bad fileformat : 0
      • Number of records written to outputfile : 3
      • Outputfile : file701.HOL.CAN.prod
      • ****************************************************************
      • Here are some error codes that can pop up:
      • Error :*** WARNING. Skipping bad row in parameter.can Check the file format!

      • RDC xxx dl 20130401 0.0975 100.0000 +1 5.0000 0.0138 0.0558 N B
      • Explanation : This error indicates an error in the file format in the input file (in this case the GE and ref_base_def are missing).

      • Error :Woops, You want to run for RDC but the output file is for HOL, Try again

      • Explanation: This error pops up if you write a different breed name in the output file then the breed name specified at the prompt as the “ebrd”. An example:python convert_org_params.py parameter.can file701.HOL.XXX.prod GMACE RDC

      • Error : Woops, file type 301 only goes with MACE and not GMACE. Try again

      • Explanation :This error pops up if there is a mismatch between the record type provided in the parameter file and the “type” specified at the prompt. Record type 301 goes with type MACE and record type 701 goes with type GMACE. An example: python convert_org_params.py parameter.can file301.HOL.XXX.prod GMACE HOL would trigger the above error message.

  • Output file: The output file will be located in the same directory where the program is located by default. In IDEA all traits will be coded with a 3 letters abbreviation. The change from the old trait coding to the new trait coding will be taken care automatically by the program.

convert_proofs.py

  • General info: This program converts the traditional genetic merit files to the new 300(MACE) and 700(GMACE) file format that will be used for loading breeding values to the Interbull Centre Database. .Execute: python convert_org_proofs.py proof_file outputfile type ebrd


Where:

  • prooffile =proof file to be converted
  • outputfile=name of the output file (file301.BRD.COU.trtgrp ebrd for MACE file701.BRD.COU.trtgrp ebrd for GMACE)
  • type=if the breeding values are for GMACE or MACE
  • ebrd=breed of evaluation (BSW, GUE,JER,HOL,RDC or SIM)

Example on how to run the program:

  • For MACE write: python convert_proofs.py file010gue.can file300.GUE.CAN.prod MACE GUE

  • For GMACE write: python convert_proofs.py file710gue.can file700.GUE.CAN.prod GMACE GUE

Output screen: Information/errors from the program will be printed to the screen.


  • If the program runs without finding any errors in the input file ===> This message will be written to the screen (an example)

    • “ Now running convert program for breeding values
    • **************************************************
    • Infile : file010.nld
    • Record type in file : 010
    • Traits : mil fat pro
    • Breed of interest : HOL
    • Total number of records read from file : 33782
    • Rec. read for the specific breed : 29022
    • Records below is the number of records read for the breed times number of traits
    • Total number of records read from file for the specific breed : 87066
    • Records with missing values skipped for the specific breed : 40254
    • Records written : 46812
    • Outputfile : file300.HOL.NLD.prod
    • **************************************************

Here are some error codes that can pop up.

  • Error: Woops, 300 only goes with MACE. Try again/ Woops, 700 only goes with GMACE. Try again

  • Explanation: This error pops up if there is no match on the record type and the type. Record type 300 goes with type MACE and record type 700 goes with type GMACE. . An example: python convert_org_proofs.py file010gue.xxx file300.GUE.XXX.prod GMACE GUE

  • Error: Woops, the breed in the output filename is not consistent with the breed in the command line. Try again

  • Explanation: This error pops up if you have different breed names in the output file then the breed specified as ebrd in the command An example: python convert_org_proofs.py file010gue.can file700.GUE.CAN.prod GMACE HOL will trigger the above error message

Output file: The output file (file300.brdu.couu.trt for mace and file700.brdu.couu.trt for gmace) will be located in the same directory where the program is located by default.

public/IDEA_EBV_UserManual (last edited 2022-01-11 12:02:56 by Valentina)