Differences between revisions 43 and 61 (spanning 18 versions)
Revision 43 as of 2013-11-22 10:47:49
Size: 6441
Editor: Hjerpe
Comment:
Revision 61 as of 2013-12-11 13:11:53
Size: 6666
Editor: Valentina
Comment:
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
 . A Python program called CheckProofsPara.py will check the 300/700 proof file format and the associated 301/701 parameter file for format correctness against the IDEA EBV User Manual Appendix II. The program prepares a zip file,IB-ORGCODE-IG-yyymmddThhmmss.zip for conventional MACE and IB-ORGCODE-GG-yyymmddThhmmss.zip for GMACE,if no errors are found in the file. The zip file contains the input proof and parameter files, renamed to proof.dat and param.dat, respectively. The program requires access to the internet and specifically to few functions/pages in the IDEA web application. Firstly, in order to ensure that the user has the most recent version of the software, the program checks its internal version with the version stored on the Interbull server. If there is a mismatch in versions, a message is printed and the program exits and the program must be re downloaded. Lists of valid orgcode’s and breed-pop-trait combinations for which the user’s organization has EBV upload authority are also obtained from the IDEA web application. If errors occur they are listed to the screen and no zip file is created. The zip file represents your checked data file to upload to the Interbull Centre IDEA database (https://idea.interbull.org/). For technical reason the program rejects files containing more than a million of data.    . <<span(style="color: #FF0000;")>>Please do not modify the program to circumvent any checks. Doing so would be pointless because the same checking routine is used again inside IDEA to double-check the data file uploaded in the zip file.<<span>> A Python program called CheckProofsPara.py will check the 300-MACE/700-GMACE proof file format and the associated 301-MACE/701-GMACE parameter file for format correctness, as described in the IDEA EBV User Manual [[public/File300|Appendix]][[public/File300|I]] [[public/File301|Appendix]] II[[https://wiki.interbull.org/public/IDEA_EBV_UserManual#APPENDIX_II_-_Format_File301_-_Parameters_EBV_and_File701-_Parameters_GEBV|.]] The program prepares a zip file, IB-ORGCODE-<<span(style="color: #F518BA;")>>IG<<span>>-yyymmddThhmmss.zip for conventional<<span(style="color: #F518BA;")>> MACE <<span>> and IB-ORGCODE-<<span(style="color: #3D18F5;")>>GG<<span>>-yyymmddThhmmss.zip for <<span(style="color: #3D18F5;")>>GMACE<<span>>,if no errors are found in the file.  <<BR>> <<BR>> The zip file contains the input proof and parameter files, renamed to proof.dat and param.dat, respectively. The program requires access to the internet and specifically to few functions/pages in the IDEA web application. Firstly, in order to ensure that the user has the most recent version of the software, the program checks its internal version with the version stored on the Interbull server. If there is a mismatch in versions, a message is printed and the program exits and the program must be re downloaded. Lists of valid orgcode’s and breed-pop-trait combinations for which the user’s organization has EBV upload authority are also obtained from the IDEA web application. If errors occur they are listed to the screen and no zip file is created. The zip file represents your checked data file to upload to the Interbull Centre IDEA database (https://idea.interbull.org/). For technical reason the program rejects files containing more than a million of data.
Line 7: Line 6:
When checking the proof file only the first 10 errors of each type are listed.  . <<span(style="color: #FF0000;")>>Please do not modify the program to circumvent any checks. Doing so would be pointless because the same checking routine is used again inside IDEA to double-check the data file uploaded in the zip file.<<span>>
Line 9: Line 8:
 . One zip file may contain records for as many or few brd-pop-trt combinations as desired. All traits in a trait group can be put in the same file, as they have in the past, but this is not essential. There is also an option to upload evaluations for all breeds, populations and traits in a single file if that is more convenient. Every proof file must be accompanied by a parameter file and each record in the proof file must have associated information for a single brd-pop-trt combination in the parameterfile. Extra records present in the parameter file will simply be ignored. This allows users to maintain a single parameter file to be submitted with several proof files, if desired. A proof/parameter file may contain only conventional (300/301) or only GEBV (700/701) records, not both. CheckProofsPara.py will fail if both record types are found in one file. If problems are found with some specific combinations, only those combinations need to be re-loaded. It is essential that proof records for all bulls are included in the same file for any one brd-pop-trt combination whether the dataset is a first submission for the IGE run in question, or a re-submission with some problem corrected. When new data are uploaded for any combination, all records present in IDEA for that combination are deleted prior to loading the new records. Thus, it is not acceptable to submit a partial dataset in order to correct the evaluations of some subset of bulls.

 The user instructions and file formats (see Appendix I and II) give details on how to run the program and on the checks performed.
If no errors are found, a zip file is created with the name IB-ORGCODE-IG-yyymmddThhmmss.zip containing the input proof and parameter files.
One zip file may contain records for as many or few brd-pop-trt combinations as desired. All traits in a trait group can be put in the same file, as they have in the past, but this is not essential. There is also an option to upload evaluations for all breeds, populations and traits in a single file if that is more convenient. Every proof file must be accompanied by a parameter file and each record in the proof file must have associated information for a single brd-pop-trt combination in the parameterfile. Extra records present in the parameter file will simply be ignored. This allows users to maintain a single parameter file to be submitted with several proof files, if desired. A proof/parameter file may contain only conventional (300/301) or only GEBV (700/701) records, not both. CheckProofsPara.py will fail if both record types are found in one file. When checking the proof file only the first 10 errors of each type are listed. <<BR>><<BR>> It is essential that proof records for all bulls are included in the same file for any one brd-pop-trt combination whether the dataset is a first submission for the IGE run in question, or a re-submission with some problems corrected. When new data are uploaded for any combination, all records present in IDEA for that combination are deleted prior to loading the new records. Thus, it is not acceptable to submit a partial data-set in order to correct the evaluations of some subset of bulls.   If no errors are found, a zip file is created with the name IB-ORGCODE-IG-yyymmddThhmmss.zip (IB-ORGCODE-GG-yyymmddThhmmss.zip) containing the input proof and parameter files.
Line 16: Line 13:

  . a. Ensure there is a working network connection
 
. b. Install [[http://www.python.org|Python]] (version 2.6 to 2.7) if necessary
  . b. Create a working directory/folder
  . d. Download the CheckProofPara.py program from https://idea.interbull.org/software and copy it to your new directory
  . e. Copy your proof and parameter file to the working directory
 . a. Ensure there is a working network connection
 . b. Install [[http://www.python.org|Python]] (version 2.6 to 2.7) if necessary
 . b. Create a working directory/folder
 . d. Download the CheckProofPara.py program from https://idea.interbull.org/software and copy it to your new directory
 . e. Copy your proof and parameter file to the working directory
Line 24: Line 20:
 . '''Execute''': <<span(style="background-color: #00FF00;")>> python CheckProofsPara.py -m <ORGCODE> –f <prooffile> –g <paramfile> [-o <outpath>] [-s <change_scale>] <<span>>
  . where:
  . <ORGCODE>=the assigned member organization code (upper case) as shown on the upper right hand side of the IDEA page. Your organization code is reported within brackets beside the "'''Logged in as'''" information
  . <prooffile>= /path/to/filename of the format 300/700 national EBV/GEBV file
  . <paramfile>= /path/to/filename of the format 301/701 evaluation parameter file
  . <outpath>= optional path for creation of the zipfile for uploading
  . <change_scale> = if the new data have a different scale compared to the previous data uploaded in IDEA
Line 25: Line 28:
  . '''Execute''': <<span(style="background-color: #00FF00;")>> python26 CheckProofsPara.py -m <ORGCODE> –f <prooffile> –g <paramfile> [-o <outpath>] <<span>>
     .where:
     . <ORGCODE>=the assigned member organization code (upper case) as shown on the upper right hand side of the IDEA page. Your organization code is reported within brackets beside the "'''Logged in as'''" information
     . <prooffile>= /path/to/filename of the format 300/700 national EBV/GEBV file
     . <paramfile>= /path/to/filename of the format 301/701 evaluation parameter file
     . <outpath>= optional path for creation of the zipfile for uploading
  . '''''Example on how to run the program:'''''
  . For MACE write: python CheckProofsPara.py -m CDN -f file300.GUE.CAN.prod -g file301.GUE.CAN.prod
  . For GMACE write: python CheckProofsPara.py -m CDN -f file700.GUE.CAN.prod -g file701.GUE.CAN.prod
Line 32: Line 32:
     . '''''Example on how to run the program:'''''
     .For MACE write: python26 CheckProofsPara.py -m CDN -f file300.GUE.CAN.prod -g file301.GUE.CAN.prod
     .For GMACE write: python26 CheckProofsPara.py -m CDN -f file700.GUE.CAN.prod -g file701.GUE.CAN.prod
 . ''' Output''': Eventual errors are displayed on the screen and can redirect to a log file if desired.
  . Example of output on the screen in case of no errors:
   . Running CheckProofsPara.py version 2013-06-04 v0.6, provided by the Interbull Centre
   . Organization code : CDN
   . Parameter file : file301.GUE.CAN.prod
   . Proof file : file300.GUE.CAN.prod
   . 0 errors in 3 lines from paramfile
   . 0 errors in 309 lines from prooffile
   . Record counts by breed_population_trait combination
   . GUE CAN fat 103 GUE CAN mil 103 GUE CAN pro 103
   . Everything OK. Zip file is ready for upload. IB-CDN-IG-20131021T152303.zip
Line 36: Line 44:
 .''' Output''': Eventual errors are displayed on the screen and can redirect to alog file if desired.
     . Example of output on the screen in case of no errors:
  . In case of errors, no zip file will be created. Correct the data and re-run the program until the data successfully pass all required checks.The first 10 errors of each kind will be printed on the screen.
Line 39: Line 46:
       . Running CheckProofsPara.py version 2013-06-04 v0.6, provided by the Interbull Centre
       . Organization code : CDN
       . Parameter file : file301.GUE.CAN.prod
       . Proof file : file300.GUE.CAN.prod
       . 0 errors in 3 lines from paramfile
       . 0 errors in 309 lines from prooffile
       . Record counts by breed_population_trait combination
       . GUE CAN fat 103 GUE CAN mil 103 GUE CAN pro 103
       .Everything OK. Zip file is ready for upload. IB-CDN-IG-20131021T152303.zip

     . In case of errors, no zip file will be created. Correct the data and re-run the program until the data successfully pass all required checks.The first 10 errors of each kind will be printed on the screen.

     . Example of output in case of errors.
      . Running CheckProofsPara.py version 2013-06-04 v0.6, provided by the Interbull Centre
      . Organization code : CDN
      . Parameter file : file701.HOL.CAN.work
      . Proof file : file700.HOL.CAN.work
      . 2 errors in 2 lines from paramfile
      . Parameter line 1. Base definition error: <BB12345>. See file format (eg. H00BB95)
      . Parameter line 2. Base definition error: <BB12345>. See file format (eg. H00BB95)
      . Error(s) in parameter file. Skipping proof file.
  . Example of output in case of errors.
   . Running CheckProofsPara.py version 2013-06-04 v0.6, provided by the Interbull Centre
   . Organization code : CDN
   . Parameter file : file701.HOL.CAN.work
   . Proof file : file700.HOL.CAN.work
   . 2 errors in 2 lines from paramfile
   . Parameter line 1. Base definition error: <BB12345>. See file format (eg. H00BB95)
   . Parameter line 2. Base definition error: <BB12345>. See file format (eg. H00BB95)
   . Error(s) in parameter file. Skipping proof file.

General information

A Python program called CheckProofsPara.py will check the 300-MACE/700-GMACE proof file format and the associated 301-MACE/701-GMACE parameter file for format correctness, as described in the IDEA EBV User Manual AppendixI Appendix II. The program prepares a zip file, IB-ORGCODE-IG-yyymmddThhmmss.zip for conventional MACE and IB-ORGCODE-GG-yyymmddThhmmss.zip for GMACE,if no errors are found in the file.

The zip file contains the input proof and parameter files, renamed to proof.dat and param.dat, respectively. The program requires access to the internet and specifically to few functions/pages in the IDEA web application. Firstly, in order to ensure that the user has the most recent version of the software, the program checks its internal version with the version stored on the Interbull server. If there is a mismatch in versions, a message is printed and the program exits and the program must be re downloaded. Lists of valid orgcode’s and breed-pop-trait combinations for which the user’s organization has EBV upload authority are also obtained from the IDEA web application. If errors occur they are listed to the screen and no zip file is created. The zip file represents your checked data file to upload to the Interbull Centre IDEA database (https://idea.interbull.org/). For technical reason the program rejects files containing more than a million of data.

  • Please do not modify the program to circumvent any checks. Doing so would be pointless because the same checking routine is used again inside IDEA to double-check the data file uploaded in the zip file.

One zip file may contain records for as many or few brd-pop-trt combinations as desired. All traits in a trait group can be put in the same file, as they have in the past, but this is not essential. There is also an option to upload evaluations for all breeds, populations and traits in a single file if that is more convenient. Every proof file must be accompanied by a parameter file and each record in the proof file must have associated information for a single brd-pop-trt combination in the parameterfile. Extra records present in the parameter file will simply be ignored. This allows users to maintain a single parameter file to be submitted with several proof files, if desired. A proof/parameter file may contain only conventional (300/301) or only GEBV (700/701) records, not both. CheckProofsPara.py will fail if both record types are found in one file. When checking the proof file only the first 10 errors of each type are listed.

It is essential that proof records for all bulls are included in the same file for any one brd-pop-trt combination whether the dataset is a first submission for the IGE run in question, or a re-submission with some problems corrected. When new data are uploaded for any combination, all records present in IDEA for that combination are deleted prior to loading the new records. Thus, it is not acceptable to submit a partial data-set in order to correct the evaluations of some subset of bulls. If no errors are found, a zip file is created with the name IB-ORGCODE-IG-yyymmddThhmmss.zip (IB-ORGCODE-GG-yyymmddThhmmss.zip) containing the input proof and parameter files.

  • If you need assistance, please do not hesitate to contact us at interbull@slu.se .

Before Running the Programs

  • a. Ensure there is a working network connection
  • b. Install Python (version 2.6 to 2.7) if necessary

  • b. Create a working directory/folder
  • d. Download the CheckProofPara.py program from https://idea.interbull.org/software and copy it to your new directory

  • e. Copy your proof and parameter file to the working directory

The Program

  • Execute: python CheckProofsPara.py -m <ORGCODE> –f <prooffile> –g <paramfile> [-o <outpath>] [-s <change_scale>]

    • where:
    • <ORGCODE>=the assigned member organization code (upper case) as shown on the upper right hand side of the IDEA page. Your organization code is reported within brackets beside the "Logged in as" information

    • <prooffile>= /path/to/filename of the format 300/700 national EBV/GEBV file

    • <paramfile>= /path/to/filename of the format 301/701 evaluation parameter file

    • <outpath>= optional path for creation of the zipfile for uploading

    • <change_scale> = if the new data have a different scale compared to the previous data uploaded in IDEA

    • Example on how to run the program:

    • For MACE write: python CheckProofsPara.py -m CDN -f file300.GUE.CAN.prod -g file301.GUE.CAN.prod

    • For GMACE write: python CheckProofsPara.py -m CDN -f file700.GUE.CAN.prod -g file701.GUE.CAN.prod

  • Output: Eventual errors are displayed on the screen and can redirect to a log file if desired.

    • Example of output on the screen in case of no errors:
      • Running CheckProofsPara.py version 2013-06-04 v0.6, provided by the Interbull Centre

      • Organization code : CDN
      • Parameter file : file301.GUE.CAN.prod
      • Proof file : file300.GUE.CAN.prod
      • 0 errors in 3 lines from paramfile
      • 0 errors in 309 lines from prooffile
      • Record counts by breed_population_trait combination
      • GUE CAN fat 103 GUE CAN mil 103 GUE CAN pro 103
      • Everything OK. Zip file is ready for upload. IB-CDN-IG-20131021T152303.zip
    • In case of errors, no zip file will be created. Correct the data and re-run the program until the data successfully pass all required checks.The first 10 errors of each kind will be printed on the screen.
    • Example of output in case of errors.
      • Running CheckProofsPara.py version 2013-06-04 v0.6, provided by the Interbull Centre

      • Organization code : CDN
      • Parameter file : file701.HOL.CAN.work
      • Proof file : file700.HOL.CAN.work
      • 2 errors in 2 lines from paramfile
      • Parameter line 1. Base definition error: <BB12345>. See file format (eg. H00BB95)

      • Parameter line 2. Base definition error: <BB12345>. See file format (eg. H00BB95)

      • Error(s) in parameter file. Skipping proof file.

public/CheckProofPara (last edited 2021-09-03 10:20:08 by Valentina)