Size: 20522
Comment:
|
← Revision 43 as of 2024-12-04 10:53:06 ⇥
Size: 8132
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 5: | Line 5: |
This document describes the procedures for the upload and download of the data for the [[https://genoex.org/|GenoEx-PSE]] service. | This document describes the procedures for the upload and download of the data for the [[https://genoex.org/|GenoEx-PSE]] service. The PSE Code of Practice you can find [[https://interbull.org/ib/pse_cop|here]] |
Line 7: | Line 7: |
It describes in details required file formats, login process and usage of the service itself both for data upload and download. | The User Manual describes in detail the required file formats, login process and usage of the service itself both for data upload and data download. |
Line 11: | Line 11: |
== 1. File formats == The SNP genotype data and the related information submitted (uploaded) to !GenoEx-PSE is categorized in two groups: |
== Login == After you have signed the User’s Agreement and joined the service, you will receive login details from Interbull Center for the users you have requested access for. From this moment you are entitled to use the !GenoEx-PSE service via [[https://genoex.org/|genoex.org]]. |
Line 14: | Line 14: |
'''A)''' Information related to a group of animals with a SNP genotype to upload ('meta.csv' file) | The '''login''' tab can be found on the top right corner of the front page. If you have lost or forgotten your password, a new password can be requested from the '''Lost Password''' page, which can be accessed both under the login tab and from the login page. Once you log in you can change your password by pressing the '''Change password''' under your username tab in the top right corner. We highly recommend that you change the password from the one provided to you, at the very first login. |
Line 16: | Line 16: |
'''B)''' Information related to the actual SNP genotypes for the animals in 1) ('snps.csv' file) | Under the '''About''' tab, you will find links to all documentation relevant for the !GenoEx-services. |
Line 18: | Line 18: |
At each upload from a Service User, both files must be zipped together and submitted at the same time. Data submitted in one file will not be processed until both files are available. | Under the '''Contact''' tab you will find information needed to contact Interbull Center should you have any questions, comments or concerns regarding the platform or the service. |
Line 20: | Line 20: |
Data extracted from the !GenoEx-PSE database generally follow the same format than the data uploaded. They however, contain additional column with unique ''Upload ID'' allowing to match meta.csv and snps.csv files coming from one upload event. Therefore, ''Upload ID'' allows distingushing multiple records of one individual coming from different sources and/or genotyping events. | On the '''System Data''' page, four tables are presented: ''SNP Arrays'', ''Laboratories'', ''Breeds'', and ''Countries''. All of the tables have clickable headers, and a search field to ease the looking up of information. The information in these tables represent the only allowed values for the information regarding SNP array, genotyping laboratory, breed and country in the files to be uploaded. See the section for file format for more information |
Line 22: | Line 22: |
=== 1A - meta.csv file - Information Related to a Group of Animals === This file is required to be a variable length, comma delimited file in .csv format including a single record for each animal for which a SNP genotype details is being reported in the snps.csv file. For example, if the snps.csv file (File 704-AB or 704-TOP) includes SNP genotype results for 100 animals, the meta.csv file will have 100 records, one per animal. ||<tablewidth="610px"width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">'''Field Name''' ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">'''Example''' ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">'''Description''' || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Record Type ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">702 ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Numeric || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Service User ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">INRA ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphanumeric || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Source Country of Animals ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">FRA ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Code of 3 CAPITAL letters according to Country List || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Animal ID - Breed Code ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">BSW ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Code of 3 CAPITAL letters according to Breed List || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Animal ID - Nation Code ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">AUS ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Code of 3 CAPITAL letters according to Country List || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Animal ID - Sex Code ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">M ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alpha (M or F) || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Animal ID - Registration ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">A1234567890 ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphanumeric, maximum 18 characters || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Genotyping Laboratory Identification ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Genotyping Laboratory ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphanumeric according to the Laboratory List OR added as new* || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Sample ID ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">R1234567890 ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphanumeric, sample numbers used in the genotyping laboratory || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Scan Date ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">yyyymmdd ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Numeric, the date when the laboratory concluded the analysis. || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Platform ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Illumina ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphanumeric, SNP Chip Platform according to the Platform List* || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">No. SNPs in Genotype ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">54001 ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Total number of SNPs* || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Genotype Call Rate ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">99.98 ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Percent call rate of the animal's full genotype used to create the SNP record for !GenoEx-PSE, two decimals. || ||||||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in;text-align:center">'''! download only:''' || ||<width="175px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">''upload ID'' ||<width="148px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">''1fdb3a18-4a0d-41ee-8530-dac3880d00cf'' ||<width="250px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Unique ID allowing to match multiple data files of the same individual by upload event. || |
The '''History''' tab directs you to the history page, where you find a table showing all the uploads or downloads done by any user of your organization. The history table as well as downloads and uploads, found under the PSE-tab, will be described in the following sections. |
Line 41: | Line 24: |
{{attachment:1_afterlogin.JPG}} | |
Line 42: | Line 26: |
== Data Upload == To access the '''Upload''' page, go to the '''PSE''' tab and click Upload. The data to be uploaded must be presented in a .zip-file as described in section 1 of the Users Manual. Details of the file format you can find [[https://interbull.org/ib/pse_file_formats|HERE]]. Click the '''Browse '''button, find and select the file to be uploaded, and then click UPLOAD. |
|
Line 43: | Line 29: |
{{attachment:2_upload.JPG||height="338",width="901"}} | |
Line 44: | Line 31: |
Table 1A. Format of the meta.csv file used for uploading general information for the group of animals for which SNP genotypes are submitted in the associated snps.csv file (format below). | After pressing the UPLOAD button, you will be redirected to a status-page showing the job-id, job type, current status and the date and time, as shown in the pictures below. |
Line 46: | Line 33: |
'''* new laboratories, platforms and arrays need to be communicated to Interbull by e-mail prior to data submission.''' | An uploading process can have four different statuses: ''Initiated'', ''Processing'', ''Finished ''or ''Failed''. At the initial re-direction from the upload page, the current status will be shown as ''Initiated''. The duration of the upload process depends on the size of your file. To follow the progress of your upload, you can either refresh the status page as shown in the three following pictures, or go to the History table. When the Processing is complete, and the upload is either successful or failed, you will receive a notificiation via e-mail to the e-mail address connected to your GenoEx user account. |
Line 48: | Line 35: |
=== 1B - Information Related to the SNP Genotypes of Animals – snps.csv file === snps.csv files contains the actual genotype data for the animals listed in the corresponding meta.csv file. The Service User may select to upload and/or download SNP genotype data in either the "AB" or "TOP" allele designations, which determines the content of the first field, namely Record Type, in the following file format (i.e.: File 704-AB versus File 704-TOP, respectively). This file will be exchanged as a variable length, comma delimited file in .csv format and include a single record for each SNP included for each animal. For example, if meta.csv file includes 100 animals and each SNP genotype include the 200 SNPs recommended by ISAG for Parentage Verification then this second file will include a maximum of 20,000 records (100 animals x 200 SNPs each). In the event that any SNP was not "called" and the result is missing, then that SNP for that animal should not be included in the !GenoEx-PSE genotype exchange file. For this reason, even if the meta.csv file includes 100 animals, this second file may not necessarily have a total of 20,000 records. ||<tablewidth="616px"width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">'''Field Name''' ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">'''Example''' ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">'''Description''' || ||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Record Type ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">704-AB (or 704-TOP) ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphanumeric || ||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Animal ID - Breed Code ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">BSW ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Code of 3 CAPITAL letters according to Breed List || ||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Animal ID - Nation Code ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">AUS ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Code of 3 CAPITAL letters according to Country List || ||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Animal ID - Sex Code ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">M ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alpha (M or F) || ||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Animal ID - Registration ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">A1234567890 ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphanumeric, maximum 18 characters || ||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">SNP Name ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">ARS-BFGL-BAC-19454 ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphanumeric (CAPITAL letters only) || ||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Allele 1 ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">A/B or A/C/G/T ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphabetic (CAPITAL letters only) || ||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Allele 2 ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">A/B or A/C/G/T ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Alphabetic (CAPITAL letters only) || ||||||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in;text-align:center">'''! download only:''' || ||<width="186px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">''upload ID'' ||<width="168px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">''1fdb3a18-4a0d-41ee-8530-dac3880d00cf'' ||<width="225px" style="border:1px solid #00000a;padding-top:0in;padding-bottom:0in;padding-left:0.06in;padding-right:0.08in">Unique ID allowing to match multiple data files of the same individual by upload event. || |
{{attachment:3a_statusUpload_initating.JPG||height="297",width="901"}} |
Line 62: | Line 37: |
{{attachment:3b_statusUpload_processing.JPG||height="285",width="901"}} | |
Line 63: | Line 39: |
{{attachment:3c_statusUpload_finished.JPG||height="301",width="901"}} | |
Line 64: | Line 41: |
=== Failed upload === In the case where the upload is unsuccessful, the job status will be displayed as ''Failed''. The number of errors and the cause of the failed upload will be shown. In the case presented in the image below, the upload failed as the animal-ID used in the snps.csv-file did not correspond to the animal-ID used in the meta.csv-file. |
|
Line 65: | Line 44: |
Table 1B. Format to be used for uploading the specific SNP genotypes to the !GenoEx-PSE database. | {{attachment:3d_statusUpload_failed.JPG||height="318",width="900"}} |
Line 67: | Line 46: |
'''! Please note:''' since the input files are comma delimited, if any of your fields require comma sign, you ought to add the escape character ‘\’ before comma e.g. ‘Laboratory\, Branch’. | == Data extraction == To access the download page, go to the '''PSE'''-tab and click '''Download'''. |
Line 69: | Line 49: |
If you wish to upload '''more than one genotype''' for a given animal, you will have to do it in a separate set of files. This is because the data base does not accept any duplicates, in order to allow distinguishing between various genotyping events. | In the download page, query the desired data by sex, breed and country. A selection must be done in all three fields for the extraction to be initiated. In order to select more than one criteria use CTRL+click. |
Line 71: | Line 51: |
The information downloaded from GenoEX-PSE follows the same formats both for the meta.csv and the snps.csv files with the addition of ''Upload ID'' (see above). | The data can be extracted either in TOP-format or in AB-format, independently of the format that it was submitted in. |
Line 73: | Line 53: |
== 2. Login == After contacting the Interbull Center and signing the User's Agreement, you receive the details for your login. From this moment you are entitled to use the !GenoEx-PSE service, via the main page: genoex.org |
In cases where animals is represented with more than one genotype, the '''Best only''' will give you the genotype with the highest call rate. By selecting '''All''' you will be provided with all the genotypes your organization have access to. |
Line 76: | Line 55: |
Login tab can be found on the top right corner of the page. Shall you forget your password, you can send password recovery request. | Specific time periods of when data has been uploaded can be select, to for example only download data that has become available since your latest download. The dates provided by default gives you all data uploaded to the database that your organization has access to. |
Line 78: | Line 57: |
{{attachment:login.png}} | When all the extraction criteria are selected, click CREATE EXTRACTION. |
Line 80: | Line 59: |
After login you will see more options: | {{attachment:5_download.JPG||height="680",width="900"}} |
Line 82: | Line 61: |
{{attachment:loggedInMenu.png|loggenInMenu.png}} | After clicking CREATE EXTRACTION, you will be redirected to a status-page showing the job-id, job type, current status and the date and time, as shown in the pictures below. |
Line 84: | Line 63: |
Once you log in you can see your account details or change your password if you wish so by choosing respectively '''''Profile''''' and '''''Change Password''''' options under your user name. | A downloading process can have two different statuses: ''Processing'' or ''Finished''. At the initial re-direction from the upload page, the current status will be shown as ''Processing''. The duration of the download process depends on the amount of data that is extracted. To follow the progress of your download, you can either refresh the status page as shown in the two following pictures, or go to the History table. When the Processing is complete, and the download is Finished, you will receive a notificiation via e-mail to the e-mail address connected to your !GenoEx user account. |
Line 86: | Line 65: |
{{attachment:user_profile2.png}} | When the extraction is completed, you can fetch the extracted file by clicking on ''Download extracted file'' on the jobs status page. The downloaded file can be fetched either directly, or at a later time and date. The data will be presented in a zipped folder containing the files snps.csv and meta.csv, following the file format as described in section 1. '''Note''' that the download files has an extra column compared to the uploaded files, as they contain a unique identifier in order to separate several genotypes for the same animal. |
Line 88: | Line 67: |
== 3. Data Upload == In order to upload the data you shall click on the '''''Upload''''' menu on the top of the page: |
{{attachment:6a_statusDownload_processing.JPG||height="414",width="901"}} |
Line 91: | Line 69: |
{{attachment:top_upload13.png}} | {{attachment:6b_statusDownload_finished.JPG||height="346",width="901"}} |
Line 93: | Line 71: |
To upload the data by uploading a .zip file as described in point 1. | == History Table == The progress of both uploads and downloads can also be followed in the history table. |
Line 95: | Line 74: |
Shall you have any doubts on how the data should be formatted, or what values are allowed in a given field, click on the '''''About Upload Formats '''''menu (in the blue area on the lower part of the page) for more information. To initiate the upload browse for the zipped files in your computer. | The table is by default sorted by date, with the most recent job shown first, but all headers can be pressed to change the sorting as you choose. The search field can be used to easier find specific jobs you may search for. |
Line 97: | Line 76: |
{{attachment:helplist2.png}} | By clicking on the job-ID of the job of your interest, you will be re-directed to the status-page for that upload or download. |
Line 99: | Line 78: |
Once you initialized the upload, you will be redirected to the upload status page, where you can track the upload progress. | {{attachment:7_history_extraction.JPG||height="302",width="901"}} |
Line 101: | Line 80: |
{{attachment:status3.png}} | <<Include(public/PSE_file_formats)>> |
Line 103: | Line 82: |
Once the upload is finished you will be able to confirm the success of the data submission or see the reason of possible failure. The status page contains the information about the file name (used by you) and job ID (assigned by system) {{attachment:status_success2.png}} The duration of the upload process depends on the size of your file and your internet connection. If there was an error or missing information in your data, the system will notify you about it with the detailed error message. {{attachment:Ufailed_platform1.png}} Above you can see the example of upload failure because of invalid platform/array in the data. The upload status page can also be reached by '''''History''''' tab on the top menu. {{attachment:top_history.png}} On the '''''History''''' page you can see all upload and download actions from your organization sorted by date and named by the job status ID. Clicking on the selected job will open the action’s details, like in the examples above. == 4. Data extraction == You will find the data extraction tab – '''''Download''''' on the top menu bar. {{attachment:top_download3.png}} The data can be extracted either in TOP or in AB format, independently of the format (AB or TOP) that it was submitted in. You can query the data by sex, breed and country. Breed and country are to be selected from the list. In order to select more than one criteria use ''CTRL+click'' {{attachment:pse_extraction_2021.png}} By default the system is extracting the data from the highest call rate per animal. However, if you tick the Full box, you will receive all the available records for queried combination.<<BR>> A very useful function, especially for the users downloading the data from GenoEx-PSE regularly, is a the time frame choice for data download, allowing to download only the data that was uploaded to the database within defined time period. Once the extraction criteria are selected, click '''''Create extraction'''''. You will see the progress on the screen, just like with the upload. When the extraction is completed you will be able to download your data as a zipped file containing snps.csv and meta.csv file. {{attachment:extrfinal.png}} The file can be downloaded right away or later on by finding specific case under '''''History '''''where the events are sorted by action date. == More information about file formats and allowed values can be found here: == |
== More information about allowed values in PSE-files can also be found here: == * [[https://interbull.org/ib/pse_file_formats|PSE File Formats]] |
Line 142: | Line 85: |
* [[http://kirste.userpage.fu-berlin.de/diverse/doc/ISO_3166.html|ISO country codes]] * [[https://interbull.org/ib/pse_platform_list|Platform arrays]] * [[https://interbull.org/ib/pse_laboratories|Laboratories]] |
* [[https://www.iso.org/obp/ui/#search|ISO country codes]] * [[https://genoex.org/display#arrays|Platform arrays]] * [[https://genoex.org/display#labs|Laboratories]] |
Line 146: | Line 89: |
* [[http://interbull.org/ib/pse_parentage_discovery_snps|Parentage discovery SNP]] |
GenoEx- PSE – User's Manual
This document describes the procedures for the upload and download of the data for the GenoEx-PSE service. The PSE Code of Practice you can find here
The User Manual describes in detail the required file formats, login process and usage of the service itself both for data upload and data download.
Please note that GenoEx service is under regular improvement and thus new functionalities may be introduced. It is therefore important to always use the latest version of this manual.
Login
After you have signed the User’s Agreement and joined the service, you will receive login details from Interbull Center for the users you have requested access for. From this moment you are entitled to use the GenoEx-PSE service via genoex.org.
The login tab can be found on the top right corner of the front page. If you have lost or forgotten your password, a new password can be requested from the Lost Password page, which can be accessed both under the login tab and from the login page. Once you log in you can change your password by pressing the Change password under your username tab in the top right corner. We highly recommend that you change the password from the one provided to you, at the very first login.
Under the About tab, you will find links to all documentation relevant for the GenoEx-services.
Under the Contact tab you will find information needed to contact Interbull Center should you have any questions, comments or concerns regarding the platform or the service.
On the System Data page, four tables are presented: SNP Arrays, Laboratories, Breeds, and Countries. All of the tables have clickable headers, and a search field to ease the looking up of information. The information in these tables represent the only allowed values for the information regarding SNP array, genotyping laboratory, breed and country in the files to be uploaded. See the section for file format for more information
The History tab directs you to the history page, where you find a table showing all the uploads or downloads done by any user of your organization. The history table as well as downloads and uploads, found under the PSE-tab, will be described in the following sections.
Data Upload
To access the Upload page, go to the PSE tab and click Upload. The data to be uploaded must be presented in a .zip-file as described in section 1 of the Users Manual. Details of the file format you can find HERE. Click the Browse button, find and select the file to be uploaded, and then click UPLOAD.
After pressing the UPLOAD button, you will be redirected to a status-page showing the job-id, job type, current status and the date and time, as shown in the pictures below.
An uploading process can have four different statuses: Initiated, Processing, Finished or Failed. At the initial re-direction from the upload page, the current status will be shown as Initiated. The duration of the upload process depends on the size of your file. To follow the progress of your upload, you can either refresh the status page as shown in the three following pictures, or go to the History table. When the Processing is complete, and the upload is either successful or failed, you will receive a notificiation via e-mail to the e-mail address connected to your GenoEx user account.
Failed upload
In the case where the upload is unsuccessful, the job status will be displayed as Failed. The number of errors and the cause of the failed upload will be shown. In the case presented in the image below, the upload failed as the animal-ID used in the snps.csv-file did not correspond to the animal-ID used in the meta.csv-file.
Data extraction
To access the download page, go to the PSE-tab and click Download.
In the download page, query the desired data by sex, breed and country. A selection must be done in all three fields for the extraction to be initiated. In order to select more than one criteria use CTRL+click.
The data can be extracted either in TOP-format or in AB-format, independently of the format that it was submitted in.
In cases where animals is represented with more than one genotype, the Best only will give you the genotype with the highest call rate. By selecting All you will be provided with all the genotypes your organization have access to.
Specific time periods of when data has been uploaded can be select, to for example only download data that has become available since your latest download. The dates provided by default gives you all data uploaded to the database that your organization has access to.
When all the extraction criteria are selected, click CREATE EXTRACTION.
After clicking CREATE EXTRACTION, you will be redirected to a status-page showing the job-id, job type, current status and the date and time, as shown in the pictures below.
A downloading process can have two different statuses: Processing or Finished. At the initial re-direction from the upload page, the current status will be shown as Processing. The duration of the download process depends on the amount of data that is extracted. To follow the progress of your download, you can either refresh the status page as shown in the two following pictures, or go to the History table. When the Processing is complete, and the download is Finished, you will receive a notificiation via e-mail to the e-mail address connected to your GenoEx user account.
When the extraction is completed, you can fetch the extracted file by clicking on Download extracted file on the jobs status page. The downloaded file can be fetched either directly, or at a later time and date. The data will be presented in a zipped folder containing the files snps.csv and meta.csv, following the file format as described in section 1. Note that the download files has an extra column compared to the uploaded files, as they contain a unique identifier in order to separate several genotypes for the same animal.
History Table
The progress of both uploads and downloads can also be followed in the history table.
The table is by default sorted by date, with the most recent job shown first, but all headers can be pressed to change the sorting as you choose. The search field can be used to easier find specific jobs you may search for.
By clicking on the job-ID of the job of your interest, you will be re-directed to the status-page for that upload or download.
PSE File Formats
Submissions to GenoEx-PSE must be uploaded as a ZIP-archive containing two files: meta.csv (file702) and snps.csv (file704). Every SNP specific record in snps.csv file must have a corresponding record in meta.csv file and the information regarding country and breed for given individual must be in agreement in both files. The fields in both files should be comma separated and no blank fields are allowed. Note that SNP-names must be written in capital letters.
At each upload from a Service User, both files must be zipped together and submitted at the same time. Data submitted in one file will not be processed until both files are available .
Data extracted from the GenoEx-PSE database generally follow the same format as the data uploaded. The downloaded files does however contain one additional column with a Unique Upload ID, allowing to match meta.csv and snps.csv files coming from different uploading events. Therefore, Upload ID allows distinguishing multiple records of one individual coming from different sources and/or genotyping events. If you wish to upload more than one genotype for a given animal, you will have to do it in a separate set of files, as the database does not accept any duplicates in order to allow distinguishing between various genotyping events.
! Please note: Since the input files are comma delimited, do avoid the use of comma within a data field. If any of your fields require comma sign, add the escape character ‘\’ before the comma e.g. ‘Laboratory\, Branch’.
file 702 - meta.csv
File 702, named meta.csv, is required to be a variable length, comma delimited file in .csv format including a single record for each animal for which a SNP genotype details is being reported in the snps.csv file. For example, if the snps.csv file (File 704-AB or 704-TOP) includes SNP genotype results for 100 animals, the meta.csv file will have 100 records, one per animal.
Col |
Name |
Format |
Description |
Example |
1 |
Record type |
Numeric |
Record type |
702 |
2 |
Service user |
Alphanumeric |
Name of uploading organisation |
INRA |
3 |
Source country/sending country |
Alphanumeric |
3 letter country code2 |
FRA |
4 |
Animal ID1 - breed code |
Alphanumeric |
3 letter breed code3 |
BSW |
5 |
Animal ID1 - country code |
Alphanumeric |
3 letter country code2 |
AUS |
6 |
Animal ID1 - sex code |
Alpha |
1 letter breed code, M or F |
M |
7 |
Animal ID1 - registration |
Alphanumeric |
Animal identification, maximum 18 characters |
0001234567 |
8 |
Genotyping Laboratory |
Alphanumeric |
Genotyping laboratory4 |
Weatherbys Ireland |
9 |
Sample ID |
Alphanumeric |
Sample numbers used in the genotyping laboratory |
R1234567890 |
10 |
Scan Date |
Numeric |
date when laboratory concluded the analysis, format yyyymmdd |
20220425 |
11 |
Platform |
Alphanumeric |
SNP platform5 |
Illumina |
12 |
SNP array |
Numeric |
SNP array, named by number of SNPs5 |
54001 |
13 |
Call Rate |
Numeric |
Percent call rate of the animals full genotype used to create the SNP record for GenoEx-PSE, two decimals |
99.98 |
!downloads only! |
||||
(14) |
upload ID |
Alphanumeric |
Unique ID allowing to match multiple data files of the same individual by upload event |
1fdb3a18-4a0d-41ee-8530-dac3880d00cf |
file 704 - snps.csv
File 704, named snps.csv, contains the actual genotype data for the animals listed in the corresponding meta.csv file.
The Service User may select to upload and/or download SNP genotype data in either the "AB" or "TOP" allele designations, which determines the content of the first field, namely Record Type, in the following file format (i.e.: File 704-AB versus File 704-TOP, respectively). The file704-AB/704-TOP will be exchanged as a variable length, comma delimited file in .csv format and include a single record for each SNP included for each animal.
For example, if meta.csv file includes 100 animals and each SNP genotype include the 200 SNPs recommended by ISAG for Parentage Verification then snps.csv will include a maximum of 20,000 records (100 animals x 200 SNPs each). In the event that any SNP was not "called" and the result is missing, then that SNP for that animal should not be included in the GenoEx-PSE genotype exchange file. For this reason, even if the meta.csv file includes 100 animals, this second file may not necessarily have a total of 20,000 records.
Col |
Name |
Format |
Description |
Example |
1 |
Record type |
Numeric |
Record type |
704-AB or 704-TOP |
2 |
Animal ID1 - breed code |
Alphanumeric |
3 letter breed code3 |
BSW |
3 |
Animal ID1 - country code |
Alphanumeric |
3 letter country code2 |
AUS |
4 |
Animal ID1 - sex code |
Alpha |
1 letter breed code, M or F |
M |
5 |
Animal ID1 - registration |
Alphanumeric |
Animal identification, maximum 18 characters |
0001234567 |
6 |
SNP Name |
Alphanumeric |
SNP Name in CAPITAL letters6 |
ARS-BFGL-BAC-19454 |
7 |
Allele 1 |
Alpha |
A/B for 704-AB, A/C/G/T for 704-TOP |
A |
8 |
Allele 2 |
Alpha |
A/B for 704-AB, A/C/G/T for 704-TOP |
A |
!downloads only! |
||||
(9) |
upload ID |
Alphanumeric |
Unique ID allowing to match multiple data files of the same individual by upload event |
1fdb3a18-4a0d-41ee-8530-dac3880d00cf |
1Columns 4,5,6, and 7 does together make up the animal identification. Interbull ID is not a requirement in PSE, but it is highly recommended to be used for animals that has such identification. For further information on Interbull ID, see ID guidelines
2Country code according to ISO country codes
3Breed code according to ICAR breed codes
4Allowed values according to laboratory list. To request inclusion of new laboratories, e-mail Interbull Center at genoex@slu.se prior to upload
5Allowed values according to platform/array list. To request inclusion of new platforms or arrays, e-mail Interbull Center at genoex@slu.se prior to upload
6Full SNP list for parentage verification and parentage discovery