GenoEx- PSE – User's Manual
This document describes the procedures for the upload and download of the data for the GenoEx-PSE service.
It describes in details required file formats, login process and usage of the service itself both for data upload and download.
Please note that GenoEx service is under regular improvement and thus new functionalities may be introduced. It is therefore important to always use the latest version of this manual.
1. File formats
The SNP genotype data and the related information submitted (uploaded) to GenoEx-PSE is categorized in two groups:
A) Information related to a group of animals with a SNP genotype to upload ('meta.csv' file)
B) Information related to the actual SNP genotypes for the animals in 1) ('snps.csv' file)
At each upload from a Service User, both files must be zipped together and submitted at the same time. Data submitted in one file will not be processed until both files are available.
Data extracted from the GenoEx-PSE database generally follow the same format than the data uploaded. They however, contain additional column with unique Upload ID allowing to match meta.csv and snps.csv files coming from one upload event. Therefore, Upload ID allows distingushing multiple records of one individual coming from different sources and/or genotyping events.
1A - meta.csv file - Information Related to a Group of Animals
This file is required to be a variable length, comma delimited file in .csv format including a single record for each animal for which a SNP genotype details is being reported in the snps.csv file. For example, if the snps.csv file (File 704-AB or 704-TOP) includes SNP genotype results for 100 animals, the meta.csv file will have 100 records, one per animal.
Field Name |
Example |
Description |
Record Type |
702 |
Numeric |
Service User |
INRA |
Alphanumeric |
Source Country of Animals |
FRA |
Code of 3 CAPITAL letters according to Country List |
Animal ID - Breed Code |
BSW |
Code of 3 CAPITAL letters according to Breed List |
Animal ID - Nation Code |
AUS |
Code of 3 CAPITAL letters according to Country List |
Animal ID - Sex Code |
M |
Alpha (M or F) |
Animal ID - Registration |
A1234567890 |
Alphanumeric, maximum 18 characters |
Genotyping Laboratory Identification |
Genotyping Laboratory |
Alphanumeric according to the Laboratory List OR added as new* |
Sample ID |
R1234567890 |
Alphanumeric, sample numbers used in the genotyping laboratory |
Scan Date |
yyyymmdd |
Numeric, the date when the laboratory concluded the analysis. |
Platform |
Illumina |
Alphanumeric, SNP Chip Platform according to the Platform List* |
No. SNPs in Genotype |
54001 |
Total number of SNPs* |
Genotype Call Rate |
99.98 |
Percent call rate of the animal's full genotype used to create the SNP record for GenoEx-PSE, two decimals. |
! download only: |
||
upload ID |
1fdb3a18-4a0d-41ee-8530-dac3880d00cf |
Unique ID allowing to match multiple data files of the same individual by upload event. |
Table 1A. Format of the meta.csv file used for uploading general information for the group of animals for which SNP genotypes are submitted in the associated snps.csv file (format below).
* new laboratories, platforms and arrays need to be communicated to Interbull by e-mail prior to data submission.
1B - Information Related to the SNP Genotypes of Animals – snps.csv file
snps.csv files contains the actual genotype data for the animals listed in the corresponding meta.csv file. The Service User may select to upload and/or download SNP genotype data in either the "AB" or "TOP" allele designations, which determines the content of the first field, namely Record Type, in the following file format (i.e.: File 704-AB versus File 704-TOP, respectively). This file will be exchanged as a variable length, comma delimited file in .csv format and include a single record for each SNP included for each animal. For example, if meta.csv file includes 100 animals and each SNP genotype include the 200 SNPs recommended by ISAG for Parentage Verification then this second file will include a maximum of 20,000 records (100 animals x 200 SNPs each). In the event that any SNP was not "called" and the result is missing, then that SNP for that animal should not be included in the GenoEx-PSE genotype exchange file. For this reason, even if the meta.csv file includes 100 animals, this second file may not necessarily have a total of 20,000 records.
Field Name |
Example |
Description |
Record Type |
704-AB (or 704-TOP) |
Alphanumeric |
Animal ID - Breed Code |
BSW |
Code of 3 CAPITAL letters according to Breed List |
Animal ID - Nation Code |
AUS |
Code of 3 CAPITAL letters according to Country List |
Animal ID - Sex Code |
M |
Alpha (M or F) |
Animal ID - Registration |
A1234567890 |
Alphanumeric, maximum 18 characters |
SNP Name |
ARS-BFGL-BAC-19454 |
Alphanumeric (CAPITAL letters only) |
Allele 1 |
A/B or A/C/G/T |
Alphabetic (CAPITAL letters only) |
Allele 2 |
A/B or A/C/G/T |
Alphabetic (CAPITAL letters only) |
! download only: |
||
upload ID |
1fdb3a18-4a0d-41ee-8530-dac3880d00cf |
Unique ID allowing to match multiple data files of the same individual by upload event. |
Table 1B. Format to be used for uploading the specific SNP genotypes to the GenoEx-PSE database.
! Please note: since the input files are comma delimited, if any of your fields require comma sign, you ought to add the escape character ‘\’ before comma e.g. ‘Laboratory\, Branch’.
If you wish to upload more than one genotype for a given animal, you will have to do it in a separate set of files. This is because the data base does not accept any duplicates, in order to allow distinguishing between various genotyping events.
The information downloaded from GenoEX-PSE follows the same formats both for the meta.csv and the snps.csv files with the addition of Upload ID (see above).
2. Login
After contacting the Interbull Center and signing the User's Agreement, you receive the details for your login. From this moment you are entitled to use the GenoEx-PSE service, via the main page: genoex.org
Login tab can be found on the top right corner of the page. Shall you forget your password, you can send password recovery request.
After login you will see more options:
Once you log in you can see your account details or change your password if you wish so by choosing respectively Profile and Change Password options under your user name.
3. Data Upload
In order to upload the data you shall click on the Upload menu on the top of the page:
To upload the data by uploading a .zip file as described in point 1.
Shall you have any doubts on how the data should be formatted, or what values are allowed in a given field, click on the About Upload Formats menu (in the blue area on the lower part of the page) for more information. To initiate the upload browse for the zipped files in your computer.
Once you initialized the upload, you will be redirected to the upload status page, where you can track the upload progress.
Once the upload is finished you will be able to confirm the success of the data submission or see the reason of possible failure. The status page contains the information about the file name (used by you) and job ID (assigned by system)
The duration of the upload process depends on the size of your file and your internet connection. If there was an error or missing information in your data, the system will notify you about it with the detailed error message.
Above you can see the example of upload failure because of invalid platform/array in the data.
The upload status page can also be reached by History tab on the top menu.
On the History page you can see all upload and download actions from your organization sorted by date and named by the job status ID. Clicking on the selected job will open the action’s details, like in the examples above.
4. Data extraction
You will find the data extraction tab – Download on the top menu bar.
The data can be extracted either in TOP or in AB format, independently of the format (AB or TOP) that it was submitted in.
You can query the data by sex, breed and country. Breed and country are to be selected from the list. In order to select more than one criteria use CTRL+click
By default the system is extracting the data from the highest call rate per animal. However, if you tick the Full box, you will receive all the available records for queried combination.
A very useful function, especially for the users downloading the data from GenoEx-PSE regularly, is a the time frame choice for data download, allowing to download only the data that was uploaded to the database within defined time period.
Once the extraction criteria are selected, click Create extraction. You will see the progress on the screen, just like with the upload. When the extraction is completed you will be able to download your data as a zipped file containing snps.csv and meta.csv file.
The file can be downloaded right away or later on by finding specific case under History where the events are sorted by action date.