Differences between revisions 1 and 4 (spanning 3 versions)
Revision 1 as of 2013-01-31 21:48:30
Size: 608
Editor: mtl-93-187-66-202
Comment:
Revision 4 as of 2013-01-31 22:40:06
Size: 2678
Editor: mtl-93-187-66-202
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
The [[public/gebvtest.py?action=print|gebvtest.py]] program offers an option (''-m, --mergefiles'') to create a file of merged Cf/Df/Cr/Gr records as a convenience to the user.
These files can make it easier to check for the correctness of the input datasets and they can be used for easier
The [[https://wiki.interbull.org/public/gebvtest_py?action=print|gebvtest.py]] program offers an option (''-m, --mergefiles'') to create a file of merged Cf/Df/Cr/Gr records as a convenience to the user. These files can make it easier to check for the correctness of the input datasets and they can be used perform additional checks and/or statistical analyses, using R for example as in this small [[attachment:test.R]].
Line 6: Line 5:
, either in directDATADIR/merged by
By default, the
Note: The merged files do not have a _POPBRD extension, so if you would like to create files for more than one population or breed, you should also supply the ''-M, --mergedir'' option with a different destination directory
One file is created for each trait present in the traits_POPBRD file and the file300Xy_POPBRD files. There is a record for each bull present in the Cf file born in/after the cutoff year specified in the traits_POPBRD file.
Flags are supplied to indicate whether the bull qualifies as a candidate or a test bull. Please see the file format below.

By default, the files are created in directory DATADIR/merged. The merged files do not have a _POPBRD extension, so if you would like to create files for more than one population or breed, you should also supply the ''-M, --mergedir'' option with a different destination directory for each population/breed. The destination directory can be an absolute path, or it can be relative to the programs directory (eg. ../sample_data/my_merges). The directory will be created automatically if it does not exist.

== File format ==

The file is in comma-separated-variable (csv) format, using commas as the separator.

||'''Column'''||'''Variable'''||'''Type'''||'''Description'''||
||1||aid||char(19)||animal ID||
||2||byear||int||Birth year||
||3||geno||char(1)||Bull has a GEBV (Gr) record (Y/N)||
||4||cand||char(1)||Bull qualifies as a candidate bull (Y/N)||
||5||test||char(1)||Bull qualifies as a test bull (Y/N)||
||6||top||char(2)||Type of proof (from Cf file)||
||7||off||char(1)||Official proof (Y/N; from Cf file)||
||8||sta||char(2)||Bull status (from Cf file)||
||9||Cf||char(2)||Fixed separator for Cf info||
||10||edc||int||EDC from Cf file||
||11||rel||real||Reliability from Cf file (x100)||
||12||ebv||real||Predicted genetic merit ("proof") from Cf file||
||13||Df||char(2)||Fixed separator for Df info||
||14||edcd||int||EDC from Df file||
||15||reld||real||Reliability from Df file (x100)||
||16||dpgm||real||DD or deregressed proof from Df file||
||17||Cr||char(2)||Fixed separator for Cr info||
||18||edcr||int||EDC from Cr file||
||19||relr||real||Reliability from Cr file (x100)||
||20||ebvr||real||Predicted genetic merit ("proof") from Cr file||
||21||Gr||char(2)||Fixed separator for Gr info||
||18||edcg||int||EDC from Gr file||
||19||relg||real||Reliability from Gr file (x100)||
||20||gebv||real||GEBV from Gr file||

GEBVtest Merged Files

The gebvtest.py program offers an option (-m, --mergefiles) to create a file of merged Cf/Df/Cr/Gr records as a convenience to the user. These files can make it easier to check for the correctness of the input datasets and they can be used perform additional checks and/or statistical analyses, using R for example as in this small test.R.

One file is created for each trait present in the traits_POPBRD file and the file300Xy_POPBRD files. There is a record for each bull present in the Cf file born in/after the cutoff year specified in the traits_POPBRD file. Flags are supplied to indicate whether the bull qualifies as a candidate or a test bull. Please see the file format below.

By default, the files are created in directory DATADIR/merged. The merged files do not have a _POPBRD extension, so if you would like to create files for more than one population or breed, you should also supply the -M, --mergedir option with a different destination directory for each population/breed. The destination directory can be an absolute path, or it can be relative to the programs directory (eg. ../sample_data/my_merges). The directory will be created automatically if it does not exist.

File format

The file is in comma-separated-variable (csv) format, using commas as the separator.

Column

Variable

Type

Description

1

aid

char(19)

animal ID

2

byear

int

Birth year

3

geno

char(1)

Bull has a GEBV (Gr) record (Y/N)

4

cand

char(1)

Bull qualifies as a candidate bull (Y/N)

5

test

char(1)

Bull qualifies as a test bull (Y/N)

6

top

char(2)

Type of proof (from Cf file)

7

off

char(1)

Official proof (Y/N; from Cf file)

8

sta

char(2)

Bull status (from Cf file)

9

Cf

char(2)

Fixed separator for Cf info

10

edc

int

EDC from Cf file

11

rel

real

Reliability from Cf file (x100)

12

ebv

real

Predicted genetic merit ("proof") from Cf file

13

Df

char(2)

Fixed separator for Df info

14

edcd

int

EDC from Df file

15

reld

real

Reliability from Df file (x100)

16

dpgm

real

DD or deregressed proof from Df file

17

Cr

char(2)

Fixed separator for Cr info

18

edcr

int

EDC from Cr file

19

relr

real

Reliability from Cr file (x100)

20

ebvr

real

Predicted genetic merit ("proof") from Cr file

21

Gr

char(2)

Fixed separator for Gr info

18

edcg

int

EDC from Gr file

19

relg

real

Reliability from Gr file (x100)

20

gebv

real

GEBV from Gr file

public/gebvtest_mergefiles (last edited 2025-03-27 16:08:41 by Valentina)