Differences between revisions 1 and 2

GEBVtest Merged Files

The gebvtest.py program offers an option (-m, --mergefiles) to create a file of merged Cf/Df/Cr/Gr records as a convenience to the user. These files can make it easier to check for the correctness of the input datasets and they can be used perform additional checks and/or statistical analyses, using R for example as in this small test.R.

One file is created for each trait present in the traits_POPBRD file and the file300Xy_POPBRD files. There is a record for each bull present in the Cf file born in/after the cutoff year specified in the traits_POPBRD file. Flags are supplied to indicate whether the bull qualifies as a candidate or a test bull. Please see the file format below.

By default, the files are created in directory DATADIR/merged. The merged files do not have a _POPBRD extension, so if you would like to create files for more than one population or breed, you should also supply the -M, --mergedir option with a different destination directory for each population/breed. The destination directory can be an absolute path, or it can be relative to the programs directory (eg. ../sample_data/my_merges). The directory will be created automatically if it does not exist.

File format

The file is in comma-separated-variable (csv) format, using commas as the separator.

Column	Variable	Type	Description
1	aid	char(19)	animal ID
2	byear	int	Birth year
3	gflag	char(1)	Bull has has GEBV (Gr) record (Y/N)
4	cflag	char(1)	Bull qualifies as a candidate bull (Y/N)
5	tflag	char(1)	Bull qualifies as a test bull (Y/N)
6	top	char(2)	Type of proof (from Cf file)
7	off	char(1)	Official proof (Y/N; from Cf file)
8	sta	char(2)	Bull status (from Cf file)
9	Cf	char(2)	Fixed separator for Cf info
10	edc	int	EDC from Cf file
11	rel	real	Reliability from Cf file (x100)
12	ebv	real	Predicted genetic merit ("proof") from Cf file
13	Df	char(2)	Fixed separator for Df info
14	edcd	int	EDC from Df file
15	reld	real	Reliability from Df file (x100)
16	dpgm	real	DD or deregressed proof from Df file
17	Cr	char(2)	Fixed separator for Cr info
18	edcr	int	EDC from Cr file
19	relr	real	Reliability from Cr file (x100)
20	ebvr	real	Predicted genetic merit ("proof") from Cr file
21	Gr	char(2)	Fixed separator for Gr info
18	edcg	int	EDC from Gr file
19	relg	real	Reliability from Gr file (x100)
20	gebv	real	GEBV from Gr file

-  ⇤ ← Revision 1 as of 2013-01-31 21:48:30 → 
  Size: 608
  Editor: mtl-93-187-66-202
  Comment:
+   ← Revision 2 as of 2013-01-31 22:32:32 → ⇥
  Size: 2683
  Editor: mtl-93-187-66-202
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-The [[public/gebvtest.py?action=print|gebvtest.py]] program offers an option (''-m, --mergefiles'') to create a file of merged Cf/Df/Cr/Gr records as a convenience to the user. 
These files can make it easier to check for the correctness of the input datasets and they can be used for easier
+The [[https://wiki.interbull.org/public/gebvtest.py?action=print|gebvtest.py]] program offers an option (''-m, --mergefiles'') to create a file of merged Cf/Df/Cr/Gr records as a convenience to the user. These files can make it easier to check for the correctness of the input datasets and they can be used perform additional checks and/or statistical analyses, using R for example as in this small [[attachment:test.R]].
-Line 6:
+Line 5:
-, either in directDATADIR/merged by   
By default, the 
Note: The merged files do not have a _POPBRD extension, so if you would like to create files for more than one population or breed, you should also supply the ''-M, --mergedir'' option with a different destination directory
+One file is created for each trait present in the traits_POPBRD file and the file300Xy_POPBRD files. There is a record for each bull present in the Cf file born in/after the cutoff year specified in the traits_POPBRD file.
Flags are supplied to indicate whether the bull qualifies as a candidate or a test bull. Please see the file format below.

By default, the files are created in directory DATADIR/merged. The merged files do not have a _POPBRD extension, so if you would like to create files for more than one population or breed, you should also supply the ''-M, --mergedir'' option with a different destination directory for each population/breed. The destination directory can be an absolute path, or it can be relative to the programs directory (eg. ../sample_data/my_merges). The directory will be created automatically if it does not exist.

== File format ==

The file is in comma-separated-variable (csv) format, using commas as the separator. 

||'''Column'''||'''Variable'''||'''Type'''||'''Description'''||
||1||aid||char(19)||animal ID||
||2||byear||int||Birth year||
||3||gflag||char(1)||Bull has has GEBV (Gr) record (Y/N)||
||4||cflag||char(1)||Bull qualifies as a candidate bull (Y/N)||
||5||tflag||char(1)||Bull qualifies as a test bull (Y/N)||
||6||top||char(2)||Type of proof (from Cf file)||
||7||off||char(1)||Official proof (Y/N; from Cf file)||
||8||sta||char(2)||Bull status (from Cf file)||
||9||Cf||char(2)||Fixed separator for Cf info||
||10||edc||int||EDC from Cf file||
||11||rel||real||Reliability from Cf file (x100)||
||12||ebv||real||Predicted genetic merit ("proof") from Cf file||
||13||Df||char(2)||Fixed separator for Df info||
||14||edcd||int||EDC from Df file||
||15||reld||real||Reliability from Df file (x100)||
||16||dpgm||real||DD or deregressed proof from Df file||
||17||Cr||char(2)||Fixed separator for Cr info||
||18||edcr||int||EDC from Cr file||
||19||relr||real||Reliability from Cr file (x100)||
||20||ebvr||real||Predicted genetic merit ("proof") from Cr file||
||21||Gr||char(2)||Fixed separator for Gr info||
||18||edcg||int||EDC from Gr file||
||19||relg||real||Reliability from Gr file (x100)||
||20||gebv||real||GEBV from Gr file||