XML tutorials and examples
Contents
Introduction
When fixed character width data file formats become too complex, with for example the number of columns depending on values in previous columns or from data in other files, the effort to write parsers to read and write the information and make sure it is correctly understood becomes very time- and resource consuming. One possible solution is to use an XML format for the data exchange files, in order to draw from the large amount of packages and libraries already written to parse such documents. In this page you will find links and examples that will help with getting familiar with the XML format and how to write tools utilizing it.
XML Overview
See the Wikipedia entry for XML for a description of the XML markup language, link here. Especially take note of the terminology of the different parts of the XML structure.
XML tutorials
An XML tutorial focusing on the web aspect: W3 Schools XML tutorial
A video XML tutorial on youtube: XML basics
Advanced an in-depth tutorial on XML: http://www.xmlmaster.org/en/article/d01/
XML concrete example
In our hypothetical example we have two source files, one with the IDs and name of animals, and another one with associated data. In our final XML file the information in these two files should be packed together in one, single XML file. The structures of the source files are very simple.
Our source files
Animal ID file
This file is a fixed width file with two columns, the first being the animal ID (19 characters) and the second being the animal name (30 characters). Every line contains only one animal, which has to be unique within the file. Our example source file looks like this:
LIMFRAF001521469226 Rosa HOLUSAF000017059414 Bossy AANFINF000006314316 Muhmuh HERFINF000003266465 Greta LIMFRAF001930958553 Linda CHAFINM000008365662 Cowlin LIMFRAM003150038969 Bryan CHAFRAF002350102162 Linda SIMDEUF000922204654 Angel HOLDEUF001006117458 Hermione CHAFRAF004303055320 Samantha HERFINM000008405652 Frodo
Associated data file
This is a 3-column fixed width file, with the first column being the animal ID (19 characters), the second one being the name of the associated data (15 characters) and the third one being the data itself (10 characters). Our example source file is:
Our final XML file
Fortran example of creating our XML file from the source files
Python example of creating our XML file from the source files