XML Data File Technical Guideline
Table of Contents
Introduction
This specification defines the syntax and semantics for Open Data XML files. An XML file contains markup or tags to encode data (numbers and text) in a machine readable format.
XML Data File Structure
An XML file must be a well formed file conforming to the Extensible Markup Language (XML) 1.0 specification.
UTF-8 Character Encoding
XML files must use UTF-8 character encoding. This ensure that any special characters (e.g. accented French characters) can be properly understood.
<?xml version="1.0" encoding="UTF-8" ?>
XML Document Type or Schema
XML files must include either a document type declaration or schema specification to define the structure of the XML file. The XML file must validate to the document type or schema.
XML Doctype
The following is an example of an XML Document Type Declaration:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rdf:RDF PUBLIC "-//DUBLIN CORE//DCMES DTD 2002/07/31//EN"
"http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-xml-dtd.dtd">
XML SchemaLocation
The following is an example of an XML schema declaration using the xsi:schemaLocation
syntax:
<?xml version="1.0" encoding="utf-8"?>
<catalog xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xsi:schemaLocation="http://www.example.com/MyData.xsd"
</catalog>
XML noNamespaceSchemaLocation
The following is an example of an XML schema declaration using the xsi:noNamespaceSchemaLocation
syntax:
<?xml version="1.0" encoding="utf-8"?>
<data_dictionary dd_version=”1.0”
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://donnees-data.spac-pspc.gc.ca/dd/spac-pspc-dd.xsd">
XML Data Quality
An XML file must meet the following data quality requirements.
Data Patterns
If a heading data pattern is specified in the data dictionary and there is an XML tag that matches the heading label, then the contents of the XML tag must conform to the pattern. If no data pattern is specified or if the XML tag does not match any heading label, then any content may appear in the tag.
In the following example of an invalid XML file for a data dictionary that specifies that the content of heading (field2
) must contain digits only. The value for the XML tag does not contain only digits:
<field1>aaa</field1><field2>123</field2><field3>ccc</field3>
<field1>ddd</field1><field2>EEE</field2><field3>fff</field3>
Validation Tools
The following list of tools may be useful in validating XML data files.
- PWGSC Open Data Tool - Web and Open Data Validator
- Free Online XML Validator Against XSD Schema - FreeFormatter.com
- XML - DTD and Schema Validator
- XML well-formedness checker and validator
- XML Validation
- W3 Schools XML Validator
Additional Resources
- Principles of XML design: When to use elements versus attributes - IBM
- Google XML Document Format Style Guide
- Date modified: