XML Data File Technical Guideline

Table of Contents

Introduction

This specification defines the syntax and semantics for Open Data XML files. An XML file contains markup or tags to encode data (numbers and text) in a machine readable format.

XML Data File Structure

An XML file must be a well formed file conforming to the Extensible Markup Language (XML) 1.0 specification.

UTF-8 Character Encoding

XML files must use UTF-8 character encoding. This ensure that any special characters (e.g. accented French characters) can be properly understood.

<?xml version="1.0" encoding="UTF-8" ?>

XML Document Type or Schema

XML files must include either a document type declaration or schema specification to define the structure of the XML file. The XML file must validate to the document type or schema.

XML Doctype

The following is an example of an XML Document Type Declaration:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rdf:RDF PUBLIC "-//DUBLIN CORE//DCMES DTD 2002/07/31//EN"
"http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-xml-dtd.dtd">

XML SchemaLocation

The following is an example of an XML schema declaration using the xsi:schemaLocation syntax:

<?xml version="1.0" encoding="utf-8"?>
<catalog xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xsi:schemaLocation="http://www.example.com/MyData.xsd"
</catalog>

XML noNamespaceSchemaLocation

The following is an example of an XML schema declaration using the xsi:noNamespaceSchemaLocation syntax:

<?xml version="1.0" encoding="utf-8"?>
<data_dictionary dd_version=”1.0”
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://donnees-data.spac-pspc.gc.ca/dd/spac-pspc-dd.xsd">

XML Data Quality

An XML file must meet the following data quality requirements.

Data Patterns

If a heading data pattern is specified in the data dictionary and there is an XML tag that matches the heading label, then the contents of the XML tag must conform to the pattern. If no data pattern is specified or if the XML tag does not match any heading label, then any content may appear in the tag.

In the following example of an invalid XML file for a data dictionary that specifies that the content of heading (field2) must contain digits only. The value for the XML tag does not contain only digits:

<field1>aaa</field1><field2>123</field2><field3>ccc</field3>
<field1>ddd</field1><field2>EEE</field2><field3>fff</field3>

Validation Tools

The following list of tools may be useful in validating XML data files.

Additional Resources

Report a problem or mistake on this page
Date modified: