ITS
Overview
The Internationalization Tag set (ITS) is a W3C recommendation that defines a set of elements and attributes you can use to specify different internationalization- and localization-related aspects of your XML document, for instance: ITS defines what attribute values are translatable, what element content should be protected, what element should be treated as a nested sub-flow of text, and much more.
- The ITS 1.0 specification is available at http://www.w3.org/TR/its/
- The ITS 2.0 specification is available at http://www.w3.org/TR/its20/
Default Rules
By default the filter process the XML documents based on the ITS defaults. That is:
- the content of all elements is translatable,
- and none of the values of the attribute translatable.
To modify this behavior you need to associate the document with ITS rules. This can be done different ways:
- By including global and local rules inside the document.
- By including inside the document a link to external global rules.
- By associating the document with a parameters file when running the filter. The parameter file being a set of external ITS global rules.
When processing a document, the filter...
- Assumes that all element content is translatable, and none of the attribute values are translatable.
- Applies the global rules found in the (optional) parameters file associated with the input document.
- Applies the global rules found in the document.
- And finally, applies the local rules within the document.
Example
For example, assuming that ITSForDoc.xml
is the ITS file associated with the input file Document.xml
, the translatable text is listed below.
ITSForDoc.xml
:
<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0"> <its:translateRule selector="//head|//code" translate="no"/> <its:withinTextRule selector="//b|//code|//img" withinText="yes"/> </its:rules>
Document.xml
:
<doc> <head> <update>2009-03-21</update> <author>Mirabelle McIntosh</author> </head> <body> <p>Paragraph with <img ref="eg.png"/> and <b>bolded text</b>.</p> <p>Paragraph with <code>data codes</code> and text.</p> </body> </doc>
The resulting text units are (with the inline codes in XLIFF 1.2 notation):
1: "Paragraph with <x id='1'> and <g id='2'>bolded text</g>." 2: "Paragraph with <g id='1'><x id='2'/></g> and text."
Validation
The Relaxed project includes an online validator for ITS.
Relaxed is an open-source project hosted on SourceForge
Extensions
Several extensions have been defined by the ITS Interest Group. There are listed in the Issues and Proposed Features section of the Interest Group wiki.
The extension namespace is http://www.w3.org/2008/12/its-extensions
Proper Namespace Handling
If the input document file uses a namespace, the ITS file must uses the same namespace. For example, if the input document file looks like this:
<doc xmlns="http://xmlx.org/ns/xmlx"> <head> <update>2009-03-21</update> <author>Mirabelle McIntosh</author> </head> <body> <p>Paragraph with <img ref="eg.png"/> and <b>bolded text</b>.</p> <p>Paragraph with <code>data codes</code> and text.</p> </body> </doc>
Then the ITS file must use the namespace like this:
<its:rules xmlns:its="http://www.w3.org/2005/11/its" xmlns:xx="http://xmlx.org/ns/xmlx" version="1.0"> <its:translateRule selector="//xx:head|//xx:code" translate="no"/> <its:withinTextRule selector="//xx:b|//xx:code|//xx:img" withinText="yes"/> </its:rules>
ITS in the Okapi Framework
The Okapi Framework uses ITS in several places. For example:
- The XML Filter implements most of ITS data categories for XML documents.
- The HTML5-ITS Filter implements most of ITS data categories for HTML5 documents.
- Several pre-defined filter configurations are ITS files.
- The version 2.0 of ITS has been implemented in Okapi as one of the deliverables of the MultilingualWeb-LT project funded by the European Commission.
For an overview of the components with ITS capability, see the ITS Components page.