How to Add Languages to Rainbow
Rainbow comes with a default list of languages and language codes, but this list is only there to provide some defaults.
- You can always use a language code that is not listed in the defaults.
- You can edit the default list to add or modify it.
The list is located in the
lib/shared sub-directory under the main installation directory. The file is named
languages.xml. When you edit it, make sure to save the file in UTF-8.
Each language entry is stored in a
<language> element and look like this:
<language code='ug-CN' lcid='1152' encoding='windows-1256' macEncoding='MacArabic' unixEncoding='iso-8859-6'> <name>Uighur (*China)</name> </language>
code attribute holds the BCP-47 language tag for the given language. BCP-47 is the standard way to represent languages in XML, HTML and many other technologies.
A language tag is composed of one or more subtags organized according the BCP-47 specification. There is an official list of subtags.
- BCP-47 specification: http://tools.ietf.org/html/rfc4647
- List of the registered language subtags: http://www.iana.org/assignments/language-subtag-registry
- A good overview of BCP-47: http://www.w3.org/International/articles/language-tags/
lcid attribute is the Microsoft LCID value for the given language. Use -1 if you do not know the proper value.
unixEncoding hold respectively the name of the encodings for Windows, Macintosh and Linux. (The attributes names reflect Rainbow's old heritage, don't pay attention to them). You must set the values to the encoding that support the given language in the given platform.
The best way to use the proper encoding name is to look at what encodings are supported by your Java system. You can get the list using the command Tools > List Available Encdings.
<name> element holds the display name of the language. In many case it also has a region or country name associated. use the
* to indicate which region is the default one for the LCID value.
You can have as many entries needed to reflect the different variants of the language. It is recommended to always provide an entry for the language alone 9without region variant), and the main regional variants. You can also have differences in the scripts used for the same language.
Example of a set of entries for a given language:
<language code='az' lcid='1068' encoding='windows-1254' macEncoding='MacTurkish' unixEncoding='iso-8859-9'> <name>Azerbaijani</name> </language> <language code='az-AZ' lcid='1068' encoding='windows-1254' macEncoding='MacTurkish' unixEncoding='iso-8859-9'> <name>Azerbaijani (*Azerbaijan)</name> </language> <language code='az-Cyrl-AZ' lcid='2092' encoding='windows-1251' macEncoding='MacCyrillic' unixEncoding='iso-8859-5'> <name>Azerbaijani (Azerbaijan, Cyrillic)</name> </language>