Tool to align CLDR XML files?
Thread poster: Samuel Murray
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 00:48
Member (2006)
English to Afrikaans
+ ...
Aug 31, 2016

Hello everyone

Do you know of a tool that can align two language XML files from CLDR to produce a two-column table with the names of languages, countries, regions, time zones etc in one language in the one column, and their translations in the other? Or do you know a place where this sort of data can be downloaded from?

Thanks
Samuel


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 00:48
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Anyone, anyone? (-: Sep 6, 2018

Samuel Murray wrote:
Do you know of a tool that can align two language XML files from CLDR to produce a two-column table with the names of languages, countries, regions, time zones etc in one language in the one column, and their translations in the other?


I'm doing another job for which this sort of thing would be incredibly useful.

Here's an example of the latest CLDR data:
http://unicode.org/Public/cldr/33.1/

In some of the folders you'd find XML files with language names. For example, the file cldr/cldr-common-33.1/common/main/af.xml contains all the time zone names in Afrikaans, and the corresponding en.xml file in the same folder contains all those time zone names in English. The files do not match 100%, but it's XML, and the content has the same labels in both files, so it should be possible to align at least the content that appear in both files.

Example from Afrikaans file:

<metazone type="Africa_Western">
<long>
<generic>Wes-Afrika-tyd</generic>
<standard>Wes-Afrika-standaardtyd</standard>
<daylight>Wes-Afrika-somertyd</daylight>
</long>
<short>
<generic>WAT</generic>
<standard>WAT</standard>
<daylight>WAST</daylight>
</short>
</metazone>

Example from corresponding English file:

<metazone type="Africa_Western">
<long>
<generic>West Africa Time</generic>
<standard>West Africa Standard Time</standard>
<daylight>West Africa Summer Time</daylight>
</long>
</metazone>

Surely this is something that must be done regularly in some part of the industry...?

Thanks
Samuel


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 00:48
English to French
+ ...
You can use OmegaT with the Okapi plugin Sep 6, 2018

Samuel Murray wrote:

Samuel Murray wrote:
Do you know of a tool that can align two language XML files from CLDR to produce a two-column table with the names of languages, countries, regions, time zones etc in one language in the one column, and their translations in the other?


I'm doing another job for which this sort of thing would be incredibly useful.

Here's an example of the latest CLDR data:
http://unicode.org/Public/cldr/33.1/

In some of the folders you'd find XML files with language names. For example, the file cldr/cldr-common-33.1/common/main/af.xml contains all the time zone names in Afrikaans, and the corresponding en.xml file in the same folder contains all those time zone names in English. The files do not match 100%, but it's XML, and the content has the same labels in both files, so it should be possible to align at least the content that appear in both files.

If you define an XML filter using the generic XML filter of the Okapi plugin, you will be able to do the alignment with OmegaT.

Didier


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Tool to align CLDR XML files?






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »