Common XML Locale RepositoryMark Davis & Steven Loomis - IBM Corporation
In the internationalization arena, Unicode has provided a lingua franca for communicating textual data. But there remain differences in the locale data used for a variety of tasks, such as formatting dates and times according to the conventions of different languages. Many of those differences are simply gratuitous; all within acceptable limits for human beings, but resulting in different results. In many other cases there are outright errors. Whatever the cause, the differences can cause discrepancies to creep into a heterogeneous system. This is especially serious in the case of collation (sort-order), where different collation causes not only ordering differences, but also different results of queries! That is, with a query of customers with names between "Arnold, James" and "Abbot, Cosmo", where different systems have different sort orders, very different lists will be returned. The Common XML Locale Repository is a project for the exchange of culturally sensitive (locale) information used in application and system development, and to gather, store, and make available data generated in that format. The project is a joint effort among members of the Linux Application Development Environment (aka LADE) Workgroup of the Free Standards Group's OpenI18N (formerly known as Linux Internationalization Initiative or Li18nux) team. This paper describes the goals and features of the Common XML Locale Repository project, and gives an overview of the XML format for locale data exchange, the current status of the Repository, the comparison of existing data from different platforms, and the process of vetting data to produce a unified set of locale data. |
When the world wants to talk, it speaks Unicode |
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS).
GMS is pleased to be able to offer the International Unicode Conferences under an exclusive
license granted by the Unicode Consortium. All responsibility for conference finances and
operations is borne by GMS. The independent conference board serves solely at the pleasure
of GMS and is composed of volunteers active in Unicode and in international software
development. All inquiries regarding International Unicode Conferences should be addressed
to info@global-conference.com.
Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission. 12 December 2002, Webmaster |