23rd Internationalization and Unicode Conference

Implementing the Unimplemented: Representing Unencoded Scripts in Unicode 3.2

John Jenkins - Apple Computer, Inc.

Presented by: Lee Collins - Apple Computer, Inc.

Intended Audience:	Managers, Software Engineers, Systems Analysts
Session Level:	Beginner, Intermediate

Sometimes it is necessary to implement support for a script before it has been encoded in Unicode; this presentation discusses the issues involved in doing so, and implementation techniques that can be used.

Summary:

Unicode 3.2 contains nearly sixty thousand characters used in the writing of dozens of scripts and hundreds of different languages. As such, it provides full coverage for virtually every significant writing system in use in the world today. At the same time, there are specialized markets where it's necessary to write using scripts not yet encoded in Unicode. These include the writing systems used by minority groups in various countries, ancient or dead scripts, and fantasy or other artificial scripts.

Many such scripts are already in the pipeline for encoding in the next edition of the Unicode standard, but even here the encoding process can be greatly simplified by developing actual implementations and solving many of the practical issues involved. We will discuss a number of the issues involved in providing temporary, private-use implementations of unencoded writing systems. This includes determining the target audience, analyzing the structure of the script, and determining the repertoire of characters needed. There are also the issues of getting actual support for the script in existing programs running on existing platforms, which means producing fonts and keyboards, at the least. Finally, there are issues with producing an implementation with an eye towards forwarding it as a completed proposal to the Unicode Technical Committee for inclusion in the standard.

Examples will be drawn from a number of scripts, including some already approved for future inclusion in the Unicode standard.

When the world wants to talk, it speaks Unicode

International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

12 December 2002, Webmaster