Nineteenth International Unicode Conference

Unicode and Indic Scripts: How to Hide a Good Implementation

Michael Kaplan - Trigeminal Software, Inc.

Intended Audience:	Manager, Software Engineer, Systems Analyst, Marketer
Session Level:	Intermediate

The Unicode consortium has worked hard to encode all of the worlds scripts. There is almost no place in the world where that work is as carefully thought out, technically impressive, and full-featured as in the Indic scripts: Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, and Malayalam.

However, adoption of Unicode in the regions that use these languages has not been as widespread as everyone might have hoped. This presentation will seek to explain many of these issues:

the problem of trying to describe all Indic scripts through the language semantics of Devanagari
issues with collation and making collation work properly
the strange and terrible [dis]connection with ISCII
why the other solutions ("font hacks") end up being chosen

In the end, there will be a much better understanding of how Unicode has "hidden" there technically acceptable implementation so that none of its primary customers even know it is there -- and thus how this could be changed.

When the world wants to talk, it speaks Unicode

International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

22 Jun 2001, Webmaster