Unicode and Indic Scripts: How to Hide a Good Implementation
Intended Audience: |
Manager, Software Engineer, Systems Analyst, Marketer |
Session Level: |
Intermediate |
The Unicode consortium has worked hard to encode all of the worlds scripts.
There is almost no place in the world where that work is as carefully
thought out, technically impressive, and full-featured as in the Indic
scripts: Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu,
Kannada, and Malayalam.
However, adoption of Unicode in the regions that use these languages has not
been as widespread as everyone might have hoped. This presentation will seek
to explain many of these issues:
- the problem of trying to describe all Indic scripts through the language
semantics of Devanagari
- issues with collation and making collation work properly
- the strange and terrible [dis]connection with ISCII
- why the other solutions ("font hacks") end up being chosen
In the end, there will be a much better understanding of how Unicode has
"hidden" there technically acceptable implementation so that none of its
primary customers even know it is there -- and thus how this could be
changed.
|