Representation of Balochi, Punjabi, Pushto, Sindhi Characters
in Unicode
Khaver Zia &
Sameen Reza - Beaconhouse-Informatics Computer Institute
Intended Audience: |
Software Engineers, Systems Analysts, Font Designers, Technical
Writers |
Session Level: |
Beginner, Intermediate |
The coded character set of Pakistan's national language, Urdu
has been standardized and its mapping into Unicode has been
discussed in an earlier paper published in IUC22. This paper
contains a similar exercise for four important regional languages
of Pakistan i.e. Balochi, Punjabi, Pushto and Sindhi. The paper
provides background information about these languages: their
history, characteristics and usage along with samples from
published material. It presents character sets of these languages
and identifies mappings of their characters to Unicode. The issues discussed in the paper will provide a basis for
future research towards codifying the characters of these
languages, in identifying relationship with Unicode as well as in
formulating sets of collating sequences and associated
algorithms. KeywordsBalochi, Character codes, Code table, Coded character set,
Encoding, Multilingual Processing, Punjabi, Pushto, Sindhi,
Standardization, Unicode ConclusionISO/IEC 10646 /Unicode is fast assuming a standard for
representing national character codes. It is imperative that all
the major languages of the South Asian region are represented in
Unicode so that associated multilingual application development can
take place. This paper takes a step in this direction by
identifying the character sets of four important regional languages
of Pakistan and mapping their characters to Unicode. |