Unicode Support in Oracle9i Database - New Unicode Feature
Intended Audience: |
Manager, Software Engineer, Systems Analyst, Marketer |
Session Level: |
Intermediate |
As global ebusiness continues its growth into every aspect of industry
as an infrastructure for information and business management,
it is becoming very crucial to make internet application with
multilingual capability. Unicode is widely used as a base to provide
this capability.
Oracle supports Unicode as UTF-8 encoding since Oracle7. In the newest
Oracle9i release, this support is further enhanced to better serve the
global ebusiness needs. Codepoint semantics is introduced for text data
so that UTF-16 semantics can be built upon UTF-8 encoding which will
easily support application server that is built upon UTF-16. The benefit
of this solution is to reduce the migration effort and to increase
storage efficiency for Latin data. A Unicode data type is introduced to
build Unicode application independent of database character set. This
data type enables existing application to be gradually migrated into
Unicode and it can pick either UTF-8 or UTF-16 as its encoding for more
storage efficiency based on data distribution.
This paper describes the functionality of codepoint semantics and the
new Unicode data type in Oracle9i database release. Design choices, such
as codepoint semantics vice Unicode data type, UTF-8 vice UTF-16, will
be discussed. A brief description of new Unicode access interface will
be given.
|