Unicode is a system standardized by the International Standardisation Organization (ISO) for coding of text characters (like letters, syllable characters, ideograms etc.). Unicode is the attempt to summarize world-wide all well-known text characters in one character set.
The Unicode character set is divided into several planes. Normally only the first plane, called the "Basic Multilingual Plane" (BMP), is used.
In Unicode, matching characters are united into so-called scripts:
After the Latin alphabet follow the Greek, Cyrillic, Hebrew etc. up to the CJK writings. CJK is an acronym for "China, Japan and Korea". It is common to these three countries that they use all a considerable number of ideographs, which go back historically to the Chinese Han dynasty. This Han ideographes are called hanzi in China, kanji in Japan and hanja in Korea and they are summarized as CJK characters in Unicode.
Unicode 2.0/2.1
CJK Unified Ideographs covers 20,902 Chinese ideographs.
Published as ISO/IEC 10646-1:1993 = Unicode 2.0.
Unicode 3.0
Extension A of CJK Unified Ideographs includes 6,582 Chinese ideographs.
Published as ISO/IEC 10646-1:2000 = Unicode 3.0.
Unicode 3.1
Extension B of CJK Unified Ideographs covers over 42,711 Chinese ideographs.
Published as Unicode 3.1.
Extension B is located on Group 00 and Plane 02 (the plane for supplementary Chinese ideographs).
For Windows 2000, surrogates are needed to display such ideographs.
Unicode 4.0
Same as Unicode 3.1 with respect to Han unified Ideographs:
20902 CJK Unified Ideographs
6582 CJK Extension A
42711 CJK Extension B |