Number of Chinese Characters

The total number of Chinese characters from past to present remains unknowable because new ones are developed all the time - for instance, brands may create new characters when none of the existing ones allow for the intended meaning. Chinese characters are theoretically an open set and anyone can create new characters as they see fit. Such inventions are however often excluded from officialized character sets. The number of entries in major Chinese dictionaries is the best mean of estimating the historical growth of character inventory.

Number of characters in Chinese dictionaries
YearName of dictionaryNumber of characters
100 Shuowen Jiezi 9,353
543? Yupian 12,158
601 Qieyun 16,917
1011 Guangyun 26,194
1039 Jiyun 53,525
1615 Zihui 33,179
1716 Kangxi Zidian 47,035
1916 Zhonghua Da Zidian 48,000
1989 Hanyu Da Zidian 54,678
1994 Zhonghua Zihai 85,568
2004 Yitizi Zidian 106,230
Number of Chinese characters in non-Chinese dictionaries
YearCountryName of dictionaryNumber of characters
2003 Japan Dai Kan-Wa jiten 50,000+
2008 South Korea Han-Han Dae Sajeon 53,667

Comparing the Shuowen Jiezi and Hanyu Da Zidian reveals that the overall number of characters recorded in dictionaries has increased 577 percent over 1,900 years. Depending upon how one counts variants, 50,000+ is a good approximation for the current total number. This correlates with the most comprehensive Japanese and Korean dictionaries of Chinese characters; the Dai Kan-Wa jiten has some 50,000 entries, and the Han-Han Dae Sajeon has over 57,000. The latest behemoth, the Zhonghua Zihai, records a staggering 85,568 single characters, although even this fails to list all characters known, ignoring the roughly 1,500 Japanese-made kokuji given in the Kokuji no Jiten as well as the Chu Nom inventory only used in Vietnam in past days.

Modified radicals and obsolete variants are two common reasons for the ever-increasing number of characters. There are about 300 radicals and 100 are in common use. Creating a new character by modifying the radical is an easy way to disambiguate homographs among xíngshēngzì pictophonetic compounds. This practice began long before the standardization of Chinese script by Qin Shi Huang and continues to the present day. The traditional 3rd-person pronoun  (他 "he; she; it"), which is written with the "person radical", illustrates modifying significs to form new characters. In modern usage, there is a graphic distinction between  (她 "she") with the "woman radical",  (牠 "it") with the "animal radical",  (它 "it") with the "roof radical", and  (祂 "He") with the "deity radical", One consequence of modifying radicals is the fossilization of rare and obscure variant logographs, some of which are not even used in Classical Chinese. For instance, he 和 "harmony; peace", which combines the "grain radical" with the "mouth radical", has infrequent variants 咊 with the radicals reversed and 龢 with the "flute radical".

Chinese

It is usually said that about 2,000 characters are needed for basic literacy in Chinese (for example, to read a Chinese newspaper), and a well-educated person will know well in excess of 4,000 to 5,000 characters. Note that Chinese characters should not be confused with Chinese words, as the majority of modern Chinese words, unlike their Old Chinese and Middle Chinese counterparts, are multi-morphemic and multi-syllabic compounds, that is, most Chinese words are written with two or more characters; each character representing one syllable. Knowing the meanings of the individual characters of a word will often allow the general meaning of the word to be inferred, but this is not invariably the case.

In the People's Republic of China, which uses Simplified Chinese characters, the Xiàndài Hànyǔ Chángyòng Zìbiǎo (现代汉语常用字表; Chart of Common Characters of Modern Chinese) lists 2,500 common characters and 1,000 less-than-common characters, while the Xiàndài Hànyǔ Tōngyòng Zìbiǎo (现代汉语通用字表; Chart of Generally Utilized Characters of Modern Chinese) lists 7,000 characters, including the 3,500 characters already listed above. GB2312, an early version of the national encoding standard used in the People's Republic of China, has 6,763 code points. GB18030, the modern, mandatory standard, has a much higher number. The Hànyǔ Shuǐpíng Kǎoshì proficiency test covers approximately 5,000 characters.

In the ROC, which uses Traditional Chinese characters, the Ministry of Education's Chángyòng Guózì Biāozhǔn Zìtǐ Biǎo (常用國字標準字體表; Chart of Standard Forms of Common National Characters) lists 4,808 characters; the Cì Chángyòng Guózì Biāozhǔn Zìtǐ Biǎo (次常用國字標準字體表; Chart of Standard Forms of Less-Than-Common National Characters) lists another 6,341 characters. The Chinese Standard Interchange Code (CNS11643)—the official national encoding standard—supports 48,027 characters, while the most widely used encoding scheme, BIG-5, supports only 13,053.

In Hong Kong, which uses Traditional Chinese characters, the Education and Manpower Bureau's Soengjung Zi Zijing Biu (常用字字形表), intended for use in elementary and junior secondary education, lists a total of 4,759 characters.

In addition, there is a large corpus of dialect characters, which are not used in formal written Chinese but represent colloquial terms in non-Mandarin Chinese spoken forms. One such variety is Written Cantonese, in widespread use in Hong Kong even for certain formal documents, due to the former British colonial administration's recognition of Cantonese for use for official purposes. In Taiwan, there is also an informal body of characters used to represent the spoken Hokkien (Min Nan) dialect. Many dialects have specific characters for words exclusive to the dialect, for example, the vernacular character F35B hakka cii11.png, pronounced cii11 in Hakka, means "to kill". Furthermore, Shanghainese Chinese also has its own series of written text, but these are not widely used in actual texts, Mandarin being the preference for all mainland regions. (For instance, 㑚, 哎垯, and 呒没, all of which are widely known and used by Shanghainese.)

Japanese

In Japanese there are 1,945 Jōyō kanji (常用漢字 lit. "frequently used kanji") designated by the Japanese Ministry of Education; these are taught during primary and secondary school. The list is a recommendation, not a restriction, and many characters missing from it are still in common use.

The one area where character usage is officially restricted is in names, which may contain only government-approved characters. Since the Jōyō kanji list excludes many characters which have been used in personal and place names for generations, an additional list, referred to as the Jinmeiyō kanji (人名用漢字 lit. "kanji for use in personal names"), is published. It currently contains 983 characters, bringing the total number of government-endorsed characters to 2928. (See also the Names section of the kanji article.)

Today, a well-educated Japanese person may know upwards of 3,500 kanji. The kanji kentei (日本漢字能力検定試験 Nihon Kanji Nōryoku Kentei Shiken or Test of Japanese Kanji Aptitude) tests a speaker's ability to read and write kanji. The highest level of the kanji kentei tests on 6,000 kanji, though in practice few people attain (or need to attain) this level.

Written Japanese also includes a pair of syllabic scripts known as kana, which are used in combination with kanji. Not all words in modern Japanese can be expressed with kanji alone, requiring the use of kana in written communication.

Korean

In times past, until the 15th century, in Korea, Literary Chinese was the dominant form of written communication, prior to the creation of hangul, the Korean alphabet. Much of the vocabulary, especially in the realms of science and sociology, comes directly from Chinese, comparable to Latin or Greek root words in European languages. However due to the lack of tones in Korean, as the words were imported from Chinese, many dissimilar characters took on identical sounds, and subsequently identical spelling in hangul. Chinese characters are sometimes used to this day for either clarification in a practical manner, or to give a distinguished appearance, as knowledge of Chinese characters is considered a high class attribute and an indispensable part of a classical education. It is also observed that the preference for Chinese characters is treated as being conservative and Confucian.

In Korea, 한자 hanja have become a politically contentious issue, with some Koreans urging a "purification" of the national language and culture by totally abandoning their use. These individuals encourage the exclusive use of the native hangul alphabet throughout Korean society and the end to character education in public schools.

In South Korea, educational policy on characters has swung back and forth, often swayed by education ministers' personal opinions. At times, middle and high school students have been formally exposed to 1,800 to 2,000 basic characters, albeit with the principal focus on recognition, with the aim of achieving newspaper-literacy. Since there is little need to use hanja in everyday life, young adult Koreans are often unable to read more than a few hundred characters.

There is a clear trend toward the exclusive use of hangul in day-to-day South Korean society. Hanja are still used to some extent, particularly in newspapers, weddings, place names andcalligraphy. Hanja is also extensively used in situations where ambiguity must be avoided, such as academic papers, high-level corporate reports, government documents, and newspapers; this is due to the large number of homonyms that have resulted from extensive borrowing of Chinese words.

The issue of ambiguity is the main hurdle in any effort to "cleanse" the Korean language of Chinese characters. Characters convey meaning visually, while alphabets convey guidance to pronunciation, which in turn hints at meaning. As an example, in Korean dictionaries, the phonetic entry for 기사 gisa yields more than 30 different entries. In the past, this ambiguity had been efficiently resolved by parenthetically displaying the associated hanja.

In the modern hangul-based Korean writing system, Chinese characters are no longer used to represent native morphemes.

In North Korea, the government, wielding much tighter control than its sister government to the south, has banned Chinese characters from virtually all public displays and media, and mandated the use of hangul in their place.

Vietnamese

the 3 names of Chinese character in Vietnamese: chữ Hán, chữ Nho, Hán tự

Although now nearly extinct in Vietnam, varying scripts of Chinese characters (hán tự) were once in widespread use to write the language, although hán tựbecame limited to ceremonial uses beginning in the 19th century. Similarly to Japan and Korea, Chinese (especially Literary Chinese) was used by the ruling classes, and the characters were eventually adapted to write Vietnamese. To express native Vietnamese words which had different pronunciations from the Chinese, Vietnamese developed the Chữ Nôm script which used various methods to distinguish native Vietnamese words from Chinese. Vietnamese is currently exclusively written in the Vietnamese alphabet, a derivative of the Latin alphabet.


DiggDigg   | RedditReddit   | Add to Mixx!MixxDeldel.icio.usStumble Stumble it!Bookmark and Share Share it

更多文章
華文 新闻标题
PCI插槽与MiniPCIE插槽_真都有用...
Published:Thu, 08 Sep 2011 22:23:57 GMT+00:00
PCI插槽与MiniPCIE插槽_真都有用吗?细数ITX小板上的那些插槽中关村在线“古老”而又“顽强”的插槽中非PCI莫属。即便是在intel ICH10南桥全......
PC业务拆分独立运营已成定局_...
Published:Thu, 08 Sep 2011 22:23:56 GMT+00:00
PC业务拆分独立运营已成定局_其实早有计划?解读HP拆分PC业务五步棋中关村在线2004年,惠普调整全球架构,将PC、笔记本电脑、掌上电......
导弹,中国的杀手锏?_军事看...
Published:Thu, 08 Sep 2011 08:51:48 GMT+00:00
搜狐导弹,中国的杀手锏?_军事看台_中国广播网(组图)搜狐从无到有、从弱到强,筚路蓝缕中,中国飞航导弹事业创造出新中国一个又......
至强E7完成高端x86突破_剑指小...
Published:Thu, 08 Sep 2011 22:41:24 GMT+00:00
至强E7完成高端x86突破_剑指小型机 浪潮联手Intel推“天阶工程”中关村在线浪潮一直是中国服务器产业高端突破的核心力量,此次启动天......
多任务切换毫无可能_砸掉遥控...
Published:Thu, 08 Sep 2011 23:01:32 GMT+00:00
多任务切换毫无可能_砸掉遥控器 痛斥智能电视四大痛苦体验中关村在线之前IOS系统被人诟病最多的就是无法实现多任务同时进行,导致......