Saturday, October 4, 2008

Section headers of a Chinese dictionary

Section headers , also known as ''index keys'' or ''classifiers'', are graphic portions of Chinese characters which are used for organizing entries in Chinese dictionaries in sections which all share the same graphic part. In practice the most common term for these is radical; however, this term has been used in many different ways, leading to great confusion, as explained at Radical . For disambiguation purposes, the term ‘radical’ is thus avoided here.

Since Chinese is not alphabetical, another means of organizing the characters for dictionary purposes is needed. In organizing his etymological dictionary , the scholar categorized all the characters using a system of 540 graphic elements that he called ''bùshǒu'' , the literal translation of which is ''section header''. These were component parts found in different characters and often reflecting some common or characteristic, but also often just a shared graphic element such as a horizontal stroke. Some were even artificially extracted groups of strokes, termed ''glyphs'' by Serruys , which never had an independent existence other than being listed in Shuōwén. Each character was listed under only one element, which is then referred to as "the" section header for that character. For example, characters containing 女 nǚ ‘female’ or 木 mù ‘tree; wood’ are often grouped in that section.

Over time, Chinese lexicographers continued to refine this system for indexing Chinese characters, in order to collect and document them. For convenience, the list of section headers was later trimmed to 214 in the 1615 dictionary ''Zìhuì''. The of 1716 was indexed using the ''Zìhuì'' section headers, and they form the standard list of 214 section headers still used by many dictionaries today. Although there is some variation in such lists -- depending primarily on what secondary section headers are also indexed -- the canonical 214 headers of the Kāngxī dictionary still serve as the basis for most modern Chinese dictionaries. Some of the graphically similar section headers are combined in many dictionaries, such as 月 yuè ‘moon’ and the 月 form of 肉 ròu, ‘meat; flesh’. Mei Yingzuo's ''Zìhuì'' was also the first dictionary to order the characters for each section headers using stroke count - the "section-header-and-stroke-count" method still used in the vast majority of present-day Chinese dictionaries.

Shape and position of section headers in characters


:Note: ''The section below uses Unicode characters from the Kangxi Radicals block. These characters are not always available in common fonts.''

In the examples above, five of the six characters have the section header on the left side but it appears at the bottom in 妾. There is no fixed rule about where a section header can go in a character - it may appear in any position in a character. However, there is one pair of section headers that have the same shape, but are indexed as different section headers depending on where they appear in the character: 阝 (the abbreviated section header form of 邑 yì ‘city’ as in 都 dū ‘metropolis’ , is always on the right side of characters, while 阝 (the abbreviated section header form of 阜 fù ‘mound, hill’ as in 陸 lù ‘land’, is always on the left.


In writing, many components are distorted or change in form in order to fit into a block with other components. They may be narrowed, shortened, or may have different shapes entirely. Changes in shape, rather than simple distortion, may result in a reduction in the number of strokes used to write it. In some cases, these written forms may have several variants. The actual shape of the component when it is used in a character can depend on its placement with respect to the other elements in the character. In the image to the right, the color blue is used for "irregular" forms.

Some of the most important variant written forms :
* 刀 "knife" → 刂 when placed to the right of other elements:
** examples: 分, 召 ~ 刖
** counter-example: 切
* 人 "man" → 亻 on the left:
** 囚, 仄, 坐 ~ 他
* 心 "heart" → 忄 on the left:
** 杺, 您, 恭* ~ 快
: 心 becomes ? when written at the bottom of a character.
* 手 "hand" → 扌 on the left:
** 杽, 拏, 掱 ~ 扡,
** counter-example: 掰,
* 水 "water" → 氵 on the left:
** 汆, 呇, 沊 ~ 池,
** counter-example: 沝,
* 火 "fire" → 灬 at the bottom:
** 伙, 緋, 灱 ~ 黑,
** counter-example: 災,
* 犬 "dog" → 犭 on the left:
** 伏, 突, 嵇 ~ 狙,
* 目 "eye" → rotated 90?:
** 助, 見, 盲 ~ 曼.
The adopted in the People's Republic of China and elsewhere has modified a number of components, including those used as section headers. This has created a number of new section header forms. For instance, 食 is written 飠 when it forms a part of other , but is written 饣 in simplified characters.

Limitations of the section header system


Some of the section headers used in Chinese dictionaries, even in the era of Kāngxī, were not genuinely distinctive graphic elements. They served only to index certain unique characters that do not have more obvious possible section headers. The section header 鬯 is used to index only one character: 鬱 . Modern dictionaries tend to eliminate these kinds of section headers when it is possible to find some more widely used alternative graphic element under which a character can be categorized. In addition, in some modern dictionaries, characters may even be indexed under more than one section header in order to make it easier to find them.

Dictionary lookup


Most dictionaries use section header classification to index and lookup characters, although many present-day dictionaries supplement it with other methods as well. Following the "section-header-and-stroke-count" method of Mei Yingzuo, characters are listed by their section header and then ordered by the number of strokes needed to write them.

The steps involved in looking up a character are:
#Identify the section header under which the character is most likely to have been indexed. If one has no idea, then the component on the left side or top is often a good first guess.
#Find the section of the dictionary associated with that section header.
#Count the number of strokes in the remaining portion of the character.
#Find the pages listing characters under that section header that have that number of additional strokes.
#Find the appropriate entry or experiment with different choices for steps 1 and 3.

For example, consider the character 信 xìn, meaning "truth", "faith", "sincerity", and "trust". Its section header is rén ‘human’ and there are 7 additional strokes in the remaining portion . To look this character up in a dictionary, one finds the section header for "human" in the part of dictionary that indexes section headers. The various section headers will be organized by the number of strokes they themselves contain. 人 and its compressed version contain only two strokes, so it will be near the beginning of the list. Locating it, one can see the page for the index on that section header, and one then normally passes through the lists of characters with one additional stroke, two additional strokes, etc. until one reaches the entries with seven additional strokes. If the section header chosen by the user matches the section header used by the dictionary compiler , and if both the user and the dictionary compiler count strokes the same way , the entry will be in that list, and will appear next to an entry number or a page number where the full dictionary entry for that character can be found.

As a rule of thumb, components in the left or top of the character, or elements which surround the rest of the character are the ones most likely to be used as section header. For example, 信 is typically indexed under the left-side component 人 instead of the right-side 言; and 套 is typically indexed under the top 大 instead of the bottom 長. There are, however, idiosyncratic differences between dictionaries, and except for simple cases, the same character cannot be assumed to be indexed the same way in two different dictionaries.

In order to further ease dictionary lookup, dictionaries sometimes list section headers both under the number of strokes used to write their canonical form and under the number of strokes used to write their variant forms. For example, 心 can be listed as a four-stroke section header but might also be listed as a three-stroke section header because it is usually written as 忄 when it forms a part of another character. This means that the dictionary user need not know that the two are etymologically identical.

It is sometimes possible to find a single character indexed under multiple section headers. For example, many dictionaries list 義 under either 羊 or 戈 . Furthermore, with digital dictionaries, it is now possible to search for characters by cross-reference. Using this ''multi-component method'' , a relatively new development enabled by computing technology, the user can select ''all'' of a character's components from a table and the computer will present a list of matching characters. This eliminates the guesswork of choosing the correct section header and calculating the correct stroke count, and cuts down searching time significantly. One can query for characters containing both 羊 and 戈, and get back only five characters to search through. The Academia Sinica’s 漢字構形資料庫 Chinese character structure database also works this way, returning only seven characters in this instance. Harbaugh’s Chinese Characters dictionary similarly allows searches based on any component.

Variations in the number of section headers


Though section headers are widely accepted as a method to categorize Chinese characters and to locate a certain character in a dictionary, there is no universal agreement about either the exact number of section headers, or the set of section headers. This is because section headers are merely arbitrarily chosen categories for lexicographical purposes.

The act as a ''de facto'' standard, which may not be duplicated exactly in every Chinese dictionary, but which few dictionary compilers can afford to completely ignore. They serve as the basis for many computer encoding systems. Specifically, the Unicode standard's radical-stroke charts are based on the Kangxi radicals or section headers.

The count of commonly used section headers in modern abridged dictionaries is often less than 214. The ''Oxford Concise English-Chinese Dictionary'' , for example, has 188. A few dictionaries also introduce new section headers based on the principles first used by Xu Shen, treating groups of section headers that are used together in many different characters as a kind of section header.

In modern practice, section headers are primarily used as tools and as learning aids when writing characters. They have become increasingly disconnected from , etymology and phonetics.

Unicode

No comments: