Date: Thu, 23 Oct 1997 05:39:01 -0500 (EST) Message-Id: <199710231039.FAA23941@locke.ccil.org> Reply-To: "Lee Sau Dan ~{@nJX6X~}" Sender: Lojban list From: "Lee Sau Dan ~{@nJX6X~}" Subject: Re: Dvorak (& Lojban) X-To: Lojban list To: John Cowan In-Reply-To: HACKER G N's message of Thu, 23 Oct 1997 18:22:14 +1000 X-Mozilla-Status: 0011 Content-Length: 3936 Lines: 80 >>>>> "HACKER" == HACKER G N writes: HACKER> Does this mean that there are only as many keys on a HACKER> Chanjie keyboard as there are on a QWERTY keyboard? Yes. Indeed, it uses only 25 keys (excluding the space bar). To input Chinese into the computer, the typical device is a normal (e.g. US 101-key) keyboard. Nowadays, there are other devices, such as pen input tablet (a digitizer, used usually by serious users of CAD) and voice input (employing voice recognition techology). Although these latter devices are more intuitive to use, the inputting speeds are far below that of using a keyboard. Many "methods" have been invented for inputting Chinese into the computer using the standard keyboard. Some of them require the user to type in a phonetic transcription (e.g. Pinyin) of a character and then lists all characters with that pronunciation for the user to select from. These methods are generally slow, because the selection lists are long and the user has to be distracted by the selection. Blind-folded typing is not possible. Changjie, however, bases on character shape (as mentioned before). There are around 100 Changjie radicals, classified into _24_ groups, with each group mapping to ONE of the 26 letter keys of a US keyboard, ignoring the distinction between upper/lower cases. One more key is used for special cases (those character that cannot be easily broken up into radicals). So, only 25 keys are needed. In Changjie, the average number of keystrokes (excluding the space bar, which has to be depressed after keying in the radicals of each character) per character is 3.5. Touch-typing is possible, because the Changjie codes of characters seldom have collisions. So, the user seldom has to select from candidate lists as in the phonetic-based methods. The typing speed can be up to 60 char/min for trained typists, and up to 200 char/min for wizards. HACKER> Because if so, that's pretty impressive, considering how HACKER> many characters there are in Chinese. There aren't many Chinese characters in common use. There are altogether tens of thousands of them, if we count all those that have appeared in these 3000 years, and count variants of the same character as different characters. However, the number of characters frequently used by newspapers is around 6000 -- the number of characters that a ordinary student should have learnt in primary school. For casual writings (e.g. personal letters, diaries, etc.) this count would drop to 3000, or even less. On the computer, around 20000 Chinese characters are available. So, theoretically, if we use only those 26 Latin letters to encode the characters, we can at best achieve a code length of log_{26} (20000) = 3.04 for each character. Some people design inputting methods with as much as 40 base letters (mapped to those 40 keys on and below the row of 1 -- 0) and for such methods, the minimum code length is about: log_{40} (20000) = 2.68 Since it is possible to assign shorter codes to more frequently used characters, the *weighted* average code length can approach _2_, i.e. an average of 2 keystrokes (excluding the space bar which terminates the input of a character) per character is possible. Such inputting methods exist, but are not as common as Changjie, which is available on almost all system supporting BIG5 encoding. -- Lee Sau Dan 'u&u40(Big5) ~{@nJX6X~}(HZ) .----------------------------------------------------------------------------. | http://www.cs.hku.hk/~sdlee e-mail: sdlee@cs.hku.hk | `----------------------------------------------------------------------------'