Message-ID: <3485CF5E.2129@locke.ccil.org> Date: Wed, 03 Dec 1997 16:30:06 -0500 From: John Cowan Organization: Lojban Peripheral X-Mailer: Mozilla 3.0 (WinNT; I) MIME-Version: 1.0 To: Lojban List Subject: Types and tokens (was: What the ...) References: <199712031444.JAA01506@locke.ccil.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mozilla-Status: 0001 Content-Length: 1831 X-From-Space-Date: Wed Dec 03 16:30:06 1997 X-From-Space-Address: - la mark. clsn. cusku di'e > I got a little lost with this type/token stuff you're using here, and I > thought I sort of understood how texts worked in Lojban. Do you mean > "type" and "token" sort of like non-terminal and terminal in a formal > grammar? Or like a terminal and the specific instance? (e.g. KOhA-type, > with the token "da" instantiating it) Or something else entirely? Just > trying to keep up with the Rostas... Something else. Tokens are actual instances of things, and types are classes whose membership criterion is equality. Usually the terms are only applied to linguistic objects, or rather the graphical instances thereof. Thus in "The cat sat on the mat", there are 6 tokens at the word level and 22 tokens at the letter level, but only 5 word types (, , , , and ) and 10 letter types ignoring case (space, , , , , , , , , ). For Unix weenies, the command "wc -w" counts word tokens, and the command "tr A-Z a-z | tr -cs a-z | sort -u | wc -l" counts word types. I was pointing out that you can consider to be a type too, a sentence type. Are sentence types composed of word types? That seems intuitive, since sentence tokens are obviously composed of word tokens. But it leads to a nasty problem: are there five or six word types in ? If five, then it seems to be the same type as , which is obviously false; if six, then there are two distinct -types, which contradicts the definition of "type". (Bonus: the lovely phrase "hapax legomenon" means a type with only a single token in a given body of writing, typically all the writing that exists in a particular dead language.) -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org e'osai ko sarji la lojban