From cowan Sat Mar 6 22:57:53 2010 Subject: TECH: missing part of morphology paper To: lojban@cuvmb.cc.columbia.edu (Lojban List) From: cowan Date: Wed, 8 Nov 1995 12:52:41 -0500 (EST) X-Mailer: ELM [version 2.4 PL24] Content-Type: text Content-Length: 5665 Status: OR X-From-Space-Date: Wed Nov 8 12:52:41 1995 X-From-Space-Address: cowan Message-ID: When I posted the morphology paper, section 13 (examples of lujvo making) was unwritten. The Synopsis version was clearly unacceptable and I had nothing ready as a substitute. Here is that section of the paper. Commentators, please note. =====cut here===== 13. lujvo-making Examples This section contains examples of making and scoring lujvo. First, we will start with the tanru "gerku zdani" ("dog house") and construct a lujvo meaning "doghouse", that is, a house where a dog lives. We will use a brute-force application of the algorithm in Section 12, using every possible rafsi. The rafsi for "gerku" are "-ger-", "-ge'u-", "-gerk-", and "-gerku". The rafsi for "zdani" are "-zda-", "-zdan-", and "-zdani". Step 1 of the algorithm directs us to use "-ge'u-" and "-gerk-" as possible rafsi for "gerku"; Step 2 directs us to use "-zda-" and "-zdani" as possible rafsi for "zdani". The four possible forms of the lujvo are then: "ge'u-zda", "ge'u-zdani", "gerk-zda", and "gerk-zdani". We must then insert appropriate hyphens in each case. The first form, "ge'u-zda", needs no hyphen, because even though the first rafsi is CVV, the second one is CCV, so there is a consonant cluster in the first five letters. So "ge'uzda" is this form of the lujvo. The second form, "ge'u-zdani", however, requires an "r"-hyphen; otherwise, the "ge'u-" part would fall off as a cmavo. So this form of the lujvo is "ge'urzdani". The last two forms require "y"-hyphens, as all 4-letter rafsi do, and so are "gerkyzda" and "gerkyzdani" respectively. The scoring algorithm is heavily weighted in favor of short lujvo, so we might expect that "ge'uzda" would win. Its L score is 7, its A score is 1, its H score is 0, its R score is 13, and its V score is 3, for a final score of 26133. The other forms have scores of 22994, 24492, and 22453 respectively. Consequently, this lujvo would probably appear in the dictionary in the form "ge'uzda". For the next example, we will use the tanru "bloti klesi" ("boat class") presumably referring to the category (rowboat, motorboat, cruise liner) into which a boat falls. We will omit the long rafsi from the process, since lujvo containing long rafsi are almost never preferred by the scoring algorithm. The rafsi for "bloti" are "-lot-", "-blo-", and "-lo'i-"; for "klesi" they are "-kle-" and "-lei-". Both these gismu are among the handful which have both CVV-form and CCV-form rafsi, so there is an unusual number of possibilities available for a two-part tanru: lotkle blokle lo'ikle lotlei blolei lo'irlei Only "lo'irlei" requires hyphenation (to avoid confusion with the cmavo sequence "lo'i lei"). All six forms are valid versions of the lujvo, as are the six further forms using long rafsi; however, the scoring algorithm produces the following results: lotkle 26622 blokle 26642 lo'ikle 26113 lotlei 25633 blolei 25653 lo'irlei 25044 So the form "blokle" is preferred, but only by a tiny margin over "lotkle"; the next three forms are only slightly worse, and only "lo'irlei" suffers because of its hyphen. Our third example will result in forming both a lujvo and a name from the tanru "logji bangu girzu", or "logical-language group" in English. The available rafsi are "-loj-" and "-logj-"; "-ban-", "-bau-", and "-bang-"; and "-gri-" and "-girzu", and (for name purposes only) "-gir-" and "-girz-". The resulting 12 lujvo possibilities are: loj-ban-gri loj-bau-gri loj-bang-gri logj-ban-gri logj-bau-gri logj-bang-gri loj-ban-girzu loj-bau-girzu loj-bang-girzu logj-ban-girzu logj-bau-girzu logj-bang-girzu and the 12 name possibilities are: loj-ban-gir. loj-bau-gir. loj-bang-gir. logj-ban-gir. logj-bau-gir. logj-bang-gir. loj-ban-girz. loj-bau-girz. loj-bang-girz. logj-ban-girz. logj-bau-girz. logj-bang-girz. After hyphenation, we have: lojbangri lojbaugri lojbangygri logjybangri logjybaugri logjybangygri lojbangirzu lojbaugirzu lojbangygirzu logjybangirzu logjybaugirzu logjybangygirzu lojbangir. lojbaugir. lojbangygir. logjybangir. logjybaugir. logjybangygir. lojbangirz. lojbaugirz. lojbangygirz. logjybangirz. logjybaugirz. logjybangygirz. The only fully reduced lujvo forms are "lojbangri" and "lojbaugri", of which the latter has a slightly higher score: 23673 versus 23704, respectively. However, for the name of the organization, we chose to make sure the name of the language was embedded in it, and to use the clearer long-form rafsi for "girzu", producing "lojbangirz." Finally, here is a four-part lujvo with a cmavo in it, due to James Cooke Brown: "nakni ke cinse ctuca" or "male (sexual teacher)". The "ke" cmavo ensures the interpretation "teacher of sexuality who is male", rather than "teacher of male sexuality". Here are the possible forms of the lujvo, both before and after hyphenation: nak-kem-cin-ctu nakykemcinctu nak-kem-cin-ctuca nakykemcinctuca nak-kem-cins-ctu nakykemcinsyctu nak-kem-cins-ctuca nakykemcinsyctuca nakn-kem-cin-ctu naknykemcinctu nakn-kem-cin-ctuca naknykemcinctuca nakn-kem-cins-ctu naknykemcinsyctu nakn-kem-cins-ctuca naknykemcinsyctuca Of these forms, "nakykemcinctu" is the shortest and is preferred by the scoring algorithm. On the whole, however, it might be better to just make a lujvo for "cinse ctuca" (which would be "cinctu") since the sex of the teacher is rarely important. If there was a reason to specify "male", then the simpler tanru "nakni cinctu" ("male sexual-teacher") would be appropriate. This tanru is actually shorter than the four-part lujvo, since the "ke" required for grouping need not be expressed. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban.