From nobody@digitalkingdom.org Mon Aug 24 23:52:15 2009 Received: with ECARTIS (v1.0.0; list lojban-list); Mon, 24 Aug 2009 23:52:15 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.69) (envelope-from ) id 1Mfpt8-0006tM-Js for lojban-list-real@lojban.org; Mon, 24 Aug 2009 23:52:14 -0700 Received: from mail-vw0-f179.google.com ([209.85.212.179]) by chain.digitalkingdom.org with esmtp (Exim 4.69) (envelope-from ) id 1Mfpt4-0006sp-Hf for lojban-list@lojban.org; Mon, 24 Aug 2009 23:52:14 -0700 Received: by vws9 with SMTP id 9so2355028vws.25 for ; Mon, 24 Aug 2009 23:52:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=JAcGL4nZoC+crnp36TGBXhk8AxlROMawYS5lHI1qdRY=; b=Zj+MshVaXf+GMteN1JZuvKwmqKjYL8Sk/0JZCsUT+wU4Spz9znJApFFKX1CqXzLyF3 nQuPIcUiE6hKlNi1IJV+zE5bVch3+7Mrsv6Te5s9pa1PXN4jl62TGzQqiNBm248VEsiD 7MU/XnWASJ+WtDZV2iVZ55gKjBYffz1aOZ4MU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=uQeCi5EH1w+EEvYLcim+5ds3dfNbiUZdFJvi61OQYRsyR6swL5GXxzXxuhzPG1f+QT jv1x8jyvqCeL5xu0a6zEKjL0QkXyKgONdKPrMBNKhnanjmPgJ8czpm2/4KM5QYnmpwV3 8ObDwknxxdBi+8UxXp6AZmMcpjhgZSt89feO8= MIME-Version: 1.0 Received: by 10.220.79.84 with SMTP id o20mr7463145vck.82.1251183124314; Mon, 24 Aug 2009 23:52:04 -0700 (PDT) In-Reply-To: <5715b9300908241856w6a420187n865b106923685bc0@mail.gmail.com> References: <5715b9300908241856w6a420187n865b106923685bc0@mail.gmail.com> Date: Tue, 25 Aug 2009 02:52:04 -0400 Message-ID: <95b2fb130908242352y3439b2d4u9d7f53c093ae9647@mail.gmail.com> Subject: [lojban] Re: How many possible gismu? From: "H. Felton" To: lojban-list@lojban.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-archive-position: 16013 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: fagricipni@gmail.com Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list On Mon, Aug 24, 2009 at 9:56 PM, Luke Bergen wrote: > I've read your previous emails and am also not sure where O(n!) came from. > Could you put the requirements (input, expected output, rules, etc...) into > a short, concise list? You'll be happy to know that I found a lower bound, search for the text string "LOWER BOUND" to quickly reach it, but a brief summary is that there is no way to use less than 7429 additional gismu to completely block up the gismu list: "So even given the rate at which I imagine that new gismu might be created; it would be extremely unusual for there to be more than 10 per century, usually less -- 10 per century would be 100 per millennium, leading to at least 7429/100 = 74+ millennia before there is even a possibility of running out of gismu." You think that this is too small; it is; as I discuss under the title mentioned above, there is no chance that the "limit" of 7429 is the actual limit; I have merely proved it to be a lower bound; the actual value can not be less than 7429; there is no doubt in my mind that it is larger; I just can't determine how much larger. There are two questions I am considering: 1) What is the _smallest_ number of new gismu that could fill up gismu space? 2) What is the _smallest_ number of gismu that could fill gismu space, starting from an empty gismu space? The rules are that a gismu blocks the use of some possible gismu that differ by one letter. The blocked possible gismu (eg, "gismu") are: 1) gismu that differ only in the last letter (eg, "gisma", "gisme", "gismi", "gismo" 2) The gismu that differ by one letter for the sets below: b p, v c j, s d t f p, v g k, x j c, z k g, x l r m n n m p b, f r l s c, z t d v b, f x g, k z j, s (eg, "kismu", "xismu", "gicmu", "gizmu", "gisnu") There are two gismu forms CVCCV and CCVCV. Since the 2nd and 3rd characters are VC in the CVCCV and CV in to CCVCV, they can not differ in just one letter. So a CVCCV can not block a CCVCC, nor can a CCVCV block a CVCCV, so at least these two sets can be considered separately. Consider the CVCCV. The CC is a permissible consonant pair; there are 7 unvoiced consonants, 6 voiced consonants, and 4 syllabic consonants. A permissible consonant pair can be made with: 1) Two unvoiced consonants; given the restriction on double consonants: this means 7*6 pairs or 42 of this type. 2) Two voiced consonants; given the restriction on double consonants: this means 6*5 pairs or 30 of this type. 3) A unvoiced consonant followed by a syllabic consonant: 7*4 = 28 4) A syllabic consonant followed by a unvoiced consonant: 4*7 = 28 5) A voiced consonant followed by a syllabic consonant: 6*4 = 24 6) A syllabic consonant followed by a voiced consonant: 4*6 = 24 7) Two syllabic consonants; given the restriction on double consonants: 4*3=12 The sum of these numbers is 188. However, the otherwise legal pairs; "cs" ,"sc" ,"jz" ,"zj", "cx", "xc", "kx", "xk", and "mz" are forbidden. Subtracting out these 9 leaves 179 possible values for the CC in CVCCV. For each V there are 5 possibilites, for the solitary C there are 17. The total possible CVCCV gismu is 17*5*179*5 = 76075. In order to make sure that I get the worst possible case; ie, the one that uses the fewest gismu, I have to test all possibilities. Let's use the second question, I make a list of all the possibles: "babda" to "zuzvu". Then start by placing "babda" in place and marking off all gismu that prevents: "babde", "babdi", "babdo", "babdu", "pabdu", "tabdu". The gismu "bapda" and "babta" are already probitted because "pd" and "bt" are not permissible consonant pairs in the first place. So then, I place the next possible word on the list; it turns out to be "babga", mark the now unusable gismu, and so on. Finally I get to count the gismu when space runs out. Then I have remove the last added gismu remove the mark from the ones that excluded and try the next possible one of the now available gismu, and see if that fills gismu space, then when I finish counting all possibles that are made by taking one off the list. Then I need to take not only the last one added, but the next to the last one added. Then I place a new one in the next available space left in alphabetical order. Then try other possible next gismu for that. Because remember, it is not likely that I will get the _smallest_ number to fully pack the CVCCV gismu space on the first try; so eventually I have to test all possibles and see what the smallest count is. It has been a long time since I first considered this problem; I have forgotten why I concluded that the algorithm was O(n!), rather than O(e^n), which is worse. The procedure for packing CCVCV gismu space is the same, except there are only 48 permissible initial pairs: 48*5*17*5 or only 20400 slots there. It seems to me though that at least in the case that one starts with empty gismu space, one should be able to use the fact that there are 6 collision cycles: b-p-f-v-back to b, c-j-z-s-, g-k-x-, d-t-, l-r-, m-n-. Notice all the cycles but the first end with "-" because each cycle catches its own tail; the first catches its own tail, too, I just explicitly noted that in the cycle back at the end. The first problem is that while "nz" in a place should block "mz", "mz" is already forbidden as a permissible pair. One could try to use the fact that in a dual voiced pair; eg, "bg", "b" can not block "p" because the other consonant, "g", is voiced and "p" is unvoiced; but in "tablu", "b" does block "p", because "pl" is a permissible pair; so "taplu" would otherwise be permitted. The problem is made worse by the fact that some pairs that would be permitted under the unvoiced/voiced/syllabic rule are forbidden; I mentioned "mz"; there are 8 more. This problem is even worse for the CCVCV gismu space, since there are 128 otherwise legal spaces that are forbidden; and the rest don't follow any truly regular pattern. There may be a way to figure out a "most efficient"; ie, using the fewest gismu; to block up gismu space without searching through all possibles; I'd think something could be done with ranking the consonant pairs by how efficiently they block other consonant pairs; ie, how many consonant pairs does "bl" block? When used as a permissible pair in the CVCCV space problem, and then when used as a permissible initial pair in the CCVCV space problem? And I haven't even considered the first question, in which the current gismu are already "fixed" in place, in the analysis. LOWER BOUND I can readily believe that the first question -- What is the _smallest_ number of new gismu that could fill up gismu space? (Assumes that the current gismu are "fixed") -- is too computationally intensive to be solved with current computers. The second question -- What is the _smallest_ number of gismu that could fill gismu space, starting from an empty gismu space? -- is not likely to be so; I strongly suspect that I am missing some trick that could vastly reduce the number of operations required. Remember, of course, these are really only interesting question as mathematical puzzles; even if someone does come up with the answer, or even comes up with an idea that helps me reach an answer; the answer won't be used for any practical purpose. The one "minimum" calculation that I have been able to come up with is that each of the 3 consonant can block _at most_ 2 gismu each, the final vowel rule means that each gismu blocks another 4 gismu, so no gismu can block more than 10 other gismu; using this worst-worst case analysis each gismu can hold at most 11 slots in gismu space; 1 for the gismu itself, 10 for the other blocked slots. Given 76075 CVCCV slots, this means that there is no way to cover that space with less than 6916 gismu; 6915 could block up only 6915*11 = 76065; there are now 10 slots left, so, in theory, the 6915th gismu could cover them all. In CCVCV space, there are 20400 slots; this leads to an absolute minimum 1855 gismu to lock out all possible slots; 1854 could block up 1854*11 = 20394 slots, leaving the 1855th one to block up the remaining 6 open slots. That makes a total of 1855+6916 = 8771. Having thought of this analysis, there is room for the number of gismu to expand by another 5 times its current size; including the brod-series, there are 1342 current gismu; even if there were room for only 8771 or 8771-1342 = 7429. 7429/1342 = 5.5+. So even given the rate at which I imagine that new gismu might be created; it would be extremely unusual for there to be more than 10 per century, usually less -- 10 per century would be 100 per millennium, leading to at least 7429/100 = 74+ millennia before there is even a possibility of running out of gismu. Of course, there is no way in censored that each gismu is going to block up 10 slots of its very own in addition to itself, so this is a very conservative minimum. I'm more than comfortable with the room to grow. To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.