From rlpowell@digitalkingdom.org Sun Feb 08 12:01:37 2004 Received: with ECARTIS (v1.0.0; list lojban-list); Sun, 08 Feb 2004 12:01:37 -0800 (PST) Received: from rlpowell by chain.digitalkingdom.org with local (Exim 4.30) id 1Apv7T-0001vY-GJ for lojban-list@lojban.org; Sun, 08 Feb 2004 12:01:31 -0800 Date: Sun, 8 Feb 2004 12:01:31 -0800 To: lojban-list@lojban.org Subject: [lojban] Re: Regular expression for brivla? Message-ID: <20040208200131.GO20322@digitalkingdom.org> Mail-Followup-To: lojban-list@lojban.org References: <20040208091817.GN20322@digitalkingdom.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.5.1+cvs20040105i From: Robin Lee Powell X-archive-position: 7054 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: rlpowell@digitalkingdom.org Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list On Sun, Feb 08, 2004 at 04:51:45PM +0100, Arnt Richard Johansen wrote: > On Sun, 8 Feb 2004, Robin Lee Powell wrote: > > > Is there a known regular expression that covers all classes of > > brivla? > > > > What about one that's just really close? > > > > Perl extensions are fine. > > From my (admittedly poor) understanding of regexes suggest that it > is going to be very long-winded. The one you presented is, to my mind, an tiny itty-bitty thing. *This* is a long one; from 'man procmailrc': If the regular expression contains `^FROM_DAEMON' it will be substituted by `(^(Mailing-List:|Precedence:.*(junk|bulk |list)|To: Multiple recipients of |(((Resent-)?(From|Sender)|X-Envelope-From):|>?From )([^>]*[^(.%@a- z0-9])?(Post(ma?(st(e?r)?|n)|office)|(send)?Mail(er)?|daemon|m(mdf|ajordomo)|n?uucp|LIST(SERV|proc)|NETSERV|o(wner|ps) |r(e(quest|sponse)|oot)|b(ounce|bs\.smtp)|echo|mirror|s(erv(ices?|er)|mtp(error)?|ystem)|A(dmin(istrator)?|MMGR |utoanswer))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t ][^<)]*(\(.*\).*)?)?$([^>]|$)))', which should catch mails coming from most daemons (how's that for a regular expression :-). > Basically, you have to look at only the first five letters, > exclude apostrophes, and see if there are at least two > adjacent consonants. OK. > This, of course, presupposes that you have already determined that > the string you are checking is a valid Lojban word. Any idea how to do that? -Robin -- Me: http://www.digitalkingdom.org/~rlpowell/ *** I'm a *male* Robin. "Constant neocortex override is the only thing that stops us all from running out and eating all the cookies." -- Eliezer Yudkowsky http://www.lojban.org/ *** .i cimo'o prali .ui