From lojban+bncCOib25n_BhCeyIPqBBoE6KCqGw@googlegroups.com Wed Jan 26 19:19:14 2011 Received: from mail-yx0-f189.google.com ([209.85.213.189]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1PiIO2-0002j1-Tz; Wed, 26 Jan 2011 19:19:14 -0800 Received: by yxn35 with SMTP id 35sf765056yxn.16 for ; Wed, 26 Jan 2011 19:19:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:x-beenthere:received-spf:mime-version :in-reply-to:references:date:message-id:subject:from:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :sender:list-subscribe:list-unsubscribe:content-type; bh=Rrd16KUgdA39qpMa4jCCCFG2ctnPedfZg3jEJcxIDc8=; b=3TsywpyEwLl1sQBOKGxTFTJuCXST/pXA5OdXoXrmw4AR/VqCPPN6jkUmuXe83NGB2f roQkrQE9jIoXikwKrFjVtE00JHuzTBzGEMZY/UVeTt3AYF9c/CaIKUzEI3tIkVR8wX3q 3To1fgFWa/IOqG2o3wuZOxBtAY1FUG7FJz0hM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:mime-version:in-reply-to:references:date :message-id:subject:from:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type; b=VITKRUxLRvIbq9TssD5oQU3DAcCD07FlJyOZgbeJ082zTRFwoVAKa8SdtbZnz4m/X0 OQOBADZ0ljcp9DYfLta1Yg0jTvERUuM8oBG0J2587mJXhrEFRuJnjSUtmUdLH/qQe5kg pp6vnytvAwUngkV6hBWs49EFxsIPrNh8I2xuM= Received: by 10.100.58.13 with SMTP id g13mr7015ana.56.1296098334802; Wed, 26 Jan 2011 19:18:54 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.231.57.97 with SMTP id b33ls1308634ibh.0.p; Wed, 26 Jan 2011 19:18:53 -0800 (PST) Received: by 10.42.178.10 with SMTP id bk10mr142018icb.58.1296098333855; Wed, 26 Jan 2011 19:18:53 -0800 (PST) Received: by 10.42.178.10 with SMTP id bk10mr142017icb.58.1296098333766; Wed, 26 Jan 2011 19:18:53 -0800 (PST) Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by gmr-mx.google.com with ESMTPS id c4si2044117ict.7.2011.01.26.19.18.52 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 26 Jan 2011 19:18:52 -0800 (PST) Received-SPF: pass (google.com: domain of rpglover64@gmail.com designates 209.85.214.172 as permitted sender) client-ip=209.85.214.172; Received: by iwn40 with SMTP id 40so1634125iwn.3 for ; Wed, 26 Jan 2011 19:18:52 -0800 (PST) MIME-Version: 1.0 Received: by 10.42.179.5 with SMTP id bo5mr1498331icb.288.1296098332686; Wed, 26 Jan 2011 19:18:52 -0800 (PST) Received: by 10.42.172.67 with HTTP; Wed, 26 Jan 2011 19:18:52 -0800 (PST) In-Reply-To: <20110127023614.GE38730@alice.local> References: <20110125204806.GB35838@alice.local> <20110126033008.GA37422@alice.local> <20110126035654.GB37422@alice.local> <20110126185729.GC38730@alice.local> <20110127023614.GE38730@alice.local> Date: Wed, 26 Jan 2011 22:18:52 -0500 Message-ID: Subject: Re: [lojban] proposed grammar definition for ZOhOI From: Alex Rozenshteyn To: lojban@googlegroups.com X-Original-Sender: rpglover64@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of rpglover64@gmail.com designates 209.85.214.172 as permitted sender) smtp.mail=rpglover64@gmail.com; dkim=pass (test mode) header.i=@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary=90e6ba6e8f80b6bb0f049acb682d --90e6ba6e8f80b6bb0f049acb682d Content-Type: text/plain; charset=ISO-8859-1 Regarding the response to the third point, couldn't you just have an exhaustive list of the shorthand transformations (wouldn't you need one anyway) and preprocess the text, transforming the shorthand to the words they represent? I feel like the biggest problem is not that of parsing. On Wed, Jan 26, 2011 at 9:36 PM, .alyn.post. wrote: > On Wed, Jan 26, 2011 at 08:18:02PM -0500, Alex Rozenshteyn wrote: > > pe'i there should be 3 ways of writing lojban: > > > > After a brief brainstorm, I could support these three modes in my > parser should that be desireable. > > > 1. Strict: the only characters allowed (barring alphabet shifts) are > > lojban characters. > > The PEG grammar currently allows digits and some punctuation. I'd > need to add an immediate rule when these productions are matched to > reject those productions if strict mode was enable and forbidden > characters appear. > > > 2. Visually mnemonic: characters such as quotation marks and > parentheses > > etc. are allowed to make skimming the text easier; there is no > need to > > standardize (although suggestions might be welcome) what means > what > > because the characters will be ignored (treated as whitespace) by > the > > parser, and so every spoken syllable will still need to be spelled > > out. > > This is how the PEG grammar works now. I believe my parser allows > more punctuation than camxes, which is a trivial fix should that be > a problem. > > > 3. Visual shorthand: It will develop anyway, so it's best to > standardize > > it. e.g. {xu} can be *replaced* by a question mark, {to} and > {to'o} > > might be *replaced* by left and right parentheses, etc. It would > make > > sense to speak of {xubu}, the grapheme representing the cmavo {xu} > > > > This would require defining what this visual shorthand was and > modifying any rule affected. It would also require not permitting > the defined shorthand punctuation to be whitespace. > > -Alan > -- > .i ko djuno fi le do sevzi > > -- > You received this message because you are subscribed to the Google Groups > "lojban" group. > To post to this group, send email to lojban@googlegroups.com. > To unsubscribe from this group, send email to > lojban+unsubscribe@googlegroups.com > . > For more options, visit this group at > http://groups.google.com/group/lojban?hl=en. > > -- Alex R -- You received this message because you are subscribed to the Google Groups "lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban?hl=en. --90e6ba6e8f80b6bb0f049acb682d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Regarding the response to the third point, couldn't you just have an ex= haustive list of the shorthand transformations (wouldn't you need one a= nyway) and preprocess the text, transforming the shorthand to the words the= y represent?

I feel like the biggest problem is not that of parsing.

On Wed, Jan 26, 2011 at 9:36 PM, .alyn.post. <alyn.post@lo= dockikumazvati.org> wrote:
On Wed, Jan 26, 2011 at 08:18:02PM -0500, Alex Rozenshteyn wrote:
> =A0 =A0pe'i there should be 3 ways of writing lojban:
>

After a brief brainstorm, I could support these three modes in my
parser should that be desireable.

> =A0 =A0 1. Strict: the only characters allowed (barring alphabet shift= s) are
> =A0 =A0 =A0 =A0lojban characters.

The PEG grammar currently allows digits and some punctuation. =A0I'd need to add an immediate rule when these productions are matched to
reject those productions if strict mode was enable and forbidden
characters appear.

> =A0 =A0 2. Visually mnemonic: characters such as quotation marks and p= arentheses
> =A0 =A0 =A0 =A0etc. are allowed to make skimming the= text easier; there is no need to
> =A0 =A0 =A0 =A0standardize (although suggestions might be welcome) wha= t means what
> =A0 =A0 =A0 =A0because the characters will be ignored (treated as whit= espace) by the
> =A0 =A0 =A0 =A0parser, and so every spoken syllable will still need to= be spelled
> =A0 =A0 =A0 =A0out.

This is how the PEG grammar works now. =A0I believe my parser allows<= br> more punctuation than camxes, which is a trivial fix should that be
a problem.

> =A0 =A0 3. Visual shorthand: It will develop anyway, so it's best = to standardize
> =A0 =A0 =A0 =A0it. e.g. {xu} can be *replaced* by a = question mark, {to} and {to'o}
> =A0 =A0 =A0 =A0might be *replaced* by left and right parentheses, etc.= It would make
> =A0 =A0 =A0 =A0sense to speak of {xubu}, the grapheme representing the= cmavo {xu}
>

This would require defining what this visual shorthand was and
modifying any rule affected. =A0It would also require not permitting
the defined shorthand punctuation to be whitespace.

-Alan
--
.i ko djuno fi le do sevzi

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To post to this group, send email t= o lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lojba= n?hl=3Den.




--
=A0=A0 =A0 =A0 =A0 =A0Alex R

--
You received this message because you are subscribed to the Google Groups "= lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com.
For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den.
--90e6ba6e8f80b6bb0f049acb682d--