From thinkit8@lycos.com Tue Sep 18 00:39:57 2001 Return-Path: X-Sender: thinkit8@lycos.com X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-7_3_2_2); 18 Sep 2001 07:39:56 -0000 Received: (qmail 48904 invoked from network); 18 Sep 2001 03:29:29 -0000 Received: from unknown (10.1.10.142) by l10.egroups.com with QMQP; 18 Sep 2001 03:29:29 -0000 Received: from unknown (HELO n25.groups.yahoo.com) (216.115.96.75) by mta3 with SMTP; 18 Sep 2001 03:29:29 -0000 X-eGroups-Return: thinkit8@lycos.com Received: from [10.1.10.107] by mv.egroups.com with NNFMP; 18 Sep 2001 03:29:29 -0000 Date: Tue, 18 Sep 2001 03:29:28 -0000 To: lojban@yahoogroups.com Subject: lujvo banro...python programing to expand lujvo Message-ID: <9o6f2o+a270@eGroups.com> User-Agent: eGroups-EW/0.82 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Length: 1155 X-Mailer: eGroups Message Poster X-Originating-IP: 24.5.121.32 From: thinkit8@lycos.com X-Yahoo-Message-Num: 10823 ok, i went ahead and hacked 2 python programs to expand all lujvo into compenent cmavo/gismu, using vlatai from jbofi'e. first, the batch file, %1 is infile, %2 is outfile: python space.py < %1 > temp1.txt vlatai -el < temp1.txt > temp2.txt python sub.py < %1 > %2 note that sub.py expects "temp2.txt" to have the vlatai output. next space.py: import re while 1: try: s=raw_input() a=re.split("[^a-zA-Z\',]+",s); for x in a: print x except EOFError: break next sub.py: import re,string qb=[] qa=[] f=open("temp2.txt") s=f.readline() while s!="": if re.search(": lujvo :",s) is not None: res=re.match("[a-z\',]+",s) qb.append(res.group(0)) res=re.search("\[[a-z\',\+\?]+",s) s2=res.group(0) qa.append(string.replace(s2,"[","")) s=f.readline() s2="" while 1: try: s2+=raw_input() s2+="\n" except EOFError: break for x in range(len(qa)): s2=re.sub(qb[x],qa[x],s2) print s2 much thanks to richard curnow for providing vlatai. it's a bit roundabout because all i have is a binary. hopefully i'll get working with the source and can do something more elegant, and/or (gi'a) make it web accessible.