From rlpowell@digitalkingdom.org Tue May 18 14:10:07 2004 Received: with ECARTIS (v1.0.0; list lojban-list); Tue, 18 May 2004 14:10:07 -0700 (PDT) Received: from rlpowell by chain.digitalkingdom.org with local (Exim 4.32) id 1BQBqY-0001WX-GC for lojban-list@lojban.org; Tue, 18 May 2004 14:09:58 -0700 Date: Tue, 18 May 2004 14:09:58 -0700 To: lojban-list@lojban.org Subject: [lojban] Actual features for my parser, and a Java request Message-ID: <20040518210958.GP6978@chain.digitalkingdom.org> Mail-Followup-To: lojban-list@lojban.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.5.1+cvs20040105i From: Robin Lee Powell X-archive-position: 7907 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: rlpowell@digitalkingdom.org Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list I've added some things to my parser worthy of being called 'features', basically to trim down its verbosity. The sentence I'm testing here is "mi cusku zo do si si de". Default output: text1=( paragraphs=( paragraph=( statement=( statement1=( statement2=( statement3=( sentence=( terms=( terms1=( terms2=( term=( sumti=( sumti1=( sumti2=( sumti3=( sumti4=( sumti5=( sumti6=( KOhA=( mi ) ) ) ) ) ) ) ) ) ) ) ) bridiTail1=( bridiTail2=( bridiTail3=( selbri=( selbri1=( selbri2=( selbri3=( selbri4=( selbri5=( selbri6=( tanruUnit=( tanruUnit1=( tanruUnit2=( BRIVLA=( cusku ) ) ) ) ) ) ) ) ) ) ) tailTerms=( terms=( terms1=( terms2=( term=( sumti=( sumti1=( sumti2=( sumti3=( sumti4=( sumti5=( sumti6=( KOhA=( de ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) Full output (used to be the only option): text1=( paragraphs=( paragraph=( statement=( statement1=( statement2=( statement3=( sentence=( terms=( terms1=( terms2=( term=( sumti=( sumti1=( sumti2=( sumti3=( sumti4=( sumti5=( sumti6=( KOhA=( KOhAWords=( mi ) ) ) ) ) ) ) ) ) ) ) ) ) bridiTail1=( bridiTail2=( bridiTail3=( selbri=( selbri1=( selbri2=( selbri3=( selbri4=( selbri5=( selbri6=( tanruUnit=( tanruUnit1=( tanruUnit2=( BRIVLA=( consonant=( c ) vowel=( u ) consonant=( s ) consonant=( k ) brivlaTail=( vowelNotY=( u ) spacing=( siClause=( zoClauseNoSIHandling=( ZO=( ZOWords=( zo ) ) anyWord=( CMAVONoAbsorb=( CMAVOLetters=( do ) ) ) ) SI=( SIWords=( si ) ) SI=( SIWords=( si ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) tailTerms=( terms=( terms1=( terms2=( term=( sumti=( sumti1=( sumti2=( sumti3=( sumti4=( sumti5=( sumti6=( KOhA=( KOhAWords=( de ) postCmavo=( spacingOpt=( spacing=( spaces=( spaceChars=( ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) There are options for showing morphological breakdown, things that act as spaces (i.e. the SI clause), and the PARSERParen things (which are of no use to anyone but me). If anyone wants to contribute Java to turn this format into something more vertical/readable, that would be great. -Robin -- http://www.digitalkingdom.org/~rlpowell/ *** I'm a *male* Robin. "Many philosophical problems are caused by such things as the simple inability to shut up." -- David Stove, liberally paraphrased. http://www.lojban.org/ *** loi pimlu na srana .i ti rokci morsi