[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] CLL diffs



On Tue, Sep 07, 2010 at 09:08:43PM -0700, Robin Lee Powell wrote:
> On Tue, Sep 07, 2010 at 09:59:51PM -0600, Alan Post wrote:
> > My favorite change so far is the following:
> > 
> > [-forbidden.-] {+forbilien .+}
> > 
> > Someone changed forbidden to forbilien, twice no less.
> > 
> > My largest challenge in this project are the fact that I did not
> > get consistent conversion of non-ASCII characters, so the wdiff
> > patch is very noisy--anytime a non-ascii character, or an ascii
> > character with a non-ascii representation (e.g., single and double
> > quote) appears, it shows up as a diff.  I've managed to remove
> > certain classes of these, and am still finding patterns as I go.
> 
> There's a unix command called "recode" which can almost certainly
> fix those problem, just so you know.
> 

This particular problem happened when I converted the .doc to .rtf.
I'm aware of iconv's support for squashing non-ascii characters.  I
did not know about recode.

I'll review that part of my pipeline and see if I can catch the
program doing it.  It *looks* like Word just made all the non-ascii
characters '?', but I could be seeing an effect of not having full
UTF support in my terminal/editor too.

-Alan
-- 
ko djuno fi le do sevzi

-- 
You received this message because you are subscribed to the Google Groups "lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lojban?hl=en.