[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lojban] CLL diffs
On Tue, Sep 07, 2010 at 09:08:43PM -0700, Robin Lee Powell wrote:
> On Tue, Sep 07, 2010 at 09:59:51PM -0600, Alan Post wrote:
> > My favorite change so far is the following:
> >
> > [-forbidden.-] {+forbilien .+}
> >
> > Someone changed forbidden to forbilien, twice no less.
> >
> > My largest challenge in this project are the fact that I did not
> > get consistent conversion of non-ASCII characters, so the wdiff
> > patch is very noisy--anytime a non-ascii character, or an ascii
> > character with a non-ascii representation (e.g., single and double
> > quote) appears, it shows up as a diff. I've managed to remove
> > certain classes of these, and am still finding patterns as I go.
>
> There's a unix command called "recode" which can almost certainly
> fix those problem, just so you know.
>
This particular problem happened when I converted the .doc to .rtf.
I'm aware of iconv's support for squashing non-ascii characters. I
did not know about recode.
I'll review that part of my pipeline and see if I can catch the
program doing it. It *looks* like Word just made all the non-ascii
characters '?', but I could be seeing an effect of not having full
UTF support in my terminal/editor too.
-Alan
--
ko djuno fi le do sevzi
--
You received this message because you are subscribed to the Google Groups "lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lojban?hl=en.