From nobody@digitalkingdom.org Sat May 26 10:04:24 2007 Received: with ECARTIS (v1.0.0; list lojban-beginners); Sat, 26 May 2007 10:04:36 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.63) (envelope-from ) id 1Hrzgb-0006SN-Ur for lojban-beginners-real@lojban.org; Sat, 26 May 2007 10:04:16 -0700 Received: from smtp1.isgenesis.com ([168.215.170.6] helo=isgenesis.com) by chain.digitalkingdom.org with esmtp (Exim 4.63) (envelope-from ) id 1HrzgR-0006SE-6i for lojban-beginners@lojban.org; Sat, 26 May 2007 10:04:09 -0700 Received: from [168.215.170.1] (HELO [10.222.0.214]) by isgenesis.com (CommuniGate Pro SMTP 5.1.5) with ESMTPS id 2591434 for lojban-beginners@lojban.org; Sat, 26 May 2007 12:03:58 -0500 Message-ID: <4658687D.70909@spamcop.net> Date: Sat, 26 May 2007 12:03:57 -0500 From: Charles Duffy User-Agent: Thunderbird 2.0.0.0 (Windows/20070326) MIME-Version: 1.0 To: lojban-beginners@lojban.org Subject: [lojban-beginners] Re: my first Lojban words 1.1 References: <1180122807.46573eb7554e4@ssl0.ovh.net> <1180197748.46586374b33be@ssl0.ovh.net> In-Reply-To: <1180197748.46586374b33be@ssl0.ovh.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 3.0 X-Spam-Score-Int: 30 X-Spam-Bar: +++ X-archive-position: 4703 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-beginners-bounce@lojban.org Errors-to: lojban-beginners-bounce@lojban.org X-original-sender: cduffy@spamcop.net Precedence: bulk Reply-to: lojban-beginners@lojban.org X-list: lojban-beginners m.kornig@sondal.net wrote: > Currently, I use a simple text editor (it's actually bloc-notes) > to create my HTML files. > > Can I still use this if I go for UTF8 or UTF16? And will the > Japanese characters be distinguishable in the source file? > Vim has good UTF-8 support since version 6.0 (:set encoding=utf8); Emacs has support from 21.3 (prefer-coding-system utf-8 for global defaults, or modify-coding-system utf-8 for an individual file). Those are the only text editors that exist, right? Now, mind you, if you had some program that claimed to be a text editor and which didn't have proper Unicode support, UTF-8 would look completely normal except for high characters (where it would depend on whether that program was UTF-8 aware -- which quite a lot is these days), whereas UTF-16 would be funky even for standard ASCII. Generally, it makes sense to use UTF-8 if a document is mostly low ASCII characters, or UTF-16 if it's mostly Unicode characters not found in low ASCII. (bloc-notes is Notepad, right? I won't recognize its claim to text-editordom, but according to http://fr.wikipedia.org/wiki/Notepad, it *does* have both UTF-8 and UTF-16 support).