Received: from mail-wr0-f189.google.com ([209.85.128.189]:34139) by stodi.digitalkingdom.org with esmtps (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.87) (envelope-from ) id 1dagbK-0006zC-RO for lojban-list-archive@lojban.org; Thu, 27 Jul 2017 04:05:14 -0700 Received: by mail-wr0-f189.google.com with SMTP id z36sf17727059wrb.1 for ; Thu, 27 Jul 2017 04:05:06 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1501153493; cv=pass; d=google.com; s=arc-20160816; b=lZE6NIM+VPEPEGzpqmjPaxCQJkUQ9Zs3CbMawLHjlY7eXqpA7Qg6ANMd1mGEQ47V/1 x4ka7Of91cw7FWUHfJ578WS8JaAYouCuoHkaIblR/iryLfxm3qHaMyuH9HJABQ1HbeHW 6u/y4gKJpJBjy7YzdGXucosY1NFvt7YyFFVejYCYSsP1+ZHWAUXuxpt3V9OUnZ8Bp6L2 x+LlSdTBGidj0OAwfp3YADQwizDq4sIwzlMIoChmXbRhtZ2q9C+xWDYsz/MaxlgwcooP IV6f3AgxYW3fYazvQHHlbBb4/yyVR7Ma0v5aRqsdkXeOQn2jVd/gGdgCDZw3YnfdEpgq NayA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:arc-authentication-results :arc-message-signature:sender:dkim-signature:dkim-signature :arc-authentication-results; bh=DWREXaPKtBYj11GLvOG0ChGITKDSR09/RxVxptjlt8Y=; b=alFItCFfjMmNJr+DfZu1rx2YKm9O9eX0nPs0nR/JUnLVIXdj/ApzxaifIjWyJ+pKvQ 9XYhJuX2kZjsD6eJL+Nbg9HcjFi2ZDNSwBxFnGiQPG7VKQicuN/iL5FKMekSCCjoe0bT xDe5ujcNuqNK5gkRg5VbBWkcm39dvN0l157jRwef1SJT1ZaKlEcuLJ38bysHhtrAHG1m aOk/qvgCWj6nDfBD2sckrnBuErdH3mLo1Ra8i0J5MCcyDzPRwDmWH2UBAq5kjZbH0bBd HH2/zrT9tQYYPxZ9mWE+jrw404bamQxDRsVEylfgKd6V4c6xxEgHDfG1zp14SUMZT9Kx 45tA== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.b=H5u+qGMa; spf=pass (google.com: domain of ilmen.pokebip@gmail.com designates 2a00:1450:400c:c0c::22e as permitted sender) smtp.mailfrom=ilmen.pokebip@gmail.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=DWREXaPKtBYj11GLvOG0ChGITKDSR09/RxVxptjlt8Y=; b=TW+s374t1JQccurxgs26bBIiKYEp8/G4PjBl4FP4eW8TaCVSlsTaOoDBl7Q+mPNmeX aZ7aeQA2C0jjYQ2OueSDvif+LMMsnA0XLyR3g68ZSqDeBnaDF9mAM4Tciv6VrArmb9qI cKeW42nQbsMe9B7qEzDrviJ/7wvSYWiEiQiRBlS97D5eR0psJomxt0j2wlbcbdpJBoj7 X43ZtDgcW7t9pKFq+9wB0QLNf03HQJLPOTMPUmVhwnpwo8T3wLpQQqUA6/Vm40O6YsbJ BHZF6opV5YBDw+OJOtf9Vvbi8OroDBXbmMs4Gg5b0iVFf3Vuqxz+QxGKIYozN3BcIf7B SNiw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding:content-language :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=DWREXaPKtBYj11GLvOG0ChGITKDSR09/RxVxptjlt8Y=; b=N+U1F8M5bsBiSLJc101aANndMKaCCuTUmtfagQ52nkqEsFq5jSsLAGXe8l4j36XcXb Qx3aR/FIq4tZxKWNR466g/TlzQ2KrckCYu5+kvRoRFj7NT0dEFThV/c+lNEMrzEcVu5m KaZiuUIAchHsAneTuhjncsKIuoEfRU9R+KfF4tsZRmVjdkxcpg1PPmgjyHdFsAvgMUww OJKOs87Uobf9zV339pTUiNbd8o6uwqEih/Wu4fO9jBIw2dtn1sZjQXVipDtz5bYvNiot Oei1V0ZQQmiM5XivEIwxBQPDgFIBUQVB5HDE+vruBc8hEFopyS4N33moHC5pvkGw4FJU DhwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:subject:to:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=DWREXaPKtBYj11GLvOG0ChGITKDSR09/RxVxptjlt8Y=; b=P0lBbFFbEakMprPhWw9AJkDscIDKDSZ/o/9xeqkSnIwDDnlx0/uxVp0CIJMvvrD9ne 4kmWhT8ul9P+DzKgr8gA1GYiCCXoFs1KJXHB/ihSNrDANVWmGpJ+B/QNs+R2IIpSnR7J jWXixRYqf1nWKfC0mbcUtxlExccvD7MSaF9RN50DBcIxMgNjYzzlLZwiuDpmv9kpjWBQ 9mtXtOqAB3LO+G7CHPe5wO/HRhy+eYQFfRjKr2DSRf/Vjk7WNupxUruw4aq+BPMANsSs GwjMwYNHg5VsUEWrwKkhTVhvNn47wgvYWMqWAXxD1P8M8rvVi+FNH9xUM1oKuEw7W4cG SIow== Sender: lojban@googlegroups.com X-Gm-Message-State: AIVw113L8huiTKJ1uSHJX5rrF00CHl2qFkf7vb+hRU7ccAstGJYS53OF 9bY21SK8y5Xvdg== X-Received: by 10.46.5.206 with SMTP id 197mr11927ljf.24.1501153492879; Thu, 27 Jul 2017 04:04:52 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.46.82.81 with SMTP id g78ls235044ljb.54.gmail; Thu, 27 Jul 2017 04:04:52 -0700 (PDT) X-Received: by 10.46.77.197 with SMTP id c66mr548811ljd.11.1501153492041; Thu, 27 Jul 2017 04:04:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1501153491; cv=none; d=google.com; s=arc-20160816; b=dn0+M6uOwgYrB/jZo2k0C4flYX4a8aMv725yCa916vy+SDQP3EUIeIMkqL2cmG8cik 9KnrKAnPrNFz9UBksbY6EBY0YDxYvTQ7sD/BOyx91FidEb6sRdT5HmfzHhFo9DlmneOW zWCB0kvyo/tZPv2tK8n6oiZc+sWBnfYk67+2PidftLeeSY+vNMdcp1vCqB0jQ4vkKg8h iQrUC4zEllIkVIEL59xhJpKnUJo6PR517kiVxI+eoHkpKb/ASPTBoW7Km2xxp/HmVhuJ sABDXKXwRewknXx1temkfILdRhXyt2YNl19f0cx26M2rlx5sA6EH+Hpbr1I9iWQXDC/5 KoXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-language:content-transfer-encoding:in-reply-to:mime-version :user-agent:date:message-id:from:references:to:subject :dkim-signature:arc-authentication-results; bh=ZOsx4ViHvLN+Wa5TsNLVDJFz4I7O+syTLUGpYttDMKc=; b=S9ZvuLnFaMn8gQEagbPqeUMKlMmzi78jqnaWC0K8iyyYmph6YRwHXh6eiJRpfwR8mO QRTYMdrk3/x9ONtpITMi9EhZL3X++TJSGhiU2dvGm9FKF8SnqaTodwGdLKQIuBj9keGx tQfhBL64/GTEqP3hMEqsPy0x9pzzLaSvgiZwZMgMx6kx0iRoTp6P3Xz3Z9GRg4s+045l nALcpExoWFYKQXVEDGonMU2qzzet1K6O85nVI0tAWmEFWFve2DabpcW5GPRZ4WinEaqA d3oaTeZD/uRWZudalt4m5o7efq+BzHDIdsBpqq+hyOon3VYrHcQy++84Vs71EatmV4c+ lBhw== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.b=H5u+qGMa; spf=pass (google.com: domain of ilmen.pokebip@gmail.com designates 2a00:1450:400c:c0c::22e as permitted sender) smtp.mailfrom=ilmen.pokebip@gmail.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from mail-wr0-x22e.google.com (mail-wr0-x22e.google.com. [2a00:1450:400c:c0c::22e]) by gmr-mx.google.com with ESMTPS id 2si294930wms.6.2017.07.27.04.04.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Jul 2017 04:04:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ilmen.pokebip@gmail.com designates 2a00:1450:400c:c0c::22e as permitted sender) client-ip=2a00:1450:400c:c0c::22e; Received: by mail-wr0-x22e.google.com with SMTP id k71so88611342wrc.2 for ; Thu, 27 Jul 2017 04:04:51 -0700 (PDT) X-Received: by 10.223.129.6 with SMTP id 6mr3398830wrm.23.1501153491429; Thu, 27 Jul 2017 04:04:51 -0700 (PDT) Received: from [192.168.0.102] (95-210-223-74.ip.skylogicnet.com. [95.210.223.74]) by smtp.googlemail.com with ESMTPSA id n24sm378456wrn.59.2017.07.27.04.04.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Jul 2017 04:04:50 -0700 (PDT) Subject: Re: [lojban] Spaces in jbovlaste To: lojban@googlegroups.com References: From: Ilmen Message-ID: <3c86d96b-e0ea-af6b-2ee8-51d4e0741fe5@gmail.com> Date: Thu, 27 Jul 2017 13:04:33 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable Content-Language: en-US X-Original-Sender: ilmen.pokebip@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.b=H5u+qGMa; spf=pass (google.com: domain of ilmen.pokebip@gmail.com designates 2a00:1450:400c:c0c::22e as permitted sender) smtp.mailfrom=ilmen.pokebip@gmail.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Spam-Checked-In-Group: lojban@googlegroups.com X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -4.1 (----) X-Spam_score: -4.1 X-Spam_score_int: -40 X-Spam_bar: ---- If spell checkers are only concerned with identifying what is a correct=20 word and what isn't, then you should disregard Jbovlaste entries=20 containing whitespace (they are multi-words lexemes), or even better,=20 check all the words that compose them to see if any of them is missing=20 from your spell-check whitelist (I strongly suspect there exists bu and=20 zei compounds containing words that appears nowhere else in the=20 dictionary=E2=80=A6). "re zei zgabube" is indeed a sequence of three words. It is present in=20 the dictionary because it is an independent lexeme, you cannot=20 accurately derive its meaning from its parts. This occurs all the times=20 in natlangs, think for example to the English "take off". As for cmavo sequences, people are allowed to chain them up without=20 whitespaces in between (this causes no ambiguity), although nowadays it=20 seems more common to always separate them with whitespaces. For a=20 spell-checker, two strategy are possible: the lazy one would be to=20 enforce the style of putting whitespaces between every cmavo, thus=20 marking e.g. "lonu" as incorrect; the second strategy, more involved,=20 would be to check any unknown letter string to see if it matchs a=20 sequence of cmavo, and allow it if it does (e.g. if the program hits=20 "calonu" and is able to find it can be a sequence of cmavo ca+lo+nu,=20 only then it would allow it). But I don't know if the software you're=20 using is able to do that without an explicit and systematic list of all=20 allowable cmavo strings=E2=80=A6 If the software were to need an explicit and exhaustive list of allowed=20 words, I guess it wouldn't be very handy to use for very synthetic=20 languages (e.g. Turkish, Quechua, Greenlandic=E2=80=A6), which might have a= n=20 infinite number of valid words. =E2=80=94Ilmen. On 27/07/2017 10:49, sukender1@gmail.com wrote: > coi ro do > > I found entries with spaces in jbovlaste. This is an issue for spell=20 > checking dictionaries (actually in "aspell"). I know that spaces are=20 > not relevant when parsing Lojban, but they're still important for=20 > human reading. This is why I would not like a rule like "import every=20 > entry and remove spaces everywhere"... > > So, I understand that it may be normal for compound cmavo, like "tai=20 > da'i", but can't these be written without space ("taida'i") without=20 > breaking the reading flow? > However, some entries seem very strange to me, such as "re zei=20 > zgabube". Aren't these 3 separated words?? > > Thank you for your explanations. > > co'o > --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsubscribe@googlegroups.com. To post to this group, send email to lojban@googlegroups.com. Visit this group at https://groups.google.com/group/lojban. For more options, visit https://groups.google.com/d/optout.