[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

dictionary editing system (was: Lojban dictionary in TEI)



On Thu, 2 May 2002, Allan Bailey wrote:

> new db structure:
>
> word: [gismu/cmavo/fu'vla/...]
> english-gloss:
> esperanto-gloss:
> ...
> <rodbau>-gloss:
>
> place_structure:
> synonyms_related_words:
> rafsi:

I've attached a copy of the PostgreSQL tables I'd come up when working on
this.

A feature of particular note is that I separated out the entry for the
Lojban word (that information is contained in the words table), and the
content in the other language. Each row in the definitions table records
the language of the definition.

One also needs to record a keyword, or keywords for each place of brivla,
so that when you construct the <otherlanguage>->Lojban portion of the
dictionary, you can map words as appropriate.

Some of the design requirements I'd come up when working on jbovlaste:

* Users need to be presented with a semantic catagorization of the words,
whether or not the database stores it. (I was thinking a Wiki-esque
front end, which pulled definitions out of the database when indicated
by appropriate tags.)
* There definitately needs to be a way for users to associate
near-arbitrary HTML with everything, so that they can link to, if
nothing else, mailing list archives.
* Multilinguality of content. If you're going to implement this, you might
as well implement it so that it doesn't have to be redone for some other
language. (Supporting existing natlangs and close approximations thereof,
like Esperanto, is good enough.)
* A (threaded) comment system, so users could provide feedback on
definitions of words, and the suitability of words for mapping to
particular concepts.
* A voting system for words, with something mojo-esque to calculate whose
votes a worth more, etc.
* The database needs to be designed so that it is easy to not only search
for words in any language the database has content in, but also to dump
the contents of the database to a file easily. (For export to a PDF, and
thus, a printer. See http://miranda.org/~jkominek/dict_en.pdf for an
example of my prototype's output.)

I'm probably missing a lot.

- Jay Kominek <jay.kominek@colorado.edu>
Plus ça change, plus c'est la même chose
BEGIN;

CREATE TABLE words (
 wordId serial primary key,
 word text not null unique,
 type int2 not null,
 added int4 not null,
 image text not null,
 xrefs text not null,
 typespecific text not null,
 etymologicorigin text not null
);

CREATE TABLE authors (
 authorId serial primary key,
 name text not null,
 username text not null,
 email text not null,
 superuser bool not null
);

CREATE TABLE definitions (
 definitionId serial primary key,
 wordId int4 references words,
 language varchar(128) not null,
 authorId int4 references authors,
 entrycomment text not null,
 added int4 not null,

 definition text not null,
 explanation text not null,
 xrefs text not null
);

CREATE TABLE etymologies (
 wordId int4 references words,
 language varchar(128) not null,
 etymology text not null
);

CREATE TABLE votes (
 definitionId int4 references definitions,
 voteval int2 not null,
 authorId int4 references authors
);

CREATE TABLE keywords (
 definitionId int4 references definitions,
 place int2 not null,
 keyword text not null
);

CREATE TABLE lujvomap (
 wordId int4 references words,
 place int2 not null,
 component int4 references words,
 componentplace int2 not null
);

CREATE TABLE rafsi (
 rafsi char(4),
 wordId int4 references words,
 llgApproved bool not null default 'f'
);

INSERT INTO authors (name, username, email, superuser) VALUES
('auto import scripts', 'importer', 'none@quu.xx', 't');

-- Don't forget to COMMIT if it all went well,
-- or ROLLBACK if something failed.