office-gobmx/lingucomponent/source/thesaurus/mythes/README
Jens-Heiner Rechtien 3900aa902b INTEGRATION: CWS ooo20031216 (1.1.2); FILE ADDED
2003/12/10 14:26:51 khendricks 1.1.2.1: Issue number:  None
Submitted by:  Kevin B. Hendricks
Reviewed by:   project owner (me)

adding in the new thesaurus implementation that removes all of the
old hardcoded limits and problems with the old thesaurus

Kevin
2004-02-04 12:06:56 +00:00

60 lines
1.7 KiB
Text

MyThes is a simple thesaurus that uses a structured
text data file and an index file with binary search
to lookup words and phrases and return information
on part of speech, meanings, and synonyms
MyThes was written to provide a thesaurus for the
OpenOffice.org project
The Main features of MyThes are:
1. written in C++ to make it easier to interface with
Pspell, OpenOffice, AbiWord, etc
2. it is stateless, uses no static variables and
should be completely reentrant with no ifdefs
3. it compiles with -ansi and -pedantic and -Wall
with no warnings so it should be quite portable
4. it uses a perl program to read the structured
text file and create the index needed for bianry
searching (see dictionaries/en_US/th_gen_idx.pl)
5. it is very simple with *lots* of comments.
The main "smarts" are in the structure of the
text file that makes up the thesaurus data
6. It comes with a ready-to-go structured thesaurus
data file for en_US extracted from the WordNet-2.0 data.
(see dictioanries/en_US/th_en_US_new.dat)
Please see WordNet_license.txt and WordNet_readme.txt
for more information on the very useful project!
(found in dictionaries/en_US/)
7. The source code has a BSD license (and no advertising clause)
MyThes has the world's simplest Makefile and no
configure support. It does come with a simple example
program that looks up some words and returns meanings
and synonyms.
To build it simply do the following:
unzip mythes.zip
cd mythes
make
To run the example program:
./example th_en_US_new.idx th_en_US_new.dat checkme.lst
Please play around with it and let me know
what you think.
Thanks,
Kevin Hendricks
kevin.hendricks@sympatico.ca