[elephant-devel] full text indexing

Gábor Melis mega at hotpop.com
Thu Mar 10 07:57:49 UTC 2005


On Thursday 10 March 2005 07:15, Ben wrote:
> i don't know how to do this sort of thing.  AFAICT the "update
> problem" is an issue for most indexing systems.  have you looked at
> lucene et al?

Lucene needs delete and re-add for update. Hopefully, I will do a bit better 
than that. I've begun implementing simple, index based solution. Elephant was 
modified to allow multiple secondary-keys to be returned by key-fn (any 
objections to this? It was great to have the index maintenance on the lisp 
side). In the full text indexing case the secondary keys are of course words. 
On update all secondary-key primary-key pairs that belong to the user being 
updated are removed and then re-added (for now :-)).

Seems to work, but this is all rather preliminary, one question is the 
performance of db_del_kv in the presence of a lot of duplicates (i.e. a lot 
of users with the same word). I asked in the berkeley-db newsgroup if it's a 
problem and whether DB_DUP_SORT helps with the performance. No answers so 
far.

Gábor



More information about the elephant-devel mailing list