[pylucene-dev] Question on design of a long-running BSD-based
reader/writer class
Peter Fein
pfein at pobox.com
Thu Jan 4 21:59:22 PST 2007
On Thursday 04 January 2007 14:13, Terry Jones wrote:
> I have a fairly simple question about trying to write a long-running
> BSD-based reader/writer class. I've written something that works, but it's
I haven't used the BSDDB[0] directory, but I had implemented something like
this using FSDirectory. Add one doc, do some searching, add a doc, rinse,
repeat. The performance was decidedly bad. The key lesson here: Lucene is
not a database.[1] Closing / opening writers & readers is not speedy & loses
the benefits of caching.
If you want to go the route your going, I'd recommend setting up a single
class to manage the Directory, IndexSearcer & IndexWriter. Basically, the
idea is to provide access to the searcher & writer via properties. Accessing
one closes the other if it's open.
Andi's suggestion of using a RAMDir & merging is probably best. If you need
immediate access to the just-added doc, you can do that with a MultiSearcher.
Though to be honest, I'd suggest thinking about other ways to structure your
application.[2]
--Pete
[0] Has anyone else had difficulties w/ bsddb (outside PyLucene)? Despite
it's excellent reputation, it's given us lots of trouble. We've been using
gdbm instead.
[1] It's a little like a database though.
[2] My advice on such matters is available at the low, low price of hundreds
of dollars per hour. Act now!
--
Peter Fein pfein at pobox.com
773-575-0694 Jabber: peter.fein at gmail.com
http://www.pobox.com/~pfein/ irc://irc.freenode.net/#chipy
More information about the pylucene-dev
mailing list