[pylucene-dev] Question on design of a long-running BSD-based reader/writer class

Andi Vajda vajda at osafoundation.org
Fri Jan 5 10:35:09 PST 2007


On Fri, 5 Jan 2007, Terry Jones wrote:

> | While there is a certain overhead with transactions and opening and closing
> | an index for every addition, I did notice that there was a fair amount of
> | thrashing around in the Lucene directory I/O and got things to be
> | considerably faster by batching all updates and doing them in a
> | RAMDirectory before adding the RAMDirectory contents to the DBDirectory via
> | the addIndexes API.
>
> I've thought about this (after reading the suggestion in Lucene in Action).
> I considered having an open RAMDirectory that is always being written to
> and which is merged into a FSDirectory whenever a search takes place. That
> would be ok for some cases, but not in general. Also, buffering approaches
> using RAMDirectory seem not to support transactions - at least not at the
> level of single additions to the RAMDirectory. That's something of a
> problem for me, but adding some sort of transaction mechanism might work.

The transaction is used with DBDirectory. Use a RAMDirectory to batch all 
changes for a given thread and once done, merge the RAMDirectory into a 
DBDirectory within a transaction.

Andi..


More information about the pylucene-dev mailing list