[pylucene-dev] Locked index
Andi Vajda
vajda at osafoundation.org
Sun Feb 27 09:55:34 PST 2005
>> The Lucene locking code doesn't use the OS locking APIs offered by Java
>> 1.5, yet. Until then, that code is a little brittle. If you need more
>> reliability in this area, try using a database for your index such as the
>> DbDirectory implementation built around Berkeley DB. PyLucene supports it.
>
> How does performance compare?
Berkeley DB is pretty fast but I didn't do a comparison with FSDirectory.
Berkeley DB is not SQL-based. The DbDirectory uses two B-trees for all its
storage needs. I always use DbDirectory and it's fast enough.
The overhead you're going to have to deal with is related to using a database.
There are going to be large files, transaction logs, backups, etc... to
manage. There are also a number of configuration options that affect
performance within the constraints of your application to consider.
The advantages of a database such as Berkeley DB are well worth it, especially
for large indexes which take quite some time to rebuild in case of corruption.
> Will this work well for big indexes? I'm at 6 GB and it looks like it'll hit
> abou 20 GB at the rate I'm going.
Berkeley DB claims to scale up to terabytes.
Andi..
More information about the pylucene-dev
mailing list