[pylucene-dev] Help understanding "performance" issues.
vajda at osafoundation.org
Wed Feb 21 10:56:13 PST 2007
On Wed, 21 Feb 2007, Rune Hansen wrote:
> I've set up a Multisearcher* inside a patched cherrypy 3.0 server (patched
> with PythonThreads).
> Using a Queue, I've created searchables (MultiSearchcer spanning 10 indexes
> with approximately 900.000 documents combined) which are available through
> cherrypys .thread_data facility for the servers 10 threads.
> When timing a search of medium complexity, one searchable returns after ~0.3
> The optimum seems to be to create two searchables, it does not produce higher
> throughput when I increase the number of searchables to three or more, it
> actually slows all the requests down. If I reduce the number of searchables
> to one, it will produce half the throughput of two searchables.
> For example:
> ab -n100 -c8 on one searchable available to 10 threads : Requests per second:
> 1.66 [#/sec] (mean)
> ab -n100 -c8 on two searchables available to 10 threads : Requests per
> second: 3.05 [#/sec] (mean)
> ab -n100 -c8 on three searchables available to 10 threads : Requests per
> second: 2.98 [#/sec] (mean)
> ab -n100 -c8 on four searchables available to 10 threads : Requests per
> second: 2.95 [#/sec] (mean)
> (average of 5 runs on each)
> I have a hard time understanding this behavior. Is it because of how Lucene
> accesses a IndexReader? Is it because of hardware limitations? Can in be
> programmed "smarter" at my end?
I'm not sure. There have been many threads about this on
java-dev at lucene.apache.org. A bunch of work was done in the area of locks and
indexes in Lucene 2.1, so I'd try to upgrade to PyLucene 2.1 as well.
More information about the pylucene-dev