[pylucene-dev] lucene.JavaError: java.lang.OutOfMemoryError: Java
heap space
Brian Merrell
brian at merrells.org
Tue Jan 8 16:48:33 PST 2008
I get an OutOfMemoryError: Java heap space after indexing less than 40,000
documents. Here are the details.PyLucene-2.2.0-2 JCC
Ubuntu 7.10 64bit running on 4GB Core 2 Duo
Python 2.5.1
I am starting Lucene with the following:
lucene.initVM(lucene.CLASSPATH, maxheap='2048m')
Mergefactor (I've tried everything from 10 - 10,000)
MaxMergeDocs and MaxBufferedDocs are at their defaults
I believe the problem somehow stems from a filter I've written that turns
tokens into bigrams (each token returns two tokens, the original token and a
new token created from concatenating the text of the current and previous
token). These bigrams add a lot of unique tokens but I didn't think that
would be a problem (aren't they all flushed out to disk?)
Any ideas or suggestions would be greatly appreciated.
-brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osafoundation.org/pipermail/pylucene-dev/attachments/20080108/64672955/attachment.html
More information about the pylucene-dev
mailing list