[pylucene-dev] Re: lucene.JavaError: java.lang.OutOfMemoryError: Java heap space

Andi Vajda vajda at osafoundation.org
Tue Jan 8 18:57:25 PST 2008


On Tue, 8 Jan 2008, Brian Merrell wrote:

> Thanks for the quick reply.  I haven't used Java in years so my apologies if
> I am not able to provide useful debug info without some guidance.
>
> Memory does seem to be running low when it crashes.  According to top,
> python is using almost all of the 4GB when it bails.

That may be misleading because all the memory used belongs to the Python 
process. Even Java's since it's loaded in via shared libraries into the 
Python process.

> I don't know what Java VM I am using.  How do I determine this?

At the shell prompt enter: java -version
For example, on my Mac, I get:

   java version "1.5.0_13"
   Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05-237)
   Java HotSpot(TM) Client VM (build 1.5.0_13-119, mixed mode, sharing)

Also, what does 'which java' return ?

> I will try running it calling gc.collect() and running optimize and see if
> that helps.  Any suggestions on how to debug _dumpRefs?

_dumpRefs() returns a dict of java objects as keys and their ref count as 
values. If this dict is unusually large, something's amiss. What is 
"unusually" ? Time will tell :)

> P.S.  My filter is implemented in Python.  In fact here is the code:

Another thing to try (proceeding by elimination), is to index your documents 
without your custom filter. Does it still run out of memory ? If the answer 
is no, clearly the python filter integration code needs to be looked at 
closely (that is, the generated C++ for that code). Maybe something's 
leaking there.

Andi..


More information about the pylucene-dev mailing list