[pylucene-dev] memory leak status

Andi Vajda vajda at osafoundation.org
Fri Jan 11 09:30:05 PST 2008

On Fri, 11 Jan 2008, Pete wrote:

> Ok, so if I have some code that manages the lifespan of a Lucene builtin OR
> Python extension (we're all duck-typers here), how is it supposed to know
> whether to call finalize or not?
> Though, hmm, if I'm reading correctly, I could just do:
> try:
>    lucene_or_python_obj.finalize()
> except AttributeError:
>     pass  # must have been a pure lucene object
> I guess that's ok.  Though it'd be really nice if you found a solution to
> automagically manage things without the explicit finalize()...   weakrefs
> maybe?

Did you look at test/test_PythonDirectory.py ?
In there you can see that the way it's done is by keeping track of the 
python extension instances being allocated. Java never allocates these, you 
do, so you can keep track of these objects and finalize() them "at some 
point" of your choice.

Yes, of course, this is (way) less than ideal and needs a better solution.
But it's better than leaking in the meantime, until a better solution is 
found and implemented.

Currently, all I can think of is a thread that runs every few seconds and 
that looks at all such Python extension instances. When it finds one that is 
not referred to by any code other than the Java wrapper itself (and a global 
list of such extensions), it would replace the java reference preventing the 
java side from being GC'ed with a Java weak reference allowing the java side 
to be GC'ed once Java's GC made the same determination, finalize() to be 
invoked by the Java GC and finally the Python side to be decref'd down to 0 
and freed. I don't like this daemon thread idea very much but if I (or this 
list) can't come up with a better idea, I might just have to implement it.


More information about the pylucene-dev mailing list