[pylucene-dev] How to call finalize?

Andi Vajda vajda at osafoundation.org
Thu Jan 17 11:15:16 PST 2008


On Thu, 17 Jan 2008, anurag uniyal wrote:

> I have a custom analyzer which uses custom tokenizers in its tokenStream 
> method. (see attached code)

I couldn't find any attached code in your message. Maybe the list software 
is stripping them...

> Now tokenStream may be called several times over which i have no control.

This looks like the problem Brian is facing too.
I suggested he keep track of his BrianFilter instances and call finalize() 
on them after each call to indexWriter.addDocument() by adding code for 
this purpose on his custom analyzer.

See http://lists.osafoundation.org/pipermail/pylucene-dev/2008-January/002232.html

> Otherwise I will wrap tokenStream method to keep track of 
> custometokenizers and finalize them once i get StopIteration.

It looks like another candidate for a decorator here.
   @finalizer
   def tokenStream(self):
       return stuff...

   and finalizer() would be defined to add the return value to some list that
   would then we iterated with calls to finalize().

This is actually looking like the background thread idea I had suggested 
earlier in that I would add code to store all such extension instances in a 
list on the env object returned by initVM(). Then, the background thread 
would walk this list and finalize() anything that is only referenced by the 
list (in addition to the deadly embrace ref, of course).

It also looks like the FinalizerWrapper class sugested earlier will 
finalize() things that are in use by the Java VM but that are no longer 
referenced in python. This would cause problems. finalize() should only be 
called once one is absolutely __sure__ that no one, not the Python VM nor 
the Java VM is using the objects in question.

The background thread idea I had suggested would, instead of finalize()'ing 
the objects once no other python refs are found, replace the global java ref 
part of the deadly embrace with a global java weak ref instead. This would
allow Java to retain the object until it's done with it itself. In other 
words, the actual finalization of the object would happen when Java 
eventually collects the object.

Andi..



More information about the pylucene-dev mailing list