[pylucene-dev] memory leak status

Andi Vajda vajda at osafoundation.org
Fri Jan 11 12:24:12 PST 2008


On Fri, 11 Jan 2008, anurag uniyal wrote:

> I am using the latest trunk code but still I am facing 
> java.lang.OutOfMemoryError.
> 
> It may be due to problem in my code, so I have created and attached a 
> sample script which shows the problem.

> In my script I am just adding a simple document in threads.
> Without threading it works and also if document's field is UN_TOKENIZED it 
> works but TOKENIZED fails...

Your code is attempting to create 500 threads in a VM limited to 5 Mb.
I get a crash after 27 threads.

Now, you could reuse the one thread that is actually being used since your 
code waits for it to complete before the next one is created. Or, if you 
really want to create 500 threads one after the other, you could detach the 
current thread from the VM before creating the next one. This allows it to 
be collected and cleaned up.

If I modify your DocThread class as follows, your code no longer crashes or 
fails.

class DocThread(threading.Thread):
     def __init__(self, writer):
         threading.Thread.__init__(self)
         self.writer = writer
         self.error = None

     def run(self):
         try:
             lucene.getVMEnv().attachCurrentThread()
             self.writer.addDocument(MyDocument())
         except Exception,e:
             self.error = e
         finally:
             self.writer = None
             lucene.getVMEnv().detachCurrentThread()

Andi..


More information about the pylucene-dev mailing list