[pylucene-dev] TermDocs.read() method

Andi Vajda vajda at osafoundation.org
Tue Sep 9 11:59:55 PDT 2008


On Sep 9, 2008, at 11:21, Martin Bachwerk <bachwerk at i5.informatik.rwth-aachen.de 
 > wrote:

> Yea, tried with maxheap=128m.. it stabilized at around 230MB RAM..  
> need to check performance though..
> The question is just.. when I iterate with .next() no memory is  
> eaten up.. it lives on around 30-40MB.. and like this it grows to  
> 800.. just strange.

For questions about the Lucene APIs themselves, you'd better off  
asking java-user at lucene.apache.org as there more expertise hanging out  
there.

> But since I don't know much Java and this is all not so critical,  
> I'll just leave it be for now.. Thanks for the help! :)

Great !

Andi..

>
>
> Martin
>>
>> On Tue, 9 Sep 2008, Martin Bachwerk wrote:
>>
>>> Hello again,
>>>
>>> the index is kinda large indeed.. even though I have  
>>> Field.Store.NO set for the actual content.. (ok the documents are  
>>> 2-3k large in average, but it could be smaller still..)
>>>
>>> The memory use is just growing and growing.. though doesn't go  
>>> into critical area, it just ate up 800megs out of 1024 I have in  
>>> some 15 mins.. after that it stayed stable. I guess this would be  
>>> acceptable.. but I don't quite understand why it is the case..
>>
>> If it stabilized, it could just mean that this is the memory  
>> necessary for Java Lucene to work with your index. Have you tried  
>> reducing the max memory so that you use less but gc more often ?
>>
>>> The arrays are pretty much dependant on the term (i.e. word).. for  
>>> words like "is" they're around the size of the number of  
>>> documents.. for rare words they can be 1-2-3.. entries long..
>>>
>>> I don't have Java code to test all this sorry.
>>
>> It could be written :) It's pretty much a one-to-one mapping for  
>> the API calls. This is what I would do next to isolate this if I  
>> were to debug this further right now.
>>
>> Andi..
>>
>>
>


More information about the pylucene-dev mailing list