[pylucene-dev] TermDocs.read() method

Martin Bachwerk bachwerk at i5.informatik.rwth-aachen.de
Tue Sep 9 11:21:13 PDT 2008


Yea, tried with maxheap=128m.. it stabilized at around 230MB RAM.. need 
to check performance though..
The question is just.. when I iterate with .next() no memory is eaten 
up.. it lives on around 30-40MB.. and like this it grows to 800.. just 
strange.

But since I don't know much Java and this is all not so critical, I'll 
just leave it be for now.. Thanks for the help! :)

Martin
>
> On Tue, 9 Sep 2008, Martin Bachwerk wrote:
>
>> Hello again,
>>
>> the index is kinda large indeed.. even though I have Field.Store.NO 
>> set for the actual content.. (ok the documents are 2-3k large in 
>> average, but it could be smaller still..)
>>
>> The memory use is just growing and growing.. though doesn't go into 
>> critical area, it just ate up 800megs out of 1024 I have in some 15 
>> mins.. after that it stayed stable. I guess this would be 
>> acceptable.. but I don't quite understand why it is the case..
>
> If it stabilized, it could just mean that this is the memory necessary 
> for Java Lucene to work with your index. Have you tried reducing the 
> max memory so that you use less but gc more often ?
>
>> The arrays are pretty much dependant on the term (i.e. word).. for 
>> words like "is" they're around the size of the number of documents.. 
>> for rare words they can be 1-2-3.. entries long..
>>
>> I don't have Java code to test all this sorry.
>
> It could be written :) It's pretty much a one-to-one mapping for the 
> API calls. This is what I would do next to isolate this if I were to 
> debug this further right now.
>
> Andi..
>
>



More information about the pylucene-dev mailing list