[pylucene-dev] TermDocs.read() method
Martin Bachwerk
bachwerk at i5.informatik.rwth-aachen.de
Tue Sep 9 11:21:13 PDT 2008
Yea, tried with maxheap=128m.. it stabilized at around 230MB RAM.. need
to check performance though..
The question is just.. when I iterate with .next() no memory is eaten
up.. it lives on around 30-40MB.. and like this it grows to 800.. just
strange.
But since I don't know much Java and this is all not so critical, I'll
just leave it be for now.. Thanks for the help! :)
Martin
>
> On Tue, 9 Sep 2008, Martin Bachwerk wrote:
>
>> Hello again,
>>
>> the index is kinda large indeed.. even though I have Field.Store.NO
>> set for the actual content.. (ok the documents are 2-3k large in
>> average, but it could be smaller still..)
>>
>> The memory use is just growing and growing.. though doesn't go into
>> critical area, it just ate up 800megs out of 1024 I have in some 15
>> mins.. after that it stayed stable. I guess this would be
>> acceptable.. but I don't quite understand why it is the case..
>
> If it stabilized, it could just mean that this is the memory necessary
> for Java Lucene to work with your index. Have you tried reducing the
> max memory so that you use less but gc more often ?
>
>> The arrays are pretty much dependant on the term (i.e. word).. for
>> words like "is" they're around the size of the number of documents..
>> for rare words they can be 1-2-3.. entries long..
>>
>> I don't have Java code to test all this sorry.
>
> It could be written :) It's pretty much a one-to-one mapping for the
> API calls. This is what I would do next to isolate this if I were to
> debug this further right now.
>
> Andi..
>
>
More information about the pylucene-dev
mailing list