[pylucene-dev] pylucene and 2gb limit of files

Andi Vajda vajda at osafoundation.org
Wed Aug 23 15:48:18 PDT 2006


On Wed, 23 Aug 2006, Julien Anguenot wrote:

> Andi Vajda wrote:
>>
>> On Tue, 1 Aug 2006, Yura Smolsky wrote:
>>
>>> The PyLucene (2.0.0-1) that I used I was only able to get to compile
>>> with gcc 3.4.6. I tried numerous times to get it to compile with gcc
>>> 4.x.x with no luck.
>>>
>>> It seems that there is a gcc 2gb limit size issue with gcj in all 3.x.x
>>> versions and was not fixed until gcc 4.x.
>>>
>>> here is a reference
>>> http://lists.osafoundation.org/pipermail/pylucene-dev/2006-March/000933.html
>>>
>>>
>>> My point is, i dont see how to get PyLucene 2.0.0.-1 and a later 4.x
>>> version of gcc on a debian or fedora box. I attempted to get this to
>>> compile for two weeks at the beginning of this month and had no luck.
>>
>> Yes, this is a known problem.
>>
>> I've been able to use gcj 4.1.0 on gentoo linux with a patch. I suspect
>> it would work just as well on any Linux such as Debian or Ubuntu.
>>
>> I applied the patch as described in this message:
>>     http://gcc.gnu.org/ml/java/2006-03/msg00190.html
>>
>> It seems, though, that this patch was superceded by the patch in bug 13212:
>>     http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13212
>> but I haven't tried it yet.
>>
>>
>> If this doesn't solve the gcj 4.x on non Red Hat Linux problem you then
>> have two options:
>>
>>   1. dig deeper into finding why gcj 4.x doesn't work on non Red Hat Linux
>>      and get help from java at gcc.gnu.org (this is how I got the first patch
>>      mentioned above)
>>
>> or
>>
>>   2. implement an FSDirectory in python. For an example, see the
>>      Test_PythonDirectory.py unit tests. I don't expect python to have the
>>      same 2gb file size limit.
>
> Hi Andi,
>
> I just tried the patch you mentioned with a gcc-4.1.0 and gcc-4.1.1

Which one ? the one in the mail message
   (http://gcc.gnu.org/ml/java/2006-03/msg00190.html)
or the one in bug 13212
   (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13212)

I have only tried the one in the mail message.
I should try the one in bug 13212.

> The patch does apply well on the gcc source using the release tarballs
> for both gcc-version but when launching the PyLucene tests with both
> 1.9.1 and 2.0.0-1, with both versions of the gcc above, I got the
> following warnings :
>
> GC Warning: Repeated allocation of very large block (appr. size 512000):
>        May lead to memory leak and poor performance.
> GC Warning: Repeated allocation of very large block (appr. size 512000):
>        May lead to memory leak and poor performance.
> GC Warning: Repeated allocation of very large block (appr. size 512000):
>        May lead to memory leak and poor performance.

I would try to put a breakpoint at the location of this warning to see who is 
allocating all this memory and what for. Then I'd contact java at gcc.gnu.org 
with a question about it.

I did not see this problem when I tried the patch in the mail message.

> Should I try out a 4.2.x snapshot ?

It would be interesting to see what happens there...
I don't know how far you'd get, I haven't tried 4.2 in a long while.

Andi..


More information about the pylucene-dev mailing list