{Spam?} Re: [pylucene-dev] {Spam?} HighFreqTerms from org.apache.lucene.misc

Dirk Rothe d.rothe at semantics.de
Tue Apr 1 10:55:46 PDT 2008


>> Ok, but by inspecting the java code, this was pretty trivial to  
>> implement in Python. Only curiosity, but do you think the java version  
>> would be (significantly) faster. I'm not sure I understand the  
>> performance implications from the jcc bridge.
>
> I don't know. How about measuring it ?
>
> The jcc bridge involves converting some literals from java to python  
> (such as strings), releasing the GIL (global interpreter lock) when  
> leaving python and reacquiring it when returnig.
>
> The jcc bridge also keeps track of the java objects returned to python  
> so that they don't get garbage collected until python no longer uses  
> them. This is implemented via a C++ multimap.
>
> It's been shown before that using a python HitCollector (used in a very  
> tight loop by the Lucene core) is significantly slower than using the  
> java equivalent [1].

Ok, I will try to measure it.

After I understand the makefile jar/java stuff better - and I guess thats  
after my theoretical CS Exams next Week ;).

--dirk


More information about the pylucene-dev mailing list