{Spam?} Re: [pylucene-dev] {Spam?} HighFreqTerms from
org.apache.lucene.misc
Dirk Rothe
d.rothe at semantics.de
Tue Apr 1 11:28:10 PDT 2008
On Tue, 01 Apr 2008 20:20:56 +0200, Andi Vajda <vajda at osafoundation.org>
wrote:
>
> On Tue, 1 Apr 2008, Dirk Rothe wrote:
>
>>>> Ok, but by inspecting the java code, this was pretty trivial to
>>>> implement in Python. Only curiosity, but do you think the java
>>>> version would be (significantly) faster. I'm not sure I understand
>>>> the performance implications from the jcc bridge.
>>> I don't know. How about measuring it ?
>>> The jcc bridge involves converting some literals from java to python
>>> (such as strings), releasing the GIL (global interpreter lock) when
>>> leaving python and reacquiring it when returnig.
>>> The jcc bridge also keeps track of the java objects returned to
>>> python so that they don't get garbage collected until python no longer
>>> uses them. This is implemented via a C++ multimap.
>>> It's been shown before that using a python HitCollector (used in a
>>> very tight loop by the Lucene core) is significantly slower than using
>>> the java equivalent [1].
>>
>> Ok, I will try to measure it.
>>
>> After I understand the makefile jar/java stuff better - and I guess
>> thats after my theoretical CS Exams next Week ;).
>
> To add a JAR file to the PyLucene build, look at line 171 in the
> Makefile for the current list of JAR files. Looking above that line
> should show you how to add another JAR file.
Yeah, I have seen that, doesnt look that hard.
thnx, dirk
More information about the pylucene-dev
mailing list