[pylucene-dev] Lucene 1.9.1 and mod_python issue

Andi Vajda vajda at osafoundation.org
Wed Jun 14 15:11:23 PDT 2006


On Wed, 14 Jun 2006, Robert Kaye wrote:

> Hi!
>
> I am upgrading the MusicBrainz searching functionality from Lucene 1.4.x to 
> 1.9.x and my command line tools for creating my indexes work peachy. But when 
> I load the searching scripts into apache2 with mod_python, and I try to 
> create my custom analyzer, the creation of that custom analyzer just plain 
> hangs. Never returns. I suspect that my custom analyzers are somehow at fault 
> and I hope that someone here can shed some light on what I am doing wrong.
>
> First, some info:
>
> - Linux 2.6.15, ubuntu dapper drake
> - Python 2.4.3
> - gcc/gcj/g++ 3.4.6 compiled from source by gcc 3.4.6 since gjc 3.4 is not 
> part of dapper and I distrust gcc 4.0.x
> - PyLucene 1.9.1, without DB support, compiled by my own gcc/gcj
> - apache 2.0.58 (compiled with my own gcc)
> - mod_python 3.2.8 (also by my own gcc)
>
> Here is my mod_python handler:
>
> =====
> from mod_python import apache, util
> import analyzer
>
> def handler(req):
>   a = analyzer.ArtistAnalyzer()
> =====
>
> The call to creating the ArtistAnalyzer never returns. Run as a standalone 
> script, it works just fine.
>
> My analyzer.py looks like this:
>
> =====
> import PyLucene
>
> class NoStopStandardAnalyzer(object):
>   def tokenStream(self, fieldName, reader):
>       res = PyLucene.StandardTokenizer(reader)
>       res = PyLucene.LowerCaseFilter(res)
>       return PyLucene.ISOLatin1AccentFilter(res)
>
> class ArtistAnalyzer(PyLucene.PerFieldAnalyzerWrapper):
>   def __init__(self):
>       PyLucene.PerFieldAnalyzerWrapper.__init__(self, 
> NoStopStandardAnalyzer())
>       self.addAnalyzer("arid", PyLucene.KeywordAnalyzer())
>       self.addAnalyzer("p_artist", PyLucene.KeywordAnalyzer())
> =====
>
> If I use the StandardAnalyzer as a default analyzer for the 
> PerFieldAnalyzerWrapper, everything works as expected. Whenever my custom 
> analyzer gets created in mod_python, it grinds to a halt but the CPU never 
> gets pegged.
>
> Any ideas what I might be doing wrong? Anything I have overlooked? I figured 
> I can't get more paranoid about the compilers than compiling gcj by hand and 
> then building everything with it. Alas that didn't yield any results. :-(
>

Any python thread running in this process accessing PyLucene needs to be known 
to libgcj's garbage collector. In other words, any python thread using 
PyLucene code (which calls into libgcj) needs to be an instance of 
PyLucene.PythonThread which does the right thing in setting it up via libgcj.

How that is done under mod_python, I don't know. But failure to do so will 
crash, hang, or otherwise act unhappy as soon as any java memory is allocated.

This question has been asked many times before on this list already but I 
don't remember seeing any more practical answer on how this is actually done.

Andi..


More information about the pylucene-dev mailing list