[pylucene-dev] Lucene 1.9.1 and mod_python issue

Andi Vajda vajda at osafoundation.org
Wed Jun 14 15:24:58 PDT 2006


On Wed, 14 Jun 2006, Robert Kaye wrote:

> import PyLucene
>
> class NoStopStandardAnalyzer(object):
>   def tokenStream(self, fieldName, reader):
>       res = PyLucene.StandardTokenizer(reader)
>       res = PyLucene.LowerCaseFilter(res)
>       return PyLucene.ISOLatin1AccentFilter(res)
>
> class ArtistAnalyzer(PyLucene.PerFieldAnalyzerWrapper):
>   def __init__(self):
>       PyLucene.PerFieldAnalyzerWrapper.__init__(self, 
> NoStopStandardAnalyzer())
>       self.addAnalyzer("arid", PyLucene.KeywordAnalyzer())
>       self.addAnalyzer("p_artist", PyLucene.KeywordAnalyzer())

Extending PyLucene classes like your ArtistAnalyzer is doing is not going to 
work well because the Java side is not aware of the extension. The 
PyLucene.PerFieldAnalyzerWrapper class is not an extension of the Java class 
but a proxy to it. If you're overriding methods on your extension, the Java 
side is not going to be calling these since you actually only extended the 
proxy.
There are a number of extension points in PyLucene, your 
NoStopStandardAnalyzer is an example of an Analyzer extension. When passed 
into a PyLucene method expecting an Analyzer instance, your 
NoStopStandardAnalyzer instance is going to get automatically wrapped by a 
PythonAnalyzer class that is implemented as a Java extension of Lucene's 
Analyzer and whose methods are implemented in C++ in order to call your python 
methods on the wrapped NoStopStandardAnalyzer instance.
Yes, this is tricky, but this is how it works.

In order to debug the actual crasher, or hang, it would helpful to first 
isolate the problem by running it in a regular python process instead of 
inside apache/mod_python.

Andi..



More information about the pylucene-dev mailing list