[pylucene-dev] lucene.analysis

Dirk Rothe d.rothe at semantics.de
Fri Feb 22 22:02:22 PST 2008


On Fri, 22 Feb 2008 11:52:08 +0100, Andi Vajda <vajda at osafoundation.org>  
wrote:

>
> On Feb 21, 2008, at 23:26, "Dirk Rothe" <d.rothe at semantics.de> wrote:
>
>> On Fri, 22 Feb 2008 11:06:40 +0100, Andi Vajda <vajda at osafoundation.org 
>> > wrote:
>>
>>>
>>> On Feb 21, 2008, at 22:10, "Dirk Rothe" <d.rothe at semantics.de> wrote:
>>>
>>>> I'm not sure if I miss something obvious, but how could I access  
>>>> stuff in org.apache.lucene.analysis.de.
>>>>
>>>> http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/analysis/de/package-summary.html
>>>>
>>>> I havent found it in the pylucene namespace.
>>>>
>>>
>>> The Java Lucene package structure is flattened in PyLucene. In other  
>>> words, just import the class name from lucene:
>>>    from lucene import GermanAnalyzer
>>>
>>> Andi..
>>
>> aah, I see, but there are two GermanStemmers:
>>
>> http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//allclasses-frame.html
>> [..]
>> FuzzyQuery.ScoreTermQueue
>> FuzzyTermEnum
>> German2Stemmer
>> GermanAnalyzer
>> GermanStemFilter
>> GermanStemmer
>> GermanStemmer
>> GradientFormatter
>> GreekAnalyzer
>> GreekCharsets
>> GreekLowerCaseFilter
>> HTMLDocument
>> HTMLParser
>> [..]
>>
>> One is from:
>> java.lang.Object
>>  net.sf.snowball.SnowballProgram
>>      net.sf.snowball.ext.GermanStemmer
>>
>> and the other from:
>> java.lang.Object
>>  org.apache.lucene.analysis.de.GermanStemmer
>>
>> pylucene seems to wrap only the second one.
>>
>
> Ugh. I'm afraid PyLucene wraps both but only one of them sticks. Both  
> the analyzer contrib package and the porter stemmer packages are part of  
> the build.
> You get to choose:
> - remove one of the jar files from your build in PyLucene's Makefile.
> - rename one of the classes with a patch to the sources if you need to  
> use both.
>

OK, I will do that.

If this is a JCC "feature", I guess it would be nice to have a resolution  
strategy for these cases in the future.

--dirk


More information about the pylucene-dev mailing list