[pylucene-dev] SpanishAnalizer

Andi Vajda vajda at osafoundation.org
Thu Nov 24 08:57:45 PST 2005


On Thu, 24 Nov 2005, Victor Peinado wrote:

> Hello all,
>
> I'm indexing Spanish documents with Lucene and I need to avoid stop
> words. I'm quite new using PyLucene and so far the StandarAnalyzer
> worked well enough.
>
> But now i need to do more complex things. Is there any SpanishAnalyzer
> in the official distribution of Lucene or PyLucene, as those ones for
> German or Russian? If there isn't, is it very difficult to extend
> Analyzer to implement a kind of SpanishanAnalyzer? What issues should
> I have in mind? Any tip/idea/documentation I should read first?

I don't think there is a SpanishAnalyzer in Java Lucence 1.4.3. 
There may be something in the snowball contrib package (also included in 
PyLucene).

Creating a custom analyzer in python in PyLucene can be pretty simple. See the 
"Lucene in Action" samples ported to Python in the PyLucene distribution.
If all you want is a different set of stop words, it might even be very 
simple.

For more specific information about a SpanishAnalyzer or how to go about 
creating your own, you might ask the java-user at lucene.apache.org mailing list 
where such Lucene-specific (java or not) questions are best addressed.

Andi..


More information about the pylucene-dev mailing list