vitojph at gmail.com
Thu Nov 24 08:49:22 PST 2005
I'm indexing Spanish documents with Lucene and I need to avoid stop
words. I'm quite new using PyLucene and so far the StandarAnalyzer
worked well enough.
But now i need to do more complex things. Is there any SpanishAnalyzer
in the official distribution of Lucene or PyLucene, as those ones for
German or Russian? If there isn't, is it very difficult to extend
Analyzer to implement a kind of SpanishanAnalyzer? What issues should
I have in mind? Any tip/idea/documentation I should read first?
Thanks in advance. Best,
Víctor Peinado || <vitojph /> || http://nlp.uned.es/~victor
¡Ningún investigador sin contrato! http://www.precarios-madrid.org
More information about the pylucene-dev