[pylucene-dev] Re: slovenian stemmer for snowball for pylucene

Andi Vajda vajda at osafoundation.org
Tue Dec 26 16:32:08 PST 2006


On Wed, 27 Dec 2006, Andra Tori wrote:

Please post your questions to pylucene-dev at osafoundation.org, there may be 
more people on that list than just myself who can help you...

> Thanks. When is next tarball due?

Dunno, it all depends on how many desirable new things accumulate on either 
Lucene or PyLucene's side.

> Well it was submitted there more than a year ago (i got it from the list
> and fixed it up to handle utf8 properly), but Porter did not include it
> into the distribution because he wanted to clarify if all the rules are
> ok.

Oh well. That happens.
I saw no copyright notice on the source file you sent.
It might be good to add that first. If you agree to the License used by 
PyLucene, just copy the copyright notice from one of the other PyLucene source 
files.

> Well... by my own evaluation the stemmer is far from perfect, but still
> very useful for many uses (i already use it directly in my project), and
> way better than no stemmer at all... So it would be great if I could use
> it for the PyLucene which I use...

That makes it less than ideal for distribution by PyLucene and PyLucene only.

Can you send a python sample about how this stemmer is/would be used with 
PyLucene (I know next to nothing about the porter stemmers package) and from 
that I can see what is needed to include it in PyLucene and I can at least 
send you instructions on how to do it yourself if it comes to that...

> I am not fluent enough in java environment to be able to put it in .jar
> myself. ... i have enough of troubles trying to have PyLucenne working
> on debian unstable already :)

If you build you own PyLucene from sources, run 'make test' after you've 
compiled it. If you get errors, you have the wrong compiler.

On Linux, I've only been able to get sane PyLucene builds from gcj 3.4.x or 
from very recent gcj 4.2.0 snapshots I built myself. Any other gcj 4.x, I've 
had to patch first.

Andi..


More information about the pylucene-dev mailing list