[pylucene-dev] Re:pylucene-dev Digest, Vol 39, Issue 9
Liang Xing
gorgonking at 163.com
Mon Aug 27 18:28:32 PDT 2007
Thanks for your reply.
PyLucene version is 2.0.0(Lucene-java-2.0.0-453447)
Python version is 2.5.1
machine i686-pc-linux-gnu
gcc-3.4.6
LC_ALL=en_us
In fact, When I run >python samples/ LuceneInAction/ KeywordAnalyzerTest.py, It works well as you mention, but that's not the point.
KeywordAnalyzerTest.py just corresponses to KeywordAnalyzer. The object I throw my doubt on is SimpleKeywordAnalyzer, which make use of the Tokenizer and pass on a self-defining class instance to it. However, it fails and offers me a wrong alert.
lia/analysis/SimpleKeyWordAnalyzer.py
Plz do me a favor to check two ideas:
1. Can I simply use a Lucene-Java class as the base class, and then override a (abstract or not) method of it.
class A?LetterTokenizer):
def isTokenizer(self, ch):
return ch.isalnum()
2. Can I pass on a new-type general python object, which implement some method of a Java-Lucene class', as:
class A(object):
def isTokenChar(self, ch):
return c.isalnum()
LetterTokenizer(A, reader)
?2007-08-28?pylucene-dev-request at osafoundation.org
>Send pylucene-dev mailing list submissions to
> pylucene-dev at osafoundation.org
>
>To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
>or, via email, send a message with subject or body 'help' to
> pylucene-dev-request at osafoundation.org
>
>You can reach the person managing the list at
> pylucene-dev-owner at osafoundation.org
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of pylucene-dev digest..."
>
>
>Today's Topics:
>
> 1. Support or not: extension from a Java Lucene
> Implementation(Tokenizer or CharTokenizer) (Liang Xing)
> 2. Re: Support or not: extension from a Java Lucene
> Implementation(Tokenizer or CharTokenizer) (Andi Vajda)
>
>
>----------------------------------------------------------------------
>
>Message: 1
>Date: Mon, 27 Aug 2007 10:22:31 +0800 (CST)
>From: "Liang Xing" <gorgonking at 163.com>
>Subject: [pylucene-dev] Support or not: extension from a Java Lucene
> Implementation(Tokenizer or CharTokenizer)
>To: pylucene-dev at osafoundation.org
>Message-ID:
> <24643926.93631188181351845.JavaMail.coremail at bj163app23.163.com>
>Content-Type: text/plain; charset="gbk"
>
> 1.Quotas from PyLucene: """Technically, the PyLucene programmer is not providing an 'extension'
>but a Python implementation of a set of methods encapsulated by a
>Python class whose instances are wrapped by the Java proxies provided
>by PyLucene. ----http://svn.nuxeo.org/pub/vendor/PyLucene/tags/1.9rc1-1/README """To me, it almost means that I can't simply extend Java class, in my own Python implementation, such as FunnyTokenizer(PyLucene.CharTokenizer). 2.Testcase Thanks to the link offered ashttp://svn.osafoundation.org/pylucene/trunk/samples/LuceneInAction/lia/analysis/keyword/SimpleKeywordAnalyzer.pyI try a testcase as follows: #-------------------tester.py------------------from PyLucene import StringReaderfrom PyLucene import CharTokenizer class SimpleKeywordAnalyzer(object): def tokenStream(self, fieldName, reader): class charTokenizer(object): def isTokenChar(self, c):
> return True
>
> return CharTokenizer(charTokenizer(), reader) if __name__ == '__main__': ca = SimpleKeywordAnalyzer() strs = ca.tokenStream(' ', StringReader('bonne nuit Francais')) print 'Merci' for each in strs: print each.termText(), each.type()#------------------------------------------------------------- Simple as it is, however, it roughly didn't work out.Message:TrackBack:File 'tester.py', line 21, in <module> strs = ca.tokenStream(' ', StringReader('bonne nuit Francais'))File 'tester.py', line 16, in <module> return CharTokenizer(charTokenizer(), reader)NotImplementedError:('instantiating java class', <type 'PyLuceneCharTokenizer'> -----------------------------------------------BTW:Environment:Python 2.5.1,PyLucene 2.0.0-3,i686-pc-linux-gnuThread model: posixgcj-3.4.6
>-------------- next part --------------
>An HTML attachment was scrubbed...
>URL: http://lists.osafoundation.org/pipermail/pylucene-dev/attachments/20070827/d17af506/attachment.htm
>
>------------------------------
>
>Message: 2
>Date: Sun, 26 Aug 2007 19:46:51 -0700 (PDT)
>From: Andi Vajda <vajda at osafoundation.org>
>Subject: Re: [pylucene-dev] Support or not: extension from a Java
> Lucene Implementation(Tokenizer or CharTokenizer)
>To: pylucene-dev at osafoundation.org
>Message-ID: <Pine.OSX.4.64.0708261945560.6362 at yuzu.local>
>Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>
>
>
>On Mon, 27 Aug 2007, Liang Xing wrote:
>
>> 1.Quotas from PyLucene: """Technically, the PyLucene programmer is not providing an 'extension'
>> but a Python implementation of a set of methods encapsulated by a
>> Python class whose instances are wrapped by the Java proxies provided
>> by PyLucene. ----http://svn.nuxeo.org/pub/vendor/PyLucene/tags/1.9rc1-1/README """To me, it almost means that I can't simply extend Java class, in my own Python implementation, such as FunnyTokenizer(PyLucene.CharTokenizer). 2.Testcase Thanks to the link offered ashttp://svn.osafoundation.org/pylucene/trunk/samples/LuceneInAction/lia/analysis/keyword/SimpleKeywordAnalyzer.pyI try a testcase as follows: #-------------------tester.py------------------from PyLucene import StringReaderfrom PyLucene import CharTokenizer class SimpleKeywordAnalyzer(object): def tokenStream(self, fieldName, reader): class charTokenizer(object): def isTokenChar(self, c):
>> return True
>>
>> return CharTokenizer(charTokenizer(), reader) if __name__ == '__main__': ca = SimpleKeywordAnalyzer() strs = ca.tokenStream(' ', StringReader('bonne nuit Francais')) print 'Merci' for each in strs: print each.termText(), each.type()#------------------------------------------------------------- Simple as it is, however, it roughly didn't work out.Message:TrackBack:File 'tester.py', line 21, in <module> strs = ca.tokenStream(' ', StringReader('bonne nuit Francais'))File 'tester.py', line 16, in <module> return CharTokenizer(charTokenizer(), reader)NotImplementedError:('instantiating java class', <type 'PyLuceneCharTokenizer'> -----------------------------------------------BTW:Environment:Python 2.5.1,PyLucene 2.0.0-3,i686-pc-linux-gnuThread model: posixgcj-3.4.6
>
>It works fine for me. Please provide more information. What are version are
>you running ? What operating system ? etc...
>
> yuzu:vajda> python samples/LuceneInAction/KeywordAnalyzerTest.py
> ....
> ----------------------------------------------------------------------
> Ran 4 tests in 0.006s
>
> OK
>
>
>Andi..
>
>
>------------------------------
>
>_______________________________________________
>pylucene-dev mailing list
>pylucene-dev at osafoundation.org
>http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
>
>
>End of pylucene-dev Digest, Vol 39, Issue 9
>*******************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osafoundation.org/pipermail/pylucene-dev/attachments/20070828/39995ffc/attachment.htm
More information about the pylucene-dev
mailing list