[pylucene-dev] Re:pylucene-dev Digest, Vol 39, Issue 9

Liang Xing gorgonking at 163.com
Mon Aug 27 18:28:32 PDT 2007


Thanks for your reply.
PyLucene version is 2.0.0(Lucene-java-2.0.0-453447)
Python version is 2.5.1
machine i686-pc-linux-gnu
gcc-3.4.6 
LC_ALL=en_us

In fact, When I run >python samples/ LuceneInAction/ KeywordAnalyzerTest.py, It works well as you mention, but that's not the point.
KeywordAnalyzerTest.py just corresponses to KeywordAnalyzer. The object I throw my doubt on is SimpleKeywordAnalyzer, which make use of the Tokenizer and pass on a self-defining class instance to it. However, it fails and offers me a wrong alert.
lia/analysis/SimpleKeyWordAnalyzer.py

Plz do me a favor to check two ideas:
1. Can I simply use a Lucene-Java class as the base class, and then override a (abstract or not) method of it.
class A?LetterTokenizer):
      def isTokenizer(self, ch):
            return ch.isalnum()

2. Can I pass on a new-type general python object, which implement some method of a Java-Lucene class', as:
class A(object):
     def isTokenChar(self, ch):
            return c.isalnum()

LetterTokenizer(A, reader)


?2007-08-28?pylucene-dev-request at osafoundation.org 
>Send pylucene-dev mailing list submissions to
>	pylucene-dev at osafoundation.org
>
>To subscribe or unsubscribe via the World Wide Web, visit
>	http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
>or, via email, send a message with subject or body 'help' to
>	pylucene-dev-request at osafoundation.org
>
>You can reach the person managing the list at
>	pylucene-dev-owner at osafoundation.org
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of pylucene-dev digest..."
>
>
>Today's Topics:
>
>   1. Support or not: extension from a Java Lucene
>      Implementation(Tokenizer or CharTokenizer) (Liang Xing)
>   2. Re: Support or not: extension from a Java Lucene
>      Implementation(Tokenizer or CharTokenizer) (Andi Vajda)
>
>
>----------------------------------------------------------------------
>
>Message: 1
>Date: Mon, 27 Aug 2007 10:22:31 +0800 (CST)
>From: "Liang Xing" <gorgonking at 163.com>
>Subject: [pylucene-dev] Support or not: extension from a Java Lucene
>	Implementation(Tokenizer or CharTokenizer)
>To: pylucene-dev at osafoundation.org
>Message-ID:
>	<24643926.93631188181351845.JavaMail.coremail at bj163app23.163.com>
>Content-Type: text/plain; charset="gbk"
>
> 1.Quotas from PyLucene: """Technically, the PyLucene programmer is not providing an 'extension'
>but a Python implementation of a set of methods encapsulated by a
>Python class whose instances are wrapped by the Java proxies provided
>by PyLucene.       ----http://svn.nuxeo.org/pub/vendor/PyLucene/tags/1.9rc1-1/README """To me, it almost means that I can't simply extend Java class, in my own Python implementation, such as FunnyTokenizer(PyLucene.CharTokenizer). 2.Testcase Thanks to the link offered ashttp://svn.osafoundation.org/pylucene/trunk/samples/LuceneInAction/lia/analysis/keyword/SimpleKeywordAnalyzer.pyI try a testcase as follows: #-------------------tester.py------------------from PyLucene import StringReaderfrom PyLucene import CharTokenizer class SimpleKeywordAnalyzer(object):    def tokenStream(self, fieldName, reader):        class charTokenizer(object):            def isTokenChar(self, c):
>                return True
>       
>        return CharTokenizer(charTokenizer(), reader) if __name__ == '__main__':    ca = SimpleKeywordAnalyzer()    strs = ca.tokenStream(' ', StringReader('bonne nuit Francais'))    print 'Merci'    for each in strs:        print each.termText(), each.type()#------------------------------------------------------------- Simple as it is, however, it roughly didn't work out.Message:TrackBack:File 'tester.py', line 21, in <module>    strs = ca.tokenStream(' ', StringReader('bonne nuit Francais'))File 'tester.py', line 16, in <module>    return CharTokenizer(charTokenizer(), reader)NotImplementedError:('instantiating java class', <type 'PyLuceneCharTokenizer'> -----------------------------------------------BTW:Environment:Python 2.5.1,PyLucene 2.0.0-3,i686-pc-linux-gnuThread model: posixgcj-3.4.6     
>-------------- next part --------------
>An HTML attachment was scrubbed...
>URL: http://lists.osafoundation.org/pipermail/pylucene-dev/attachments/20070827/d17af506/attachment.htm
>
>------------------------------
>
>Message: 2
>Date: Sun, 26 Aug 2007 19:46:51 -0700 (PDT)
>From: Andi Vajda <vajda at osafoundation.org>
>Subject: Re: [pylucene-dev] Support or not: extension from a Java
>	Lucene	Implementation(Tokenizer or CharTokenizer)
>To: pylucene-dev at osafoundation.org
>Message-ID: <Pine.OSX.4.64.0708261945560.6362 at yuzu.local>
>Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>
>
>
>On Mon, 27 Aug 2007, Liang Xing wrote:
>
>> 1.Quotas from PyLucene: """Technically, the PyLucene programmer is not providing an 'extension'
>> but a Python implementation of a set of methods encapsulated by a
>> Python class whose instances are wrapped by the Java proxies provided
>> by PyLucene.       ----http://svn.nuxeo.org/pub/vendor/PyLucene/tags/1.9rc1-1/README """To me, it almost means that I can't simply extend Java class, in my own Python implementation, such as FunnyTokenizer(PyLucene.CharTokenizer). 2.Testcase Thanks to the link offered ashttp://svn.osafoundation.org/pylucene/trunk/samples/LuceneInAction/lia/analysis/keyword/SimpleKeywordAnalyzer.pyI try a testcase as follows: #-------------------tester.py------------------from PyLucene import StringReaderfrom PyLucene import CharTokenizer class SimpleKeywordAnalyzer(object):    def tokenStream(self, fieldName, reader):        class charTokenizer(object):            def isTokenChar(self, c):
>>                return True
>>
>>        return CharTokenizer(charTokenizer(), reader) if __name__ == '__main__':    ca = SimpleKeywordAnalyzer()    strs = ca.tokenStream(' ', StringReader('bonne nuit Francais'))    print 'Merci'    for each in strs:        print each.termText(), each.type()#------------------------------------------------------------- Simple as it is, however, it roughly didn't work out.Message:TrackBack:File 'tester.py', line 21, in <module>    strs = ca.tokenStream(' ', StringReader('bonne nuit Francais'))File 'tester.py', line 16, in <module>    return CharTokenizer(charTokenizer(), reader)NotImplementedError:('instantiating java class', <type 'PyLuceneCharTokenizer'> -----------------------------------------------BTW:Environment:Python 2.5.1,PyLucene 2.0.0-3,i686-pc-linux-gnuThread model: posixgcj-3.4.6
>
>It works fine for me. Please provide more information. What are version are
>you running ? What operating system ? etc...
>
>     yuzu:vajda> python samples/LuceneInAction/KeywordAnalyzerTest.py
>     ....
>     ----------------------------------------------------------------------
>     Ran 4 tests in 0.006s
>
>     OK
>
>
>Andi..
>
>
>------------------------------
>
>_______________________________________________
>pylucene-dev mailing list
>pylucene-dev at osafoundation.org
>http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
>
>
>End of pylucene-dev Digest, Vol 39, Issue 9
>*******************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osafoundation.org/pipermail/pylucene-dev/attachments/20070828/39995ffc/attachment.htm


More information about the pylucene-dev mailing list