[Dev] pylucene fsdirectory patch and unicode issue
Kapil Thangavelu
hazmat at objectrealms.net
Thu Apr 29 00:10:03 PDT 2004
hi folks,
attached is a patch against cvs head to add lucene's standard
fsdirectory store to PyLucene. swig files were regenerated with swig
1.3.21
also attached is a unittest file, with one failing test (prefix XXX)
which attempts to index unicode with pylucene, using a copy of the input
stream reader from repository.utils.Streams which does string encoding.
i was wondering if anyone had any idea as to the cause of this error,
because afaics they should return the same value because the encoding by
input stream reader amounts to the following
unicode(u'sample text'*20).encode('utf-8')
unicode('sample text'*20).encode('utf-8')
and the return values are both of type str and have the same value.
i've attached the traceback from the unit test as well.
cheers,
-kapil
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_PyLucene.py
Type: application/x-python
Size: 4359 bytes
Desc: not available
Url : http://lists.osafoundation.org/pipermail/dev/attachments/20040429/7f933eb2/test_PyLucene.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pylucene_fsdirectory.patch
Type: text/x-diff
Size: 24583 bytes
Desc: not available
Url : http://lists.osafoundation.org/pipermail/dev/attachments/20040429/7f933eb2/pylucene_fsdirectory.bin
-------------- next part --------------
======================================================================
ERROR: test_indexDocumentWithUnicodeText (__main__.Test_PyLuceneWithFSStore)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_PyLucene.py", line 101, in test_indexDocumentWithUnicodeText
writer.addDocument(doc)
File "/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/PyLucene.py", line 200, in addDocument
def addDocument(*args): return _PyLucene.IndexWriter_addDocument(*args)
ValueError: java.lang.RuntimeException: TypeError: decoding Unicode is not supported
at org.osafoundation.io.PythonReader.read(char[], int, int) (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.analysis.standard.FastCharStream.refill() (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.analysis.standard.FastCharStream.readChar() (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.analysis.standard.FastCharStream.BeginToken() (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.analysis.standard.StandardTokenizerTokenManager.getNextToken() (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.analysis.standard.StandardTokenizer.jj_ntk() (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.analysis.standard.StandardTokenizer.next() (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.analysis.standard.StandardFilter.next() (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.analysis.LowerCaseFilter.next() (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.analysis.StopFilter.next() (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.index.DocumentWriter.invertDocument(org.apache.lucene.document.Document) (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.index.DocumentWriter.addDocument(java.lang.String, org.apache.lucene.document.Document) (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.index.IndexWriter.addDocument(org.apache.lucene.document.Document, org.apache.lucene.analysis.Analyzer) (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at org.apache.lucene.index.IndexWriter.addDocument(org.apache.lucene.document.Document) (/home/hazmat/Desktop/osaf/chandler/persistence/PyLucene/build_release/_PyLucene.so)
at PyCFunction_Call (python)
at PyObject_Call (python)
at ext_do_call (../Python/ceval.c:3713)
at eval_frame (../Python/ceval.c:2152)
at PyEval_EvalCodeEx (python)
at fast_function (../Python/ceval.c:3533)
at call_function (../Python/ceval.c:3458)
at eval_frame (../Python/ceval.c:2116)
at fast_function (../Python/ceval.c:3520)
at call_function (../Python/ceval.c:3458)
at eval_frame (../Python/ceval.c:2116)
at PyEval_EvalCodeEx (python)
at function_call (../Objects/funcobject.c:504)
at PyObject_Call (python)
at instancemethod_call (../Objects/classobject.c:2434)
at PyObject_Call (python)
at slot_tp_call (../Objects/typeobject.c:4442)
at PyObject_Call (python)
at do_call (../Python/ceval.c:3644)
at call_function (../Python/ceval.c:3460)
at eval_frame (../Python/ceval.c:2116)
at PyEval_EvalCodeEx (python)
at function_call (../Objects/funcobject.c:504)
at PyObject_Call (python)
at instancemethod_call (../Objects/classobject.c:2434)
at PyObject_Call (python)
at slot_tp_call (../Objects/typeobject.c:4442)
at PyObject_Call (python)
at do_call (../Python/ceval.c:3644)
at call_function (../Python/ceval.c:3460)
at eval_frame (../Python/ceval.c:2116)
at PyEval_EvalCodeEx (python)
at function_call (../Objects/funcobject.c:504)
at PyObject_Call (python)
at instancemethod_call (../Objects/classobject.c:2434)
at PyObject_Call (python)
at slot_tp_call (../Objects/typeobject.c:4442)
at PyObject_Call (python)
at do_call (../Python/ceval.c:3644)
at call_function (../Python/ceval.c:3460)
at eval_frame (../Python/ceval.c:2116)
at fast_function (../Python/ceval.c:3520)
at call_function (../Python/ceval.c:3458)
at eval_frame (../Python/ceval.c:2116)
at fast_function (../Python/ceval.c:3520)
at call_function (../Python/ceval.c:3458)
at eval_frame (../Python/ceval.c:2116)
at PyEval_EvalCodeEx (python)
at function_call (../Objects/funcobject.c:504)
at PyObject_Call (python)
at instancemethod_call (../Objects/classobject.c:2434)
at PyObject_Call (python)
at slot_tp_init (../Objects/typeobject.c:4671)
at type_call (../Objects/typeobject.c:438)
at PyObject_Call (python)
at do_call (../Python/ceval.c:3644)
at call_function (../Python/ceval.c:3460)
at eval_frame (../Python/ceval.c:2116)
at PyEval_EvalCodeEx (python)
at PyEval_EvalCode (python)
at run_node (../Python/pythonrun.c:1240)
at run_err_node (../Python/pythonrun.c:1226)
at PyRun_FileExFlags (python)
at PyRun_SimpleFileExFlags (python)
at PyRun_AnyFileExFlags (python)
at Py_Main (python)
at main (python)
at __libc_start_main (/lib/libc-2.3.2.so)
More information about the Dev
mailing list