[pylucene-dev] Downcast of TermFreqVector to TermPositionVector
Bernhard Jung
bernhard at jung.name
Mon Jul 30 03:44:34 PDT 2007
hi everybody,
I stumbled across the problem of using term vectors with position and
offset information in pylucene. I use fields with
Field.TermVectors.WITH_POSITIONS_OFFSETS set and the getTermFreqVector
method of IndexReader to retrieve the term vector, but this is of type
TermFrequencyVector and not of TermPositionVector (a sub-interface of
TermFrequencyVector), which would provide the method getTermPositions
and getOffsets that I want to use.
I patched lucene.cpp of the latest subversion trunk (of 2007-07-30) to
provide downcast methods from TermFrequencyVector to TermPositionVector
(isTermPositionVector and toTermPositionVector).
I'd like to share this patch or be corrected if I somehow follow a wrong
way to get the positions and offsets of terms in a document.
Find attached the patch and an example script that makes use of the
downcasted TermPositionVector.
bernhard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch-20070730-termpositionvector-downcast.diff
Type: text/x-patch
Size: 1063 bytes
Desc: not available
Url : http://lists.osafoundation.org/pipermail/pylucene-dev/attachments/20070730/b61e9c21/patch-20070730-termpositionvector-downcast.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sample_termpositionvector.py
Type: text/x-python
Size: 1128 bytes
Desc: not available
Url : http://lists.osafoundation.org/pipermail/pylucene-dev/attachments/20070730/b61e9c21/sample_termpositionvector.py
More information about the pylucene-dev
mailing list