[pylucene-dev] Query.extractTerms - how to call this
Andi Vajda
vajda at osafoundation.org
Fri Dec 14 10:02:54 PST 2007
On Fri, 14 Dec 2007, Helmut Jarausch wrote:
> following a suggestion on the Java-Lucene mailing list,
>
> I tried to call Query.extractTerms
>
> In Java it has the signature
> void extractTerms(Set terms)
>
> It puts all results in the Java-set terms
>
> I tried
>
> parser= QueryParser(...)
> query = parser.parse('a string')
> Set_of_Terms= set()
> query.extractTerms(Set_of_Terms)
>
> but this fails. The same is with
>
> Set_of_Terms=[]
> query.extractTerms(Set_of_Terms)
>
> So, what Python type should be used?
JCC is not making a correspondance between a Python set and a Java Set.
In other words, when it sees a set object, it's not converting it to a
concrete Java Set subclass. This could be done, though, if one implemented a
Java Set subclass that wraps a Python set instance. In other words, a Python
extension of the Java Set abstract class implemented with a Python set.
It's not too hard to do. I might just add that to JCC at some point.
In the meantime, use a concrete Java Set subclass such as HashSet instead of
a Python set:
>>> from lucene import QueryParser, StandardAnalyzer, HashSet, Term
>>> q = QueryParser("fields", StandardAnalyzer()).parse("foo AND bar")
>>> terms = HashSet()
>>> q.extractTerms(terms)
<HashSet: [fields:foo, fields:bar]>
>>> for term in terms:
print term, type(term)
fields:foo <type 'Object'>
fields:bar <type 'Object'>
Notice how the type of the term instance is seen as Object by python. This
is because a HashSet contains Java Object instances so JCC generated code to
wrap the HashSet contents as Object instances and not what they actually are
(which it doesn't know at compile time).
To cast the field objects to Term, use the cast_() method as in:
list(terms)[0].cast_(Term)
To know what the actual java class of a wrapped object is, you can call
getClass() on it:
>>> list(terms)[0].getClass()
<Class: class org.apache.lucene.index.Term>
Andi..
More information about the pylucene-dev
mailing list