[pylucene-dev] searching repeated and untokenized fields
Andi Vajda
vajda at osafoundation.org
Sun Apr 30 23:53:50 PDT 2006
On Sun, 30 Apr 2006, Alf Eaton wrote:
> I have a couple of questions regarding indexing and searching a document that
> has repeated values for the same field (specifically, the authors of a
> document, in this case):
>
> Firstly, I'm adding the repeated field with this code:
>
> for creator in creators:
> doc.add(Field('creator', creator, Field.Store.YES,
> Field.Index.UN_TOKENIZED))
>
> but can't find a way to read those fields back out from the index. If I use
>
> for author in hits[i]["creator"]:
> print author
I'm not sure I understand what you're trying to do in the code above.
In PyLucene 1.9.1, the way to iterate hits is:
for i, doc in hits:
print doc['creator']
If there is more than one field called 'creator' then, you might want to try:
for i, doc in hits:
for creator in doc.getFields('creator'):
print creator
In PyLucene 2.0rc1, you can also say:
for hit in hits:
for creator in hit.getDocument().getFields('creator'):
print creator
If this doesn't work, please send in code that illustrates the problem (that
would help in understanding and fixing the potential bug(s)).
> Secondly, it doesn't seem to be possible (in PyLucene 1.9.1) to search an
> untokenized field using a term that contains spaces. For a document that has
> a creator "Doe J", the query
> creator:"Doe J"
> doesn't return any results, and
> creator:Doe J
> doesn't match what it needs to.
Again, please send in code that reproduces the problem. If you can make sure
that what you're trying to do work in Java Lucene, that's a plus.
Ideally, your sample code would be organized as unit tests.
Thanks !
Andi..
More information about the pylucene-dev
mailing list