[pylucene-dev] searching repeated and untokenized fields

Andi Vajda vajda at osafoundation.org
Mon May 1 18:46:33 PDT 2006


On Mon, 1 May 2006, Alf Eaton wrote:

>>> Secondly, it doesn't seem to be possible (in PyLucene 1.9.1) to search an 
>>> untokenized field using a term that contains spaces. For a document that 
>>> has a creator "Doe J", the query
>>> creator:"Doe J"
>>> doesn't return any results, and
>>> creator:Doe J
>>> doesn't match what it needs to.
>> 
>> Again, please send in code that reproduces the problem. If you can make 
>> sure that what you're trying to do work in Java Lucene, that's a plus.
>> Ideally, your sample code would be organized as unit tests.
>
> Good idea to do the tests: I realised that StandardAnalyzer was converting 
> the search terms to lowercase when used in QueryParser, but not when adding 
> untokenized fields to the document using IndexWriter, so the two weren't 
> matching. Fixed now, thanks (and it's presumably not a PyLucene problem).

Yeah, I don't think the StandardAnalyzer tokenizes untokenized fields.
This is definitely a question for java-user at lucene.apache.org.

Andi..


More information about the pylucene-dev mailing list