SV: [pylucene-dev] Unknown filterproblem

David Pratt fairwinds at eastlink.ca
Tue Mar 7 09:14:51 PST 2006


Hi Trond. Which version is this?

Regards,
David

Trond Aksel Myklebust wrote:
> Looks like creating a new token works....but in the PorterStemFilter it is
> done by token.termText = "new text",
> Is this a bug in PyLucene?
> 
> Works:
> 
> class integerFilter(object):
> 
>     def __init__(self, tokenStream):
>         self.input = tokenStream
> 
>     def next(self):
>         token = self.input.next()
>         if token is None:
>             return None
>         token.termText = NumberUtils.pad(int(token.termText()))
>         return Token(token.termText,token.startOffset(), token.endOffset(),
> token.type())  
> 
> -----Opprinnelig melding-----
> Fra: pylucene-dev-bounces at osafoundation.org
> [mailto:pylucene-dev-bounces at osafoundation.org] På vegne av Trond Aksel
> Myklebust
> Sendt: 7. mars 2006 15:20
> Til: pylucene-dev at osafoundation.org
> Emne: [pylucene-dev] Unknown filterproblem
> 
> I got an analyzer which, if the fieldname is integer, adds a "integerfilter"
> to pad the number. 
> The problem is that though the padding is done, I am not getting the padded
> number as output. 
> 
> class myAnalyzer(object):    
>     def __init__(self):
>         pass
>     def tokenStream(self, fieldName, reader):
>         tokenStream = StandardFilter(StandardTokenizer(reader)) 
>         if fieldName == "integer":
>             tokenStream = integerFilter(tokenStream)
>             print tokenStream
>         return tokenStream
>         
> class integerFilter(object):
>     def __init__(self, tokenStream):
>         self.input = tokenStream
>     def next(self):
>         token = self.input.next()
>         if token is None:
>             return None
>         token.termText = NumberUtils.pad(int(token.termText()))
>         print token.termText
>         return token
>         
> analyzer = myAnalyzer.myAnalyzer()
> directory = FSDirectory.getDirectory(LuceneDir, False) searcher =
> IndexSearcher(directory) qParser =
> wrapAnalyzer(self.analyzer).queryParser(CustomQueryParser.CustomQueryParser(
> ), defField)
> print qParser.parseQuery("integer:200000")
> 
> Output:
> myAnalyzer: <Modules.Lucene.integerFilter.integerFilter object at
> 0x01AA7890>
> IntegerFilter: 0000200000
> qParser: integer:200000
> 
> As you can see the integerfilter is added, and the number is padded, but the
> parser returns the number without the padding....
> Anyone who knows what is wrong?
> 
> 
> _______________________________________________
> pylucene-dev mailing list
> pylucene-dev at osafoundation.org
> http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
> 
> 
> 
> 
> _______________________________________________
> pylucene-dev mailing list
> pylucene-dev at osafoundation.org
> http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
> 


More information about the pylucene-dev mailing list