[Chandler-dev] Chandler background full-text indexing
heikki at osafoundation.org
Tue May 23 20:29:45 PDT 2006
Andi Vajda wrote:
> The background indexer runs every minute or so. This value is hardcoded.
> At some point we need to have support for user preferences and we can
> then tie that value in with them. If you're in a real hurry to have your
> stuff indexed in the background right away, you can use the
> repository.notifyIndexer() API.
While once a minute is good proof-of-concept, I believe we need to have
a centralized way to force indexing to happen before using functionality
that requires up-to-date indexes.
For example, suppose a user synchronizes their collections, and follows
up with a search. If the indexer hasn't run yet, the search will not
find the newly synced items (and will return garbage for changed stuff).
It is not scalable/reliable to add spot checks to the code to force
indexing just before actions that we know will need to have fresh
indexes (like run indexer before executing search).
I am not sure where the choke point should be - in the repository itself
or some layer above it.
> PyLucene indexing is also considerably faster now. I realized that the
Looking at Tinderbox perf data, the new code more than halved the time
it takes to import a large calendar. All in all, our new code is about
5% faster than it was before we started indexing stuff (on Windows,
didn't check other platforms).
We will need new tests and may need to modify existing tests to work
with indexing in a deterministic way, measure actual indexing perf etc.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 251 bytes
Desc: OpenPGP digital signature
Url : http://lists.osafoundation.org/pipermail/chandler-dev/attachments/20060523/87bcc8a6/signature.pgp
More information about the chandler-dev