[Chandler-dev] Performance, Scalability

Brian Kirsch bkirsch at osafoundation.org
Tue Oct 9 12:53:39 PDT 2007


Hi Katie,
see comments in-line.


On Oct 8, 2007, at 5:59 PM, Katie Capps Parlante wrote:

> Grant Baillie wrote:
> > Performance, Scalability
> > ------------------------
> >
> > *Open Issue*: (Grant, Andi, Brian K) Unclear what the goals are  
> w.r.t.
> > email beyond what we have today. Need measurements of how Chandler
> > performs in the presence of many items (and/or collections), as  
> well as
> > explicit performance goals.
>
> As it stands right now, one of the possible strategic paths is to  
> handle email in a first class way. At the very least, I think we  
> want to have a plausible path for Chandler to be able to handle a  
> real mailbox. I want to have a better idea what it would take to  
> get there (from a performance and scalability perspective) so that  
> we can make reasonable decisions about that path.
>

Right now Chandler internal performance is the major factor  
inhibiting real mailbox support (10,000+ messages). The twisted and  
Python email 3.0 layers are able to download
and convert messages at a reasonable speed to meet average users  
expectations. It is when the messages are turned in to Chandler Mail  
Stamped Item's that things really
slow down and exponentially so as the number of mail messages  
downloaded grows.

Indexing, Observers, Merging are all hurting performance terribly.  
While observers work great when data changes on a single Item it  
kills performance when many observers
get fired for a single downloaded message. Indexes and Merges are  
taking as much as 2 minutes each to complete once 5,000+ messages get  
downloaded.

We should have a meeting / discussion as to the best next steps. I  
personally think the only way to get acceptable performance metrics  
is to go one layer lower
and manually insert data in to our Repo / Berkeley DB. Whether this  
is possible or not would require some consulting with Andi. But  
basically the idea is to get as low level as
possible and bypass as much of the Python abstraction as we can. This  
may even require writing some c code :)

And of course we need to reduce the numbers of Indexes, Observers,  
and thus Merging in the Chandler architecture.


Thoughts?

-Brian



> For calendaring performance, we used Mitch's data as a reference  
> point. (That is how we arrived at the original 3k event calendar).  
> If we again use Mitch as a reference point, adding task/notes and  
> mail:
>
> Tasks/notes:
> ~ 500 tasks
> ~ 1750 notes
>
> Mail messages:
> 112,549 messages total (~500MB)
> ~ 61,000 archived messages
> ~ 6400 messages in inbox
> ~ 100 folders
>
> And don't forget the calendar:
> 3000+
>
> I'm sure we'll find many people with fewer messages and tasks, and  
> people with larger mailboxes. For a concrete example to focus on, I  
> think Mitch's case is a good one to expand the performance/ 
> scalability ambitions for Chandler.
>
> For now, we'll stick with the same target hardware:
> http://chandlerproject.org/Projects/PerformanceProject#Target% 
> 20Hardware
>
> We're looking for reasonable end user response times on the target  
> hardware (keeping in mind that the PPC mac will be a bit slower,  
> and the intel mac has the advantage of more memory).
>
> Next steps?
>
> Cheers,
> Katie
>
>
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
> Open Source Applications Foundation "chandler-dev" mailing list
> http://lists.osafoundation.org/mailman/listinfo/chandler-dev



More information about the chandler-dev mailing list