[Chandler-dev] Performance, Scalability

Phillip J. Eby pje at telecommunity.com
Tue Oct 9 14:16:49 PDT 2007


At 09:53 AM 10/9/2007 -1000, Brian Kirsch wrote:
>We should have a meeting / discussion as to the best next steps. I
>personally think the only way to get acceptable performance metrics
>is to go one layer lower
>and manually insert data in to our Repo / Berkeley DB. Whether this
>is possible or not would require some consulting with Andi. But
>basically the idea is to get as low level as
>possible and bypass as much of the Python abstraction as we can. This
>may even require writing some c code :)

Or better yet, reusing some:

http://pyinsci.blogspot.com/2007/07/fastest-python-database-interface.html

In fairness these numbers (up to 100,000 inserts in 3 seconds for 
SQLite) are based on smaller record sizes than what we use, and fewer indexes.

However, this is precisely one of the key problems with our current 
persistence strategy!  That is, objects aren't very good at being 
database records, precisely *because* they encapsulate quite a lot of 
data into a single logical object.  Databases, on the other hand, are 
built around "excapsulation" -- splitting data into logically distinct chunks.

When using a database rather than an object store, you have many 
options for tuning it, while leaving the front-end alone.  However, 
if we do this sort of thing in the repository, we end up creating yet 
*another* layer of abstraction (i.e. performance drain) between us 
and the data.

If in the repository, for example, we store an email as a single 
Item, then its email-specific data must be accessible, even if we are 
just listing it in the detail view, because it is a single item.  To 
improve this state of things in a database-based storage, we could 
simply split triage/dashboard information to one table, and 
email-specific data to another, without necessarily splitting the 
Item itself.  To do this in the repository, on the other hand, we 
would have to actually have two different items.



More information about the chandler-dev mailing list