[Cosmo-dev] prototyping cosmo with jdbc instead of jcr

Brian Moseley bcm at osafoundation.org
Fri Jul 7 14:41:15 PDT 2006


a few threads over recent weeks have approached the subject of
jackrabbit's scalability and operational challenges. most agree that
jcr has a lot of promise and seems to be a good fit with a
webdav-oriented hierarchical data store, but it's clear that
jackrabbit has a long way to go before it can efficiently scale to the
numbers of users, calendars and events we're targeting with cosmo.

i propose to build a prototype of cosmo that stores its data directly
into a relational database using hibernate to map the cosmo domain
model to the relational schema. see <http://hibernate.org/> for more
info on hibernate.

the advantages to such an architecture are many. with more control
over how our data is stored, we can optimize for our most troublesome
use cases (PROPFIND against a thousand event calendar, for example).
by replacing the state-heavy but non-clusterable, mainly non-remotable
jackrabbit with a more traditional j2ee style webapp talking to an
rdbms, we achieve horizontal scalability with both server tiers. we
can take advantage of mature, professional quality management tools
for the app server and db server. and we don't have a "black box"
component that few people understand, which reduces the number of head
scratching bugs or performance problems that typically require a large
dose of experience with the software to resolve.

i do see a couple disadvantages as well. cosmo is currently stateless,
which is the holy grail for the app server tier, but we may need to
introduce some user state in order to effectively cope with situations
like concurrent writes to the same calendar. it's unclear whether or
not this will be needed, but even if it is, app server clustering (and
cache clustering) is not only possible but is a well known technology
area with lots of mature solutions. another potential disadvantage is
that we'll eventually have to implement resource locking, which we get
for free with jcr, but i don't think that's a particularly difficult
problem to solve.

a side benefit of this prototype would be a refactoring of the
jcr-server library that we use to implement the server side webdav
protocol handling. this would be required in order to replace the
current jcr implementation of webdav and caldav with the hibernate
implementation. the refactoring will make the protocol layer code a
lot easier to understand and explain, and even more importantly, it
will become unit testable (unit level testing is impossible today).

my plan is to have something functional within the next couple of
weeks that we can use to make further decisions on our architectural
direction. i'm also going to continue exploring the problem areas of
our current jackrabbit usage, in case there are more tweaks or tuning
knobs that we haven't found yet.

thoughts on this proposal? concerns? think i'm nuts?


More information about the cosmo-dev mailing list