[Cosmo-dev] Re: Cosmo Question
Brian Moseley
bcm at osafoundation.org
Wed Jul 5 15:48:31 PDT 2006
(cc'ing to cosmo-dev, because everybody will benefit from this info)
On 7/5/06, rletness at simdeskcorp.com <rletness at simdeskcorp.com> wrote:
> I'm currently evaluating the Cosmo server in the hopes that we can
> leverage it's CalDAV functionality. I really like what you have put
> together and it seems to work great in the limited testing I've done.
> My only concern is performance and scalability. It looks like
> jackrabbit is the limiting piece that prevents the server from scaling.
> Is this accurate? Because there is no cluster support in jackrabbit,
> the repository access is limited to a single instance right? You can
> partition the repository, but if any of the repositories fail you lose
> access to that partition unlike a traditional clustered deployment. I'm
> sure the jackrabbit guys will figure out clustering at some point but I
> was wondering where performance and scalability are on the Cosmo
> roadmap(are you wating on jackrabbit improvements or are there other
> areas you are looking into). I'm sure you have thought about it a lot
> and wondered if you had any insights.
yep, there are two main issues:
1) it's not currently practical to access jackrabbit remotely, which
means that every cosmo instance has its own embedded repository which
owns its own data.
2) even if one did access jackrabbit remotely, as you pointed out,
jackrabbit isn't clusterable.
we haven't yet defined what feature set cosmo 1.0 will have. there are
still a lot of decision to be made inside osaf. what i can tell you is
that we certainly won't make any guarantees about scalability or
performance until we are feature complete; in other words, near the
end of the release cycle.
we're most concerned with developing features right now, because our
primary task is to provide the capabilities required by the entire
"chandler ecosystem", and secondarily to implement interoperability
standards. i think this focus will eventually shift to making the
server stable and fast for more than a few users - the numbers we've
been working with for a while are 10,000 users, each with a 1,000
event calendar, per instance.
> Also, I like the JCR model Cosmo uses, but I think the relative newness
> of the technology is cause for some concern. For instance, its not
> trivial to look at your data once its stored in the repository (custom
> binary serialization). Also, all querying is done in the same process
> instead of offlloading it to a DB server. And the more child nodes you
> have (events) the performance degrades because the entire node state
> (includes all child nodes) has to be persisted each time. So in Cosmo
> this translates to a performance degradation when dealing with calendars
> with lots of events. I'm sure all of this will improve in time as JCR
> becomes more accepted and super optimized commercial implementaitons
> become available. I understand you guys are focusing on providing a
> functional piece of software and are not too concerned with performacne
> and scalability right now. Any insight you have on the issue is greatly
> appreciated.
i agree with every point you've made. additionally, jackrabbit doesn't
provide mature management tools for backups and other operational
tasks.
it's not beyond the realm of possibility that we'll choose to
re-implement cosmo using an rdbms directly. i'll be talking more about
this in the weeks to come. if anybody has thoughts on the matter, i'd
love to hear them.
i don't know if you'll be at oscon this month, but these issues are
part of the cosmo & scooby talk we're giving there.
More information about the cosmo-dev
mailing list