[Cosmo-dev] Re: [service-dev] Deployment of Cosmo persistence
load distribution
Brian Moseley
bcm at osafoundation.org
Wed Jul 19 10:37:10 PDT 2006
On 7/19/06, Jared Rhine <jared at wordzoo.com> wrote:
> Indeed. While it may seem distant, the associated issues feel close to me
> for the Hosted Service. If you can't identify any Cosmo features in the
> Beta timeframe that will require cross-partition operations, we can put off
> the issue somewhat.
i can't even tell you what features will be in 0.5, much less beta :)
i hope that acls get into cosmo before beta, but given the recent
threads on the design list, i'm not holding my breath. i doubt
cross-calendar search will show up.
> If Cosmo does need it, I'd like to "think a lot more
> about" the issues now (thus this thread). I've seen tiny issues cause
> switches from distributed to centralized architectures and back again, so
> this stuff can get tricky quick. And then you prototype and find out why
> one's model doesn't work ;)
i don't want to minimize this issue, because it's important, but i
disagree that it's a gating factor in te decision making process. if
we try to solve every imaginable scalability problem now, we're going
to paralyze ourselves and not make any progress. this is an issue of
high importance, and it's been on my radar since day one. let's
recognize that and work on the more pressing issues first.
> As a program management note, I'd like the Hosted Service to be running on
> the "final" deployment architecture by a couple months before Alpha ends.
> If you ask me to defer the planning in these areas until after a Hibernate
> solution is stabilized, then time starts to get pretty tight.
i have never seen dates associated with alpha and beta. nobody's even
defined alpha and beta for me. i certainly have no idea what
requirerments we have for those milestones. so forgive me if i smile
and nod whenever anybody talks about them.
> Well wait a sec. You grant everything's happening behind a single IP
> address. If app server APP1-6 talks to database DB1 (users a-k) and APP7-12
> talks to DB2 (l-z) and an incoming Atom request comes in (for user "zebra"),
> what happens?
>
> It certain appears to me that I need to get that routed to say APP8. How's
> that happen with a stateless, Basic-authed request?
sorry, i said the wrong thing earlier. the app server layer should not
be tied to a particular database. app servers 1-12 should be able to
talk to both database servers. each app server has the logic for
extracting user id info from the request and can ask the service
locator for a data service url for that user id.
> If the IP sprayer is configured with APP1-12 in its set, then zebra's Atom
> request may get routed to APP4 and thus DB1 where the zebra user does not exist.
>
> Here's another model, instead of "n cosmo instances per database server", we
> have n cosmo instances for m database servers". On a stateless Atom request
> for zebra, the IP sprayer picks a random app server, say APP9. APP9 takes
> the username/password submitted, looks up the account in an LDAP server,
> confirms the password is right, and retrieves the database server to use
> from the LDAP user profile (DB2).
>
> APP9 then connects to DB2 (probably using an existing connection is a
> database pool), does its SQL via Hibernate, and away it goes. Next request
> gets sent to APP4 instead and the same process happens.
heh, yeah. what i said a few paragraphs ago, before reading these ones :)
another trick we pulled at cpth, and that i've seen done in several
other places, is to delegate the task of database connection to a
proxy. the proxy maintains pools of connections to each database
server, and the app server pools connections to the proxy (which is
probably located on localhost). this saves the app server from having
to know anything about service location - that logic can be built into
the db proxy. all app servers can continue to use a default
configuration, whereas the individual proxies can be configured
separately per hardware cluster or data center.
> For web UI stuff, the IP sprayer would send a random server, say APP11.
> Tomcat instantiates the web session state from the cluster's distributed
> session cache. The IP sprayer might direct subsequent requests to the same
> APP11, but any other server could pick up mid-stream if APP11 died or the IP
> sprayer just decided to switch app servers.
>
> Apologies if I misunderstood your response or confused the issue either with
> my questions. (I'm thinking the above is pretty much what you had in mind too).
yep.
> It should be noted that other deployment models should not be excluded by
> anything like above; I'm not saying "LDAP only" or something.
yeah, that's where having a service locator interface is handy. folks
can plug in their own implementations - ldap, dns, what have you.
> > we can cluster tomcat so that web session state is shared among all cosmo
> > instances...
>
> APP1-12 or would share session state or just APP1-6?
1-12.
> I'd prefer to have session state migrated as needed between a cluster of
> identical app servers. It'd be helpful is someone could vouch for Tomcat's
> session state clustering good behavior in the real world. I can't
> personally testify.
i can't either, but it's been in the product for a good long while. i
can dig up some testimonials if folks require.
we can also consider switching to any other servlet container or app
server to get better clustering. there's nothing specfically tying
cosmo to tomcat.
> Granted and not disputed. Or at least we're more in sync after discussing
> stateless Cosmo a bit more. The stateful/stateless issue clearly makes a
> big difference to deployment issues and we can't yet predict all the nuances
> of trying to keep a stateless Cosmo. But if you're saying in this thread
> that to the best of your understanding, it should be possible to go
> stateless, and that it's the committer's intent to try to preserve that
the only web session state that i can imagine keeping around would be
related to ui presentation (which node of a tree control is open, what
is the last visited page, that sort of thing). stuff that can be lost
with minimal user impact.
> behavior, I'll in good faith start to design deployment around that. (While
> still identifying the risk that you might be wrong in your assessment or
> ability to implement same.) With this case, yes I would state that a
> Cosmo/Scooby merge would not be reasonably objected to on a load
> distribution basis.
cool :)
More information about the cosmo-dev
mailing list