[Cosmo-dev] Longer-running test results
Jared Rhine
jared at wordzoo.com
Mon Oct 2 12:12:08 PDT 2006
The threaded tests I've been developing in the osaf.us sandbox are
yielding good results. The threaded testing uncovered race-condition
style bugs in the last-available JCR version of Cosmo, but I haven't
gotten any unexpected or 500 errors in now hours and hours of
highly-concurrent testing. Congrats.
There's two ways to look at the resulting data, either individual
operation times, or as an overall "average mix". The average mix is the
current focus for me since that will help estimate Preview and 1.0
expected loads.
Bottom line, given the "average mix" I'm currently running, we see a
total average operations/second of:
Last JCR: 3.4 ops/sec
r2573: 12.0 ops/sec
So our Hibernate+MySQL is a bit under 4x faster than the last JCR
version. Details here: http://osaf.us/perf-test/20061001133930
This is not on a fully-loaded server; while I can crank up the
parallelism from the current 25 threads to 250 threads until the box is
fully pegged at 100%, at that point, the PUT times are in the 20+s
range, beyond what I think we'd find acceptable.
Other learnings are that something's (probably) going on with delete
collection at higher levels of concurrency; I've opened #6896 for
investigation though the priority is low. Surprisingly, I'm also seeing
longer PUT times (2x vs JCR), though that may just be because Cosmo is
supporting 4x as many parallel operations.
Next steps include:
+ publishing side-by-side results of loads at different levels of
concurrency to examine in detail at what load things start to go
downhill for a Cosmo server
+ refining the "expected average mix" of transactions we expect to see,
and tweaking the testing agent to match that mix until it's as
reasonable/realistic as we can make it without more real-world usage
experience (see the osaf.us sandbox for details on the current mix, or
ask)
+ reaching broad consensus on a specific target number of users doing a
specific target mix of operations for Preview, and highest acceptable
transaction times for those ops
+ and finally, calculating from our target load and the real-world
testing we're doing against a single server to 1) a clear picture on
whether we *must* have multiple app server and/or multiple db server
support before Preview to support the agreed load; and 2) how many
servers might be needed (assuming something less than 100% scaling as
more boxes are added).
--
Jared Rhine <jared at wordzoo.com>
More information about the cosmo-dev
mailing list