[Cosmo-dev] Local versus remote MySQL server
Jared Rhine
jared at wordzoo.com
Wed Oct 11 17:32:44 PDT 2006
To determine if Cosmo is currently fast enough to launch for Preview on a
single machine, it was important to determine what real-world performance
change is possible when the MySQL backend is moved to a separate server.
This benchmarking has been done against Cosmo r2659. The results can be
seen here:
http://osaf.us/perf-test/20061011170115
The bottom line is that we can not expect performance improvements by
launching the Hosted Service in a "2 box" configuration (separate app and db
servers) versus a "1 box" config (just one big box with app and db together).
Previous tests have been run against MySQL, where the MySQL server is on the
same box. One might expect that overall performance would go up if you
threw a whole second machine in the mix, and ran MySQL on a separate box
(thereby distributing the load partially).
This turns out to not be the case. For MySQL local, we get 28.4 txns/sec;
for remote/separate MySQL, 31.0 txns/sec. Pretty much a wash (other runs
came out slightly slower). So if we find about 30 txn/sec too slow for what
we need, we're likely to get the most bang for our buck by trying to buying
faster CPUs. To double the CPU performance of this particular box, would
probably take us from $5,500 per server to about $18k, but it could be done.
Depending on the mix tested, performance was between 5% faster and 10%
slower when using a remote MySQL versus "local" (MySQL on the same box).
(When slower, it's from expensive operations like ATOM GETs and webcal GETs).
The app server and database servers being tested are essentially identical:
4 Opteron cores, same fast RAID-5 disks, lots of memory. All boxes are
otherwise underused. All boxes are plugged into the same gigabit switch, so
networking latency is low and consistent. There is no swapping occurring.
The app server spends 2-15% of its time idle, so it is well-utilized. I
believe these are very tests.
These additional changes have been made:
+ The testing mix has been tuned quite a bit. iCal gets, ATOM gets,
JSON-RPC, and ICS GETs are all now included and should be much more
representative of real-world usage. In particular, the test roughly mirrors
the "load scaling model" outlined at <http://osaf.us/project/scaling-model>.
This will be used in upcoming "how many users?" discussions.
+ The tests are run with 25 processes, each with only one Python thread. In
comparing 10 procs with 5 threads each versus 50 procs with 1 thread each, I
found you needed the extra processes to saturate the CPU of the app server.
With a hybrid "let Python do some of the threading", the app server was
80%+ idle.
+ A powerful 3rd machine was used for generating the test load (also plugged
into the same switch). This means that the app server can devote 100% of
its time to processing Cosmo requests; there's no run-the-test-scripts
overhead. No significant difference measured as a result, but it solidifies
the case that Cosmo is being provided every last drop of a very fast machine
for testing.
+ I picked a concurrency level of 25 process to make PUTs in the range of
1.5 seconds maximum. Concurrency levels of 40 processes generates average
PUT times of about 2.5 seconds. (I'll formally propose this 1.5sec PUT
threshold in another thread.)
This may be a lot of detail, but it's an important step to understanding
what overall performance we can expect out of Cosmo. The formula is
essentially "X target number users -> D txns/day -> N txns/sec. Server S
tested at M txns/sec, so is N greater than M? If so, we're undersized.".
This thread defines a value for M, and I believe he pieces are in places to
answer D and N. Working backwards, let's set if we're happy with the
resulting value X.
-- Jared
More information about the cosmo-dev
mailing list