[Cosmo-dev] testing data migration for cosmo 0.6
Mikeal Rogers
mikeal at osafoundation.org
Mon Dec 4 13:50:15 PST 2006
> I hadn't thought about unit testing for the migration script as its
> just a sql script. Its a good idea though, but this would require
> a db and would probably have to be some custom test sql.
You'll need to be careful about this. All unittests must be run
before checkin so we don't want them to take too long. Brian had a
collection of functional tests that ran inside the general junit
framework and took far too long to run because they had to setup
various environments.
I'm in favor of these tests but if they take 10 minutes to run then
we'll need to break them off of the regular unittests.
>
> The functional tests definitely should be run. Even though they
> don't test the migrated data, they test the schema changes. You
> could also create some scripts that verify that the correct number
> of records are in each table given the state of the data before
> migration.
>
We'll run the functional tests pre-migration, then test the migrated
data manually. We can also run the functional tests again, after
removing the users, against a migrated repository.
>
> Theres a bunch of tests you could do manually, but just verifying
> that the 0.5 data is viewable/updateable using 0.6 should be a good
> start. Also, a test that is a must is verifying that the db schema
> from a clean 0.6 install is exactly the same as a migrated 0.5 db
> (table names/field names/indexes/contraints/etc). There should be
> some db tools out there that do a diff of two db instances, which
> would make this pretty easy.
There has to be tools for doing sql diffs. This seems like a very
easy test to automate and is probably the first that we should target
since it's the simplest identifier of an imperfection in the
migration script.
> I talked with Mikeal extensively last Thursday about possible
> approaches
> to testing automation more appropriate to testing migrated data.
> Two I
> can remember:
>
> 1) Testing scripts which keep track of the expected state of a Cosmo
> instance. This sort of test would start with an empty instance, run a
> bunch of stuff, and at the end of tests, have an understanding of what
> state the server is in.
>
> The threaded performance test I use to validate Cosmo 0.5 and measure
> performance works this way to some extent. As it does each PUT, it
> notes that ICS in a data structure for that user. Later, when it
> does a
> random GET, it compares (IIRC) what it gets back with what it put in
> earlier.
>
> 2) Testing scripts which pair "actions" with "responses", but separate
> those in time. For instance, for a given PUT, you can run a GET later
> to confirm the operation succeeded. So you could run a PUT and the
> GET
> against server A, run the migration script (to create server B), then
> run the GET against server B. If the GET/A == GET/B, then you've
> validated something.
>
> Neither of these actually test production data already in the osaf.us
> instance, though. For that, I envision a combination of new automated
> and defined-test-plan manual testing.
Our current scripts have little notion of state and are pretty hard
coded to test request/response. This is a very good requirement for
the broadsword rewrite/replacement as it would make writing scripts
many times easier, a test writer would simply need to define new data
and some new general tests and the tool would know what the proper
responses should be.
That said, it's not going to be possible to write that tool in this
release cycle. We need a place (a wiki page maybe?) to stick all
these feature requirements so that when I go in to fix broadsword we
have a good feature list to work on.
> There are some automated tests possible, but they will require new
> code.
> I don't think the following would be too hard to code:
>
> * are there the same number of users before and after
> * do all the CMP info for each user match before and after
> * do users X,Y,Z have the same collections before and after
> * do collections A,B,C have the same number of items before and after
This seems fairly straightforward and easy to automate, given we have
enough time to actually automate it.
>
> Slightly harder but still straightforward are:
>
> * do ICS files Q,R,S match before and after
> * do PROPFINDs F,G,H return the same results
Actually this wouldn't be too hard. The individual event ics files
for a calendar shouldn't change during a migration. So I could easily
think of a script that was pointed at a calendar or list of
calendars, got all the event ics files for the calendars and stuck
them in an object, pickled it, then verified that they all remained
the same after the data was migrated to another server.
If a script were responsible for constructing the ics files and
putting them into the calendar the test tool would be much more
complicated -- but since we have a production server up that's
already populated with thousands of events we should just use that.
This seems like a possible candidate for "low hanging fruit".
> I think this would be helpful. Mikeal suggested an IRC hour for
> coordination, which seemed to work well last time. Mikeal asked how
> long the migration would take, to see whether the migration could be
> re-done during interactive testing. I suspect it will be very fast,
> though I'm not yet sure if this would be helpful.
>
> I have some questions though, about how exactly dogfood testing would
> work. I don't think it's necessarily wise to have people update their
> existing Chandlers to point to the staging instance. Changes intended
> to be made permanent would be lost when right before switchover, the
> migration is run for the final time.
My suggestion, for the IRC hour, would be to copy all the data in
production to another staging instance. Have everyone point their own
chandler to that instance and add their identical calendars in to
Chandler and add some more events as they need them for testing.
Everyone take a pause while we migrate the data to 0.6 and bring up
the server. Have everyone verify that all their events are solid and
that they can add new events and update, etc.
In terms of longer term dogfooding beyond the time for an IRC hour I
don't see an easy way for people to maintain using the production
instance for everyday work, dogfooding the migration, and moving to
0.6 in any kind of smooth way -- they would either loose large
amounts of data or we would have some kind of nightmarish merge
during the last transition which I _know_ Jared isn't signing up for.
------------------------------------------------------------------------
-------------
My main concern is with resourcing. We should have the first drop of
the migration script this week, and we have until early January to
make sure everything works.
We not only have to test this and flush out any bugs but we're also
going to be working through the pains of defining a new process,
writing new tools and automation, and managing some public IRC QA
sessions. I for one have a full plate from now until release testing,
although I'm sure some work will be shifted to make room for working
on this migration -- it just hasn't been identified yet.
I'd like to know what kind of time Jared can commit and what aspects
he would like to work on and then we can work together on the items
we prioritize for this release.
I'd like to sign myself up for the script that pulls down the event
ics files for a users calendar and verifies they are the same against
migrated data -- if we decide we're going to get that done in this
release. To make things easier on ourselves we should incorporate all
this work into a single run-able script and not have the different
pieces all living in their own places.
-Mikeal
More information about the cosmo-dev
mailing list