[Cosmo-dev] Requirements for Windmill tests running on tinderbox

Ted Leung twl at osafoundation.org
Mon Jun 4 16:42:40 PDT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 4, 2007, at 3:46 PM, Mikeal Rogers wrote:

>> 1. Developers should be running the Windmill tests before the  
>> check in major functionality changes -- this is independent of  
>> Tinderbox
>> 2. A Tinderbox run is schedule every time a developer commits.    
>> That Tinderbox run should run the Windmill tests.  If that  
>> Windmill run fails, then something in the associated checkin is  
>> the likely culprit - tracking that kind of stuff down doesn't seem  
>> like something that ought to require managerial intervention.    
>> Besides, someday the person checking in might not even work at  
>> OSAF, at which point getting managers involved won't help much.
>
> I know how things work on Chandler but I'm not sure that the exact  
> same process will work well for Cosmo.
>
> The windmill tests are more brittle by nature than the Chandler  
> tests and when the tests break they require more test level changes  
> than the Chandler tests do. The UI layout and architecture also  
> changes much more rapidly than on the Chandler side and adding the  
> burden of fixing broken tests because of major functionality  
> changes will load up the cosmo UI development resource(s) with much  
> more work than you think. As an example, with the latest look at  
> 0.7 Adam anticipates that 30-50% of the tests will need to be  
> changed, and estimates only a day to fix them himself.

I am assuming that the 30-50% of the existing tests are targeting the  
detail view, which ought to be the majority of the changes to the  
Calendar UI.   Are their other impacts which I am not accounting  
for?   Since we don't have any dashboard tests, those tests shouldn't  
need to be rewritten ;-).   Since we are talking about 1 day to fix  
all of those tests, I am more concerned about  the time it will take  
to write new tests to cover the new functionality.

>
> Because the tests change so dramatically through each release  
> ( this is in reaction to the UI code changing dramatically during  
> each release ) I don't think it's fair to put the burden of fixing  
> the tests on the developers until the end of the cycle. When  
> checking in "major functionality changes" the tests surrounding  
> that functionality need to be modified and in some cases,  
> rewritten. Adam is very efficient at fixing these tests and I don't  
> think he'll be a bottleneck for fixing them.

For functionality changes, I agree.  For areas where functionality  
didn't change, I think developers ought to be on the hook, because in  
those areas, the test ought to be correct already.

>
> I agree that we don't want managerial intervention if the tests  
> break, but expanding something like the process we have now seems  
> like the best solution.
>
> The current process is something like this;
> -mde runs the tests towards the end of our release cycle when he's  
> fixing bugs before he checks in.
> -Adam kicks off the tests every time he starts testing new build
>
> How I see the process improving is;
> -mde, travis, and bobby kick off the tests manually toward the end  
> of the release cycle when fixing bugs before checkin.

I'd just replace mde, travis, and bobby with "anyone changing the UI  
in a substantial way"

> -Adam watches tinderbox to make sure the tests are always passing  
> during the entire cycle, fixing them when necessary.
>
> For as long as I've worked on cosmo we've destabilized the entire  
> trunk at the beginning of every release and slowly stabilizing it  
> while more features and fixes drop, and QA changes many of the  
> tests to work in the new code base. With windmill modifying and  
> debugging these tests now takes about 20% of the time it did when  
> we used Selenium but it's still a hit and it's still easier and  
> faster for QA to make the larger changes. I don't think the burden  
> of fixing the tests should be on the developers until the end of  
> the release cycle when we are in bug fix mode, at that point Adam  
> would become a bottleneck and tests are most likely failing due to  
> valid bugs and not architectural or functional changes.
>
> This is obviously a contentious issue that we'll be discussing  
> further. At this point I don't see anyone objecting to the task of  
> getting them in to tinderbox, just with the process surrounding how  
> we manage and handle the results of the tests being in tinderbox,  
> and running manually. I'm going to move forward and log a task for  
> bear to add windmill tests to tinderbox today and we can discuss  
> this process more in the coming days.

I've no problem with getting windmill into tinderbox.  The sooner the  
better, actually.   At the end, I want us to be in a world where  
tinderbox can give us trustable information about checkins that break  
the UI (as well as performance regresssions).   If we need some  
intermediate steps along the way (like what you propose above) I am  
fine with that.

Ted
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iD8DBQFGZKN1YCjW/J06/U8RAs3hAJ9pVGwwbe/cY1tSg8TrUuBRATDMCACfdQJ3
JkveeNX3Ma1upYxbGqA9D/o=
=tQ4V
-----END PGP SIGNATURE-----


More information about the cosmo-dev mailing list