[darcs-users] managing change

zooko zooko at zooko.com
Sun Feb 8 00:02:50 UTC 2009


On Feb 6, 2009, at 11:01 AM, Eric Kow wrote:

> As Petr points out, this approach has its disadvantages:
> 1. Slow:... it takes us one whole year to get rid of anything, as  
> major darcs releases are every 6 months;
> 2. Time-consuming: because it causes us to split our time between  
> maintaining the old stuff and working on the new stuff and
> 3  Potentially bad engineering: we're increasing the number of  
> possible code paths (yuck! conditional compilation!) thereby  
> reducing the amount of time that each path is explored.
> So despite my claims, it's not even completely clear that the so- 
> called "conservative" sunset procedure is the sort of responsible  
> engineering practice that it aspires to be.

These are pretty strong criticisms.  It could be that the sunset  
approach would make new releases buggier instead of more stable.  I  
have some experience with this sort of approach and while I don't  
know if my experience generalizes to the darcs codebase, I would say  
that the sunset approach didn't work out well for me.

> Third, we want to make sure that we never break darcs, because  
> sometimes Life Just Happens: deadlines pile up at work, buses hit  
> people, hackers get girlfriends, babies are born.

This is a good consideration to keep in mind (and by the way the same  
thing happens in proprietary software development in a company --  
priorities change all the time).  I think the right approach to this  
is to have the policy of "trunk is always good".  You never break  
trunk at time T, intending to fix it again at time T+n.  Instead you  
do whatever work you need to make it good in a branch, and then  
commit it to trunk once it is strictly better than the current  
trunk.  A corollary of this is that if a patch lands in trunk and  
then is discovered to contain a regression, then that patch is rolled- 
back.

Obviously this is pretty much impossible without test-driven  
development.  If you don't have thorough tests, then how do you know  
if you're breaking things with your patches?

You have some good questions about test-driven development:

> 1. The kinds of things the sunset procedure aims to catch are  
> integration errors (unexpected interaction between different parts  
> of darcs), and also real-world errors (e.g. HTTP not working behind  
> proxies) that seem tricky to capture in laboratory conditions.  I  
> don't mean to say "we shouldn't do automated testing because it  
> can't cover everything".  Of course we should do more automated  
> testing.  But how should we catch the real world errors?

My experiences with Twisted, and with Brian Warner on Tahoe, have  
taught me that such issues are a lot more programmable and  
reproducible than I had thought.  Things that I used to consider  
obviouly "manual", like "Write to the AIX user and ask him to  
misconfigure his network in that same way again and try again with  
this new build", are to these guys "automatable", like "Run a  
buildslave on AIX, figure out exactly which parts of our source code  
can be affected by network misconfiguration, and test how that code  
handles that effect.".

This is not to deny your point -- certainly integration and "real  
world" are always full of surprises, and some things can't be  
automated with reasonable effort, and you will always want manual  
testing after all the automated testing is done.  But what I've  
learned is that automated testing can address 90% of those cases that  
I formerly thought required manual testing.

> 2. I'm not sure how to go about testing IO-intensive stuff (I guess  
> our functional tests, i.e. the shell scripts, are a good example)

In Twisted and in Tahoe, I've seen two complementary approaches  
taken.  One is the lower-level, "unit test" sort of approach --  
figure out what functions will be called with what sort of inputs in  
response to the I/O, and thoroughly test those functions under those  
inputs.  Haskell should be *great* at this, right?  The whole *point*  
of side-effect-free programming is that you don't have to worry about  
things *other* than the arguments affecting the computation.

The other is a more holistic "functional test" approach -- simulate  
the circumstances that the code under test is required to handle.  If  
you want to test that the code handles a user who mashes down the "n"  
key, then launch a subprocess, exec darcs in that subprocess, send a  
thousand "n" chars on its stdin, and examine how it behaves.  If  
Haskell is not already good at this sort of thing, then you can  
always write your functional tests (as currently) in bash (ugh), Perl  
(ugh), Python (yay!) or something, but Haskell is probably going to  
get good at this sort of thing, because Haskell is growing up, and  
this is the sort of thing that a modern, well-rounded practical  
language needs to be good at.

> 3. It seems that for a heavy reliance on testing to work, we are  
> going to need to have much much wider test coverage.  How do we  
> break out of this chicken and egg?  Do we put everything on hold  
> and launch a massive darcs testing initiative?

What the Twisted folks did when switching from their previous  
practices to the Ultimate Quality Development System was simply to  
mandate that any new patches had to fully satisfy the new  
requirements.  This works well, because if the current code contains  
bugs, then at least they are old bugs, and in practice it causes less  
havoc to keep old bugs than it would to replace them with new bugs.   
The result of Twisted's practice has been a near-monotonic  
improvement in code quality -- the rate at which new bugs are  
introduced by patches is now much lower than the rate at which old  
bugs are fixed by patches.

Regards,

Zooko
---
Tahoe, the Least-Authority Filesystem -- http://allmydata.org
store your data: $10/month -- http://allmydata.com/?tracking=zsig


More information about the darcs-users mailing list