[darcs-users] Delta Debugging

Mon May 24 21:28:36 UTC 2004

On Mon, May 24, 2004 at 09:20:37AM -0700, Kenneth Knowles wrote:
> On Mon, May 24, 2004 at 10:09:53AM -0400, Andrew Pimlott wrote:
> > I wouldn't think about breaking up patches.  You'll get a combinatorial
> > explosion of sub-patches, most of which will probably be completely
> > broken. 
> 
> The algorithm is O(n log n) in whatever level you choose to break up, if I
> recall;

Ok, I took a look at those papers, and from what I can tell, they aren't
very promising for trying to break patches into sub-patches.  The
problem as I said is that most combinations of sub-patches will be
broken, ie they will give an indeterminate result.  Take this example,
which I don't think is too contrived:  We have 8 sub-patches.  1, 2, 3
(sub-patch set a) go together, as do 1, 2, 4, 5, 7 (sub-patch set b).
sub-patch set b works, sub-patch set a has a bug, and the code only
builds if the sub-patch sets are either entirely present or absent.  I
don't think that any variant of their algorithm will identify patch-set
a as the minimal test case.  Instead, they will give the full set of
sub-patches.

In the ddmin paper, I believe they get around this by claiming only to
find 1-minimal test cases, ie test cases where removing one change will
not cause failure.  Ok, but removing any one change in my case always
gives an indeterminate result, so that's not very helpful.  In the
dd/dd+ paper, I can't figure out how they allow for this case (well,
looking at the version of the paper that has proofs, I see an assumption
of "at least on non-interfering failure-inducing change", which does not
hold in my case).

So as far as I can tell, this algorithm works well only when there are
lots of non-broken change sets.  Fortunately, people don't tend to check
in broken code, and if they do, they fix it "soon".  This suggests to me
a possibly significant point:  Keeping patches in "order" helps ddmin,
because it will help it break the change set into non-broken subsets.
However, darcs may reorder patches, possibly reducing that usefulness of
ddmin compared to a traditional RCS which keeps them in order.  Since I
don't know when and how darcs reorders, I can't say whether this would
be a problem in practice.

Here is an example, which I tested with the DD.py implementation:  Say
patch 1 breaks the build, then patch 2 fixes it but introduces the
target bug, then patch 3 breaks the build, then patch 4 fixes it.  If
the patches stay in order, ddmin will find the test case (1, 2).  If
patch 2 is moved to the end, ddmin will find (1, 2, 3, 4).

Andrew