[darcs-users] Defining patch types modularly

Wed Mar 31 09:31:10 UTC 2004

David Roundy wrote on Sunday, 28 March, 2004 13:19:
> On Sat, Mar 27, 2004 at 04:04:39PM +0100, Marnix Klooster wrote:

> > ANATOMY OF A PATCH
> > 
> > The first observation that I make is that any patch as it currently
> > exists in darcs can conceptually be separated in the following parts:
> 
> I'm afraid this is where I get stuck...
> 
> >  - change GUID, description, and more administrative details
> 
> In the rest of this, you seem to be talking about primitive patches, and
> primitive patches don't in general have descriptions.

Ok, agreed.  This was more for completeness' sake; it is not essential
to my story.

> >  - pointer to the previous patch in the repository, and thereby
> >    indirectly to the entire context of the patch
> >  
> >  - patch procedure: type (e.g., replace) and arguments (e.g.,
> >    original, replacement)
> 
> I'm not entirely clear what you mean by arguments here.  In the case of
> 
> hunk ./foo 4
> -hello
> +world
> 
> which would be the arguments? I presume it would be the
> "-hello\n+world\n", since that is the original and replacement.

Correct.

> >  - patch scope: the part of the tree that is potentially affected by
> >    the patch
> 
> I presume this would be the ./foo 4 in the above example?

Correct.

> >  - diff: like a 'diff' output, but also capable of describing adding,
> >    renaming, and deleting files and directories.
> 
> ??? I don't see that this is part of the patch at all... either that or it
> is the patch itself.

The diff is in this case "change line 4 of file foo from hello to
world", so in this case it is essentially the same as the patch
procedure+scope.

So in the specific case of a hunk patch, the diff is effectively
stored as part of the patch procedure arguments.  But for for example
a replace patch, this would be different:

 * patch procedure: type: replace; arguments: "abc", "xyz"

 * scope: file "foo"

Given the context of the patch (a tree) we can compute the diff in
this case by running the patch procedure, i.e., searching for the
places where "abc" occurs in file "foo".

So the diff is something that can be computed for every patch,
sometimes trivially (as for the hunk patches), sometimes not (as for
replaces and refactoring-type patches).  And that is precisely what I
tried to describe next:

> > The next observation is that the diff is theoretically redundant:
> > given a repository of patches, we can sequentially compute the diff of
> > all patches.  But in practice we don't want to do that for reasons of
> > efficiency.  And it turns out that we can use this diff effectively
> > for a number of things.
> 
> Here I get really confused.  What is the diff? Is it the
"-hello\n+world\n"
> in the above example? If so, how can it be redundant... i.e. what would we
> calculate it from? Or is this diff section something new that you are
> proposing to add?
> 
> Basically, this is where I got stuck, since I don't know what you
> mean by the "diff" of a patch.

I hope this is a bit clearer now?  And also the rest of my previous
post?

Because the essential idea underlying my previous post is that we can
use the 'relocation info' in the diff of a patch ('which position in
the old tree corresponds to which position in the new tree').  We can
use this relocation info to 'move' a patch by changing the locations
in the patch procedure and the scope.  And we can use it (or the diff
itself) to efficiently compute the diff of such a 'moved patch'.

So the basic idea is to have a repository that stores for every patch
also its diff, and use that info to do modularly what can now only
be done specifically (giving the combinatorial explosion "commute
hunk patch with hunk patch", "commute hunk patch with replace patch",
etc., which is a bad thing when you want to add more patch types,
which was the original motivator for this research).

In my previous post I tried to show that it is possible to *commute*
patches using only patch-type-specific code (so modular patch types).
I have a hunch that the concepts of 'the diff of a patch' and 'move a
patch by a diff' also allow to *merge* patches using only patch-type-
specific code.

Groetjes,
 <><
Marnix
-- 
Marnix Klooster 
mklooster at baan.nl