[darcs-users] Patch Theory in action

Marnix Klooster mklooster at baan.nl
Sat Nov 22 14:05:43 UTC 2003

Hello David et al.,

> Currently it's looking like I'll rewrite it either while at home for
> Christmas or during January itself (but before Jan 31, when I'll be giving
> a talk on darcs).  Since I'll be giving a 45 minute talk, I'll definitely
> need a goodly number of concrete examples of the patch theory at work.
> of course suggestions are always welcome!
> Kevin, your explanation is correct (which I don't imagine surprises you).
> The "solution" to the creepy situation where a patch is meaningless
> its context is somewhat alleviated by patch bundles--see my response to
> your branching email.
> -- 
> David Roundy
> http://www.abridgegame.org

For what it's worth, here is a suggestion of mine.  It is my current
view of the darcs world.  (I'm using the patch/change distinction that
was introduced in an earlier thread.)

A tree is a root directory with subdirectories and text and binary

A patch is an executable description of how to transform one tree into
another (e.g., "rename file F to G").  A change is the intention
behind a patch (e.g., "rename the file currently called F to G").

A tree T is always represented by a sequential list of patches, which,
when applied sequentially to the empty tree, result in tree T.  (This
corresponds to an unordered bag of changes.)

A repository stores a tree in that representation.

Each change is uniquely identified by a human-readable GUID, which
contains at least the creation date/time, the creator (e-mail
address), and and a description.

Now for the new part: I see the darcs world --all those repositories--
as a big collection of patches, which are each uniquely identified by
the following two:
 - a context, which is a tree, which is (as we saw above) represented
   as a list of (pointers to) patches; and
 - a change GUID.
Yes, this is a fairly recursive definition: to find one patch you need
to point to a list of them, each of which is identified by...  But in
practice that all works out pretty well: locally we can just point to
the patch, and remotely we can just give a flat list of change GUIDs.

For example, within a repository, the context of a patch is simply all
patches that precede it.  Therefore at any time we can identify a
patch by pointing at a place in the patch list.

When pushing patches, that cannot be done, since the patches and their
order can change between the moment of the 'push' and that of the
'apply'.  Therefore a patch bundle describes the context tree
explicitly, by giving the change GUIDs for all 'context patches' that
make up that tree.  (It is not necessary to give the context for all
those 'context patches'.  If we send only the change GUIDs ABC, then
we imply that the context for the first patch is the empty list of
patches (i.e., the empty tree), for the second it is the first patch,
and for the third it is the first and then the second patch.)

By the way, I don't see it as "creepy" that a patch is meaningless
without its context.  I mean, if I get a diff based on
bridger-1.10.3a-beta3, then I don't expect it to work automatically
for beta1 or beta4.  In fact, by making the context of a patch
explicit, in darcs we have a higher probability of success, and in
case of failure it is clearer why it failed.

I don't recall whether darcs further optimizes its context description
using the explicit dependencies that are created by manual
dependencies and tag patches.  But this obviously can be done: e.g.,
it is not necessary to give all patches that precede a given tag,
since everybody who can find the tag patch can find out on which
patches it depends.

Hope this helps; I know it helps me in understanding what is going on
when pushing, pulling, and applying.
Marnix Klooster
mklooster at baan.nl <mailto:mklooster at baan.nl> 

More information about the darcs-users mailing list