[darcs-users] Colin Walters blogs on Arch changesets vs Darcs

Fri Nov 26 03:27:53 UTC 2004

I think most of your post has been discussed, but I want to add a couple
things.  But please say if you think anything has been missed, because
your points are highly valuable.

On Tue, Nov 23, 2004 at 06:13:34PM -0500, Colin Walters wrote:
[snip full example]
> When they receive the patch, it will apply
> exactly to project/Makefile, because they have the same content.  But it
> will be the *wrong* Makefile!  I don't see how you can solve this
> without a notion of explicit logical file identity.

Even though others have given the idea, I think it's instructive to show
exactly how this works (to the best of my understanding).  The
repository we start out with has a bunch of patches that I'll abbreviate
"...".  Also, I'll apply patches left to right, because I had an utterly
awful professor for quantum mechanics and I'm still bitter.

So in branch A we modify project/Makefile, and call this patch P; and in
branch B we move project/Makefile elsewhere and create a new
project/Makefile (patch Q).

    A == ...P
    B == ...Q

Now, A sends P to B, and you ask why this doesn't cause confusion.  The
answer is, you're not allowed to apply P after Q.  P can only come after
... (this is its context, and it is included in "darcs send")!  So
how do we get P into B?  First, all patches are invertible, so if we add
the inverse of Q (Q-1) onto B, we get

    ...QQ-1

This is equivalent to ..., so we're allowed to apply P:

    ...QQ-1P

Now, the tricky part: commute P with Q-1.  The idea behind commuting is
that we can apply P and Q-1 in the opposite order, but not verbatim;
instead we fix up P to take into account the changes in Q-1, and vice
versa, getting P' and Q'-1.  P' should be logically the same change as
P.

    ...QP'Q'-1

Finally, we don't want that Q'-1, so just drop it off the end:

    ...QP'

If we believe that P' is logically the same as P, this is what we want.
So in short, what gets applied to B is the fixed-up P', which modifies
the right files.

> > However, you can exclude the patch files themselves from a repo (this is
> > the checkpoint and partial get features).  This makes operations that
> > need those patches fail, which usually is not a practical problem.  
> 
> The issue is "usually".

There has been discussion of grabbing old patches on-demand, which would
mitigate the problem.  However, it's really only a problem when you want
to either 1) examine old patches (obviously) or 2) push/pull with
someone whose repo is not up to the checkpoint, which is simply uncommon
(since people tend to keep their repos up to date) unless there is a
fork.

> It's not just that this is a "hard problem".  I just fundamentally don't
> see how it can scale in general.  For small projects, sure.  But
> something like gcc or Emacs or Linux?

Yes, I think it will probably be possible.  It will probably be at least
linear in the size of the relevant portions of the history, but I
consider this acceptible.  Even in big projects, the bulk of the history
is shared by everyone, so it can be effectively (if not conceptually)
ignored.

Andrew