[darcs-devel] Activation Patches

Wed May 30 15:49:35 PDT 2007

Hello,

After what seems like an eternity of bumbling with patch theory and a
lot of help from David, I've implemented activation patches as a new
patch type in darcs.

Below is a description of what I learned, what I did, and why.  I hope
that by describing what I learned others can begin learning more about
patch theory.

Motivation:  Why a new patch type?

As you may know, I'm part of an effort to improve conflict handling in
darcs.  We have come up with a plan that allows users to cancel and
activate patches (more on this later).  Implementing activation
patches is the first step in this direction.

What is an activation patch?

An activation patch consists of a patch, which it activates, and a
context.  The context is a set of patches which do not commute with
the activated patch (the reason for this representation will be
explained later).  For example, if A, B, C and D are all patches, we
might use the following notation to represent an activation patch
which activates A:
  (_A_BCD)
This would mean that the activation patch activates A (hence the
underlining) and that B, C, and D do not commute with A.  It should be
noted that this notation works much better on paper.  I'm still
searching for a better notation, perhaps with square brackets as this
notation makes parenthesis significant for more than simple grouping.

Notable Properties of activation patches:

Normally when two patches are commuted, say hunk patches, they modify
the representation of each other.  This is done to maintain the
starting and ending contexts of the patches.  Using lower case letters
to represent the state of the repository, we could use notation like
this to talk about the contexts of a patch sequence:
  o A a B b
Here we mean that we started in some original state o, applied patch
A, which places us in a context a, then applied patch B to get context
b.  In general, commuting A with B, yields two new patch
representations of the same changes, A' and B'.  The way that the
representations vary is in their contexts.  So, B' goes from context o
to a new context c, and similarly, A' must go from c to b (we must
have the same ending state, the state that includes both the changes
from A and the changes from B).  Above, I used context and state
interchangeably, but really a state describes a repository after patch
applications and a context tells you what state the repository must be
in for the patch to be applicable.  It turns out that patch context is
extremely useful for reasoning about patches.

1) An activation patch has no direct effect on the state of the
repository.  Several properties are derived from this.  (There is
something it has an effect on, but we'll get to that later.)

2) An activation patch should never modify other patches.  An
activation patch is not designed to modify the repository state
directly.  That means, using the above notation for context, that an
activation patch has this form:
 o (_A_) o

Notice that it starts in some context o and leaves the repository in
the same context.  The consequence of this is that the activation
patch cannot modify patches that it commutes with or else it would
require a new context and hence modify the repository state.

2)  An activation patch is its own inverse.  Since the activation
patch does not modify the repository state, it should be clear that
the patch is its own inverse.

3) The activation patch must commute with all other patches, except
for the activated patch with which it must always fail to commute.  In
patch theory, patch dependencies are expressed through commutation.
When two patches fail to commute, then they depend on each other.
When two patches commute, they are independent.  To have the
activation patch correctly activate just the patch it is associated
with it must fail to commute with exactly that patch and succeed with
all other patches.

Now that we have a representation of activation patches,  several
crucial properties, and an understanding of context we are ready to
implement the activation patch commutation rules.

Implementation:

First, we need to learn about clever_commute.  We use clever_commute
to ensure the following property (the caret means inverse, prime means
same change, different representation):

AB <--> B'A' if and only if B^A^ <--> A^'B^'

That is, whenever we can commute two patches their inverses also
commute and vice versa.  The type of clever_commute is:
clever_commute :: ((Patch, Patch) -> Perhaps (Patch, Patch)) ->
                (Patch, Patch) -> Perhaps (Patch, Patch)

The Perhaps type works similar to the Maybe data type:
data Perhaps a = Unknown | Failed | Succeeded a

clever_commute works by taking a patch commutation function and a pair
of patches.  First it tries to apply the commutation function, if it
returns Succeeded a, then clever_commute returns Succeeded a, because
the commutation was a success.  If the commute returns Unknown then
clever commute tries commuting the inverses of both patches and
returns the result.  A failure at any point means that the commutation
is not possible.

The above is important to understand for anyone looking to add
commutation operators to darcs.

The first property we implemented was to handle the dependency between
the activation patch and the activated patch.  It looks like this:
activation_commute_patch :: (Patch, Patch) -> Perhaps (Patch, Patch)
activation_commute_patch (p1, Activation p2 [])
  | eq_patches p1 p2 = Failed
activation_commute_patch _ = Unknown

It simply checks if the activation patch, (Activation p2 []), is being
commuted with the activated patch, p1.  This is only possible when the
context stored in the activation patch is empty.  This is true because
activation patches are always created in sequence adjacent to the
patch they activate.  As we'll see from the other commutation rules
the context list becomes empty when the patches are again adjacent.
First we check that p1 is in indeed the activated patch.  If it is,
the commutation must fail in order to specify the dependency.  In all
other cases, we don't know anything more about this commutation.  We
don't need to write code for the input in the other order because this
can be handled by passing activation_commute_patch to clever_commute.

The next property is quite a bit more involved.  In the next case we
don't assume anything about the context stored in the activation
patch.  An example helps illustrate the algorithm needed here.

Suppose we have patches, A, B, C, D and E which all commute with each
other.  Then suppose we have this sequence of patches:
(_A_BCD)E
and we want to commute this sequence.  How should it work?  To derive
the commutation operation we use the properties above and by also
reasoning about the contexts of the patches involved.  We begin by
writing down the context of the above sequence:
d (o _A_ a B b C c D d) d E e

 How did I arrive at this particular sequence of contexts?  I started
inside the activation patch first:
(o _A_ a B b C c D d)
Then, I remembered that the activation patch does not change the state
of the repository so it must be the same before and after the
activation patch:
d (o _A_ a B b C c D d) d

Now I can just add the state after E to the end.

We need to modify the representation of the activation patch until it
is in context e, because as is we cannot write the following:
E(_A_BCD),
because it would have this context which is non-sensical:
d E e  d (o _A_ a B b C c D d) d

Clearly we need to commute the context stored in the activation patch
with E.  Doing that in several steps (with context listed) yields
this:
o A a B b C c D d E e <--> o A a B b C c E' t1 D' e
  <--> o A a B b E'' t2 C' t1 D' e
  <--> o A a E''' t3 B' t2 C' t1 D' e
  <--> o E'''' t4 A' t3 B' t2 C' t1 D' e

Now, we can write down the answer to the commutation (again with context):
d (o _A_ a B b C c D d) d E e <--> d E e (t4 _A'_ t3 B' t2 C' t1 D' e) e

It should be clear that while the context muddies the notation a fair
bit, it is crucial for reasoning about the commutation operation.

The last significant case to consider with activation commutation is
when the activated patch (or the context stored in the activation
patch) do not commute with the other patch.  In these case we add the
patch to the context stored in the activation patch.  The tricky part
here is that every commutation operator must accept as input its own
output and do the right thing (basically, it must understand how to
invert itself).  It is left as an exercise to the reader to verify
that the operators given so far satisfy this property.  In this case,
inverting is trickier.  Going back to the example above, and assuming
that none of the patches A, B, C, D and E commute we would get this:
(_A_BCD)E <--> E(_A_BCDE)

Check the context if you like to see that it is correct.  Now if we
did a few other commutes, presumably with patches not listed above, we
could end up in a situation like this:
E(_A_BE'D'C') <--> ??

We have to be careful to permute the context stored in the activation
patch until E is at the right end again.  Then we can invert the
previous commutation.

This is the point where the current implementation is halted.  I'm
working with David on refactoring more of the code to work with GADTs
so that this last case can be implemented to work in general while
having the GADTs ensure that the context is correct.

I know a few questions are still unanswered, and I will hopefully send
another email this week or early next week describing those plot
holes.

Is this email useful in understanding patch theory?  Let me know if
you have questions.  It's certainly been useful for me as it helps me
write it all down and document what we've been doing.

I'm hoping this gets other interested parties up to speed enough that
they can help review patches or just understand patch theory better.

Thanks,
Jason