# [darcs-users] Patch Theory in action

David Roundy droundy at abridgegame.org
Sun Nov 23 18:36:10 UTC 2003

On Sat, Nov 22, 2003 at 09:29:56PM +0100, Marnix Klooster wrote:
> Kevin Smith wrote (in part):
> > While on the subject of the manual, I found a couple other things in
> > Appendix A hard to follow:
> >
> > 1. Why are the patch lists "backward"? I expect ABC to mean that patch A
> > was applied first, not last.
>
> The reason is that David is a physicist, and thinks in an analogy to
> quantum physics.  Every patch is an 'operator'.  The same would
> probably be true for mathematicians (like me), who would view a patch
> as a 'function' mapping a tree to a(nother) tree.  For example,
> applying A, B, and C to the empty tree e, in that order, can be
> written as follows:
>
>   C(B(A(e)))
>
> or (using 'o' for function composition)
>
>   (C o B o A)(e)
>
> (See also point 4, and my notation proposal below.)

Or linear algebra:

A x = b (where x and b are column vectors and A is a square matrix)
A B C x = b

But I agree that the notation gets confusing.  In quantum mechanics one
usually keeps the ket in the notation (plays the role of the "tree", or the
"context"), so it's easy to see and remember which comes first.  In the
patch notation, I always leave out the context which helps foster
confusion.  Even worse, in the implementation, the "first" patch is the
first in a list, which means it is on the left.  :(

Actually, it's interesting that in quantum mechanics this ambiguity moves
over into Feynman diagrams, in which the direction of time is ambiguous.
The direction of time doesn't affect whether a diagram is valid, but it
does determine what process it represents... and I've seen different people
(or books )using three of the four directions for time.

> > 2. The M(A, B) notation is never mapped back to the ABC notation, which
> > seemed jarring.
>
> I agree that this notation is confusing, because (I think) M(A,B) is
> used both to mean 'the change that results by merging changes A and B'
> and 'the patch that results by merging changes A and B'.  I think that
> the Theory description will be a lot clearer when the patch/change
> distinction is strictly kept.

Hmmm.  This notation is taken directly from how I work out complex merges
by hand, in which terseness is key.  Often I used to fill up a full page
doing an unwind or unravel of a reasonably complex merge, trying to figure
out where the code was going wrong, so I needed an easy notation, and when
I went to document it I just copied it over.  Since the unravel is no
longer essential, I actually plan to simplify this bit quite a lot.

> In a similar way I find the notation
>
>   P1 || P2 ===> ... <--> ...
>
> confusing.  I think that I know what is meant, but as a mathematician
> this is too semi-formal to make formal sense.

I presume you'd want something more like

if A || B then B'A <---> A'B?

This notation is also somewhat old, dating back to when a merge could
fail.

> > 3. The statement "it is possible to show that the right inverse of a
> > patch is equal to its left inverse." is never really explained. What is
> > a "left inverse" or a "right inverse"?
>
> A little group theory.  In general, assume that we have a set S (say,
> of patches), and an operator (say, o) that takes two elements of S and
> produces a(nother) element of S.  The operator is associative:
>
>   x o (y o z) = (x o y) o z
>
> Also assume that S has an identity element i for which
>
>   x o i = i
>   i o x = i

I think you mean

x o i = x
i o x = x

?

> > If ABC meant to do patch A first (instead of last), then this wording
> > could use "beginning" and "end", because they would refer both to the
> > symbolic representation, and to the chronological application of the
> > patches.
> >
> > Whew.
>
> Again for what it's worth: here is a proposal for an alternative
> notation.
>
>  e (or Greek epsilon) : the empty tree
>  T,T1,T2 : some arbitrary tree, usually used as a 'context'
>  T1 = T2 : T1 and T2 are exactly the same tree.
>  P,P1,P2,Q,Q1,Q2 : arbitrary patches (mathematically these are
>                    partial functions from trees to trees)
>      [NOTE: Personally I rather dislike primes (e.g. in P'') to
>             create new related variable names; I like numbering
>             better.  So P1 and P2 would usually denote two
>             patches representing the same change.]

Well, I like primes.  To me P1 and P2 look like two unrelated patches,
while P and P' look like two patches representing the same change.  (But
the notation shouldn't require which choice one makes in any case.)

>  P = Q : P and Q are exactly the same patch: for every tree
>          they either both are inapplicable, or they both
>          give the same result tree.
>      [NOTE: So commutation would be written as P1;Q1 = Q2;P2;
>             see the following notation.]

That's not really true.  Commutation doesn't just indicate that the
composition of the two patches is the same patch in each case, it also
indicates that the two commuted patches correspond to the same two
*changes* in reverse order.

P1;Q1 = P1;Q1, but (switching notation) P1 Q1 <-/-> P1 Q1 (inventing
not-commute operator on the fly)

It is crucial that commutation is a reversible function of two patches, not
just a relationship between to pairs of patches.

>  T;P : the tree after applying patch P to tree T
>      [NOTE: This solves the order problem that Kevin and I
>             have.]
>  def(<an expression resulting in a tree>) : the result of the
>                                             expression is
>                                             defined
>      [Example: If P is "rename file F to G", it can only be
>                applied to all trees containing a file named
>                F.  So def(T;P) is true for every T that
>                contains such a file, and false otherwise.]

An interesting feature of darcs is that "rename file F to G" is actually
defined even if file F doesn't exist.  :)  In other words, I decided to let
(my notation)

{mv F G} {add F} <---> {add G} {mv F G}

It seems a little silly, but causes no problems, as long as {rm G} {mv F G}
doesn't commute.

But your example would be correct in the case of a file modification
patch.

>  P;Q : the patch such that for every T, (T;P);Q = T;(P;Q)
>      [NOTE: This double use of ; creates a syntactic
>             ambiguity, but not a semantic one, because
>             by definition both interpretations of T;P;Q
>             are the same.]
>  -P : the inverse of patch P, i.e., for every T for which
>       def(T;P), we have T;P;-P = T.
>      [NOTE: Personally I dislike superscripts and subscripts.]

Well, I'm afraid I do like superscripts and subscripts! :) (Although I also

>  P || Q : P and Q apply to the same set trees: for
>           every T, def(T;P) == def(T;Q)
>
> I don't have time to work this out further now, or give examples. But
> I agree with Kevin that a left-to-right notation is more intuitive for
> this domain.

Yeah, I'm convinced.  I know I've caught myself making the mistake of
writing down the patches in the wrong order myself, when writing
documentation.  My current leaning is to essentially switch from "ket"
notation to "bra" notation (in terms of the Dirac bra-ket notation).

My current notation has no way of representing a tree.  This is definitely
a mistake.  I have no problem understanding that there is an implied
context, but it would be nice to make it more explicit when explaining
things.  Rather than require that trees all be called Tn, I'm thinking of
just making it a bra:

<T|

where the contents of the <| could be anything, but the empty tree would be

<0|

Composition of patches would be represented as is done currently (except in
revese order, since the tree is now on the left), or optionally with a
latex \cdot (or o in your notation).

<0|ABC

would thus indicate patches A then B then C being applied to the empty
tree.

I still think a commute operator is necesary, and like the <---> operator
for this.  But I need to be clearer about <---> being an operation (or
function?), not just a relationship.  This is definitely one of the most
confusing issues, primarily because you can understand almost everything
else correctly without really understanding that the commute is a function,
which ends up making the practical side of how the theory is applied seem
more mysterious than it actually is.

I'm now thinking that perhaps I'll start working on the theory appendix
rewrite now rather than later.  It'll have the added advantage of keeping
me busy so I won't have time to add new destabilizing features, which may
be good for darcs' stability.
--
David Roundy
http://www.abridgegame.org