[darcs-users] Explanation: What does commute mean?

Mon Oct 20 22:37:54 UTC 2008

I made a little Haskell module for playing with some of the ideas
described below.  I've attached it so that people can try it out.

Jason

On Fri, Oct 17, 2008 at 5:47 PM, Jason Dagit <dagit at codersbase.com> wrote:
> Hello,
>
> People often want to understand how commute on patches works.  Usually we
> start by explaining:
> Given two patches, A and B, if A and B commute then: AB <--> B' A', for some
> B' and A'.
>
> Naturally people ask, "But what is the relationship between A and A' or B
> and B'?"
>
> This is a very important question and I'll provide you with some insight.
>
> Suppose we have a repository with 2 files, a and b.  We could then make the
> following operations:
> mv b c
> mv a b
> mv c a
>
> You can think of each operation as a transformation on the 'state' of your
> repository.
>
> Suppose also, that we make an edit to a, and an edit to b.
>
> Let's name the above, using T for transformation:
> T_bc = mv b c
> T_ab = mv a b
> T_ca = mv c a
> T_a = edit a
> T_b = edit b
>
> You can imagine that if I gave the diff for T_a and the diff for T_b that
> you could apply those diffs in either order to your repository and get the
> same final 'state'.  Meaning, a and b are the same whether you update a
> first or b first.
>
> But, suppose instead that I performed T_bc, T_ab, and then T_ca.  This has
> the effect of swapping a and b by name.  Now suppose you applied the diffs
> T_a and T_b.  What would you want the outcome to be?
>
> It turns out, that it matters which operations were created first.  If you
> created the diffs T_a and T_b *before* you did the operations of the swap,
> then you should expect that after the swap the diff for T_a actually
> modifies b, whereas T_b should modify a.  On the other hand, if you created
> the diffs T_a and T_b *after* the swap, then you expect T_a to modify a and
> T_b to modify b.
>
> We have an intuitive idea of 'context' now.  As in, what is the context that
> T_a and T_b were created in?  Knowing this will tell us how they transform
> the repository state.
>
> Intuitively, it seems as though we need to remember the 'context' in which
> T_a and T_b were created.  So let's say that the operations performed up to
> the point where T_a is created is the context of T_a.  In other words, the
> context for T_a is sequence of transformations that existed when T_a was
> created.  Similarly, since T_a is a transformation, creating it results in a
> new context, which is the old context plus T_a.  We could say that T_b has
> this context.  Going a bit further, it seems like we should talk about how
> T_a has a pre-context and it also has a post-context.
>
> For example, if we created T_a before doing the swap, then the pre-context
> might include two transformations, one that creates a and another one that
> creates b.  The post-context would then include those two transformations
> and T_a itself.  If we created T_a after doing the swap, the pre-context and
> post-contexts of T_a would include T_bc, T_ab and T_ca also.
>
> Now a side note about commutative functions.  Consider the function created
> by composing T_a and T_b, let's write T_a . T_b.  Recall, that with function
> composition parameters start on the right and pass through the sequence to
> the left.  As discussed in the intro, T_a . T_b is equal to T_b . T_a.  This
> is because T_a and T_b are independent of each other.  Thus, we would say
> that the functions T_a and T_b are commutative functions.  This means, that
> changing their order of application does not change the result.
>
> We are saying that:
> T_a . T_b = T_b . T_a
>
> Because T_a and T_b are commutative it doesn't matter which order we compose
> them.  If we restrict our view to just the repository above with only the
> files a, b and no c, then on this restricted set of repository state how do
> these two compare?
> T_b . T_a
> T_a . T_b . (T_ca . T_ab . T_bc)
>
> In plain English, the first one edits a and then b, the second one swaps a
> and b, edits b and finally edits a.
>
> As far as the mathematics of it is concerned, the first one will edit a and
> b, while the second one will have T_a editing a different a than the first
> one and T_b editing a different b than the second one.
>
> Going a bit further, let's say that T_a and T_b were created without any of
> T_bc, T_ab or T_ca in their context.  So we could have two scenarios.
>
> We could, for example, start with T_b and T_a, swap their order and then do
> the swap of a and b afterwards.  That would give us:
> T_b . T_a
> and
> (T_ca . T_ab . T_bc) . T_a . T_b
>
> Intuitively, it seems like T_a and T_bc are commutative functions, eg., T_bc
> . T_a = T_a . T_bc.  So we could rewrite the second one as this:
> T_ca . T_ab . T_a . T_bc . T_b
>
> Now, suppose when we commute the function T_a with T_ab, that we replace T_a
> with T_a'.  T_a' is like T_a except that T_a' makes the edits of T_a to b
> instead of a.  After all, this results in T_a' editing the correct file
> after the rename. Similarly, when we commute T_b with T_bc, T_b is replaced
> with T_b' that edits c instead of b.  When we commute T_b' with T_ca we
> replace T_b' with T_b'' that edits a instead of c.
>
> So, the above goes through these steps:
> T_ca . T_a' . T_ab . T_bc . T_b (commute T_a to the left)
> T_a' . T_ca . T_ab . T_bc . T_b (commute T_a' to the left)
> T_a' . T_ca . T_ab . T_b' . T_bc (commute T_b to the left)
> T_a' . T_ca . T_b' . T_ab . T_bc (commute T_b' to the left)
> T_a' . T_b'' . T_ca . T_ab . T_bc
>
> The last one will then have T_a' and T_b'' making edits the same file
> contents as T_a and T_b respectively, even though the names of the files
> were changed by the swap.
>
> So, if you've followed me to this point, then you now have the intuition for
> what we mean when two patches A and B, commute to B' and A', as AB <--> B'
> A'.  You can think of a patch as being one of the above transformations
> along with the context of the transformation.  You might also notice that
> commute of patches must be doing something to the context of the patches.
>
> Patch commute has the potential to update the context and transformation the
> patches it swaps OR it could update the context and leave the state
> transformations equal to what they were in the input.  Patch commute can
> also fail, but we're ignoring that case for the moment.
>
> Thinking back to how we arrived at the need for context, you might notice
> that for each context, that is each sequence of operations, we get one
> unique repository state.  This is a very important property of context.
> Without it, context wouldn't really be useful.  Also, notice that the
> opposite is not true, repository state does not determine the context.
> Which makes sense, because there are lots of operations you can do that get
> the repository to a particular state, so given a state how do you know what
> was done?
>
> The next important property we want for commuting patches is that once two
> patches have been commuted, you can commute them again to undo the
> commutation.  In fact, it turns out the examples above are saying we want
> contexts to determine the same state if you commute the patches inside the
> context (again, context is a sequence of patches!).
>
> For R to be an equivalence relation, we need three things:
> 1) x R x, is true for all x
> 2) if x R y then y R x
> 3) if x R y and y R z then x R z
>
> Here, we replace x R y with "the sequencing, or order, of x can be obtained
> by commuting adjacent elements of y".  Roughly how to prove each:
> 1) either claim that 0 commutes satisfies definition of R or check that
> commute is self-inverting
> 2) relies on self-inverting nature, I think
> 3) messier but should still be provable
>
> I'm pretty sure both (2) and (3) could be done with a brute force proof that
> considered all the pairings of patch types in their general cases.  Start
> with all sequences of length 2, then 3 and I think at that point you could
> make an inductive argument to hit sequences of length n.  This would be a
> lot of work, and I'm not convinced it could be fully automated.
>
> Why would we want to show the above?  Showing that R is a relation would
> tell us that sequences of patches are equivalent under commute.  Now,
> combine this with the idea that context determines the state uniquely and
> now we know sets of patches uniquely determine your repository.
>
> Now we hit the circular bit, I think.  In the darcs implementation we want
> all the above to work out so we've defined all the pairwise commutes so that
> R should be an equivalence relation for us.  I don't know of any formal
> work, other than our QuickCheck properties that have tried to verify this.
> While it's perfectly fine and natural to define the desired properties then
> prove that some particular operations match those properties it's that proof
> by inspection that we have yet to do to make this formal.
>
> My arguments here apply to all the primitive patches that modify your
> repository but other patch types, like conflictors, do something else and I
> really don't know how to fit them into the theory.
>
> Thanks,
> Jason
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Commute.hs
Type: text/x-haskell
Size: 2582 bytes
Desc: not available
Url : http://lists.osuosl.org/pipermail/darcs-users/attachments/20081020/9069cb7c/attachment-0001.hs