[darcs-users] Write-up on "tree repositories" as an alternative to conflictors

Fri Jan 22 22:42:14 UTC 2021

Hi Ben,

Happy 2021! I let this thread drop for a while but trying to catch up now.

I think we had defined an abstract patch theory as a set of names, and
a set of possible "contexts" which are sets of names (or corners on the
hypercube), where the set of possible contexts satisfies (copying and
pasting from my earlier email):

(a) The empty set is a context.

(b) If c1 and c2 are contexts, then c1 ∩ c2 is a context.

(c) If c1, c2, c3 are contexts and c1 ⊆ c3 and c2 ⊆ c3, then c1 ∪ c2 is
    a context.

(d) For every non-empty context c there is at least one element n of c
    such that c\{n} is also a context.

This can be turned into a category: take your definition of an abstract
patch as a pair (c,n) (c is a context and n is a name) and then take
the free category over that.

Then the discussion turned to concrete representations of patches.
Long-winded (sorry) question below...

> > I'm not sure I understand. I guess a state is the content of the files
> > in the repository.
> 
> Plus the directory structure. One concrete representation of states is
> the 'Tree' type in Darcs, see
> https://hub.darcs.net/darcs/darcs-screened/browse/src/Darcs/Util/Tree.hs#59
> up to line 84.

Yes, that makes sense.

> > Any concrete patch has exactly one start state and
> > one end state.
> 
> Not at all. Unnamed prim patches can be applied to many states, indeed
> the domain is always infinite. For instance the domain of "addfile ./f"
> is the set of all trees which do not have "./f" as either file or
> directory. Its codomain is the set of all trees that do have "./f" as a
> file. Its inverse is "rmfile ./f" with domain and codomain swapped.
> 
> > Do you mean that it's a partial bijection where the
> > domain and range are both sets of size 1?
> 
> No, see above.
> 
> > E.g. suppose the concrete patch P is:
> > 
> > hunk ./file.txt 2
> > +world
> > 
> > and in the context of p, file.txt has just one line "Hello".
> > I guess it's tempting to think of p as the following partial bijection
> > f: f always inserts the line "world" after the first line of file.txt.
> 
> This is exactly how the above hunk is interpreted in Darcs.
> 
> > But that's not really a useful representation of p; e.g. it doesn't
> > work if p gets commuted to a different context.
> 
> Why not? If you commute the patch to a different context then its domain
> and codomain usually change, too. I would expect nothing less.
> 
> But perhaps what you mean is that it is not a useful /generic/
> description of "what the patch does" /independent/ of commutation. This
> is true. If we had a representation of concrete patches that is truly
> context independent, then a lot of problems could be solved quite
> easily, see Pijul. In particular, I think we would not need names for
> patches.
> 
> > So to me it makes more
> > sense to just say that concretely, p is the pair ("Hello\n",
> > "Hello\nworld\n").
> 
> It is certainly possible to define hunks in this way. In Darcs, so
> called binary patches are defined in this way: they contain the old
> content and the new content.
> 
> However, even if you do that, your hunks are still "polymorphic"
> relative to the "rest of the tree". And commuting such a hunk with a
> file rename patch may still have to modify the hunk.
> 
> > I guess this is a long-winded way to just say I'm not sure what you
> > mean by concrete patches being bijections.
> 
> As explained above, for each kind of prim patch in Darcs (hunk, binary,
> addfile, etc... but please ignore setpref patches) one can precisely
> define the set of trees that are their domain and codomain and they are
> designed so that the partial function defined by applying the patch is
> injective and thus can be inverted.
> 
> > I guess you'd want to be able to commute concrete patches without
> > access to any information other than those two patches'
> > representations. Does your definition capture that?
> 
> Absolutely. The full commute code for Darcs' prim patches is here:
> https://hub.darcs.net/darcs/darcs-screened/browse/src/Darcs/Patch/Prim/V1/Commute.hs.
> It could use a bit of streamlining but I think the overall picture of
> how commutation works on prim patches is recognizable.

I think I am closer to understanding what you mean by "concrete patch",
but I think I haven't got it completely.

Here's my understanding so far:

a) We begin with a set S of states. In the case of Darcs, that's the
   Tree type.

b) A concrete patch is a bijection between two subsets of S. A
   "concrete patch theory" gives a way of encoding some such bijections
   as data. For example, every Darcs prim patch encodes such a
   bijection, so Darcs prim patches (excluding names) are a "concrete
   patch theory" in this sense.

c) If two Darcs prim patches are different (excluding names) then they
   encode different bijections. This is probably not too hard to prove.
   If it weren't true, Darcs wouldn't satisfy point (b) above.

d) Commuting two concrete patches (assuming it succeeds) may or may not
   change the patches. I suppose the commute function is an essential
   part of the description of a concrete patch theory.

e) We can turn any concrete patch theory into a category by taking all
   subsets of S as objects and then taking the free category generated
   by all the bijections in our concrete patch theory.

f) We're interested in functors from a concrete patch theory to an
   abstract patch theory as described at the top of this email (e.g.
   objects are sets of names).

But that can't all be right. In particular I'm confused about (f).
Suppose we have just two names n1 and n2, and our contexts are {}, {n1}
and {n1, n2}. When I think of making that concrete, I imagine something
like this:

* {} is an empty Tree.

* n1 creates a file called "file.txt" with one line, "Hello". So,
  concretely, {n1} is a Tree with just "file.txt" with just "Hello".

* n2 appends "world" after "Hello", so {n1,n2} is a Tree with just
  "file.txt" containing two lines "Hello and world".

But a functor from a concrete patch theory (as defined above) to that
abstract patch theory couldn't work like that. The concrete patch
theory objects are supposed to be sets of states, but above I assigned
a single state to each context.

What went wrong? Should I have assigned a set of states to each of {},
{n1} and {n1,n2}, and if so, can you give an example assignment?

Another thought: Could there be other concrete patch theories that
violate (b) but are still interesting? E.g. maybe there could be two
patches A and B that represent the same bijection, but that behave
differently when commuted with a third patch C.

I've probably misunderstood something basic; maybe you can figure out
where I stumbled.

> > I'm not sure what rules constrain us here. E.g. I could just declare
> > that any "conflicted" context maps to the empty fileset, but would that
> > work out? I guess the hard part is figuring out what how the concrete
> > patches work (for example, how you can commute them without access to
> > anything but the two patch representations).
> 
> Indeed. This is the reason why the conflictor commute and merge code is
> highly non-trivial. A complicating factor is that a clean definition of
> conflictors cannot be based on prim patches alone, they have to be
> /named/ prim patches. How to resolve this tension is still largely unclear.
> 
> I think the constraints are the usual ones: the merge and commute laws,
> in particular permutivity. E.g. the first thing you need to ensure is
> that if you merge p\/q to q'/\p', then p;q' commutes to q;p', even if p'
> and q' are conflicted.

It would be interesting to discuss this further but I think I should
figure out what you mean by concrete patch theory first.

-- 
James