[darcs-users] How to extend a patch theory to fully commute

Tue Sep 1 20:41:03 UTC 2020

On Wed, 19 Aug 2020 at 11:52, Ben Franksen <ben.franksen at online.de> wrote:
> Am 19.08.20 um 11:44 schrieb Ben Franksen:
> > The fact that your patches have no "content" is of of course a result of
> > starting out with "enriched" contexts. As you note in 2.2
> > Interpretation, your contexts can *not* in general be understood simply
> > as "working tree states": the mapping from contexts to states is not
> > injective. It is not yet clear to me if that will have consequences when
> > it comes to implementing your theory.
>
> Let me elaborate this point a bit.
>
> Your theory assume the existence of contexts as a "set". That doesn't
> give us any hint as to how these contexts should be represented in a
> concrete implementation. If we start out with some notion of prim patch
> (such as the one in Darcs), then these exist as concrete representations
> of state changes as data. But we know that states do not correspond 1:1
> to contexts. The simplest example involves duplicate prim patches p, q,
> such that name(p)/=name(q), but repr(p)==repr(q). Assuming a starting
> context X for both p and q, the ending context is different, even though
> the end state is the same. More interesting examples involve things like
> (remove f; add g) versus (move f g).
>
> The question now is: how can we arrive at a suitable definition (i.e.
> representation) of context, given such a concrete definition of prim
> patches and states?
>
> The answer, in Darcs at least, is to not define contexts. Contexts
> aren't represented at runtime at all. Instead, in our code contexts
> exist only formally at the type level.
>
> Your definition of context address may require that we have a way to
> reify contexts as data. A natural way of arriving at a suitable
> definition is to go backwards from patches: identify a context with the
> set of all patch sequences that start at the empty state and that differ
> only by commutation. At this point I am always tempted to say "and
> elimination of inverses"; indeed your definition of a patch universe
> implies that a sequence [a;b;b^] really has the same start and end
> context as the plain [a]. But with respect to commutation this doesn't
> seem right: [a;c] may commute to some [c';a'], but [a;b;b^;c] may not,
> because e.g. [a;b] may not commute. Every time I think about this I am
> getting head-aches. I cannot reconcile these two points of view.
>
> Anyway, assuming we have sorted this out somehow, I am still having
> doubts about reifying contexts, because the definition as an equivalence
> class of sequences does not easily map to an efficient representation.
>
> Cheers
> Ben

Sorry again for the long pause.

It's not clear to me right now exactly what properties a
representation of contexts would need to have. Your example about [a b
b^; c] not commuting is troubling but I need to think about where
exactly that would cause a problem, if anywhere.

I find my definition of context address as signed name multiset kind
of tantalizing. (The one that's currently in the pdf as Definition 10,
but that I promptly disavowed in a follow-up email after sending the
link. Fix the starting context to be the empty state; then a context
is just represented by the names of patches you follow to get there.)
The lack of an actual sequence probably causes problems but I don't
yet know what those problems are. On the other hand it's an easy data
structure to work with.

I have three next steps in mind right now (hopefully I'll find the time):

1. Move things from my original hms_patchfinder writeup to the new
Latex document, and rework the proofs to use the new definition of
patch universe. The goal is to increase confidence that in theory, at
least, patch universes can be extended to fully commute in the way I
originally described.

2. Look for ways to simplify; e.g. can an extended context address
just be a signed name multiset? (I'm always tempted to skip to this
step, but I think I should make myself start with step 1.)

3. Think more about my implementation idea from a previous email,
wherein a repository is just a primitive patch sequence plus
tombstones. I think I never answered your response, so here's an
answer:

> > If I wanted to implement it, I think it would just become this:
> >
> > * A repository consists of two things:
> >   * A sequence S of primitive patches with distinct names, starting at
> > O, with no inverses (i.e. only positive names).
> >   * A set of names, called "tombstones", representing "deleted"
> > patches. These names don't appear in S.
>
> The problem I see here is that this looses the start (or ending) state
> of the "deleted" patches. And you don't even remember their content,
> just their name. But even supposing you remember a start state and the
> patch representation of every deleted patch, this will still mess up
> your commutation. Because deleting inverse pairs AA^ from the main
> sequence basically means the same as stating that such a pair commutes
> with any other patch, which is clearly wrong

Yes, I forgot that the repo needs to remember the content of those
"deleted" patches. E.g. someone might later obliterate A^, or someone
might want to pull A and not A^ to another repo; or someone might
simply want to see the content of A and A^ in the output of "darcs
changes -v". So, tombstones as names only clearly isn't enough.

I'm a bit fuzzy about the problem with commutation that you raise.
Yes, one way to delete AA^ would be to commute them all the way to the
end of the sequence and then drop them. But what would go wrong if we
simply declare that whenever you have a sequence B;A;A^;C you're free
to simply replace it with B;C (and vice versa) as an operation
distinct from commuting?

James