[darcs-users] A formalization of Darcs patch theory using inverse semigroups (apfelmus)

Tue Nov 3 11:09:27 UTC 2009

Eric Kow wrote:
>>>  - hunk movement
>>>  - some kind of tree structure editing
> 
>> What are these? But yes, there might be patch types that cannot be  
>> formulated in terms of GUIDs. (Not sure whether this is a problem,  
>> though.)
> 
> Hunk move patches are just the idea of displacing a block of text within
> a file or maybe even to some other file.  But it sounds like this could
> be done in such a scheme, and perhaps made simpler? Ganesh was
> commenting that it's got tons of cases to consider.

Looks possible to me:

You can think of the GUIDs as some kind of pointers. Then, the contents
of the line and its position in the source file correspond to record fields

   line_A->text      = "module Main where"
   line_A->previous  = nil
   line_A->next      = line_B

In other words, source files are represented as doubly linked lists, and
the observation is simply that you can edit the contents and the
structure independently.

Now, a hunk move corresponds to taking a subsequence out of a doubly
linked list and linking it in at another position. So, yeah, looks like
this can be modeled in terms of GUIDs.

Put differently, unique names just spell out our intuition of "that's
the same line, even if it's line 2 in one revision and line 3 in another".

> Tree structure editing: I don't know, but there have been people asking
> about using Darcs to edit XML files.  So I imagine that operations like
> "add a child node here; move this node there; delete this node" could be
> interesting.

Should be well possible to model trees in terms of GUIDs/unique
names/pointers:

   node_A->tag      = "html"
   node_A->children = list_A

   list_A->element  = node_B
   list_A->previous = nil
   list_A->next     = list_B

   etc.

> Now that you've tackled agnosticism, I have one more question for you,
> oh monorail salesman: what about dependencies?  I understand how Darcs
> figures out if one patch depends on another (they don't commute).  Would
> it be straightforward to have such tracking in a commute-free patch
> theory? I imagine that along with conflict marking this would become a
> mere user interface issue more than anything else.  Maybe this would
> even make it easy to implement the "breakable dependencies" that some
> people keep asking for.  (then again, in a commutation-based scheme,
> I guess breaking dependencies would just be creating a conflict and no
> more difficult).

At the moment, I'm not convinced that Darcs' dependency tracking really
works! For instance, a patch B could both depend and not depend on a
previous patch A if there are intermediate patches whose commutation
screws things up somehow when there are conflicts, and similar shenanigans.

The situation is not necessarily better for GUIDs, but I think that at
least they make it easier to develop an understanding of what exactly is
going on.

My first gut reaction would be that a change like

   change line_A->text
      from "module Main where" to "module Test where"

should depend on  line_A->text  being "module Main where" in the first
place. But it's unclear what this means when there are many patches that
have changed  line_A->text  several times already.

In any case, this is where more research is needed, both for Darcs and
GUIDs, and I'm arguing that the latter approach makes a better starting
point.

If I understand Jean-Philippe correctly, he's even saying that
dependencies/conflicts will turn out to be pretty much intractable (for
both formalisms). I'm not that pessimistic yet.

> I imagine that along with conflict marking this would become a
> mere user interface issue more than anything else.

On a minor note, I would like to remark that conflicts being a user
interface issue doesn't make them any easier. You still want sensible
and predictable behavior like that a merge of three repositories  A B C
doesn't mark conflicts exactly when the pairs  A B, B C and C A  don't
mark conflicts.

A good analogy is probably error messages for the type checker in GHC.
It's a "user interface issue", but you need a good theory to actually do
that, like Helium currently does.

Regards,
apfelmus

--
http://apfelmus.nfshost.com