[darcs-users] Make darcs force-commute patches from CLI, to learn about darcs?

Mon Jun 29 14:23:31 UTC 2020

Am 29.06.20 um 13:46 schrieb James Cook:
>> This in turn led me to the more general question of how to detect
>> inconsistencies when exchanging patches between repos. The problem here
>> is that darcs currently relies on global uniqueness properties that are
>> quite easy to invalidate (e.g. by manually editing patches, and as
>> hinted above also when we independently convert branches of a repo).
>> Specifically, we rely on the following global property: if two patches
>> have the same name/identity, then (a) they can always be commuted to a
>> common context and (b) they are equal (content-wise) after commuting
>> them to any such common context. (The context of a patch is defined as
>> the set of patches preceding it.) Effectively this property means that
>> patches with the same name/identity are really just commuted/merged
>> versions of one and the same original patch.
>>
>> I have an (efficient) algorithm in mind that validates these assumptions
>> whenever we exchange patches between repos. Implementing this requires a
>> pretty deep refactor though.
> 
> I was wondering if global uniqueness could be solved by borrowing from Pijul.

Me too, more than once. But note that none of these considerations are
practical for the kind of evolutionary change we are limited to if we
want to maintain compatibility with existing repos. This is why I
concluded that validation is the only practical solution for Darcs.
Instead of ignoring it (as we always did, hoping it never happens), we
should fully recognize the fact that different repos with the same patch
format may become incompatible. (Though of course we still strive to
eliminate any bugs that may cause this to happen accidentally.)

> If you include some metadata as part of the repository state (e.g. the
> identity of the hunk responsible for every line of every text file)
> then I think you could make it so that a primitive patch's
> representation doesn't change when it's commuted. I suspect pretty
> much any primitive patch theory could be adapted to work this way.
> (Note I'm not claiming this will make everything actually commute like
> in Pijul.)

If you extend the "repository state" with enough extra information, then
yes, I think it is possible to do that. Indeed we have a competing prim
patch theory that goes about half-way toward that goal. We never came
around to integrate that properly with the high-level Repository/UI
code. Because much of what goes on in those layers has to do with the
repo state (for instance, think of generating difference patches between
the working state and the pristine state). So we need an abstraction
layer for repo states and finding out what the common API should be here
is difficult.

Another problem is that even supposing commutation does not change prim
patches, in darcs you still have conflictors. And conflictors
/definitely/ have to change representation when we commute them: the
merge-commute law obviously requires that.

So this would mean we end up re-implementing Pijul where conflicts are a
property of the repo state, rather than of patches.

> Besides giving unique names, another nice thing about this is that you
> shouldn't need to implement n^2 different patch commuting functions
> for n types of primitive patch. (I don't know if darcs actually needs
> to do this; I'm just assuming.)

The commutation rules for prim patches are the least complicated part of
our core algorithms. The n^2 here is not a problem in practice. What
/is/ a problem in practice is that we can never change any of the prim
patch commutation rules without introducing a new incompatible patch
"format", because these rules, too, must be globally invariant. If we
can detect such inconsistencies reliably, we may be a bit more relaxed
wrt maintaining the exact commutation behavior.

> If you have access to the repo state,
> you can tell if two patches commute just by trying to apply them in
> the opposite order and seeing if they complain.

True, but that is hardly simpler or more efficient than commuting them
directly as we do in Darcs. And it means you can no longer handle
patches or sequences of patches in isolation. So everything becomes
stateful. Besides, determining /whether/ two patches commute could be
easily cached at the named patch level, even in darcs as of today. The
problem is to decide what the resulting patches look like after we
commuted them. If the representation does not change, then this problem
of course doesn't exist...

> In theory, you could
> even allow plugins that implement new patch types, with the basic
> principle that patches communicate with each other through
> modifications to the repo metadata.

That may be possible, especially for a completely new project that is
designed around these notions from the start.

> A slightly less radical version of this would be to keep the current
> primitive patch representation, but use this idea when generating the
> patch names. E.g. a hunk's name is a hash involving the identities of
> the hunks responsible for the lines it deletes. You'd still need to
> keep some metadata around to be able to do that, though.

There is also more than just hunks. What about file renames? A file
rename commutes with a hunk that changes the file, and in darcs this
changes the hunk patch to refer to the renamed file. You need to change
the internal representation to refer to files using UUIDs to get around
that. And as soon as you do that you get the same problems with global
uniqueness invariants that you previously had for patches. There is a
reason why git cannot track file renames: they store just trees and
identify them by their content hash. This is secure and quite efficient,
but properly tracking file renames is not possible in such a model.
AFAIK Pijul suffers from a similar limitation and for the same reason
i.e. there is no primitive "rename" patch, only file-remove and file-add.

Cheers
Ben