[darcs-users] some special amend-records

Sebastian Fischer mail at sebfisch.de
Tue Apr 9 10:57:55 UTC 2013


Thank you for your comments, Ganesh.

I made an account on the Wiki and will make a page summarizing the proposal
when initial ideas have settled.

Should we talk about "changesets" instead of "patch groups"? The term "set"
seems a better fit than "group" mathematically and "changeset" seems to be
an established term for the thing we are talking about.

 My impression is that problems arise because patches depend on full named
> patches rather than on the actual primitive patches that make relevant
> changes.
>
> I agree that this is a real problem, but on the other hand atomic
> changesets are important for keeping related changes together. We need to
> be careful to keep that, otherwise you could for example end up pulling the
> addition of a method to a class without the changes that implement that
> method, or vice-versa.
>


Good point. So, on one hand user-defined dependencies are important to keep
related changes together, on the other hand users make mistakes when
defining dependencies which are currently more difficult to fix than
necessary.

This is related to your later remark:


Or (thinking out loud), perhaps nesting and dependencies are really the
> same thing: patch group G depending on H seems similar in practice to G
> containing H, even if they seem different conceptually.
>


It seems there are two kinds of dependencies.

1. A change depends on another change if one cannot be applied without the
other. For example, adding text to a file depends on creating that file and
deleting a line depends on adding that line. Such dependencies can be
computed automatically and we may call them "system dependencies".

2. A change can also depend on another change for other reasons. For
example, the result of applying one might not compile without applying the
other. As you point out, such dependencies are important to ensure
consistent repositories. They cannot be computed automatically, so users
need to specify them and can make mistakes. We may call these dependencies
"user dependencies".

My impression is that this distinction is important to handle mistakes when
defining user dependencies. For example, when users bundle a change
"feature and mistake", publish it, and others depend on this change then it
is easy to revert the mistake if there are no system dependencies on it
despite the user dependency. Without this distinction, I think
patchgroups/changesets would not solve the original problem in the workflow
of factis research.


> My impression is that the whole repository can be viewed as a patch group
> that contains everything, so patch groups might be a generalization of
> repositories. Adding and removing patch groups from a working directory
> looks like switching branches in the same directory but more flexible. (My
> impression is that patch groups can be combined more flexibly than
> branches.)
>
> That's a very good point about adding a feature to a whole repository.
>
> I think it means patch groups can be used to subsume the history tracking
> concept that you implemented as a separate tool, which would be a nice
> result if it works out. It would be really great if it can also give us a
> good UI for multi-branch repos.
>


Which part of history tracking would be subsumed? I don't think that push
and pull would be tracked, for example. I notice that it is a bit unclear
to me how history interacts with patchgroups/changesets. Maybe you have a
clearer view?


> I have looked at changeset evolution and phases in Mercurial. If I
> understand correctly, patch groups are a means to avoid changeset
> evolution. Instead of changing patches, new patches that (partially) revert
> old patches would be recorded and patch groups would hide unnecessary
> details, for example by producing combined diffs.
>
> I don't think adding new patches to revert old ones is going to work too
> well, because (X;X^-1) doesn't really behave the same as an identity patch:
> it conflicts with anything that conflicts with X.
>


I see.



> So my feeling is that patch groups will have to support a "remove this
> patch" operation.
>


In my previous mail I was a bit unclear about this aspect. Initially I
proposed to allow to "add and remove" patches and later I assumed that
modifying or removing patches from changesets was not allowed.

In retrospect, I think it is important to *not* modify or remove arbitrary
patches from changesets (at least when they are public) in order to not
destroy system dependencies. With a concept of phases that track which
changesets are public, it seems possible to allow modification and removal
of unpublished changesets but this seems to be orthogonal.

There may be a less powerful operation on changesets that addresses your
point about conflicts of (X;X^-1) without destroying system dependencies on
public changesets. Can we provide an operation "normalize changeset" that
recomputes primitive patches from a combined diff? It seems that this will
never destroy code because it does not change the effect of a changeset on
the repository. Does this make sense?

For example, assume X^-1 is added to a changeset S containing X. Someone
who pulls this change and has itself changes that depend on X will get a
conflict that needs to be resolved before S can be normalized. Someone
without changes that depend on X will not have to do anything. This is less
intrusive than amending a patch corresponding to S which both potential
users depend on.

System dependencies may need to be recomputed, when normalizing, to depend
on the new primitive patches. Which dependencies should be updated is a bit
unclear to me.

A subtle point is nesting: if we normalize a changeset should nested
changesets be normalized too? If yes, all underlying primitive patches may
be removed and we need to recompute many dependencies. If not, it would
mean that a normalized changeset would become detached from its component
changesets which seems undesirable.

  - can patch groups be hierarchical?
>>
>
>  [...] I see two options:
>
>   1. When adding patch group g1 to patch group g2 add all patches of g1
> to g2 so patch groups are always flat.
>
>  2. keep a tree structure of patch groups that allows to observe which
> group was added to which.
>
>  I'm not sure if a tree structure is useful from a user perspective, as I
> expect, e.g., diffs for a patch group to be accumulated and hence identical
> with options 1 or 2.
>
> My feeling is that it is useful for the UI. Suppose I have groups  "Add
> rebase suspend" ; "Add rebase unsuspend" etc, which collectively make up
> "Add rebase". If I alter one of the smaller groups, I want that to be
> reflected in the bigger one.
>
> On the other hand it means that in the "repository as patch group" case,
> altering one of the smaller groups implicitly alters the whole repository,
> which could be surprising.
>


I think it's quite natural. If I have merged the changeset "Add rebase" I
expect my repository to change if I pull new changes to the changeset "Add
rebase". I view this as similar to being on some Git branch and pulling
changes to that branch.



> I think either way we'll need UI help to provide warnings in this kind of
> case -  e.g. "altered group X; groups Y and Z implicitly altered as a
> result".
>


Not sure if it's a warning but I agree that such output is useful. For
example:

Pulled changes to the following changesets:

  * Add rebase
      * Add rebase suspend

   - is it allowed for them to overlap?
>>
>
>  Overlapping seems useful for the "incorporating common functionality
> into different patch groups" scenario. Overlapping does not change the
> underlying sets of primitive patches, so I think it is reasonable for patch
> groups to share patches or even nested groups. So rather than a tree of
> groups we would have a directed acyclic graph.
>
>
> Agreed. One thing I think we should ban is non-contiguous patch groups,
> i.e. having a group where the individual changes can't be commuted to be
> next to each other.
>
> For example consider the following sequence:
>
> 1: create X with contents "A"
> 2: change contents of X to "B"
> 3: change contents of X to "C"
> 4: change contents of X to "D"
>
> if it we legal to have group G1 with just changes 1 and 3, it would be
> very hard to manipulate. We could also create G2 with just changes 2 and 4,
> and end up with two groups that were implicitly dependent on each other.
>


Indeed. Should we even require that user dependencies must always subsume
all system dependencies of the bundled primitive patches? In your example,
this would mean that G1 and G2 must contain each other which means that
they are the same changeset.

This requirement is stronger than contiguousness. For example, it is no
longer possible to record X^-1 in a changeset that does not contain X. Your
other example (where lines 2 and 4 modify a different file in a different
changeset) would still be supported though.

One way to ensure that changesets contain all system dependencies seems to
be to have a "current changeset" that is reflected in the working directory
and used implicitly when recording changes. The system dependecies in the
working directory are then automatically part of the current changeset.

(Maybe, switching to a nested changeset should be possible if the changes
in the working directory (system-)depend only on changes made by the nested
changeset. Then nested changesets can be updated by first making them the
current changeset.)



> As a starting point, we'll need to work out the patch semantics of the
> operations on patch groups themselves.
>


Once we agree on an intuitive model of patchgroups/changesets, I can write
a Wiki page about what commands should be used to manage them and how
existing commands might be affected.



> Another thing to think about is whether we also want phases and whether
> they can be kept sufficiently independent of patch groups that we can
> implement them separately.
>


I consider phases (that track which changes are public, which are local,
and which are secret) orthogonal. If normalization is the only operation
that modifies changesets (and it turns out that it indeed never breaks
system dependencies) we may not even need phases.

I think of the current proposal as "changeset abstraction" and opposed to
"changeset evolution". I think users should only see changesets and their
accumulated diffs. Underlying primitive patches should be "abstracted
away". System dependencies should be computed automatically and (probably)
only among primitive patches. Additional user-defined dependencies specify
which changes belong to which others and control which changes are pulled
in the working directory of a repository.

I wonder how such an abstract view on changesets corresponds to history. A
changeset seems like a timeless bundle of changes where it is important
what changes are made but not when and in which order they have been
implemented.


The outlined proposal seems quite intrusive to Darcs and I wonder if there
is something simpler that could be done more quickly. My impression is that
the distinction of "system dependencies" and "user dependencies" is
necessary to solve factis's problem. Maybe it is also sufficient? Could we
change dependencies in Darcs accordingly without adding a new concept of
patchgroups/changesets?


Best regards,
Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osuosl.org/pipermail/darcs-users/attachments/20130409/ae766e1c/attachment-0001.html>


More information about the darcs-users mailing list