[darcs-users] Coalescing patches

Nik Trevallyn-Jones nik at babel.homelinux.net
Thu Sep 24 01:23:36 UTC 2009


Hi Reinier,

Thanks for your input.
I see that I've not made my original document clear enough.
(For me, it's always a struggle to keep things brief enough without 
losing important detail *sigh*.)

My apologies, all. I hope this response is not now too long.

Reinier Lamers wrote:
> Hi Nik,
>
>   
> I see your coalesced patch type as very similar to the tag patches that we 
> have now: tags also have no content and depend on a set of patches in the same 
> repo. Pulling a tag means pulling a named subset of the patches in the repo 
> that you pull from.
>   

The main difference between my coalesced patch and the tag patch is that 
the set of original patches that the coalesced patch "contains" is *not* 
physically in the same repository (except as the contents of the 
coalesced patch).
> So your push --coalesce does more or less this:
>
>  * Push all the patches that it would push without the --coalesce option
>  * Apply an extra patch to the remote repository that depends on all of the
>    pushed patches and only on the pushed patches, so that it can be used as a 
>    named when pulling from the remote repo.
>   

Not quite: I haven't made myself clear enough, I'm sorry. My plan was 
that it did this:

* create a single patch that represents all the changes of the original 
patches (a type of merged patch)
* add sufficient metadata to this single patch to identify the patches 
that it comprises/contains
* push this single patch only into the remote repository, as a coalesced 
patch.

The result of this is two repos that will produce the same working set 
when pulled, but which contain a different number of patches.

* The original repo has all the fine-grained history of every record 
that was made during the development process.

* The remote repo has a a smaller number of patches, because each 
coalesced patch it contains represents multiple patches in the original 
repo.

* If a user attempts a subsequent push from the original repo to the 
remote repo, then the push logic must recognise the original patches 
represented by the coalesced patch. In this way, duplication of the 
original patches is avoided. (In my plan, this would be done by 
comparing patch ids, presuming this is possible)

* Similarly, if a user attempts to push a coalesced patch to a target 
repository which contains at least one of the original patches, then the 
possible duplication must also be recognised.
- There are a number of possible scenarios here:

** the target repo contains *all* the original patches contained in the 
coalesced patch;
-- In this case the coalesced patch is ignored because the target repo 
already contains all its effects.

** the target repo has only *some* of the original patches contained in 
the coalesced patch (they may have been cherry-picked, for example);
-- In this case, the coalesced patch could simply be igonored; *or* 
those patches contained in the coalesced patch and *not* already 
contained in the target repo could be pushed.
--- The second alternative would result in a coalesced patch in the 
target repo that is *different* to the coalesced patch in the original 
repo, because the coalesced patch in the target repo will contain only 
those original patches which were not already in the target repo.

To support this second alternative, a coalesced patch needs to contain 
enough information to identify which changes belong to which original 
patch. So rather than being a truly merged patch, a coalesced patch 
would be more of a patch container, holding the original patches in some 
form of collection (list, bag, dictionary, etc).

* It would seem that "push --uncoalesce" would also be useful, allowing 
a coalesced patch to be pushed into a target repo with the result that 
the target repo contains all the original patches instead of the single 
coalesced patch.

> Or is this not what you mean? In the way I sketch it, obliterating the 
> coalesced patch would not obliterate its members. Do you feel that it should 
> obliterate the members too?
>   

Yes, obliterating a coalesced patch should obliterate its members too.

To my mind, one of the main purposes of the coalesced patch is to allow 
repo users a clean way of handling associated patches. As a developer, I 
may know that a set of 15 patches represents a new feature. But users of 
a shared repo probably only recognise individual features. So for them 
to cherry-pick effectively, it would be much more helpful for them to 
simply pull/unpull a single coalesced patch that represents the entire 
feature.

> Besides, are you aware that the word "coalesce" is already used in darcs to 
> mean simplifying a sequence of patches to remove modifications that do not 
> influence the end result?
No, I'm sorry I was not aware of that. Thanks for letting me know. I 
avoided the terms "merge" and "group" because it seemed that these were 
present in darcs, in concept at least.

Other terms we could use are: composite, bound, cooperative, unified, 
collection;

>  For example, when you coalesce the insertion of a 
> line, followed by the insertion of another line, followed by the eletion of 
> the former line, the coalesced version of that sequence of operations will 
> simpy be the latter insertion.
>   

Cool - the operation makes complete sense. I was unaware the term was 
already in use. :o)

Thanks again for your input.

Cheers!
Nik


More information about the darcs-users mailing list