[darcs-users] Coalescing patches

Wed Sep 23 16:55:39 UTC 2009

Nik writes:

 > I was focusing more on the user interface than the underlying storage.
 > I had inferred that git and darcs would be as effective as the other in 
 > the actual coalesce operation.
 > True, git can efficiently edit the history DAG, but with Darcs there is 
 > (effectively) no history DAG to edit, so the point is moot, isn't
 > it?

No, it's not.  Search the archives for discussions of "when to tag".
Darcs does have to keep track of dependencies among patches, those do
form a DAG, although it may be more sparse than git's history DAG.
The point about tagging is that once you hit a tag, you can stop
searching for dependencies.  And editing the DAG in Darcs is not very
easy.

 > I assumed that the coalesce was combining a number of smaller patches 
 > into a single larger one. Darcs has the patch contents and the patch 
 > dependency information which is sufficient to ensure the coalesced patch 
 > is correct. Correct?

Well, yes and no.  What if you want to get rid of certain
dependencies?  It seems to me that the main point of coalescing in the
naive view is to get rid of the trivial dependencies *among the
coalesced patches*.  But I think an obvious and important
generalization is to get rid of spurious dependencies on other
patches.

 > So you're talking more of the situation in which some of the patches to 
 > coalesce contain multiple updates, some of which should be coalesced and 
 > some not?
 > Is this also the cause of the "conflicts or spurious changes" you talk 
 > of in the point above?

Yes.

 > So, assuming that the storage model and record semantics are unchanged, 
 > you don't see any real benefit to darcs users in the suggested change?

It's hard to say.  I've experimented with a workflow where I never
commit; XEmacs has a call to "git commit"[1] on after-save-hook.
Later I would go back and edit the DAG to split the autosave branch
into topics, and coalesce related patches.

What I found was
(1) I tended to save *much* more often, because "save->fix typo->save"
    was a really easy idiom to keep extraneous changes separate from
    the main effort.
(2) Since each patch was small, I was generally willing to write a
    short log for most saves.  (This surprised me.)  It was useful to
    refer to these in creating coalesced logs.
(3) In software development (mostly maintenance-type work, fixing
    bugs, code cleanups, etc), I found myself spending about equal
    amounts of time (a) cherry-picking trivial patches to push
    immediately, (b) rebasing the "real work", and (c) merging, fixing
    conflicts and other cleanup (pruning dead branches, etc).
(4) Based on time logs, I was about as productive with this workflow
    as otherwise, but (a) felt more productive, and (b) probably could
    have achieved a significant gain with better tools.
(5) In writing lectures, etc this workflow was actually a drag because
    I didn't have very good tools.  (DAG editing was done with shell
    functions and CLI commands.)  The work was very linear and the
    burden of compacting the patches was greater than perceived
    benefits.  With better tools it probably would have been a no-op.

Based on points (3) and (5), I wonder if the automatic coalescing you
propose would have great benefits.  But this is very specific to my
personal usage, and with different workflows and tools, the benefits
and costs might be very different for you.

Footnotes: 
[1]  git was the only one that was fast enough at that time.  All the
others were slow enough that I found myself saving much less often.  I
don't know about now.