[darcs-users] Re: an interface for splitting hunks

David Roundy droundy at abridgegame.org
Thu Mar 31 13:09:35 UTC 2005


On Wed, Mar 30, 2005 at 05:21:31PM +0200, Benedikt Schmidt wrote:
> Ketil Malde <ketil.malde at bccs.uib.no> writes:
> 
> > Mark Stosberg <mark at summersault.com> writes:
> >
> >> I think we are talking about the same thing. Right now I believe that
> >> hunks equal all contingous changed lines.
> >
> > But that's not unabmigous.  A hunk like
> >
> [...]
> >
> > but one of them is definitely nicer.
> 
> Here is how the diff code i'm working on for darcs (similar to the one in
> GNU diff) creates the diff:
...

I'm looking forward to seeing your new code! (especially if it's smart
about these sorts of issues!)

> >  \end
> >
> > +\begin
> > +foo
> > +\end
> > +
> >  \begin
> >
> > is equivalent to
> >
> >  \end
> >
> >  \begin
> > +foo
> > +\end
> > +
> > + \begin
> 
> In this case neither (a) nor (b) help to identify the first edit-script as
> the better one, because the empty line isn't special in any way. It should
> be possible to add "starts/ends with an empty line" to (a) and (b) and see if
> that helps to get the first diff in most of the cases. Anyone has some sample
> files for testing where darcs and/or gnu diff get it wrong?

I think Ketil's example is basically the sort of situation where darcs gets
things "wrong".  When it's either removing or inserting a function, there's
no way (without additional input, such as "beginning" or "ending" regexps)
to know where best to apply the break.

Ideally, I think I'd like something like

darcs record/whatsnew --begin-hunks-with '\{$|^$' --end-hunks-with '^\}$'

which would tell darcs that I want my hunks (when ambiguous) to start with
blank lines or function definitions, and to end with a closing brace--this
would work for the style of C-code formatting that I prefer.  Other people
might prefer something like --begin-hunks-with '^[^ ]+ [^ ]+\(' to start
their functions.

It certainly would be nice to have this sort of flexibility, as it would
both reduce the danger of conflicts (by making edits of separate code
separate), and would make the hunks more readable.

The idea of *splitting* hunks based on markers (as opposed to just choosing
between equally valid hunks) seems more than a bit awkward to me.  It's
fine if all you're doing is adding lines, but when you modify lines, darcs
wouldn't know where to split the old code.  e.g. the hunk

-A
-B
-C
-D
+a
+b
+c
+SPLIT HERE!
+d
+e

Would this be split into

-A
-B
-C
+a
+b
+c

-D
+d
+e

or into

-A
-B
+a
+b
+c

-C
-D
+d
+e

There's no way for darcs to know.  An interactive hunk-editing might be
easier, but that seems like a potential user interface nightmare.
-- 
David Roundy
http://www.darcs.net




More information about the darcs-users mailing list