[darcs-devel] [issue525] amend-record => darcs patches show duplicate additions

Ben Franksen bugs at darcs.net
Mon Aug 27 15:02:18 UTC 2018


Ben Franksen <ben.franksen at online.de> added the comment:

In msg7021 Eric Kow asked:
> Do you think you could provide some examples of what a realistic
> non-canonical representation of a patch would be (compared to the
> canonical one) and also an explanation of why running sort_coalesceFL
> on them can result in their decanonicalisation?

Ignoring the issue of different diff algorithms, a non-canonical patch
is a hunk which has common lines in its 'old' and 'new' arguments. The
canonize function splits such a hunk into smaller ones with no common
lines, using the supplied diff algorithm. OTOH, coalesce joins adjacent
or overlapping hunks; this may produce non-canonical hunks when then the
second hunk puts back (a part of) what the first hunk removes. The
simplest example is the sequence (of file states)

  a -> b -> b
            a

corresponding to hunks

  hunk 1 [a] [b]
  hunk 2 [] [a]

which (currently) coalesce to

  hunk 1 [a] [ab]

While this behavior could be fixed for the simple example above, we
cannot easily fix it in general: the second hunk might add many lines
interspersed with arbitrarily many lines from what the first hunk
removes. Finding the list of minimal hunks is exactly what the diff
algorithm is supposed to do.

This means we /first/ have to coalesce and /then/ canonize whenever we
concatenate lists of primitive patches.

Note that the hunks produced by canonizing a single hunk should already
be in coalesced form, since the diff algorithm takes care of that. Thus,
David's solution does unnecessary work: it first canonizes, then
coalesces, then canonizes again. The first part is not needed and can be
removed.

----------
status: resolved -> unknown

__________________________________
Darcs bug tracker <bugs at darcs.net>
<http://bugs.darcs.net/issue525>
__________________________________


More information about the darcs-devel mailing list