[darcs-users] locking

Jamie Webb j at jmawebb.cjb.net
Mon Mar 6 19:30:30 UTC 2006


On Mon, Mar 06, 2006 at 01:12:23PM -0500, Max Battcher wrote:
> Marnix Klooster wrote:
> >The solution seems very simple to me: have new patch type that contains 
> >a *character-based* diff, instead of the currently used (and 
> >traditional) line-based diff.
> >
> >This requires no knowledge of the structure of the file contents, and is 
> >fairly robust (i.e., leads to sensible diffs) when only small changes 
> >are made, such as wording changes, layout changes, etc.
> 
> Well, then you have to worry about character sizes and encodings a lot 
> more.  Characters may be anywhere between 7-bits (non-extended ASCII, 
> among others) to 16-bits (Unicode) to 32-bit ("Wide" Unicode), and in 
> some languages their "characters" actually contain sequences of several 
> Unicode characters.

A character-based diff would just be a binary diff, so it should be
safe. Maybe just to be sure you'd want a heuristic that e.g. treats
two hunks as conflicting if they are within 10 bytes of each
other.

There could be issues with conflict markers getting placed in the
middle of characters [1], but that just depends on the marking scheme,
e.g. we can always insert new lines rather than adding characters to
existing ones, perhaps like this:

          v--v
This is a test sentence.
----
This is a conflicting sentence.
          ^---------^

However, character diffs on their own don't solve the problem because
they actually make things worse for flowed paragraphs (all those
little newline-change hunks). Users would also presumably need to be
forced to use some WYSIWYG-type editor that displays paragraphs
flowed, but saves each one on a single line.

-- Jamie Webb

[1] And there are already similar problems there. Try introducing
conflicts in a DOS-format file with a Unix darcs. Does Darcs currently
handle UCS2/4? I didn't think it did (or rather would just assume the
file was binary). UTF-8 was carefully designed not to confuse programs
like Darcs.




More information about the darcs-users mailing list