[darcs-users] More thoughts on line endings, meta tags, patch types, etc

Marnix Klooster marnix.klooster at ssaglobal.com
Fri Dec 10 14:45:41 UTC 2004


Michael Conrad [mailto:conradme at email.uc.edu] wrote on Friday, December 10,
2004 01:45:

> So I've been playing with ideas to have my earlier meta-data tag idea fix
> the recurring line endings problem.  However, I think I'm going to give up
> on this idea.
> 
> First of all, the idea of having a script convert undesirable line endings
> generated by darcs before the user sees them, and then converting back
> before darcs sees them is just infeasible.  (*1)
> 
> Even if darcs handles the line endings with a special option, it would be
> nearly impossible to commute accross the patch that changed the type of
the
> file.  (*2)
> 
> I got thinking, this is a lot like the compatibility problem between
binary
> and text: the two file types require a different processing, and thus it
> isn't feasible to convert between them.  If someone wants to change the
> type, they basically have to remove the file and add as the new type.
> 
> So, from this line of thought, if we wanted to have the ability to do
> logical lines it would really require a new patch-type (as David already
> said, I believe - I can be stubborn sometimes, sorry)

[...snip...]

> I'm getting kind of carried away here, as usual, so I'll stop.
> Thought? Comments? Ideas?
> -Mike

[...snip footnotes...]

[This is not directly about line endings, but it might be relevant for that
as
well...]

I've thought a bit about the binary/text thing as well, and I thought of it
as
a *generalization* of the existing 'hunk' patch type.

I find it a bit har to explain this in text, so I'll try ASCII art.

One can imagine a text file divided into 'chunks' like this:

   |This is a \n|small text file\n|of three lines|

Here the | and \n are not part of the file; | starts/ends a chunk, and \n is
of
course a newline character.  Now imagine we change this into

   |This is a \n|slightly larger text file of two lines|

Then a hunk diff will be generated that replaces 'chunks' 2 and 3 of the
original version by chunk 2 of the new version.

Now take a binary file

   |^@^@^R^E^L^A^X^@^@It's all just ones and zeroes^@^Z|

What we really mean with 'binary' is: please view this as one big 'chunk',
diffs aren't meaningful inside.

So the 'type' of a file is really how we divide it into parts for diffing.
This type is of course not a property of the file, but using darcs'
preferences we indicate how we want darcs to view the file, and what kind
of diff (i.e., hunk patch) it produces.  Therefore I see this as a
generalization of the 'hunk' patch type.

How to use this in practice, whether it helps for the line-endings problem, 
whether there are more useful types than 'binary' (one chunk) and 'text'
(each \r and/or \n ends a chunk), and whether this is too much solution for
too small a problem, I don't know yet.  Just wanted to drop this view here,
it might help.

Groetjes,
 <><
Marnix




More information about the darcs-users mailing list