[darcs-users] petition for '\0' to be removed from binary auto-detection code
droundy at abridgegame.org
Tue Nov 16 11:47:40 UTC 2004
On Mon, Nov 15, 2004 at 06:06:21PM +0000, Mark Stosberg wrote:
> While I like the idea of auto-detecting binary files, I realized that
> '\0' (aka NUL) is not a good test.
> It it sometimes used (at least) in Perl to put a bunch of things into a
> string that you may want to separate back out later. The character is
> used precisely because it doesn't occur in text.
> In particular, it's still used in the modern "CGI.pm" library, to provide
> compatibility with the ancient 'cgi-lib.pl' library.
Sounds like a reasonable argument to me. The only trouble is that this
pretty well guts the check for binary files, since we currently only check
for '\0' and '\26' (EOF). And I imagine that it is usually the '\0' check
that correctly identifies binary files.
> I tried to create this patch myself, but I couldn't figure out where
> this logic was located. :)
It's in fpstring.c, actually written in C for blinding speed (well,
blinding may be an overstatement...).
Another option would be to add a set of regexps that indicate files that
are *always* text. This would be an ugly option, but might be used to keep
\0 as a binary test, but special-case .pl files out of getting checked.
More information about the darcs-users