[darcs-users] Debugging issue1739-escape-multibyte-chars-correctly.sh on tn23

naur at post11.tele.dk naur at post11.tele.dk
Sun Apr 4 13:45:51 UTC 2010


The earlier investigation indicated that darcs decodes the contents of files that it reads with readLocaleFile, such as the file read when specifying the --logfile option, using "the console's encoding". To me, this is a debatable choice. 
I seem to recall a fairly recent discussion, related to darcs or perhaps GHC, that brought up questions like this, so perhaps there is some agreement that I am simply unaware of. If so, I would be grateful for a reference to such a conclusion. Otherwise, I would like to hear some answers to the question: Is decoding the contents of files using "the console's encoding" really suitable? Or should some other mechanism be used? Perhaps controlled by settings/options/parameters?

----- Original meddelelse -----
> Fra: Reinier Lamers <tux_rocker at reinier.de>
> Til: darcs-users at darcs.net
> Dato: Lør, 03. apr 2010 14:19
> Emne: Re: [darcs-users]
> Debugging	issue1739-escape-multibyte-chars-correctly.sh on tn23
> ...
> It looks like there's actually a bug in the script so that it
> depends on the 
> locale (i.e., it works with LC_ALL=da_DK.UTF-8 but fails with
> LC_ALL=''). 
> Perhaps the script should try to detect what the locale encoding is
> and bomb 
> out if it's not UTF-8 or if it can't be detected.
> Reinier

The answer depends: If decoding the contents of a file using "the console's encoding" is considered proper, then, yes, a script that critically depends on the exact manner in which the contents of a file is decoded needs to ensure that "the console's encoding" is set properly or, alternately, fail. But if decoding the contents of a file using "the console's encoding" is considered improper, the field is more open.

As a practical matter, just to get the buildbot lights green, it is, of course, easy to simply ensure that the tn23 buildbot slave sets LC_ALL=da_DK.UTF-8. 

Best regards

More information about the darcs-users mailing list