[darcs-users] Debugging issue1739-escape-multibyte-chars-correctly.sh on tn23
tux_rocker at reinier.de
Mon Apr 5 19:59:07 UTC 2010
Op zondag 04 april 2010 15:45 schreef je:
> The earlier investigation indicated that darcs decodes the contents of files
that it reads with readLocaleFile, such as the file read when specifying the
--logfile option, using "the console's encoding". To me, this is a debatable
At least on Linux, "the console's encoding" is the locale encoding, which is
configured by environment variables and data files and implemented by libc. It
also used by the C library in its multibyte character string functions. "less"
for instance also uses it to indicate an error if they failed to decode it.
It's also the encoding in which your text editor will save your new files if
you haven't specified an encoding.
Maybe on Windows, it is not as clear that the console's encoding is also the
encoding we should assume for files of which we don't know the encoding. But
then I think it's up to Windows user to submit a patch to do something more
I'm now going to submit a patch that makes issue1739 skip on non-UTF-8
locales. This should be safe because darcs's behavior on non-UTF-8 locales is
tested in the utf8.sh test (it will skip on pretty much all stock Linux
systems because it requires an ISO-8859-15 locale to available, but at least
it runs and still passes on my laptop :)).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 197 bytes
Desc: This is a digitally signed message part.
More information about the darcs-users