[darcs-users] encoding problems in darcs 2.5

Wolfgang Jeltsch g9ks157k at acme.softbase.org
Thu Dec 16 22:37:49 UTC 2010


I had a discussion with kowey on IRC today, which was about character
encoding issues in darcs 2.5. This resulted in

    <http://bugs.darcs.net/issue2018>    .

The local encoding is respected for input from the terminal but not for
output to the terminal. For each character, darcs outputs a single byte
that contains the lower 8 bits from the Unicode codepoint.

Now, I’ve found the root of this problem. In src/darcs.hs there is this

    forM_ [stdout, stdin, stderr] $ \h -> hSetBinaryMode h True

After commenting it out, I didn’t experience this problem anymore.

However, I’m not sure whether commenting this line out can cause new
problems. Maybe, a darcs that is accessed via SSH could now send data to
the client in a wrong way if it uses something like hPutStr on stdout.

During debugging, I found out something that seemed strange to me:
Printer.hPrintPrintable is apparently only called with such values of
type Printer that represent a single-character string. So when
outputting to the terminal, hPrintPrintable is called once for every

There is another encoding issue. The file _darcs/prefs/author is written
in the local encoding, not UTF-8. Therefore, it isn’t portable. Changing
the character set or copying a repo to a another machine that uses a
different character set will produce wrong author names.

Best wishes,

