[darcs-users] encoding problems in darcs 2.5

Reinier Lamers tux_rocker at reinier.de
Fri Dec 31 14:32:25 UTC 2010


Hi,

(Petr: there is a question for you at the end)

Op vrijdag 17 december 2010 11:04 schreef Eric Kow:
> Reinier, I think we may have found one general direction to look into
> for issue1693.  I've updated the ticket to point to this message.
> 
> On Thu, Dec 16, 2010 at 23:37:49 +0100, Wolfgang Jeltsch wrote:
> > Now, I’ve found the root of this problem. In src/darcs.hs there is this
> > line:
> > 
> >     forM_ [stdout, stdin, stderr] $ \h -> hSetBinaryMode h True
> 
> Ah ha! This reminds me of this IO audit I did in May
>   http://lists.osuosl.org/pipermail/darcs-users/2010-May/024023.html
> 
> Looks like my assumption yesterday was wrong when I said
> 
>   when it prints String, it uses hPutStr h
>   ...
>   I *assume* hPutStr stdout uses the locale...
> 
> How embarrassing, should have been more awake.
> 
> > However, I’m not sure whether commenting this line out can cause new
> > problems. Maybe, a darcs that is accessed via SSH could now send data to
> > the client in a wrong way if it uses something like hPutStr on stdout.
> 
> Hmm, so the client end at least still sets hSetBinaryMode for its IO and for
> server end, darcs transfer-mode seems to output bytestrings only.  But this
> could use a bit more research.
> 
> Since I know what I'm looking for, I know that I can search through the
> history using
> 
>    darcs changes --match 'hunk hSetBinaryMode' --summary 
> 
> I learned that the hSetBinaryMode was introduced actually very early in
> Darcs' code (not during 2.4.4; that was just for calling externals like
> ssh)

On http://bugs.darcs.net/issue1693, Petr says "You really need to use binary 
handles for everything", warning that not doing so will result in crashes with 
GHC 6.12.

This means that the only way to make non-ASCII output work is by converting 
all text we output to byte streams *ourselves*. But that's seriously ugly. 
Every time someone just writes:

putStrLn ("Processing patch " ++ pi_name p)

the reviewer has to tell him that he can't do that and that it should be:

putStrLn ("Processing patch " ++ encodeLocale (pi_name p))

Yuck. Now actually I made encodeLocale and decodeLocale functions that have 
pure types, but they aren't really pure of course. If I wanted to correct 
that, we'd even get:

patch_name_locale <- encodeLocale (pi_name p)
putStrLn ("Processing patch " ++ patch_name_locale)

Double yuck. So I sort of ignored the problem and left the issue open.

A bit of Googling for "ghc", "6.12", "non-ascii" and "special characters" in 
various arrangements does not turn up any bug reports however. Petr, perhaps 
you can point us to the GHC bugs that make that we have to set binary mode on 
standard in, out and err?

Regards,
Reinier
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.osuosl.org/pipermail/darcs-users/attachments/20101231/f4d32f1a/attachment.pgp>


More information about the darcs-users mailing list