[darcs-devel] [issue2389] non-Latin (e.g. Cyrillic) letters not printed in the output of whatsnew

Eric Kow bugs at darcs.net
Mon May 19 15:16:39 UTC 2014


Eric Kow <kowey at darcs.net> added the comment:

I notice there are a couple of other (otherwise seemingly unrelated) 
encodings-related issues on the trackers (regarding patch names and 
filenames).

Thanks to Stephen for noticing that the sequence of 3 Unicode code points 
actually corresponds to what would be a single char encoded in UTF-8 (3 
octets for that char).

I don't have a full diagnosis myself, but I'll note that Darcs mostly 
treats text files as bytestrings (with the exception that it assumes some 
sort of 8-bit encoding when looking for the "\n" char).  So it's not 
entirely surprising that that you see individual bytes in the output 
representation.

Of course, it's quite wrong on Darcs' part to be treating these as 
individual Unicode code points, so something isn't quite going right on 
the way from its internal representation of the file contents (bytes) to 
the display on the screen (to text and back to bytes again)

----------
priority:  -> bug
status: unknown -> needs-diagnosis/design
title: non-Latin (e.g. Cyrillic) letters not printed in the	output of whatsnew -> non-Latin (e.g. Cyrillic) letters not printed in the output of whatsnew
topic:  -Darcs2, ProbablyEasy

__________________________________
Darcs bug tracker <bugs at darcs.net>
<http://bugs.darcs.net/issue2389>
__________________________________


More information about the darcs-devel mailing list