[darcs-devel] [issue2389] non-Latin (e.g. Cyrillic) letters not printed in the output of whatsnew
Eric Kow
bugs at darcs.net
Mon May 19 15:16:39 UTC 2014
Eric Kow <kowey at darcs.net> added the comment:
I notice there are a couple of other (otherwise seemingly unrelated)
encodings-related issues on the trackers (regarding patch names and
filenames).
Thanks to Stephen for noticing that the sequence of 3 Unicode code points
actually corresponds to what would be a single char encoded in UTF-8 (3
octets for that char).
I don't have a full diagnosis myself, but I'll note that Darcs mostly
treats text files as bytestrings (with the exception that it assumes some
sort of 8-bit encoding when looking for the "\n" char). So it's not
entirely surprising that that you see individual bytes in the output
representation.
Of course, it's quite wrong on Darcs' part to be treating these as
individual Unicode code points, so something isn't quite going right on
the way from its internal representation of the file contents (bytes) to
the display on the screen (to text and back to bytes again)
----------
priority: -> bug
status: unknown -> needs-diagnosis/design
title: non-Latin (e.g. Cyrillic) letters not printed in the output of whatsnew -> non-Latin (e.g. Cyrillic) letters not printed in the output of whatsnew
topic: -Darcs2, ProbablyEasy
__________________________________
Darcs bug tracker <bugs at darcs.net>
<http://bugs.darcs.net/issue2389>
__________________________________
More information about the darcs-devel
mailing list