[darcs-users] [darcs-devel] [patch639] Use utf-8 charset for darcs send in case of non-ascii ...

Stephen J. Turnbull stephen at xemacs.org
Mon Jul 11 03:19:11 UTC 2011


Juliusz Chroboczek writes:
 > > The big problem that you face is short sequences of extended Shift
 > > JIS, Big 5, and Windows-125x that are mostly ASCII.  That sounds a lot
 > > like a typical email message with correctly spelled name and/or .sig
 > > to me.
 > 
 > Please exhibit a sentence in a natural language encoded in one of these
 > encodings that decodes as proper UTF-8, or forever keep your peace.

That may not be possible.  The examples I've seen involved multiple
languages, English + something else.  I don't have one to hand, and
don't have time to look for or generate one.  They're rare -- except
from the point of view of a person who happens to have such a name or
use such a .sig.

In any case, I'm not asking that non-UTF-8 encodings be decoded *at
all* (that's Eric's suggestion, and code is already available in Darcs
it would seem), and certainly not that Darcs try to detect non-UTF-8
encodings that masquerade as UTF-8.  Only that when something does not
decode as proper UTF-8 (including the "uniquely encoded with the
minimum number of octets" condition) that the user be warned.

Anything less *is* "punishing the users," and will hurt Darcs.



More information about the darcs-users mailing list