[darcs-users] [patch37] Store textual patch metadata encoded in UTF-8

Eric Kow kowey at darcs.net
Mon Nov 2 17:51:21 UTC 2009


Hi everybody,

For more context behind this patch, please see
  http://bugs.darcs.net/issue64

On Sun, Nov 01, 2009 at 17:55:50 +0000, Reinier Lamers wrote:
> Here is the patch that makes darcs store patch name, author and log as UTF-8 encoded strings.
>
> By itself, this has no visible advantage. However, it makes it possible to:
> 
> * display non-ASCII patch and author names consistently in different locales
> * produce closer-to-valid XML if the --xml-output option is given
> * use --match with non-ASCII patch and author names on patches that were
>   recorded in another locale
 
To be honest, I'm a little bit scared of this patch, in the sense that I'm not
100% sure what to do with it.

I agree in principle that this is the right thing to do, and the fact that it
comes from somebody that takes an almost perverse pleasure from fixing weird
Unicode problems is reassuring.

Anyway, here's my plan on how to go forward.  I think two or three layers of
review may be appropriate here.  Reinier, please shout if you think I'm
overreacting or if this is not the right time to be reaching for a guru.

1. The base review: what does the patch bundle say it does;
   does it do what it says?
   
2. The Unicode guru review: are the things the patch bundle
   says it does the right things to do?  It sounds like this is not just
   a matter of understanding the specs, but having real life experience
   and flamewar scorch marks to go along with it.

3. The overall user experience, wisdom and foresight review (probably
   the same as #2).  Given how darcs users actually use computers in the
   real world, do all the things we do here make sense in practice?

I don't have a good grip on the distinction between 2 and 3, but I feel
like there is some sort of distinction.

Perhaps a good example of 2/3 would be the kind of stuff Eric Sink talks
about in <http://www.ericsink.com/entries/quirky.html>.

As for the Unicode guru, I can think of some candidates.  Trent has
mentioned John Cowan.  There's also Juliusz Chroboczek, who apparently
has quite a bit of experience with these issues, but I believe to be
busy.  An equally likely to be busy candidate would be Bryan O'Sullivan,
who besides working on the Haskell text library has experience from the
Mercurial?

I think it may be good to get in touch with them to see if in principle,
they'd be available to advise us.  I've BCC'ed them to this message,
but will try to contact them personally too.

Meanwhile, any volunteers for the base review, while we continue our
search for a Unicode guru?  Part of the job would be to percolate up
concrete questions for said guru to address so that they don't have to
wade through a lot of boring stuff.

Thanks!

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://lists.osuosl.org/pipermail/darcs-users/attachments/20091102/9681b11c/attachment.pgp>


More information about the darcs-users mailing list