[darcs-users] blue color bug

Peter "Firefly" Lund firefly at diku.dk
Mon Jul 5 09:58:18 UTC 2004


On Mon, 5 Jul 2004, Alex Shinn wrote:

> This is a very good idea to determine whether or not the data is text,
> but if it is you still need to handle text encoding so it's more of a
> pre-step to the other options.

Which is not what we want.  We have _darcs/prefs/binaries for that.

> > 7. Try to determine if it's text in the current locale.  If it's not,
> > treat it as binary.
> >
> > If the user is working with a repository in UTF-8 while living in an
> > ISO-2022-JP locale, he's got other problems.

Some editors handle things like this just fine, so no, the locale doesn't
have to match the actual encoding/charset used for text files in a darcs
repository.

> Projects will very frequently have mixed encodings, especially if they
> are internationalized (e.g. generally every .po file will be in a
> different encoding that has nothing to do with the user's locale).

Mostly due to small mistakes and/or incomplete transition to UTF8, I
gather...

> Also right now many projects will be in a transitioning phase from a
> native encoding to UTF-8, or have any other number of valid reasons
> for working with multiple encodings.

...but, yes, mixed encodings happen.

It sounds more and more like the right thing to do right now is to print
all bytes outside of [32..126] in an escaped form.

Later on (post 1.0) a single setting for the encoding/charset will
probably be good enough.  If iconv reports a conversion error we can
always fall back to the blue hexescapes locally.

Btw., printing the hunks is not the only thing that needs to know how to
represent bytes in text files that are outside the printable 7-bit ASCII
charset -- the web script(s) need(s) it too.

-Peter




More information about the darcs-users mailing list