[darcs-users] [patch167] Reintroduce UTF-8 tagging (and 3 more)

Eric Kow kowey at darcs.net
Fri Mar 26 14:38:56 UTC 2010


On Thu, Mar 25, 2010 at 15:58:44 +0100, Petr Rockai wrote:
> > Just plain correctness. If someone once recorded in latin1 a patch
> > with name "fix encoding handling so we never see garbage like 'é'",
> > he will see "fix encoding handling so we never see garbage like 'é'"
> > if he views the patch with newer darcs.

> But that's not true -- this is a problem that is not solved by
> ignore-this. You still have to guess an encoding to use for patches that
> don't have the ignore-this utf8 tag.  There are always to be cases where
> this fails, no matter how you choose the encoding, so it doesn't make
> sense to talk about correctness. We can only think of more or less
> successful heuristics. (In your case, if the user has an utf8 locale,
> they still get é in that patch name.)

Do we all agree on the following two points?

- We cannot guarantee correctness in case of untagged patches.
- But we can guarantee it on tagged patches.

I think you (Petr) are saying that this guarantee is actually not very
important in practice.

Somewhere for the next couple of years (old software dies out slowly) or
so, there will be an overlap period where we would not know if patches
were created with new Darcs or with old Darcs.  During this transition
period, would it not be desirable to at least guarantee correctness for
the patches that are created with a new Darcs?

> Since many distributions are defaulting to utf8 for a while now, we can
> expect most repositories to come with utf8 patches.  Overall, I think
> that the ignore-this tagging only benefits people with non-utf8
> repositories on matching non-utf8 locales (*and* only in the case that
> their patches happily decode as utf8 coincidentally), and such people
> will only become rarer.

So I can understand that that an argument based on rarity would apply
to questions of effort-allocation and prioritisation (eg. we don't
prioritise automated nightly builds because we think the intersection of
people who want bleeding edge Darcs but are unwilling to build it is too
small).  But I think it's better to avoid applying it when it comes to
questions of trying to get the right behaviour.

I agree that the Ignore-this tagging is ugly :-(.  I'm think we're in
arguing-about-tradeoffs territory here.  If it's not too early for me to
in my vote, I'll fly the conservatism flag and say that we should pay
this price of widespread ugliness, to gain that marginal improvement in
correctness.  I say this (i) because we seem to be in one of these cases
where when Darcs does the wrong thing, you wouldn't really know about it
because it kinda looks right and (ii) because while we have reasons to
make educated guesses about our users [Petr's guess is a very good one],
I don't think we actually know that much about them.

But if I could back off a little bit from my own conservatism, I'd say
that what is at stake here is "only" the patch metadata (not the patches
themselves), so it probably does not matter that much.

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://lists.osuosl.org/pipermail/darcs-users/attachments/20100326/cafd938e/attachment.pgp>


More information about the darcs-users mailing list