[darcs-users] [patch37] Store textual patch metadata encoded in UTF-8

Judah Jacobson judah.jacobson at gmail.com
Wed Nov 11 04:45:33 UTC 2009


A handful of comments:

On Tue, Nov 10, 2009 at 10:12 AM, Eric Kow <kowey at darcs.net> wrote:
> On Sun, Nov 01, 2009 at 17:55:50 +0000, Reinier Lamers wrote:
>> The patches make darcs treat old-style patches as locale-encoded.
>
> Move locale conversion out of IO monad
> --------------------------------------
>>  This is motivated by the circumstance that GHC 6.12 will offer locale
>>  conversion with a pure type.
>

I'm not sure about that; the ghc-6.12 Unicode conversion support
(currently) only takes place for file reading and writing, and thus
always implicitly uses the IO monad.

> Uh, OK.  I'll trust the GHC folks on that one and don't have the imagination
> to see why that would be a bad thing.  I guess that makes things nicer to read.

Yeah, it seems slightly hacky to me but the gain in read/writeability
probably makes it worth it, as long as the choice by the Darcs project
is explicitly noted somewhere.

>> +-- unsafePerformIO in the locale function is ratified by the fact that GHC 6.12
>> +-- and above also supply locale conversion with functions with a pure type.
>> +decodeLocale :: B.ByteString -> String
>> +decodeLocale = unsafePerformIO . runInputT defaultSettings . decode
>
>>  -- | Encode a String to a ByteString according to the current locale
>> -encodeLocale :: String -> IO B.ByteString
>> -encodeLocale = runInputT defaultSettings . encode
>> +encodeLocale :: String -> B.ByteString
>> +encodeLocale = unsafePerformIO . runInputT defaultSettings . encode
>

I would caution against this code.  Each call to runInputT initializes
the entire Haskeline runtime, so I'd expect it to be somewhat
inefficient if you're using it to convert the metadata for a bunch of
saved patches.  (Though I haven't actually done any timing tests
myself.)

In general, it seems much better to use a package like iconv or
text-icu for the conversion instead of piggy-backing off of Haskeline.
I'm also curious about the problem that you alluded to with text-icu
on Linux distributions...

>> -askUser :: String -> IO String
>> +-- | Ask the user for a line of input.
>> +askUser :: String    -- ^ The prompt to display
>> +        -> IO String -- ^ The string the user entered.
>>  askUser prompt = withoutProgress $ runInputT defaultSettings $
>>                      getInputLine prompt
>>                          >>= maybe (error "askUser: unexpected end of input") return
>> hunk ./src/Darcs/Utils.hs 159
>> -            -- Return the input as encoded, 8-bit Chars (same as the
>> -            -- non-Haskeline backend).
>> -                        >>= fmap B.unpack . encode
>
> TODO: think about implications of this change.
> Will this cause any problems for http://bugs.darcs.net/issue1292 ?

That change looks good to me.  The call to encode was always intended
as a temporary workaround until Darcs properly supported Unicode
metadata.

-Judah


More information about the darcs-users mailing list