[darcs-devel] [patch1832] do white space en/decoding directly on the ByteStrings
Ganesh Sittampalam
bugs at darcs.net
Sat Jul 27 19:44:12 UTC 2019
Ganesh Sittampalam <ganesh at earth.li> added the comment:
Partial review of what I've looked at so far:
> Author: Ben Franksen <ben.franksen at online.de>
> Date: Sun Feb 17 21:50:01 CET 2019
> * do white space en/decoding directly on the ByteStrings
encodeWhitePS/decodeWhitePS seem like obvious candidates for
QuickCheck testing against encodeWhite/decodeWhite, particularly if
we want to optimise them further.
> hunk ./src/Darcs/Util/Path.hs 157
> +encodeWhitePS :: B.ByteString -> B.ByteString
> +encodeWhitePS x =
> + case B.findIndex (\c -> isSpace (w2c c) || c == backslash) x of
> + Nothing -> x
> + Just i -> B.concat
> + [ B.take i x
> + , B.singleton backslash
> + , BC.pack (show (B.index x i))
> + , B.singleton backslash
> + , encodeWhitePS (B.drop (i+1) x)
> + ]
> + where
> + backslash = 0x5C
> +
> hunk ./src/Darcs/Util/Path.hs 604
> -encodeWhiteName = encodeLocale . encodeWhite . decodeLocale .
unName
> +encodeWhiteName = encodeWhitePS . unName
[existing definition of encodeWhite]
> encodeWhite :: FilePath -> String
> encodeWhite (c:cs) | isSpace c || c == '\\' =
> '\\' : show (ord c) ++ "\\" ++ encodeWhite cs
> encodeWhite (c:cs) = c : encodeWhite cs
> encodeWhite [] = []
Is the new code actually equivalent to the old code? In the old case
we properly translate from locale code into Unicode characters and
then look for any Unicode space. In the new code we just look for
any byte that looks like an ASCII space in the encoded sequence of
bytes.
The repeated concats in this code strike me as quite inefficient,
I'd have thought it'd at least make sense to make a single list
[B.ByteString] then concatenate at the end.
----------
status: needs-review -> review-in-progress
__________________________________
Darcs bug tracker <bugs at darcs.net>
<http://bugs.darcs.net/patch1832>
__________________________________
More information about the darcs-devel
mailing list