[darcs-users] darcs patch: Use index-based diffing in Record. (and 32 more)

Petr Rockai me at mornfall.net
Sat Jul 25 10:05:40 UTC 2009


Hi,

Port the replay (check/repair) functionality to hashed-storage.
---------------------------------------------------------------
>  This removes a few unsafeDiff users. It also simplifies the replay code by not
>  threading the Slurpy all over the place (instead placing applyAndFix in the
>  TreeIO monad). There is a slight risk of regressions (and a moderate risk of
>  space leaks).

This patch, with latest hashed-storage (yet unpublished, but I'll fix that in
due time) does not introduce any space leaks (hooray!). It however does
introduce a performance regression that is not quite insignificant. I am
investigating what is going on. It seems that a fair chunk of time goes into
filepath manipulation, since hashed-storage uses a different data type for
paths -- this accounts for over 90% extra ticks on the new profile, when
compared to the old.

          darcs-2.3       darcs-hs
----------------------------------
floatPath         0            186
total           775            968
-----------------------------------
diference       775            782

Now the question is, how to reconcile those issues. The same problem is going
to pop up in darcs pull (although currently, pull does not use the buffering
optimisations that check/repair do, and this will come for free with
hashed-storage). The whole issue is a little tricky.

I have tried to change the FileName internal representation to use
AnchoredPath, but this is obviously failing all the tests, since there's a
number of subtleties in FileName implementation (I mean, it's completely
undocumented, sort-of spaghetti, code). Oh, and encode/decode_white have
incorrect haddocks (see
http://repos.mornfall.net/hashed-storage/dist/doc/html/hashed-storage/Storage-Hashed-Darcs.html#v%253AdarcsDecodeWhite
for a correct version).

Moreover, due to having to call encode/decode white, I end up doing more
unpacking/repacking than with String-based FileName, anyway. The likely right
thing to do would be to re-implement encode/decode white in terms of
bytestrings and try that way (so we would avoid a lot of un/packing).

For now, I'll pass over the issue, since check/repair is still in a fairly good
shape. When I have more time, I'll play a bit with file path representations.

Yours,
   Petr.


More information about the darcs-users mailing list