[darcs-users] no more checkpoints, but hashed repos for GHC in Darcs 2.4?

Petr Rockai me at mornfall.net
Sun Sep 20 08:48:35 UTC 2009


Ganesh Sittampalam <ganesh at earth.li> writes:
>> Actually, the compatible change that removes the common prefix of zeroes
>> could already help with the problem.
>
> That's a good point actually, is removing the prefix really backwards
> compatible? Old darcs binaries won't be able to read the unprefixed files, will
> they?
Actually, they will -- pre-2.0 darcs used to write those, even. The prefixes
were added quite late in the 2.0 development to aid diffing performance (the
mtime optimisation issue). I think darcs only cares that the hash has one of
the lengths it knows, sha1, sha256 or sha256+size prefix. I definitely can read
prefix-less repositories with old darcs just fine.

Unfortunately, we have found out yesterday that this doesn't help at all. It
seems that lookup performance is rather sensitive to lookup order, but with
random access, these two give roughly the same performance. For directory
order, the prefixed variation is faster by around factor of 2, but I cannot
reproduce the ordering easily for a non-prefixed list. Anyway, there's
something fishy about the implementation of ext3/vfs lookups in Linux... (You'd
expect that evenly distributed directory list would perform no worse -- or
actually better -- than a very poorly distributed one -- even git makes that
assumption in its format.)

Yours,
   Petr.


More information about the darcs-users mailing list