[darcs-users] optimising darcs annotate

Benedikt Schmidt beschmi at gmail.com
Sat Oct 25 22:34:39 UTC 2008


"David Roundy" <daveroundy at gmail.com> writes:

> On Fri, Oct 24, 2008 at 11:23 PM, Petr Rockai <me at mornfall.net> wrote:

>> Ganesh Sittampalam <ganesh at earth.li> writes:
>>>> I'll repeat what I mentioned above:  it's faster and better to refer
>>>> to patches by hash than by number.  It takes a bit more space, but
>>>> that shouldn't be a significant downside, and the upside is that you
>>>> have easy O(1) lookup of patch name and contents, and potentially a
>>>> human-readable database (assuming the humans don't mind grubbing
>>>> around in _darcs/patches/).
>>> The hashes are far bigger than a 32-bit int, so it'll be a lot more space
>>> (proportionately), which I think would be enough to matter on large repos
>>> with large patches. I would also expect it to be significantly slower to
>>> read the larger file, which I would expect to outweigh the time cost of
>>> looking up the numbers later.
>> I'd just like to point out that the HashedIO hashes of patches *will* change
>> upon commutation, so in case you use them, you have to arrange for writeout of
>> a commuted patch to invalidate the cache, or you will miss relevant patches
>> upon a later annotate.
> Yes, as with all caches, yes, the cache will need to be kept coherent.
>  This is independent of how we refer to patches.

just to be sure, Petr's comment only applies to the hash of the
patch content (as used for the HashedIO identifiers)?

I'm using the hashes of the patch-info (Darcs.Patch.Info.make_filename)
and it would be nice if I could rely on the fact that they are invariant
under patch representation changes (I'm not sure if I really need it yet).


More information about the darcs-users mailing list