[darcs-users] Signing patches

Florent Becker florent.becker at ens-lyon.org
Fri Apr 10 19:43:36 UTC 2009


Daniel Carrera <daniel.carrera at theingots.org> writes:

> Ganesh Sittampalam wrote:
>> On Fri, 10 Apr 2009, Daniel Carrera wrote:
>>> So the order used for the hash is independent of how the patches
>>> are arranged in the repository. All that we need is that we agree
>>> on which patches we are looking at. So my suggestion only works if
>>> the set of dependencies of U is unique up to order.
>>
>> The dependencies of a patch are unique up to order, and the general
>> idea of hashing just U with its dependencies has been discussed
>> before (search for phrases like "minimal context"). I can't remember
>> what the conclusion was, if any, though.
>>
>> If you track down the past discussion, it would be worth writing up
>> as a wishlist item for the bug tracker if not already there.
>
> I found a discussion from 2005, starting here:
> http://lists.osuosl.org/pipermail/darcs-users/2005-November/008745.html
>
> Summary:
> * David R said it was a good idea. Ideally we'd use some canonical
> representation so it'll survive whenever Darcs changes format.
> * The cost of verifying a patch would be at worst O(N^2) assuming no
> exponentially expensive commutes. If commutation and patch-parsing was
> fast, it could probably work fine.
>
>
> I have an optimization idea: The patch could store not just its own
> hash, but also the hashes of its immediate dependencies. So Darcs can
> skip all of the patches that are not directly relevant. I'll
> elaborate:
>
> Def: By "direct dependency" I mean that patch B depends on A and there
> is no patch X such that B depends on X depends on A.
>
> Imagine we have the following tree: ABCDEFG
>
> We want to add patch H, which depends on F and B. In turn, F depends
> on E and B depends on A, but that's it. Then patch H would say:
>
> [hash: <hash-of-H>]
> [Added feature H.
> depends: <hash-of-F>
> depends: <hash-of-B>
> user at example.com**20090331192735]
>
>
> When Darcs verifies the patch, it skips straight to F and B. When it
> gets to F it sees:
>
> [hash: <hash-of-F>]
> [Added feature F.
> depends: <hash-of-E>
> user at example.com**20090221192735]
>
> So it goes get patch E as well. Likewise, when it reaches B it finds a
> depends:<hash-of-A>' so it grabs A. In this way, Darcs goes straight
> to the patches that it needs and doesn't have to spend as much time
> figuring out the minimal-context for H.
>
> The hash for a patch would be the SHA1 sum of the entire patch file,
> except for the [hash: ... ] line of course. This means that the hash
> of H depends on the hash of F and B, and the hash of F in turn depends
> on the hash of E and so on. In this way, the hash for any patch
> depends on the entire history of patches needed to produce that
> patch's minimal context.
>
> Once we have a unique hash, we can get digital signatures. So a patch
> might actually look like this:
>
> [hash: <hash-of-H>
> sig: <sign-hash-of-H>]
> [Added feature H.
> depends: <hash-of-F>
> depends: <hash-of-B>
> user at example.com**20090331192735]
>
>
> What do you think?

I think your scheme works, except that it makes both recording and
applying quadratic in the context. Extracting the minimal context takes
quadratic time (imagine a patch depending on every other previous patch).
 Since applying (actually, hash-checking) is going to be quadractic anyway
(because we have to reorder the dependences), it might be smarter to compute
the minimal context at apply time. [Disclaimer: this is all worst-case
complexity, one would have to look at realistic mean-case complexity…]

Florent



More information about the darcs-users mailing list