[darcs-users] Signing patches

Fri Apr 10 19:22:11 UTC 2009

Ganesh Sittampalam wrote:
> On Fri, 10 Apr 2009, Daniel Carrera wrote:
>> So the order used for the hash is independent of how the patches are 
>> arranged in the repository. All that we need is that we agree on which 
>> patches we are looking at. So my suggestion only works if the set of 
>> dependencies of U is unique up to order.
> 
> The dependencies of a patch are unique up to order, and the general idea 
> of hashing just U with its dependencies has been discussed before 
> (search for phrases like "minimal context"). I can't remember what the 
> conclusion was, if any, though.
> 
> If you track down the past discussion, it would be worth writing up as a 
> wishlist item for the bug tracker if not already there.

I found a discussion from 2005, starting here:
http://lists.osuosl.org/pipermail/darcs-users/2005-November/008745.html

Summary:
* David R said it was a good idea. Ideally we'd use some canonical 
representation so it'll survive whenever Darcs changes format.
* The cost of verifying a patch would be at worst O(N^2) assuming no 
exponentially expensive commutes. If commutation and patch-parsing was 
fast, it could probably work fine.

I have an optimization idea: The patch could store not just its own 
hash, but also the hashes of its immediate dependencies. So Darcs can 
skip all of the patches that are not directly relevant. I'll elaborate:

Def: By "direct dependency" I mean that patch B depends on A and there 
is no patch X such that B depends on X depends on A.

Imagine we have the following tree: ABCDEFG

We want to add patch H, which depends on F and B. In turn, F depends on 
E and B depends on A, but that's it. Then patch H would say:

[hash: <hash-of-H>]
[Added feature H.
depends: <hash-of-F>
depends: <hash-of-B>
user at example.com**20090331192735]

When Darcs verifies the patch, it skips straight to F and B. When it 
gets to F it sees:

[hash: <hash-of-F>]
[Added feature F.
depends: <hash-of-E>
user at example.com**20090221192735]

So it goes get patch E as well. Likewise, when it reaches B it finds a 
'depends:<hash-of-A>' so it grabs A. In this way, Darcs goes straight to 
the patches that it needs and doesn't have to spend as much time 
figuring out the minimal-context for H.

The hash for a patch would be the SHA1 sum of the entire patch file, 
except for the [hash: ... ] line of course. This means that the hash of 
H depends on the hash of F and B, and the hash of F in turn depends on 
the hash of E and so on. In this way, the hash for any patch depends on 
the entire history of patches needed to produce that patch's minimal 
context.

Once we have a unique hash, we can get digital signatures. So a patch 
might actually look like this:

[hash: <hash-of-H>
sig: <sign-hash-of-H>]
[Added feature H.
depends: <hash-of-F>
depends: <hash-of-B>
user at example.com**20090331192735]

What do you think?