[darcs-users] Origin of sha1 hash value

zooko at zooko.com zooko at zooko.com
Tue Feb 28 01:41:41 UTC 2006


> > The patch id isn't a hash of the patch, but a hash of the patch
> > metadata.  As far as I know, this gains none of the benefits of
> > hash-based identification but suffers all of the costs and then
> > some.
> 
> Interesting, and a bit peculiar ;)  Though I suppose since the 
> datetime and the author are hashed as part of the metadata, that 
> pretty much guarantees a unique identifier in normal usage.

I wish.  If you are using darcs in an automated fashion, being driven by
scripts or events, it is not hard to make for colliding patch ids.  I've done
this myself twice.  If you get a colliding patch id then (if I recall
correctly), you get a corrupted repo.

This is a pet peeve of mine.  I've argued on mailing lists about it for years.
I've persuaded David Roundy to agree that it would be good to add some bytes of
real randomness in there somewhere (he suggested in the least significant bits
of the timestamp, e.g. in the femtoseconds).  However, nobody has actually gone
and done it.  It would make "convergence by means of generating bitwise
identical patches" harder.  (Not impossible, by using PRNG.)

Alternately, one could hash the actual contents of the patch, which raises
different issues...

Regards,

Zooko




More information about the darcs-users mailing list