[darcs-users] switching all references by patches to be by hash only

Nathaniel W Filardo nwf at cs.jhu.edu
Mon Aug 24 05:50:11 UTC 2009


On Mon, Aug 24, 2009 at 01:34:36AM +0100, Eric Kow wrote:
> > While we're talking about revving the storage, would it be feasible to
> > switch _all_ references to patches to be by hash only?
> 
> Note the file hash ties us to a particular instance of a patch. If we
> commute the patch, it gets a different file hash so I don't think this
> would give us enough information to work with.

Correct.  However, context files always contain a given linearized
representation of the dependency graph and therefore may get away with
refering only to one particular commuted form of a patch.

If the patch files themselves (continue to?) contain the patch information,
then there is no loss of information in switching representations.
 
> The problem with using just the file hash is that the object you want
> may just not be on the repository you're reading from.  You may be able
> to reconstruct it through commutation, but then you'll need the regular
> patchinfo stuff to do that.

First off, it's probably rude of me to send you a context file and a pointer
into my repository for which I am unable to provide the backing store.
Equivalently, I should ensure that every context file I touch either has its
dependencies satisfied locally or that I emit a modified version which does.

Secondly, there's no check currently that two patches which are decorated
with the same patch info are actually different commuted forms of each
other.  This is Zooko's point about the failure of security in darcs -- it's
possible to provide you a bogus patch that looks right.  Switching to hashes
as identifiers closes the bogus-return hole at the expense of requiring
either more time (to generate a locally relevant context file) or space (to
store all the different commuted forms of a patch).

As an alternative proposal, since commutation is going to alter the content
of a patch -- but merely its place, rather than real "content" -- we could
instead give each file a unique identifier (other than its name; cf.
arch/tla) and discard positional data of hunks when hashing the patch.  Such
an identifier should be unmolested by commutation and may therefore stably
name a patch; at least one set of location data would have to be provided in
order to commute such a thing, but that's probably OK.  This feels similar
to http://web.mornfall.net/blog/patch_formats.html .

--nwf;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.osuosl.org/pipermail/darcs-users/attachments/20090824/67de5906/attachment.pgp>


More information about the darcs-users mailing list