[darcs-users] darcs patch: Use index-based diffing in Record. (and 57 more)

Petr Rockai me at mornfall.net
Mon Sep 14 18:39:24 UTC 2009


Eric Kow <kowey at darcs.net> writes:
>> In my preferred world, hashed-storage would be a library that users can  
>> instantiate with whatever specific hash type and format they choose. 
>> Darcs would then be a client with a specific instantiation that takes 
>> account of the general weirdnesses of the darcs hashed format (it uses 
>> both SHA1 and SHA256 hashes, the hashed filenames in existing repos are 
>> prefixed by a 10 digit size specifier, and the way that directories 
>> listings are hashed is also somewhat specific to darcs).
>
> This sounds like the right thing to do in principle, but due to geography,
> I've been talking to Ganesh more, so I'm biased.  In any case, I'm not really
> entitled to have an opinion.
Well, most applications will likely never need to use any other hash type than
what hashed-storage provides. However, with current state of affairs, the
change mostly boils down to: replace all occurrences of Hash with (Hash h) =>
h. The class would contain the encoding/decoding bits and computation of
hashes.

There are two options on substituting NoHash -- either provide a class
function, noHash, and give up pattern matching (or use view patterns), or wrap
everything in Maybe again. I find the Maybe wrapping to be really ugly though
-- it complicates a lot of signatures (and you can't even escape with a type
synonym anymore, now that "h" is of some unknown type), and I had specifically
refactored hashed-storage to get rid of this Maybe wrapping. In my opinion, the
NoHash constructor is a natural part of the Hash type -- it is common and valid
for some bits in hashed-storage to not have any hash (that's where that
ubiquitous Maybe would come from -- I don't think I can identify any
significant use of plain 'h', without a Maybe, throughout the library -- apart
from computeHash :: ByteString -> h).

The upside would be that people that really don't like the SHA256 for whatever
reason could still use hashed storage with their non-sha256 hash function (but
at a cost: their hashed stores would be incompatible with all other
hashed-storage client code). For darcs, the SHA256 is perfectly OK and there's
really no use for an overloaded Hash type on this front. Neither the 10-digit
prefix nor the directory listings are related to the Hash type, since both
these things are handled by the "format" -- the Storage.Hashed.Darcs or
Storage.Hashed.Plain module, and neither leaks into hashed-storage
internals. Treating the 10-digit prefix as a part of hash is a historical
mistake (both for darcs, and for earlier hashed-storage).

So I still think that the Hash overloading is mostly of theoretical interest,
but would have little practical impact on both darcs and other hashed-storage
users. So I still lean toward keeping things simple and stupid. If this turns
out to be a huge problem in the future, we can always just do this
refactor. Trying to foresee future usually doesn't work out very well anyways,
and talking of code design this is doubly so (YAGNI!).

Yours,
   Petr.

 PS: It may be worth stressing, that if there's a fixed Hash type in
 hashed-storage, we can guarantee that all hashed-storage apps will be able to
 read all other hashed-storage apps stores, when using formats implemented in
 hashed-storage. Generalising Hash would in theory lose this property (but in
 practice, I guess it wouldn't matter, since everyone would just stick to the
 default Hash anyway).


More information about the darcs-users mailing list