[darcs-devel] darcs patch: Include validators in slurpies. (and 1 more)

Fri Oct 28 05:48:19 PDT 2005

On Thu, Oct 27, 2005 at 09:29:01PM +0200, Juliusz Chroboczek wrote:
> This is just a proposal.  If you see a better way of doing that, I'll be
> glad to recode it.  (I'm waiting to hear from you before continuing this
> line of work.)

So the idea is to optionally include the git hash of each file in the
slurpy so that we can compare those if possible to when diffing?

What I'm not so clear on is when this will be useful.

It seems like when running whatsnew/record, we'll only have a hash for one
of the two slurpies involved, so is the plan (or perhaps even the existing
implementation--I only skimmed it) to generate the hash for non-git
repositories, figuring that this'll be faster than reading the entire file
from the pristine cache--although we might still have to read the entire
file, if they differ.

Is there some other diffing scenario where we'll already have hashes on
both sides... I guess maybe this is it, when we're reading a git repository
we have to diff all the versions, don't we?

Will the presence of this validator allow us to eliminate the GitSlurpy?
That would seem like a noble goal--it's so similar to Slurpy that using
shared code would be great.

Did I misread the patch, or do you really not attach validators to
directories? If not, why not? It seems like it would make lots of sense
(unless I'm misunderstanding--but the linux kernel has such huge
directories that just avoiding parsing the list of files in a few
subdirectories could be a huge performance gain).

This patch makes me think that perhaps it would be nice to switch Slurpy to
use record accessors so we could avoid many of these stupid changes that
are needed because of pattern matching based on position.  I think
(although I'm not sure I've ever written code like this) that if we write

data Slurpy = SlurpDir {
                slurpname :: FileName,
                slurpvalidator :: Validator,
                slurpsubslurps :: [Slurpy]
              }
            | SlurpFile {
                slurpname :: FileName,
                slurptime :: EpochTime,
                slurplength :: FileOffset,
                slurpvalidator :: Validator,
                slurpcontents :: FileContents
              }

then we can pattern match like

  case aslurpy of
  SlurpFile { slurplength = l, slurptime = t } -> do something with t and l

which would mean we wouldn't have the silliness that you had to deal with
where every time that stupid tuple was used it had to be modified because
you added a new field.

Which is not to say that you need to make this change, just that your
patch made me think that it would be nicer to have our code be more
flexible with respect to changes in data layout.  I didn't know before that
one could pattern match on named fields--this seems considerably more
robust than position-based pattern matching.

Back on-topic, so far your validators look like a good idea to me (modulo
on-topic comments and questions above).  I imagine with care they can be
used to implement a hashed pristine cache--perhaps even using
git-compatible hashes.
-- 
David Roundy
http://www.darcs.net