[darcs-users] darcs patch: Add Repository.checkPristineAgainstSlurpy. (and 11 more)

David Roundy droundy at darcs.net
Tue Sep 2 14:27:48 UTC 2008


On Tue, Sep 02, 2008 at 05:05:05AM +0200, me at mornfall.net wrote:
> Hi again.
>
> This bundle has the new, shiny and all that fix for issue971. There are two
> caveats:

In general, the principle looks good, and I expect to apply, providing
tests will pass.

> 1) We still introduce a failing test, pending_has_conflicts.pl: the new repair
> manages to somehow fail when an incorrect (conflicted) pending is present in
> the tree. Interestingly, this only happens in oldfashioned repositories. I'm
> currently still at loss as to why this happens.

A quick look shows that this was a real bug that we've fixed.  I've
translated this test into shell (always the first step in debugging a
perl test), and it's clear that the repository really is buggy, and
your patch causes repair to catch this.  This is a bug in record,
though, not repair.  Darcs record ought to either fail when someone
has corrupted the pending file, or it should ignore a corrupt
pending.  It definitely shouldn't record a corrupt patch containing
the contents of a corrupt pending!

Given the limited time I have to work on darcs, I think I'll skim over
the review of this patch, since we've discussed most of what you did
beforehand (and because this incided shows that you've already
revealed a bug that neither of us knew existed, which proves the
usefulness of this change.

> 2) I'm not exactly proud of readHashedPristineRoot. However, I, at this moment,
> lack the creativity to come up with anything better.

I'll take a look (and comment below).

> Also note, that this bundle includes the previous one, since there are some
> dependencies on that. Now, some notes on implementation:
>
> I have left most of the previous incarnation of Repair.lhs intact, since I have
> figured, that the code, under given setting, already has the intended effect:
> although a direct hash comparison might be more reliable and more efficient
> than smart_diff, I haven't done that yet, partly on the grounds of disliking my
> new readHashedPristineRoot.

That sounds like a good starting point, best to make just one change
at a time.  Comparing hashes directly does seem better (for the
reasons you mention), so hopefully we'll see a clear way to move in
that direction in the future.

> Now, replacePristineFromSlurpy, as it's been written before, does the job of
> updating the root pointer in the hashed case, and doing a rollout of new
> pristine for the old-fashioned case. Note that we currently, as the code goes,
> can't just update the root pointer, as the slurpy might contain unwritten
> changes (update_slurpy is not guaranteed to have been called after the last
> patch has been applied). Moreover, I think the code is simpler like this, and
> about just as efficient.

That makes sense, I think.

> That's it for now I think.
>
> Yours,
>    Petr.
>
> Tue Aug 12 02:50:39 CEST 2008  me at mornfall.net
>   * Add Repository.checkPristineAgainstSlurpy.
>
> Tue Aug 12 02:54:23 CEST 2008  me at mornfall.net
>   * First working (albeit slow) version of repair that uses hashed newpristine.
>
> Tue Aug 12 03:06:03 CEST 2008  me at mornfall.net
>   * Only "update" (sync to disk) the slurpy every 100 patches.
>
> Tue Aug 12 03:27:20 CEST 2008  me at mornfall.net
>   * Fix checkPristineAgainst{Cwd,Slurpy}: we ignored files missing in pristine.
>
>   Add LookForAdds to smart_diff options to fix that and also throw in IgnoreTimes
>   for a good measure and extra paranoia.

I think there's a problem in smart_diff, and that it should always
behave as if LookForAdds was passed.  I don't think this should change
the behavior of darcs (since LookForAdds is handled elsewhere), and it
would simplify the smart_diff code and eliminate the possibility of
this particular bug.  But this is a job for another day... (just
wanted to mention it, while it was on my mind).

> Mon Sep  1 17:03:24 CEST 2008  me at mornfall.net
>   * Resolve conflicts.
>
> Tue Sep  2 02:44:02 CEST 2008  me at mornfall.net
>   * Rename hashSlurped, slurpHashed and syncHashed to writeHashedPristine, slurpHashedPristine and syncHashedPristine, respectively.
>
> Tue Sep  2 03:04:05 CEST 2008  me at mornfall.net
>   * Change type of subdir parameter in Cache/HashedIO functions from String to HashedDir.
>
>   This refactor should make calling the Cache and HashedIO functions safer: you
>   should be no longer able to swap hash and subdir accidentally in the call site,
>   or mistype the subdirectory name.
>
> Tue Sep  2 04:05:32 CEST 2008  me at mornfall.net
>   * Haddock the {slurp,write,sync}HashedPristine functions in HashedIO.
>
> Tue Sep  2 04:42:54 CEST 2008  me at mornfall.net
>   * Add Repository.replacePristineFromSlurpy.
>
> Tue Sep  2 04:46:20 CEST 2008  me at mornfall.net
>   * Make the "hashed" repair use pristine.hashed for its work.
>
>   We rely on HashedIO reliability to simplify repair to work inside the existing
>   pristine.hashed. When running on old-fashioned (darcs 1) repositories, we
>   temporarily create pristine.hashed, use it to check (and possibly replace) the
>   old-fashioned pristine and finally we remove it again.
>
> Tue Sep  2 04:49:43 CEST 2008  me at mornfall.net
>   * Add HashedRepo.readHashedPristineRoot.
>
> Tue Sep  2 04:50:40 CEST 2008  me at mornfall.net
>   * Make clean_hashdir take a list of root hashes and use it in repair.
>
>   We use this functionality to keep two possibly distinct pristine trees while
>   repairing, both living in a single pristine.hashed directory.


> hunk ./src/Darcs/Repository.lhs 28
>                            withRepository, withRepositoryDirectory, withGutsOf,
>                            makePatchLazy, writePatchSet,
>                      findRepository, amInRepository, amNotInRepository,
> -                    slurp_pending, replacePristine,
> +                    slurp_pending, replacePristine, replacePristineFromSlurpy,
>                      slurp_recorded, slurp_recorded_and_unrecorded,
>                      withRecorded,

I'm not sure what interface we will ultimately want, but I'd like to
consider moving much of Repair's guts into Repository.Internals,
because code like replacePristineFromSlurpy provides a pretty
dangerous interface: if there is any bug in the code that calls it,
you've got a corrupt repository.  Ideally (and this really will be
true only after the type witness work is complete) code that only
calls the Repository API shouldn't be able to corrupt the repository.

(again, this is for a future day)

> [Add HashedRepo.readHashedPristineRoot.
> me at mornfall.net**20080902024943] hunk ./src/Darcs/Repository/HashedRepo.lhs 33
>                                       replacePristineFromSlurpy,
>                                       add_to_tentative_inventory, remove_from_tentative_inventory,
>                                       read_repo, read_tentative_repo, write_and_read_patch,
> -                                     write_tentative_inventory, copy_repo, slurp_all_but_darcs
> +                                     write_tentative_inventory, copy_repo, slurp_all_but_darcs,
> +                                     readHashedPristineRoot
>                                     ) where
>
>  import System.Directory ( doesFileExist )
> hunk ./src/Darcs/Repository/HashedRepo.lhs 101
>         -- note: in general we can't clean the pristine cache, because a
>         -- simultaneous get might be in progress
>
> +readHashedPristineRoot :: Repository p C(r u t) -> IO (Maybe String)
> +readHashedPristineRoot (Repo d _ _ _) =
> +    withCurrentDirectory d $ do
> +      i <- (Just `fmap` gzReadFilePS (darcsdir++"/hashed_inventory")) `catch` (\_ -> return Nothing)
> +      return $ inv2pris `fmap` i

Hmmm.  I'm not sure why you've moved all IO errors into a Maybe
output.  Why not just leave errors uncaught, so that the caller can
either fail or display the error appropriately to the user?

> hunk ./src/Darcs/Commands/Repair.lhs 100
>        Left err -> fail err
>        Right x -> return x
>
> -update_slurpy :: Cache -> [DarcsFlag] -> Slurpy -> IO Slurpy
> -update_slurpy c opts s = do
> +update_slurpy :: Repository p -> Cache -> [DarcsFlag] -> Slurpy -> IO Slurpy
> +update_slurpy r c opts s = do
> +  current <- readHashedPristineRoot r
>    h <- writeHashedPristine c opts s
>    s' <- slurpHashedPristine c opts h
> hunk ./src/Darcs/Commands/Repair.lhs 105
> -  clean_hashdir c opts HashedPristineDir h
> +  clean_hashdir c opts HashedPristineDir $ catMaybes [Just h, current]
>    return s'

Ah, I think I see:  it's because this code needs to work with either
hashed or non-hashed repositories? I can see now why you did it this
way--and also why you're not happy with it.

One problem with this code (not a fatal problem, but something that
would be nice to fix eventually) is that we're leaking a lot of the
internals of the repository format into Commands.Repair.  That's not
a new problem, but it's worth fixing (the mythical "some day").  And
it's alwo what's ugly about readHashedPristineRoot and using
clean_hashdir in the Commands hierarchy.

For now, I think this is close to as good as we're going to get.
Eventually, we can move much of this repair code into
Repository.Internal, where we can have access to all the gory
internals.  It's a little bit of a stupid distinction (moving all the
code into a different module), but I think it's an important
distinction, as it will allow just a bit more of the Commands code to
be written in a high-level manner that doesn't depend on the internal
details of repository format.

Applied (locally).  I'll deal with the failing test.  Thanks!  Also,
my work computers are still down (along with the power in the
building), so pushing will continue to be delayed.

David


More information about the darcs-users mailing list