[darcs-devel] darcs-git merge

David Roundy droundy at darcs.net
Sat Jul 9 07:05:57 PDT 2005


On Fri, Jul 01, 2005 at 10:13:24PM +0200, Juliusz Chroboczek wrote:
> > I imagine that ideally none of the "command" modules would import
> > DarcsRepo at all, once the transition is complete.
> 
> Hmm...  I'm not so sure about that; there are a few commands that we
> might not want to go through an abstract interface.  Check and repair
> come to mind.
> 
> Of course, one solution is to make ``checkRepository'' and
> ``repairRepository'' be the abstract interface, but I think that's
> cheating.

You're right.  Also there are the cases Ian mentioned, get, put and init,
which create a repository from scratch.  But the majority of the darcs
commands shouldn't need to import DarcsRepo.

> > Most notably, I'd like to make GitSlurpy be a
> > {Writeable,Readable}Directory,
> 
> Could you explain that?

I'd like to have something like the SlurpMonad (in SlurpDirectory.lhs):

data SlurpMonad a = SM ((Either String Slurpy)
                        -> Either String (Slurpy, a))

which allows you do write monadic code modifying a slurpy using

withSlurpy :: Slurpy -> SlurpMonad a -> Either String (Slurpy, a)
withSlurpy s (SM f) = f (Right s)

This monad is an instance of {Writeable,Readable}Directory, which means
that... we ought to be able to write

apply_to_slurpy :: Patch -> Slurpy -> Maybe Slurpy
apply_to_slurpy p s = case withSlurpy s (apply p) of
                      Left _ -> Nothing
                      Right (s', ()) -> Just s'

I see that we haven't done this, but it was in the plans and now that I see
that it hasn't been done I'll do it.  It was waiting on a cleanup that Ian
has already done, which handled the setpref patch type.

The advantage is that we then need only one implementation of "apply",
which is obviously a very nice thing.  The withSlury/SlurpMonad scheme
also presents a nice imperative interface so we can modify Slurpies as if
it were a real filesystem.

Making GitSlurpy work this way would also eliminate the export of
applyHunkLines and applyBinary that you were forced to add to PatchApply,
which would be a Good Thing.

> > Also, the prefsDirectory export from RepoPrefs is ugly.
> 
> That's an understatement.  It's a horrible hack to get send to work.

I've fixed this, I believe (see patch I just sent).

> > It's only imported into Send, and even there is used incorrectly.
> 
> Ah?

The problem was that you were using the current directory's repository type
to determine where to look for prefs/email in the remote repository, which
might not be the same repository type.

> > I don't care for the exports of Slurpy constructors that Juliusz added.
> 
> Neither do I.  It is my understanding that Slurpies are mostly going
> away anyhow, so I'd like to understand Ian's work before thinking
> about redesigning that.

They aren't entirely going away.  Their use is being reduced, but they'll
stay around for diffing purposes at the minimum, and also for checking if
patches can be applied without actually modifying the repository.  But
you're right that confering with Ian would be a good idea.

> > Shouldn't identifyPristine return NoPristine if it can't find a
> > pristine directory cache?
> 
> What do you mean by that?  That it should return NoPristine if there's
> not even a ``pristine.none'' file, or that it should work on a Git
> repository?

Yes, I think I meant both.

> If the former, I disagree, as it would prevent us from having new
> formats of pristine caches in the future (we want to reliably fail in
> that case).

Ah, that's a good point.  The RepoFormat bit could handle this, but a bit
of redundancy like this would be a good idea.

> If the latter, than I think we're seeing the abstractions differently.
> The Pristine layer lives below the level of Repository.  Commands live
> above Repository.  Having commands delve to the Pristine level makes
> for a much more complicated layering than the one I have in mind.
>
> But I'm open to your vision if you can explain why you think it's better.

Okay.  Hmmm.  I've just added a sort of intermediate layer, which holds the
RepoForamt information and the URL of the repository, so we'd now have a
place to stick it if we wanted to always have a pristine cache.

I think this relates to the question of how you want to handle pending and
darcs add.  On one extreme, we could have both a pristine cache and a
pending file in darcs-enabled git repositories.  On the other extreme
(which I think is what you're imagining/implementing), we could have
neither, and rely on the normal git cache to store both the "pristine
state" and the "pending change" (meaning whatever information we need to
get the needed functionality).  In between, we could have a pending but no
pristine cache.

I think we ought to always support a pending.  With no pending, replace
would be impossible to implement.  We may not *yet* support replace in git
repositories, but it'd be nice to have the option of doing so (adding the
relevant information to the commit information section).  Having support
for pending (provided we abstract it a bit) would mean that
replace/add/remove/mv could all be implemented in a
repository-format-agnostic manner.

I think regarding the pristine cache, I agree with you (not to imply that
we disagree with regard to pending, since you haven't said anything about
that).  Reading the pristine cache can already be handled in a
repository-agnostic manner through slurp_unrecorded.  Writing to the
pristine cache should only happen when adding patches to the repository,
and in that case we *ought* to have an abstraction that looks like

add_patches_to_repository :: [Patch] -> Repository -> IO ()

which would handle (for DarcsRepos) writing the patch files themselves,
updating the inventory, updating the pristine cache, updating the pending
file, and doing this all in as atomic a manner as possible.  This would
both simplify the command code itself, and would greatly reduce the danger
of inconsistent handling of errors in modifying a repository (witness the
recent bug discovered in pull, which was in precisely this code--and might
also show up in apply).

> > I don't like many of the uses of unsafePerformIO in Git... but I've
> > been focusing on the non-git portion of the code.  I'm afraid that even
> > if these are "safe" in the sense that they read objects that are
> > immutable, they could lead to confusing error messages if the objects
> > have disappeared.
> 
> The initial implementation of Git used IO throughout.  Then I got to
> understand Linus' design, which happens to be purely functional.
>
> [...]

Okay, this makes sense.  I guess maybe a comment somewhere in there
explaining this would be nice, to avoid scaring people like me?

I'm still thinking that moving some of the unsafePerformIOs to different
locations might be a good idea.  In particular, if GitRepo.read_repo is
pased a relative path to a repository, there will be potential bugs if we
ever change directories, which is very scary.

As long as GitRepo.read_repo is only ever called by Repository.read_repo,
this shouldn't be a problem, since Repository.read_repo makes the path
absolute, but it's a scary situation.

I think this is the only danger I see (now that I've had your explanation)
in your unsafePerformIOs, that they aren't safe unless their argument is an
absolute path.  Perhaps this danger is easily alleviated if the Git code is
only ever called by GitRepo code, if we just add some absolute_dir calls to
the GitRepo module.

> Thanks again, David, for the merge,

You're welcome, and thanks for the contribution!  :)
-- 
David Roundy
http://www.darcs.net




More information about the darcs-devel mailing list