[darcs-users] darcs patch: Is this Monadish?

David Roundy droundy at darcs.net
Thu Aug 28 14:58:40 UTC 2008


On Wed, Aug 27, 2008 at 09:17:36PM -0700, Jason Dagit wrote:
> David,
>
> Is this a type safe Monadish that you would accept?

It's very close, but it seems to be considerably more invasive than
I'd hope we could achieve.  In particular, I don't see that we need to
add any new types other than an RIO (or IORepo, but I'd rather keep it
short) type.

> Note that RepoInState is broken in the same way that Context was and maybeToFL
> is.  But, what else can we do?  I don't have any other bright ideas other
> than maybe you'll feel like we don't need it at all?  As in, we could make it:
> IORepo p r u tr a

You're right that I don't see any reason why we'd need it.

> and drop RepoInState like a bad habbit.  I think that presents new problems
> in the type of (>>>=).
>
> I can think far enough ahead to see how we'll instantiate the RepoInStates,
> but maybe we could export,
> newRepoState :: Sealed3 RepoInState
> newRepoState = Sealed3 RepoInState
>
> And then let it trickle down from pattern matches?  Of course, we probably
> also want to add, unsafeCoerceR :: r x y z -> r a b c
> In the event that we need to coerce a repository or a RepoInState.
>
> Probably one function that is missing is
> lowerIORepo :: IORepo p i j a -> IO a
>
> I also may have export evil goblins from Repositoy.InternalTypes.  It's even
> possible that by using newtype deriving that I've open backed doors.  Perhaps
> there is a way to derive lowerIORepo already?  Deriving MonadIO is
> nice because it lets us use liftIO (check the patch for why that's nice).
> Deriving Monad is nice because when then as long as the type witnesses are
> not changing we can just use the monad instance.

I'd definitely prefer not to use newtype deriving, and don't see any
reason to do so.  I don't see why MonadIO is nice, I'm afraid.  Is it
any different from the one-liner io function I define below?

I can see how a Monad instance would be handy, and hadn't thought of
that myself.  In fact, with that instance, there's no real reason to
keep the use of RIO to a minimum.  The trouble with newtype deriving
to get the Monad instance is that it breaks the system (as I'll
explain later), at least as far as I can tell.

> Ganesh was helping me with it and he recomended making it a ReaderT so that
> the Repository is hiding inside.  In the refactoring I've done so far, I didn't
> take advantage of that, but in principle it should work.  It almost seems like
> we could just throw away RepoInState and replace it with Repository p.  In fact,
> that's probably what I should do.  But, then how do you declare IORepo in
> a way that allows (>>>=) to have the right type?

I'd definitely make RIO be a Reader-like monad (as you'll see below),
and I agree about throwing away RepoInState and replacing it with
Repository p.  I'll give you a hint on defining RIO (your IORepo, but
I prefer to think of it as "repository IO", meaning IO performed on a
repository).

> Oh right, so here was the problem.  We don't want to use State because you
> could fetch a previous state and use it later.  But, we wanted to store the
> (Repository p), so a Reader works for storing (Repository p) since that part
> doesn't change.  That's why we farmed out the part that can change to
> the type RepoInState.  I think not storing the Repository in a Reader
> still presents the same issue of needing RepoInState.  I think without
> RepoInState we need all the permutations of type changes implemented as
> functions which is yucky :(

Right, but the changing type of Repository can very easily be hidden
in the reader-like monad.

> Please advise.

Below is a sketch of how I'd do things.

> Wed Aug 27 20:38:22 PDT 2008  Jason Dagit <dagit at codersbase.com>
>   * Is this Monadish?

Something like this (it's equivalent to a Reader Monad, but it's nicer
in my opinion to avoid those nasty transformers, and make behavior
explicit.  Monads aren't actually hard to define.

data RIO p C(r u t t') a = UnsafeRIO (Repository p C(r u t) -> IO a)

(of course you won't be able to use t' because of the C preprocessor,
but I'll use it anyhow in this email)

The type here reflects the fact that the RIO (stands for Repository
IO) can only modify the tentative state.  That's how we manage
atomicity in darcs, is by setting up a "tentative" state and then
calling finalizeRepositoryChanges.  Of course, this constructor cannot
be safely exported, which is why it's marked UnsafeRIO.

First off, we should change the type of

withRepository :: [DarcsFlag] -> (forall p C(r u). RepoPatch p
               => RIO p C(r u r r) a -> IO a

to reflect the fact that we can't modify repositories unless we've got
a lock, and the fact that the tentative state always starts out as the
recorded state.

We can now modify withRepoLock to be of a type something like

withRepoLock :: [DarcsFlag] -> (forall p C(r u). RepoPatch p
             => RIO p C(r u r t) a -> IO a

Where the type system will now allow us to make modifications to the
repository.  Note that we could do better than this, and could ensure
that any modifications are protected by withGutsOf, but that can be a
later refactor.

So now we need our monad-like functions (and I agree that we don't
need or want a class for these).

unsafeUnRIO :: RIO p C(r u t t') a -> Repository C(r t u) -> IO a
unsafeUnRIO (RIO f) = f

This is your lowerIORepo function, but marked (properly) as unsafe.

(>>>=) :: RIO p C(r u t t') a -> (a -> RIO p C(r u t' t'') b)
       -> RIO p C(r u t t'') b
(RIO f) >>>= g = RIO (\ (Repository ...) ->
                      do a <- f (Repository ...)
                         unsafeUnRIO (g a) (Repository ...))

(>>>) :: RIO p C(r u t t') a -> RIO p C(r u t' t'') b
      -> RIO p C(r u t t'') b
(RIO f) >>> (RIO g) = RIO $ \ (Repository ...) ->
                            do f (Repository ...)
                               g (Repository ...)

returnR :: a -> RIO p C(r u t t) a
returnR = io . return

io :: IO a -> RIO p C(r u t t) a
io = RIO . const

(You could also use the MonadIO class, I suppose.)

Note that these definitions themselves are entirely un-typechecked,
but we can audit them pretty carefully--since they're pretty small.
Also note that the type definitions of RIO functions can only be fully
checked if they do not use the RIO constructor.  So maybe we should
rename the RIO constructor as UnsafeRIO.  Any primitives that use that
constructor need to be carefully audited.

We'd also want an

instance Functor (RIO p C(r u t t')) where
  fmap f (RIO g) = RIO (\repo -> fmap f (g repo))

And as you suggest, an

instance Monad (RIO p C(r u t t)) where
  (>>=) = (>>>=)
  (>>) = (>>>)
  return = returnR
  fail = io fail

But we definitely don't want this monad instance derived from newtype
deriving, because that would break the type system (since you could
sequence two (RIO p C(r u t t')) actions to get a single (RIO p C(r u
t t') action, which strongly violates the type witnesses.  By writing
the Monad instance using the safe RIO operators (>>>=) and (>>>), we
ensure that this instance is safe.

I'm a bit afraid this monad instance won't work after all, since I
don't think that we can define an instance with 't t' repeated like
this.  :( In which case we may need to switch to a different approach
that I wrote about and then deleted (when I saw your idea of a Monad
instance).

A little checking shows that instances like

instance Monad (RIO p C(r u t t))

are allowed if we turn on FlexibleInstances.  I'm not sure how safe
that is, though.

Anyhow, these are my thoughts.  The backup plan--if we can't define a
safe Monad instance--would be to use RIO only in the bits of the code
that modify the repository, in which case withRepository and
withRepoLock would both look like:

withRepository :: [DarcsFlag] -> (forall p C(r u). RepoPatch p
               => Repository p C(r u r) -> IO a) -> IO a

which itself is an improvement over the current type (without changing
anything else), and we'd just change withGutsOf to run RIO:

withGutsOf :: Repository p C(r u r) -> RIO p C(r u r t)
           -> IO (Repository p C(t u t)

and then we'd be safe so long as withGutsOf is only called once (which
we could check in haskell_policy.sh).  This lesser approach might be a
reasonable stepping stone, as it'd give us almost-complete safety, and
would involve modifying far less code.  Oh, and note that withGutsOf
would ideally to be modified to call finalizeRepositoryChanges.  In
this latter approach, only repository-modifying command would need to
be moved into the RIO.  This has an additional advantage in that it
would help motivate us to keep the repo-modifying commands together in
a tight sequence--which is important on darcs-1 repositories, where we
don't have atomicity, and want to keep the critical section of the
code (where a crash would corrupt the repository) as small as
possible.

David


More information about the darcs-users mailing list