[darcs-devel] darcs patch: add context to HashedRepo.copy_repo (and 2 more)

Wed Mar 12 13:48:59 UTC 2008

On Tue, Mar 11, 2008 at 02:27:18PM -0700, Jason Dagit wrote:
> On Tue, Mar 11, 2008 at 1:07 PM, David Roundy <droundy at darcs.net> wrote:
> > The problem is that you expose a constructor for Context, and you
> > unseal the output from createHashed and friends (I believe).  Which is
> > effectively equivalent to simply unsealing the output of createHashed
> > and friends.
> 
> If I just unseal createHashed and friends then I end up with existential
> types which are not equal to any other type.  To then put the freshly
> created patches into a sequence I would have to unsafeCoerceP them.  This
> use of unsafeCoerceP seemed unconstrained and susceptible to misuse.

I don't think you should ever need an unsafeCoerceP just to read patches
into a sequence using createHashed.  That's why a sealed patch has only one
of its two contexts sealed.

Oh, I see the trouble:  it's that you want to constrain the final state.
This is an inherently unsafe operation, so it makes sense to use
unsafeCoerceP.  Why not just use it, rather than making the patch reading
itself unsafe?  The existing unsafeCoerceP's in read_repo and
read_tentative_repo are in exactly the place where we need an unsafe
operation.  What is wrong with this? It *is* unsafe, and needs to be
unsafe, and it's at the exact bit of code where we know that this unsafe
operation needs to be performed.  I like that a lot more than sprinkling
the unsafe operations all over the place.

Were there more unsafeCoerceP's that you needed to add? I'm really feeling
doubtful that this Context approach is the way to go.  You can't remove the
use of unsafe operations from HashedRepo, because that is where high-level
invariant properties of repositories are defined.

> The only way I could think of to eliminate this was to pass more context
> information to those functions.  As I said before, it seemed best to do
> this incrementally instead of completely changing the PatchInfo type.
> For the Context type to be useful at all I need a way to assign context
> to objects and conversely a way to remove it again.  Even if I hide the
> constructor of Context I still have to expose:
> unsafeContext :: a -> Context a C(x y)
> unsafeUncontext :: Context a C(x y) -> a

But unsafeUncontext is actually safe, only unsafeContext is unsafe.  Thus
by hiding the constructor we're able to distinguish between safe and unsafe
uses.

> Each corresponds to exactly to the constructor of Context.  These can still
> be used together in the same module to arbitrarily change the context of
> something.  That's why I didn't bother to create those two functions
> initially (I have now and I'm recording the patch as I type this).  I wasn't
> sure we gained anything.  I guess making people use the unsafe versions
> forces us to think about why and where we are adding/removing context.  I
> know this isn't as safe as our previous approaches, but without completely
> rewriting all the code that uses PatchInfo I'm not sure how else to approach
> this problem.  I don't think the Monadish approach applies here at all since
> we are doing normal IO in createHashed/parse and not patch operations.  And
> as I understand Monadish, the point is to make monadic patch operations
> safe.

> >  We could do that, but it would mean that those are unsafe functions.
> > They're already marginally dangerous, so maybe that isn't the end of
> > the world.  I'm not clear that Context gains us any real safety over
> > merely unsealing these functions.  I guess the only safety we'd have
> > would be gained by auditing the use of the Context constructor.
> 
> Someone could lie and create an invalid context using Context, but I think
> it's harder now.  And I think it's easier to create a sequence, associate a
> Context with the elements and then process that sequence correctly than it
> is to use unsafeCoerceP directly inside of read_repo_private.
> 
> Something I've learned from working pragmatically with our type witnesses is
> that it's unrealistic to think we'll ever make it impossible for people to
> introduce patch manipulation bugs.  I think the best we can get is that it's
> harder to do things dangerously than it is to do it safely.
>
> I agree that it is still important to audit the code because someone could
> always make use of unsafeCoerc#, unsafeCorceP, unsafeMap_l2f, etc.  Given
> that sometimes it is very hard to express what you mean in the type system I
> think these functions have a purpose and we just have to be cautious about
> using them.  The same will now apply to unsafeContext and unsafeUncontext
> (if they get accepted).

The key point of the type witnesses is to ensure that we can do a "grep
audit" of code: if grep unsafe Foo.lhs returns no hits, we should know that
a module is free of the classes of bugs prevented by typewitnesses.  This
allows us to reduce by an order of magnitude (or more) the amount of code
that need be carefully audited for this sort of bugs.

Thus any introduction of a mechanism for subverting the type witnesses
*needs* to be marked as unsafe.  If we can't guarantee grep-level safety,
we've gained very close to nothing but inconvenience.

> > > I think we should try to avoid it's use when we have a 'native' solution
> > > (like having contexts on patches), but I think even if we decide it
> > should
> > > be UnsafeContext that it's useful as an incremental trick.
> > >
> > > Does my explanation help?  Do you understand what I was trying to do
> > now?
> > > Do you have ideas how to improve it?
> >
> > Yes, that makes sense.  I think that either the constructor shouldn't be
> > exported, or it should be renamed to be "unsafe".  If we don't export it,
> > we'd create a function for adding contexts, which itself would be labelled
> > unsafe.  The problem is that Context (the constructor) + createHashed is
> > actually equivalent in unsafety to unsafeCoerce#, which is pretty darn
> > unsafe (i.e. we don't just lose type witness safety, we could introduce
> > segfaults).
> 
> I know that with unsafeCoerce#  you could do evil things like lie to the
> compiler and claim an Int is a String and your program would segfault (if
> you're lucky).  But, in this case, Context just manipulates phantom types
> and similarly for createHashed wrt Sealed (or not Sealed).  My understanding
> of phantom types is that at a low level it's okay to coerce them (eg, you
> won't segfault directly from that, but you may confuse the compiler and
> still get results you didn't want).  It's unclear to me how we would cause a
> segfault in this case.  I agree type safety of our witness types is
> important, but it's not clear to me how we could construct an example using
> Context and createHashed that would cause darcs to segfault.  The only
> possibility I'm thinking of is that we could write a version of
> read_repo_private that used Context to create a PatchInfo sequence that
> isn't really sequential and darcs could die when applying that sequence.  I
> believe this is equivalent to tampering with the inventory, but I don't
> think my approach to fixing read_repo_private improves or degrades our
> handling of tampered inventories.

No, there's no guarantee that Context just manipulates phantom types.
unsafeContext can create concrete types, so you could write (this is
pseudocode):

unsafeCoerce# :: a -> b
unsafeCoerce# x = fromJust $ do let myid = identity
                                    h = unsafeContext (hash myid)
                                    p = createHashed h (parse $ showPatch myid)
                                IsEq <- sloppy_identity p
                                return x

and you've got an inplementation of unsafeCoerce# (which never returns
Nothing).

> > The boring talk's over now, and I still haven't carefully read your patch.
> > But if you resend a version with an unsafe label (and be sure to think
> > about where you put the label, and don't just trust me), I'll apply it.
> 
> My patch to do that was sent in a different email.  Please let me know what
> you think.  I've put the unsafe labels where I think they belong.

I'm sorry to go back on what I've said, but I'm not going to apply this
without further discussion, since I'm increasingly doubtful of its
usefulness.

The network connection here decays in the afternoon, so I'll probably not
be online again today.
-- 
David Roundy
http://www.darcs.net