[darcs-users] darcs patch: Begin using RIO (and 39 more)

Wed Sep 10 20:06:03 UTC 2008

On Wed, Sep 10, 2008 at 7:48 AM, David Roundy <daveroundy at gmail.com> wrote:
> On Wed, Sep 10, 2008 at 10:46 AM, David Roundy <daveroundy at gmail.com> wrote:
>> The bug *here* is basically that _foo is a non-type-safe function.
>> It's exporting values with arbitrary type witnesses, which means that
>> it is equivalent to unsafeCoerceP.
>>
>> unsafeCoerceP :: forall a b. a -> b
>> unsafeCoerceP x = case foo1 =\/= foo2 of
>>                                IsEq -> x
>>                                _ -> impossible
>>   where foo1 :: Maybe (FL Patch C(() a))
>>             foo1 = _foo "input" -- where I give it any input that
>> leads to a Just value.
>>             foo2 :: Maybe (FL Patch C(() b))
>>             foo2 = _foo "input" -- where I give it any input that
>> leads to a Just value.
>
> Sorry, this should have been:
>
> unsafeCoerceP :: forall a b. a -> b
> unsafeCoerceP x = case foo1 =\/= foo2 of
>                               IsEq -> x
>                               _ -> impossible
>  where foo1 :: FL Patch C(() a)
>            Just foo1 = _foo "input"
>            foo2 :: FL Patch C(() b)
>            Just foo2 = _foo "input"

Well, the exact definition of _foo that I gave above would always
error at run-time because it only returns Nothing, but we could
imagine making a version that always resulted in a sequence with the
same ending context given the same input.  In that case, a and b
really would be equal.  In that case I would argue that one of the
flaws in our encoding is that you can use types besides () and
existential types as contexts.

This also leads me to the observation that one of the problems here is
that we essentially export access to unsafeCoerce through (=\/=).  We
do so because that's a useful thing to do given our encoding of
contexts into Haskell's type system.  Granted, we've exported it in
such a way that it is heavily constrained but as the examples have
shown there are multiple ways to get back to unsafeCoerce by starting
with (=\/=) or by instantiating phantom types in an undisciplined way.
 This seems to boil down to cases where people could instantiate the
types using concrete types.  In the context that _foo came up, we
could return a patch sequence from _foo because it calls functions
that create patches.  Instead of creating patches with sealed types we
have chosen to make functions, like smart_diff and hence
get_force_replace, that return patches with phantoms that are
completely unconstrained and non-sealed.  As you mentioned, having a
context on Slurpies could help but we don't have that and it would
take a lot of time.

I know we discussed this sort of thing in the past and as I recall the
conclusion was that if we returned fully sealed types for functions
like smart_diff that it becomes problematic for the caller.  The
caller would have to unsafeCoerceP the patches to a specific context.
So, why have we chosen to return arbitrary contexts that allow for
unsafeCoerce implementations when we could go with Sealed types and
judicious use of unsafeCoerceP?  My understanding is by returning
arbitrary contexts the caller doesn't have to actually call
unsafeCoerceP to get the patches to a specific context.  This of
course makes such functions unsafe because we've exported unsafeCoerce
as (=\/=).  In my opinion it's harder to tell though as we don't have
the explicit unsafeCoerceP at the call site.  This reminds me of the
way that getContents is really an unsafe function because of it's lazy
nature, but you don't immediately see the problem when you first use
getContents because nothing about it's type or name tells you that
it's unsafe and using unsafeInterleaveIO internally.

For people other than David following along, the reason sealed types
work around this is because they enforce the thing I mentioned in the
first paragraph by making the available phantoms into existential
types.  As this example shows:

_foo :: String -> Sealed (FL a C(x))

unsafeCoerce2 :: forall a b. a -> b
unsafeCoerce2 x = case foo1 =\/= foo2 of
                              IsEq -> x
                              _ -> impossible
  where foo1 :: FL Patch C(() a)
           Sealed foo1 = _foo "input"
           foo2 :: FL Patch C(() b)
           Sealed foo2 = _foo "input"

Ignoring the error we'd get about GHC's brain exploding (which we
could avoid by introducing some do-notation instead of binding foo1
and foo2 in a where-clause), the above wouldn't type check because the
type checker wouldn't let me unify the existential types with the
rigid types in the forall.  So this implementation of unsafeCoerce2
wouldn't work.

Now that I understand the problem better, I feel even more strongly
that sealing things like smart_diff is the right thing to do, even if
it requires some manual proofs or unsafeCoercePs when doing things
like reading a sequence of patches from the inventory.  My reasoning
being that by sealing everything we have a pessimistic system.  We're
forced to explicitly state in the code when the sealed types should be
equal instead of having some places where we're optimistic and allow
the types to unify freely.  This freedom to unify with whatever leads
people like me to define unsafe functions without realizing it.  The
downside to the pessimistic approach is that you could incorrectly
apply unsafeCoerceP, but at least I know I'm potentially making a
mistake and where explicitly.  It's even a greppable offense.

This seems to be the place where you argue that needing an
unsafeCoerceP in the high level code is a bug and hence sealing
smart_diff leads to buggy high level code.  Whereas I feel that
functions like smart_diff have the same "bug" but hide it more subtly.
 One thing I can't tell is when you'll say something is unsafe and
acceptable and when something is unsafe and unacceptable.  For
example, we could implement unsafeCoerce above with smart_diff and yet
we allow smart_diff and try to remember that it's unsafe and only use
it when we have to.  Maybe this has something to do with where in the
source we've defined the unsafe functions?  I don't really know.  This
is also why I need you to comment on changes that I'm proposing; I
can't read your mind about these issues.  Also, the unsafeCoerce
implementations seem to depend on us importing something that uses
unsafeCoerce internally, so your example unsafeCoerce wouldn't be
possible in Replace.lhs currently.  Additionally, sequenceRepls and
friends are defined in a where-clause and used only once as you argue
as a reason to allow them to return a freely unifying type.

Jason