[darcs-users] darcs patch: Add LANGUAGE pragmas for explicit langua... (and 3 more)

David Roundy droundy at darcs.net
Wed Oct 29 14:46:08 UTC 2008


On Tue, Oct 28, 2008 at 09:12:53PM -0700, Jason Dagit wrote:
> On Tue, Oct 28, 2008 at 7:39 PM, David Roundy <droundy at darcs.net> wrote:
> > On Tue, Oct 28, 2008 at 03:36:32PM -0700, Eric Kow wrote:
> >> Salvo 6!
> >>
> >> Sun Oct 26 00:01:40 BST 2008  Don Stewart <dons at galois.com>
> >>   * Add LANGUAGE pragmas for explicit language extensions
> >>
> >> Sun Oct 26 16:40:09 GMT 2008  tux_rocker at reinier.de
> >>   * add a get_unrecorded_in_files to check for unrecorded changes in a subset of working directory
> >>
> >> Sun Oct 26 19:06:12 GMT 2008  tux_rocker at reinier.de
> >>   * make get_unrecorded_private work with type witnesses again
> >>
> >> Sun Oct 26 19:46:36 GMT 2008  tux_rocker at reinier.de
> >>   * make whatsnew use the lstat-saving functions to scan the working copy
> >
> > These look very appealing, and have some nice ideas, but are also buggy.  :(
> >
> >> ] hunk ./src/Darcs/Repository/Internal.lhs 417
> >>      ftf <- filetype_function
> >>      Sealed pend <- read_pending repository
> >>      debugMessage "diffing dir..."
> >> +    -- the unsafeCoerceP below is necessary to be able to concatenate
> >> +    -- pend with NilFL to form dif. See http://hpaste.org/11480
> >>      let diffs = if null files
> >> hunk ./src/Darcs/Repository/Internal.lhs 420
> >> -                  then [smart_diff opts ftf cur work]
> >> -                  else catMaybes (map (diff_at_path opts ftf cur work) (map fn2fp files))
> >> -    let workdiff = foldl (+>+) NilFL diffs
> >> +                  then smart_diff opts ftf cur work
> >> +                  else let diffsPerFile = catMaybes (map (diff_at_path opts ftf cur work) (map fn2fp files))
> >> +                       in foldr (+>+) (unsafeCoerceP NilFL) diffsPerFile
> >>          dif = if AnyOrder `elem` opts
> >
> > There may be other bugs, but this one is sufficient to cause trouble.  This
> > unsafeCoerceP really is unsafe.  Combining patches together in this way has
> > serious potential for repository corruption! In particular, if a file is
> > referenced more than once in the paths, you can get multiple copies of the
> > same change.
> 
> It's not actually the unsafeCoerceP that makes this case unsafe, and
> I'll explain why I think that.  The unsafeCoerceP is just there to
> deal with the problem in this psuedo code:
> foo :: a -> FL Prim C(x y)
> foo _ = NilFL
> foo a = -- do something legit with 'a' to create a sequence

No, the unsafeCoerceP doen't *make* it unsafe, it is however necessary
because the patch is unsafe.

> The problem being that this won't actually type check unless x = y.
> Basically, our type signature prevents foo from returning a list, such
> as the empty one, that claims to not modify context.  If we add
> unsafeCoerceP on the NilFL the problem 'goes away'.  I'm calling this
> the 'foo idiom' for now.

Right, the type signature of foo indicates that it is unsafe.

> You might argue that adding unsafeCoerceP to foo is unsafe in terms of
> Haskell's types but this argument depends on two reasons: 1) We have
> unsafeCoerce inside (=\/=) and (=/\=) and with it you can use foo to
> reimplement unsafeCoerce. 2) x and y are unconstrained by a
> relationship to the input of foo.
>
> Now, while we shouldn't encourage functions that make it possible to
> implement unsafeCoerce the above foo idiom isn't really a problem by
> itself.  The usefulness of functions like foo is evidenced by unsafe
> functions like smart_diff.  Being able to create a patch, with its
> context, 'out of thin air' is useful and we just have to be careful
> about it because of the next point.
> 
> The next level of unsafety is the one we assign to x and y.  We are
> embedding our patch theory constraints in Haskell's types and
> sometimes we can't do this flawlessly.  Sometimes what we want to do
> is type safe in Haskell's type system but does not respect the
> invariants of patch theory.  In this case the flaw is that smart_diff
> is really unsafe_smart_diff.  Our type signature for smart_diff is
> equivalent to replacing 'smart_diff' with '(unsafeCoerceP .
> smart_diff)'.

Right, smart_diff is usafe.

> Again, the problem here isn't with the unsafeCoerceP on NilFL.  Which
> is my way of saying, the problem isn't the fact that we are doing
> something that can be used to implement unsafeCoerce (while admitting
> that it is undesirable).  The real problem is that darcs has an
> invariant that our type encoding is ignorant of.  That invariant, the
> one you point out, seems to be about multiple copies of the same
> change.  I agree that we need to fix the code above to respect that
> invariant.

As I mention above, the unsafeCoerceP isn't the problem, it's only
symptomatic of the problem.  Nevertheless, paying attention to that
symptom enables us to catch problems in the code.

> > The get_unrecorded_in_files idea is great, and looks like the right sort of
> > interface, but concatenating patches from separate diffs is way too
> > fragile.  A better approach would be to pass all the interesting files into
> > a diff_at_paths.
> 
> Okay, so that pushes the invariant down a level.  I can't say for
> sure, but I think you're saying  that makes it easier to deal with.  I
> certainly can't think of a way off the top of my head to make a
> context threaded version of smart_diff.  Well, with the possible
> exception of returning Sealed (FL Prim C(x)) like we do with
> readPatch, which is flawed but harder to misuse.  And one more
> exception for dependent types; I'm pretty sure Agda could represent
> this even though I don't have an implementation to prove it.

Yes, having smart_diff and diff_at_path return Sealed sequences would
also have eliminated this possible bug--but would also have made this
feature impossible to implement.

> Generalizing your suggestion I think this is what you're saying: To
> make our flawed type encoding easier to work with, pass all the
> information necessary to construct the sequence to just one function.
> So that instead of O(n) calls to a (patch theory) unsafe function
> (smart_diff or diff_at_path) which can be threaded/concatenated in a
> (patch theory) unsafe way we make O(1) (patch theory) unsafe call(s).
> The code above is already sound in Haskell's type system (but could be
> made unsound by introducing (=\/=) or (=/\=)).
>
> Note: I'm hence forth making a distinction between things that are
> patch-theory unsafe and things that are haskell-type-system unsafe.
> Where the latter is sufficient for the former.  It would also be handy
> to make a special case for the foo idiom in the places where it is
> audited and known to be used safely.

I don't think we should ever use the foo idiom.  So far as I know,
it's never necessary, and never wise.  But perhaps you have a
counter-example.

> > I'm right now pushing a test that demonstrates this bug in whatsnew.
> > (Except it passes, since I'm not applying these patches just yet.)
> 
> Ah, tests are good.  They help us document these invariants.  Patch
> review is good, but I would certainly feel more comfortable with test
> centric patch acceptance over patch-review centric.  I certainly miss
> a lot of things when I review patches.  Also the research I read in my
> first year of graduate school, about psychology of programming,
> agrees.  If I recall correctly Nancy Pennington has some great studies
> on the subject.

I'm not sure what you mean by test-centric patch acceptance, do you
mean that we should accept buggy patches that aren't caught by the
test suite? I'd estimate that we most likely a couple of orders of
magnitude more tests before we're close to the point where the test
suite can compete with patch review in terms of safety.

> We could of course have the best of both worlds when we get the code
> more factored so that we have a small kernel that must be correct at
> all costs and libraries built on top by contributors that just want
> feature X where X is tolerant to bugs, eg., doesn't corrupt data (like
> interactive change viewing, which is buggy by the way but it won't
> cause corruption, mild confusion at best, I made a ticket for this but
> I can't find it).
> 
> > P.P.S. It's nice to see a case where the type witnesses really did catch us
> > a bug... Note that this code is right at the border between where the type
> > witnesses protect us and where they don't, since Diff itself isn't
> > protected by type witnesses.  Fortunately, it was on the safe side of the
> > border, or I might not have caught it... also fortunately, the bit on the
> > diff side of the border is sufficiently simple so as to be easily
> > reviewable.
> 
> I disagree that we are on the useful side of the type witnesses here.
> I've made my case for why above.  It is good that you saw the
> unsafeCoerceP and started investigating, but it's really the
> diff_at_path and smart_diff that should have caused the audit in this
> case.  Here I think it's unfair to say that unsafeCoerceP is the
> culprit, given it's usage.  It's solving a different problem, namely
> the foo idiom.

It's true that diff_at_path is itself unsafe, but it's also bug-free
by itself.  And as I'll point out, unsafeCoerceP isn't the culprit,
it's the symptom.  When you have to use unsafeCoerceP, the odds are
good that you're doing something wrong, which was true in this case.

> I'd argue that since I also reviewed this during the sprint but
> couldn't spot the bug that this is evidence that an invariant is
> missing from the type encoding.

True, we ought to generate sealed types with smart_diff and
diff_at_path, that would come closer to enforcing this invariant.  But
at the essence, the invariant that's violated in this code is the one
invariant that the type witnesses enforce, which is that patches
cannot be put in sequence unless they are known to be sequential.

> I'm not arguing that the type witness stuff is without use.  It's
> incredibly handy, I just think we have to be careful about attribution
> here :)

Well, I'd say that when code is made explictly unsafe because of the
type witnesses, it's done us a favor.

David


More information about the darcs-users mailing list