[darcs-users] the adventure branch

Jason Dagit dagitj at gmail.com
Fri Aug 27 04:45:59 UTC 2010


Petr Rockai <me <at> mornfall.net> writes:

> 
> Hi,

My reply is very long and I know you're busy.  I'm not actually
subscribed to darcs-users.  The fact that I have come up with a way to
reply should tell you that this thread matters a lot to me.
Therefore, I do and will appreciate the time you spend reading it and
any time you spend giving me a thoughtful reply.

> 
> it seems that we have reached a point in darcs development where it
> would make a lot of sense to start improving our general codebase and
> the libdarcs interface.

So far I'm with you.  Code quality is something I care about too.
Passionately.  I discovered darcs in part because CVS kept segfaulting
when I was importing data into a repo.  David rewrote the original
darcs (in C++) into Haskell because it was just too buggy.  My MS
project was aimed at making darcs more maintainable and less buggy.
So, not only do I care about it abstractly, I've invested a lot of
personal time and energy into that goal with respect to the darcs
code.  I hope that's something we can all say.

In other words, I'm questioning the details of what you propose very
carefully because I care very deeply.
 
> Also, there are some rewrites in store for parts of darcs functionality
> (e.g. annotate) and a port to hashed-storage 0.6 (which itself is a work
> in progress).

Oh?  Are these plans written down anywhere?  If so, I missed them.  I
was planning to rework annotate myself.  I'd have to duplicate effort.

> We have discussed how to manage these kinds of changes review- and
> merge- wise, and our current conclusion is that it would likely be best
> to create a separate branch.

Who is "we" and where/when was this discussed?  I heard this may have
been discussed on IRC.  Could you please provide links to the relevant
discussion or summaries?

> I'm proposing for the branch to live on
> http://darcs.net/adventure. The current review team should stay
> responsible for this branch, in addition to mainline (http://darcs.net).

Okay.  You're asking the review team to split their attention between
the two?  This sounds unpleasant given that as a volunteer run project
we're perpetually understaffed and human attention is one of our most
valuable resources.
 
> The main differences in the process for the adventure branch would be:
> - for mainline, we are trying hard to keep everything fully functional
>   and nearly-releasable... this requirement would not hold for
>   adventure, making it possible to merge work in progress that
>   temporarily breaks darcs

Why is it useful to allow darcs to break?  Can't you do that broken
refactor on a local feature branch?  Why does that breakage need to
propagate to an official and shared repository?

> - the kind of review we do on mainline is quite detailed -- on the
>   adventure branch, we would probably drop line-to-line reviews and
>   instead favour higher-level, "design" view

What is a higher-level design view?  Can you give me a concrete
example of how this review is different than the current reviews we
do?  By the way, I don't feel like code review is good at catching
bugs.  It's more about code quality than correctness in my opinion.
As far as I can recall, I've been saying that since we started using
code review although not very loud because I want to encourage code
review.
 
> The changes landing in adventure are likely to be very
> invasive. Therefore, we do not intend to ever merge adventure and the
> current mainline. Instead, applicable fixes that are merged into
> mainline should be also done on adventure. At some point, when adventure
> is no longer very adventurous and we are satisfied with its shape, we
> will swap it in for mainline.

Okay.  Now wait a second.  I'm not okay with some of the *possible*
implications of this.  There are ways to do this that I would be happy
with, but given what is described here I feel my anxiety rising and
red flags going up.
 
> I hope that this is going to be a one-off venture: one of the main goals
> of the adventure branch will be to get the darcs codebase into a state
> where further refactorings and far-reaching changes will be manageable
> within mainline.

What is the plan to achieve this goal?  The details matter a lot to
me.  Let me explain.

Every codebase has at least two types of bugs: (a) bugs you know
about, and (b) bugs you don't know about.

The process of quality assurance helps us expand (a) (and then
hopefully you fix them), while minimizing (b).  If we throw out the
current code and swap in new code, written from scratch from the
sounds of it, then we are ideally doing quality assurance with the
goal I just mentioned.  Unfortunately, it's not obvious from your
explanation that we're doing QA of the type I just described, or that
QA will happen at all on the adventure branch.  In fact, it sounds
like you're okay with things being broken there.

We might end up with cleaner, more modern code, that is overall
prettier, but what affect will that have on (a) and (b)?

Let me make that more concrete.  The type-witnesses we use help to
reduce (b) by statically enforcing a particular invariant.  Picking
good libraries give us assurance (but not guarantees) that (b) is
reduced for the functionality we draw on.  The tests we have (unit,
and shell) provide even more confidence that the code is correct,
fleshing out (a) so we can reduce it.  The Coq/Isabelle work that Ian
and others have done provides assurances at the specification level.

Will the adventure branch include rigorous and disciplined use of
testing?  For example, will it follow a variant of test driven
development?  Write unit tests (or QC properties) for all the code we
write?  Will it require criterion micro-benchmarks?  Tests for all the
pathological thing like our error/exception handlers?  Bad input for
our parsers? etc.

I assume the final swap won't happen unless the adventure branch can
pass the shell tests that we have at the time in the current
mainline. I think that's a low bar and we should intentionally plan
from the start to have a MUCH higher bar.

What about formal methods?  Haskabelle is not ready for production use
so we can't use it.  Dependently type programming in Haskell is a bit
of a pain (see our type-witnesses for an example), but maybe we could
automate it using SHE [1]?  We are keeping the type witnesses, right?
Agda isn't too dissimilar from Haskell.  We could write the algorithms
in Agda and translate the implementation to Haskell.

I was working on a type threaded version of iteratee to see if we
could use it for resource controlled processing of patch streams. I'm
not sure we need the type threading on iteratees, but I thought maybe
if I had that library I could see where it makes sense to use it.  The
problem I had with it was that chunking doesn't fit well with the type
witnesses.  Which meant I couldn't write the generic stream
transformer that appears in John Lato's iteratee library.  I could
probably fix this in my iteratee library by fixing the chunk size to a
single element.  I just need to reinvestigate it.

I was also working on a type threaded zipper that uses a monad on the
ends (say, IO).  The goal would be a monadic cursor based traversal of
patch sequences that exposes commute at the cursor location.  Then you
could do interesting things like write the commuted patches to disk in
a temp location instead of needing to hold them in memory.  Perhaps
using weak pointers to reduce disk IO.  It might look something like
this:

data ZipperFL a d where
  ZFL :: IO (FL Patch C(a b))    -- ^ head of the stream
      -> (Patch :> Patch) C(b c) -- ^ Focus
      -> IO (FL Patch C(c d))    -- ^ tail of the stream
      -> ZipperFL a d

Then you'd have functions like:
commute :: ZipperFL a e -> Maybe (ZipperFL a e)
forward :: ZipperFL a e -> ZipperFL a e

Which act on the patches at the focus of the zipper.  We could easily
parameterize this over any monad if we want to make testing easier.

There is also the type index monad (RIO) stuff that David and I
started but put on hold.  I think recently Ganesh was interested in it
again.

How would those design experiments fit into your adventure branch?

Another thing to consider is that there are lots of arguments
available, if you search for them on google, explaining why it's bad
to throw away your code and start over.  I really hope that's not or
plan our.  The type-witnesses were not easy to get integrated and yet
we came up with a plan that let us put them in incrementally.  What I
don't know, because I missed out on the previous discussions, is why
it is not feasible to refactor the current code to include lots of QC
properties and other QA, and then refactor mercilessly?

In conclusion:

If we make correctness a priority from the BEGINNING of the adventure
branch and have test driven development (or formal methods driven
development) requirements on all code that goes in, I will feel quite
happy about the branch.  I'll be telling everyone to use it once its
merged back in.  I'll be singing the songs of darcs praise.  Happy
lispy will be happy :)

If we instead, just do some testing at the end of the adventure branch
using our current test suite then I'll be dragging my feet.  I'll be
quite afraid of the new branch and probably stick with its predecessor
for a few stable release cycles.  I'll be telling my friends to stay
away from it too.  Basically, I'll be sad.  Don't make lispy sad :(

Therefore, I really hope we can use the adventure branch as a chance
to make a cultural shift to evidence based correctness in all the
patches that we accept to darcs.

Thanks,
Jason

[1] http://personal.cis.strath.ac.uk/~conor/pub/she/



More information about the darcs-users mailing list