[darcs-users] Getting ready to sprint, camp team

Wed Apr 15 17:01:35 UTC 2009

On Wed, Apr 15, 2009 at 03:30:02PM +0200, Florent Becker wrote:
> 
> A/ camp mailing-list versus darcs-users mailing list.
> 
> I discovered yesterday the existence of camp's own mailing list, which
>  sees very few usage. darcs-user's traffic is also not that big, and given
>  that it would be nice for everyone here to know in what shape the future
> darcs-3 is. I propose to use darcs-user for both camp and darcs for 
> normal discussion, and respectively the camp and darcs-devel lists for
> automated traffic. Is it ok with everybody?

Some people have been saying that d-u is too busy for them, and I think
that many people on d-u won't be interested in the camp talk, so I'd say
it's best on the camp list. I'll happily reply here if that's what other
people want, though.

> Side question:  does camp want a space on darcs-wiki, what'd we do with
> it?

Sounds like you have found a solution and are looking for a problem  :-)

I disabled edits on the
    http://trac.haskell.org/camp/
wiki so I wouldn't have to deal with spam.

> B/ camp as darcs-3 / hashed-storage
> 
> This leads us into the second point: do we have an estimation for the maturity
>  of camp-core (and additionally hashed-storage)? Is it the right time to
> marry them?
> Do we have a plan for the various transitions? 
> When do we start coding new features
> for camp rather than darcs? 

I can't answer all your questions, but the big things blocking camp are:
 * Completion of the correctness proof
 * Algorithm for and implementation of conflict marking
There's also a lot of other stuff that has to be written of course, but
that's all just a SMOP, and I'm not interested in spending time on that
before the above 2 are solved.

> I don't have an opinion on these questions, but I think
> the crucial points to decide are:
> -does camp become darcs-3, or do camp-ideas get ported to darcs'(s?)
> code?

Regardless of the final decision, camp-ideas should not be ported to
darcs's code at the moment.

> -does camp adopt hashed-storage? is it a goal for the sprint?

hashed-storage is currently a distraction as far as camp is concerned.
Once camp's main goals are achieved we can think about whether using
hashed storage (as a replacement or as an option) is the right way to
go.

> C/ camp-paper and camp-coq.
> 
> This situation, with 4 different sources of "truth" about camp must be resolved,

We should certainly try to merge the coq's. Actually formally linking
the Haskell to the coq seems infeasible in the short term.

> 3/ sensible sequences versus states
> 
> camp-coq does not seem to care about that issue (ie, at the moment, it
> does not encode this constraint).

Yes; I think this is the most "obviously correct" bit, so I'm leaving it
until last.

> My view on this point is:
> -camp-paper's axiom are incomplete. We need at least one more axioms
> that says that if sp and sq are sensible sequences, then so is spp^q.
> Without this axiom, nothing says that p^ undoes p, we only know its
> behaviour with respect to commutation. 

Yes. (Things like overlooking this axiom is the reason I want a coq
proof: It's too easy to miss a detail when making human proofs. More
eyes will spot more of the details, as you have done here, but nothing
beats a machine proof).

> I also think that at some point, we'll need to introduce ungrounded sensible 
> sequences, if only as a device in proofs, with the axiom that if sp and ps'
> are sensible, so is sps'.

This is only true if you have some sort of context on your sequences.

Otherwise,
    s  = addFile "foo"
    p  = addFile "bar"
    s' = addFile "foo"
is a counterexample.

> If we do so, then I think that one can show that
> there is a set of states such that each patch can be associated with a from
> and a to state, and every ungrounded sequence is sensible if and only if the
> from and to states match.
> 
> -camp-code's (and dic's) stateful patches (where states are an
> _abstract_ type) are much easier to work with in coq (and in code) 
> than a bunch of axioms, as the states prevent most mistakes.
> Note that using states makes proving that every sequence you produce is
> good automatic, whereas you need a proof with each operation when
> using sensibility.

Yes, I don't object to using states. The only reason I haven't is to
keep it simpler for now.

> 4/ Unnamed marked sequences
> 
> before introducing names (they don't exist yet in dic), dic relies on a
> concept which is not present in camp-paper, nor in camp-coq.
> 
> That concept is that of marked sequences (shuffle_mark_[1,2] in
> dic). The idea is that if s (with p marked) can be shuffled into t (with q marked),
> then when shuffling s into t, p "lands" on q. (There is also a version with
> two marks). Additionally, one will need a lossy version, with
> s (with p marked) can be turned into t (with q marked) if by shuffling
> patches in s, adding and removing patches, p "lands" on q.

This feels like an ugly kludge to me. The "obvious" human proofs for
some of the lemmas required a similar kludge, but I managed to find a
way to do the proofs that didn't need it in the end. I hope we can avoid
it with machine proofs too.

> 5/ Zigzag sequences (and their absence)
> 
> This is a point that's absent in all sources alike, but which we will need
> to take into account somehow. We can sum it up by saying that the
> subsequence pp^p is evil, and should probably not appear in a repository 

Unfortunately, you can trace patch sequences like this within a catch
repo. If you have

    A + B + AC

then you get

A             [A^, :B]                        [, A:C]
(the patch A) (The patch B, conflicts with A) (The patch C, with context A)

So to get the context of C you start at the left, and have A A^ A.

But you are right that you can't have catches
    P P^ P
in camp; or even
    P P^
for the moment.

> 6/ names as a function or a component of patches
> 
> Second, Ian's proofs that 'c' holds are difficult

We've talked about this OOB, and I think we're agreed that we need to be
able to do these proofs, because they're simpler versions of the catch
proofs that we're going to have to do.

I still think that my way, i.e. constructing named patches from unnamed
patches and names, is the best way. But even if I didn't, I think that
doing these simple proofs will help me understand how best to write the
big proofs, so I plan to continue with this.

> 7/ Where to introduce names?
> 
> This is the biggest point of divergence between my approach and
> Ian's. What I propose is quite a
> radical change, but i feel that:
> -My approach is more elegant, both for paper and computer proofs.
> -Given my experience of manipulating functions and equality in coq, it's
> good practice to try to avoid them.
> 
> Currently, we (camp-paper) only define the merge _algorithm_ and catches
> on top of named patches.
> 
> This means that we have a lot of places where we say "if the names are
> equal, do this, else do that". This is a very good way to define algorithms,
> but horrible for when we need to prove them: it does not carry what we
> gain from knowing that the two patches have
> the same name or not.
> 
> The best witness of this is the existence of patch
> universes. Universe. In the plural! That's a sign
> that something is not going right. They are not quite convincing on paper
> (try getting a paper reviewed with "the allowed operations are…"), but on
> coq, it's going to be horrible to deal with them. The simple thought that i
> might have to invert those guys makes me shiver. And them we'd have to
> reprove the axioms a third time for patches in patch universes! 

I think you may have missed the idea behind patch universes.

With unnamed patches we have some axioms, say A1 and A2.
With named patches we prove equivalent lemmas N1, N2.
With catches we prove equivalent lemmas C1, C2.

Then the patch universe says: Given a patch type P, and lemmas L1 and L2
on P, there is a patch universe in which L3, L4, L5 are true.

Roughly speaking, L1 and L2 are how triples of patches are related by
commutation, and L3+ are things like "patches in a sequence (or, indeed,
theentire universe) have a minimal context".

So the point of patch universes is that we don't have to prove that
named patches /and/ catches have minimal context, we only have to prove
that patch-like things have minimal context.

As for "operations", you can represent them as relations, in the same
way that we already represent the commute operation as a relation.

> Also, if the patch universe appears in the type of patches

I can't talk about this yet. I haven't looked into how this will best be
represented in coq.

> My proposal is to end the madness. I propose to add an axiom that says
> that we can decide whether two patches are related, and draw the consequences.

I don't want to add axioms about named patches. Apart from anything
else, if you need to make it an axiom for named patches then I can't see
how you're going to do it for catches. Or to phrase it differently: If
you can prove it for catches, I'm sure you could prove it for named
patches, so it has no business being an axiom.

Thanks
Ian