[darcs-devel] announcing darcs 2.0.0pre1, the first prerelease for darcs 2

David Roundy droundy at darcs.net
Fri Dec 14 17:18:29 UTC 2007


Thanks, Peter, for making this investigation!

On Fri, Dec 14, 2007 at 01:42:14PM +0100, Petr Rockai wrote:
> Hi,
> 
> first of all, hats off for the progress you have made!
> 
> David Roundy <droundy at darcs.net> writes:
> > The future of darcs is in the darcs-2 repository format, which features a
> > new merge algorithm that introduces two major user-visible changes
> >
> >  1. It should no longer be possible to confuse darcs or freeze it
> >  indefinitely by merging conflicting changes.  However, this feature
> >  '''needs to be tested''', so please, do your worst, and let us know how
> >  darcs can handle it!
> 
> Hmm, I have just tested the nested conflict issue. Now, the behaviour
> is *much* better in darcs-2 than it has been in darcs one. I have
> modified Pekka Pessi's misery.sh (from [darcs-users] unique features +
> exponential time issue) to compare darcs with darcs2. In current
> stable, it gets stuck for good around depth 7 or so. With --darcs-2
> repos, it manages to get through to 19 but slows down considerably
> there.
> 
> 15: 0m10.757s
> 16: 0m24.504s
> 17: 0m48.363s
> 18: 1m46.241s
> 19: 3m52.970s
> (I have killed it at this point.)

Hmmm.  I'll do some profiling and see what turns up.  Darcs is certainly
going to slow down for deeply nested conflicts, but we may be able to do
better.  Towards the end, I was pretty much exclusively focusing on getting
the semantics right, with little or no thought of efficiency (apart from
trying to ensure no worse than quadratic space use for a single conflicted
patch).

> Much better than darcs 1:
> 1, 2, 3, 4 all take around a second.
> 5: 0m9.520s
> 6: 5m11.640s (~300s)
> 7: (killed)
> 
> Here go the (rough) conflictor sizes:
> 12:06:08 | morn at eri:/tmp/misery-4005/R2 -> for i in `seq 0 19`; do darcs changes -v -p p$i$ | wc -l; done 
> 11
...
> So how it appears, the conflictor size is ~quadratic in conflict
> nesting depth?

That should be about right.  In certain easy cases we aren't quadratic, but
in the general case our storage is quadratic.  This could be improved, but
at the cost of code complexity, and possibly also time complexity.  But
it's the code complexity that has me scared, it's really hard to get this
sort of thing right (at least for me).

> Time is another matter, as from depth 15 on, it seems to be somewhere
> around 2^{depth}. However, the darcs 1 timings much more resemble double
> exponential than a simple one (ie. 2^2^{depth}).
> 
> Great!

Hmmmm.  I'm a bit disappointed in that exponential cost, I had hoped to
acheive polynomial cost.  :( But in the end, handling conflict resolutions
was trickier than I'd anticipated, and I had to make a change that I
suspect accounts for the exponential behavior.  Still, I'll do some
profiling and see if I can improve this (using the misery.sh script).

> Now the question is, what does this mean for darcs in practice. If
> people are headstrong enough, they will eventually reach the nesting
> depth of 15 or more and start to get significant slowdowns. How much
> is this a theoretical problem and how much it is a practical one, I do
> not know. However, the progress indicator work I have read somewhere
> about may further mitigate this problem.

Indeed, this is the question.  Hopefully darcs 2 will be "good enough",
although we can guarantee that for certain pathological scenarios darcs 2
will *not* be good enough.  The question is how common or important those
scenarios are.

> Another one could be printing a warning when one is reaching a
> "dangerous" size and number of conflictor patches. However, in fact,
> what I do not know is, whether an alternative conflict resolution
> scheme (one that could gather more information than just diffs) will
> further reduce the size (and possibly amount of) conflictors generated
> here.

The ultimate solution would be to enable a "unilateral" conflictresolution,
which is to say that the repository that has the conflicts in it should be
able to resolve those conflicts such that future patches from the other
repository won't be involved in the conflicts.  Unfortunately, every scheme
that I tried for this failed (most of which involved resolution by an
inverse patch), so I ended up reverting to semantics much closer to the
darcs 1 semantics, which don't allow for unilateral conflict resolution.
:( But these are semantics that are much simpler and easier to implement
(correctly).

The good news (for an "ultimate" solution) is that I've now refactored the
code such that it is very easy to pop in a new patch type, so if someone
had ideas to take this further, that would definitely be doable.
-- 
David Roundy
Department of Physics
Oregon State University


More information about the darcs-devel mailing list