[darcs-devel] announcing darcs 2.0.0pre1, the first prerelease for darcs 2

David Roundy droundy at darcs.net
Fri Dec 14 21:56:03 UTC 2007


On Fri, Dec 14, 2007 at 10:15:13PM +0100, Alexander Staubo wrote:
> On 12/14/07, David Roundy <droundy at darcs.net> wrote:
> > On Fri, Dec 14, 2007 at 01:33:33PM +0000, Simon Marlow wrote:
> > > I guess I don't understand why optimize is exposed to the user at all. if
> > > there's an optimal state for the repository, why can't it be maintained in
> > > that state?
> >
> > It's because it could cost O(N*T) (where N is the number of patches since
> > the last already-identified in-order-tag repository and T is the number of
> > out-of-order tags in this set) to find out if there is a more optimal state
> > than the current state.  We *could* make every darcs operation O(N) in the
> > hope of making N smaller (where many of them are now O(1)), but that
> > doesn't seem like a good direction to go.  On the other hand, maybe the
> > additional simplicity would be worth the performance penalty.  Perhaps we
> > should optimize whenever we perform an O(N) command.  As it is, this
> > optimization is only performed when actually creating a tag.
> 
> The problem with the current system is that (1) you have to create a
> tag, and (2) you have to manually optimize -- and as I understand, you
> have to do this on both the remote and the local repository to speed
> up pulls and pushes.

You don't need to call optimize on the repository that is used to create
the tag, and you shouldn't need to do so very often.

> In my experience, Darcs pulls/pushes speed up dramatically if you
> optimize, but performance starts to trail once you start adding
> patches, and quickly becomes painfully slow. Eric Kow's advice is to
> optimize every 30 or so patches, which seems to match my experience.

With hashed repositories, this shouldn't be a big issue for pushes and
pulls, at least if you're dominated by network IO costs (which is often the
case), because patches that you've already seen won't need to be downloaded
again.

> That's an maintenance annoyance for a team of 5 developers working
> full steam on a single repository. So I toyed with the idea of setting
> up a Cron job to tag and optimize the centralized trunk repository
> when the number of patches since the last tag exceeded 30, but that
> doesn't solve the need to optimize locally.
> 
> The second issue is that the tags on which to hook the optimize are
> completely artificial. We have to call it something like
> "repository_optimization". It's just something more to clutter the
> change history.

I agree that adding artificial tags is a bad idea, and I don't think a tag
every 30 changes is really needed.  It will partly depend on how large your
changes are.  But I absolutely agree that tags should only be made that
"mean something".  It's not a bad idea, however, to occasionally create
tags like "this version compiles and passes tests", which convey useful
information but aren't necessarily product releases.

> Anything to make Darcs auto-optimize would make me happy. It could be
> a user preference setting for all I care, as long as I can turn it on
> permanently. I don't see why Darcs has to optimize before *every* O(N)
> command -- it can surely optimize now and again when it seems there
> would be a performance gain.

The problem is that optimizing is hardly more expensive than finding out if
there would be a performance gain, at least in the simple cases where
there's no performance gain.
-- 
David Roundy
Department of Physics
Oregon State University


More information about the darcs-devel mailing list