[darcs-users] add/record time

David Roundy droundy at abridgegame.org
Sat Jul 26 12:50:09 UTC 2003


On Fri, Jul 25, 2003 at 03:57:40PM -0700, Trevor Talbot wrote:
> I'm just starting out with darcs (0.9.11), and I'm trying to get a 
> handle on expected performance.
> 
> Is the sequence
> 
>   darcs add -r ... [~7000 files]

One of the biggest remaining issues is certainly its speed when scaling to
large repos.  I very much want to make this work, but getting good
performance out of haskell is hard (which is not to say impossible, but
just a lot of work).

I believe the relevant number (as far as time goes) is likely to be the
total size of the files being added rather than the number of files.  A
record of more than about 2MB will take a "long time."

>   darcs record -va
> 
> expected to take "a long time"?

I definitely do expect it to take a long time (although it shouldn't).  I'd
estimate on the order of hours, depending on how big your files are.  I've
had records smaller than this take 12 hour, but I've improved things a lot
since then, so I wouldn't guess more than a few hours.

> What is it supposed to be doing after 
> "I've gotten unrecorded."?  It doesn't appear to be doing much of 
> anything besides using CPU cycles, is why I ask.

It's a bit hard to say what exactly is being done when, since haskell is a
lazy language.  I think it is reading in all the files and parsing them
(breaking them into lines) and getting ready to format them as patches to
output.  I'm going to start running some tests to see what's holding things
up.  I went through this once a month or two ago and made lots of fixes,
but at the end it still took maybe three or four hours to record the add of
a large directory.  Since then I've made a number of improvements in
efficiency, so it's worth going back to see what the limiting factor now
is.

> Also, the default expressions in _darcs/prefs/boring don't work with a 
> recursive add very well.  I just changed them to things like 
> (^|/)CVS(/|$) etc.

Ah yes.  I think I must have changed the algorithm that uses these regexps
and forgotten to modify the regexps themselves.  Originally they were
intended to match not the whole filepath but just each segment thereof.
Thanks for pointing this out! (it's fixed now in latest darcs)

> It might be a good idea to avoid recursing into boring directories.

That's something that's already fixed in the latest darcs.  :)
-- 
David Roundy
http://www.abridgegame.org




More information about the darcs-users mailing list