[darcs-users] darcs2 much slower than darcs1 on big repo

Alan Bram alan.bram at oracle.com
Thu Aug 14 18:41:06 UTC 2008


Hi Jason and Max,

Thanks for your responses.

>From Jason:

> > As an experiment, I created a new repo with "darcs init --darcs-2",
> > and then simply imported the latest versions of all of the files and
> > directories.  In other words, no history: this new repo has exactly
> > one patch in it.
> 
> Presumably you want the history though right?  Did you do it this way just
> as a benchmark test?

Yes.  I first tried (1) running darcs2 on my old repo, (2) converting
my old repo.  But my repos have a lot of history, and a couple of the
patches have been huge; so I thought it might be less demanding on
darcs just to try the simpler experiment.

> We've identified an issue with some versions of GHC producing darcs binaries
> that are really slow at whatsnew.  It seems that ghc 6.6.x and ghc 6.8.3 are
> not affect.  Specifically, ghc 6.8.2 seems to generate the slow darcs on
> several platforms.  It would be good if we could rule out this possibility.

Hmm, that may well be it.  I'm using the darcs2 that is packaged with
Debian Lenny.  I don't know exactly how it was built (is there some
sort of "darcs --version" command that can tell?).  But I do also know
that the version of GHC that ships with Lenny is 6.8.2 -- so that
sounds like a pretty good guess.

I guess I could try first downloading and building GHC 6.8.3, and then
trying to build darcs.  If I do that, should I use darcs 2.0.2
release, or get the very latest darcs from its repo?  (Except that
building darcs from its repo rather than a tarball is probably harder,
right?)

> I think the recommended way to get a darcs2 repository would be to use the
> 'darcs convert' command.  Have you tried this, and if so, is it better?
> worse? about the same?

>From what I could tell, it was a little worse.


>From Max:

> 3500 files in 300 directories?  Wow.
> 
> Does this repository happen to be publicly shared or could this
> repository be publicly shared?  Darcs is looking for "benchmark"
> repositories for performance testing and particularly performance
> regression testing.

The repo isn't public.  But if you wanted to try the simple initial
import experiment (see attached listing), you could get the published
released version of the tarball:

    http://download.oracle.com/berkeley-db/db-4.7.25.tar.gz

Note that I'm not complaining about the time it takes to do the
initial import.

> A better experiment is to try:
> 
> darcs get --hashed current-repository hashed-repository
> 
> A deeper experiment would be to try:
> 
> darcs convert current-repository darcs2-repository

Neither of those resulted in any improvement, as far as I could tell.


Cheers,
 - arb
-------------- next part --------------
$ wget -q http://download.oracle.com/berkeley-db/db-4.7.25.tar.gz
$ tar xzf db-4.7.25.tar.gz 
$ cd db-4.7.25
$ darcs init --darcs-2
$ #
$ # our docs source files look like *.so, and we publish tags file
$ #
$ sed -e 's/|so|/|/' -e '/tags/d' _darcs/prefs/boring >/tmp/myboring
$ mv /tmp/myboring _darcs/prefs/boring
$ darcs add --recursive *
Skipping boring file _darcs
Skipping boring file _darcs/format
Skipping boring file _darcs/hashed_inventory
Skipping boring file _darcs/lock
Skipping boring file _darcs/patches
Skipping boring file _darcs/patches/pending.tentative
Skipping boring file _darcs/prefs
Skipping boring file _darcs/prefs/binaries
Skipping boring file _darcs/prefs/boring
Skipping boring file _darcs/prefs/motd
Skipping boring file _darcs/pristine.hashed
Skipping boring file _darcs/pristine.hashed/da39a3ee5e6b4b0d3255bfef95601890afd80709
Skipping boring file _darcs/tentative_hashed_inventory
Skipping boring file _darcs/tentative_pristine
$ darcs record -a -m 'initial import' --skip-long-comment -q -A me
Finished recording patch 'initial import'
$ time darcs changes
Thu Aug 14 11:27:45 PDT 2008  me
  * initial import

real	0m9.855s
user	0m9.985s
sys	0m1.032s
$ #
$ #   ... (add a couple of lines to repmgr/repmgr_net.c)
$ #
$ time darcs whatsnew -q
hunk ./repmgr/repmgr_net.c 23
+ *      4 bytes         - size of other
hunk ./repmgr/repmgr_net.c 26
+ *      ? bytes         - other

real	0m9.791s
user	0m9.905s
sys	0m1.116s
$ $ uname -a
Linux pistil 2.6.24-1-686 #1 SMP Thu May 8 02:16:39 UTC 2008 i686 GNU/Linux
$ darcs --exact-version
darcs compiled on Jul 26 2008, at 14:32:06
# configured Sat Jul 26 14:09:28 CEST 2008
./configure /usr/local/share/config.site /usr/local/etc/config.site

Context:

[TAG debian: 2.0.2-2
me at mornfall.net**20080726120856] 

$ cat /etc/apt/sources.list
# 
# deb cdrom:[Debian GNU/Linux LennyBeta2 _Lenny_ - Official Beta i386 NETINST Binary-1 20080608-11:24]/ lenny main

#deb cdrom:[Debian GNU/Linux LennyBeta2 _Lenny_ - Official Beta i386 NETINST Binary-1 20080608-11:24]/ lenny main

deb http://ftp.us.debian.org/debian/ lenny main contrib non-free
deb-src http://ftp.us.debian.org/debian/ lenny main contrib non-free

deb http://security.debian.org/ lenny/updates main contrib non-free
deb-src http://security.debian.org/ lenny/updates main contrib non-free



More information about the darcs-users mailing list