[darcs-users] darcs-2 performs really well for the "darcs get" use case

zooko zooko at zooko.com
Tue Apr 22 23:00:11 UTC 2008


Folks:

We use darcs to manage our source code in the http://allmydata.org  
project (it is an open source, secure, decentralized file system).   
Our trunk repository [1] currently has 2,484 patches in it.  The  
current version of the source code has 269 files, at a total of 13  
MiB bytes (some of the files are binaries) and around 48,000 lines in  
the non-binaries.  The _darcs/patches directory (which contains all  
of the patches, gzipped) takes 40 MiB of disk space.

We have automated builds and tests using buildbot, so this was an  
opportunity for me to benchmark different versions of darcs.

The buildslave that I used is a Celeron Coppermine at 564 MHz running  
Ubuntu Dapper [exhibit 2].  Originally it was running darcs-1.0.5  
(that's what comes with Ubuntu Dapper), and a "darcs get --partial"  
of our source code over HTTP took 286 seconds [exhibit 3].

Next I installed darcs-1.0.9 -- the final release in the darcs-1  
line.  A "darcs get --partial" took 308 seconds [exhibit 4].  (I  
didn't try this experiment enough times to determine if the  
difference between darcs-1.0.5 and darcs-1.0.9 was merely jitter in  
the network or the machine load.)

Next I installed darcs-2.0.0.  A "darcs get --partial" took 93  
seconds [exhibit 5].

Next I configured it to do its darcs get from a hashed-format  
repository instead of an old darcs-1-format repository as described  
in the darcs manual [6].  A "darcs get --partial" took 6.47 seconds  
[exhibit 7].

Next I configured it to use a "global cache" as described in the  
darcs manual [8].  The global cache was not populated yet, of course,  
so the next "darcs get --partial" did not benefit from it, and indeed  
took 7.19 seconds to run and to populate the global cache [exhibit 9].

Finally, I ran it again with the global cache having been populated  
in the previous run.  This time "darcs get --partial" took 3.85  
seconds [exhibit 10].

Morals of the story:

1.  Upgrade from darcs-1 to darcs-2.

2.  Starting using hashed-format repositories.

3.  If you don't mind having only a partial copy of history, in order  
to have faster "darcs get", then use "darcs get --lazy" (which is the  
preferred spelling for "darcs get --partial" in darcs-2).

4.  Whether or not you are using --lazy, enable a global cache.  A  
global cache can speed up other operations in addition to "get",  
including working on different branches.

5.  If you have a workload that is important to you other than "darcs  
get", then try an experiment like this one on your workload and  
report your results to darcs-users at darcs.net.  :-)

Regards,

Zooko

[1] http://allmydata.org
[2] http://allmydata.org/buildbot/waterfall? 
builder=dapper&last_time=1208899391
[3] http://allmydata.org/buildbot/builders/dapper/builds/1464/steps/ 
darcs/logs/stdio
[4] http://allmydata.org/buildbot/builders/dapper/builds/1466/steps/ 
darcs/logs/stdio
[5] http://allmydata.org/buildbot/builders/dapper/builds/1467/steps/ 
darcs/logs/stdio
[6] http://darcs.net/manual/node7.html#SECTION00740000000000000000
[7] http://allmydata.org/buildbot/builders/dapper/builds/1468/steps/ 
darcs/logs/stdio
[8] http://darcs.net/manual/node5.html#SECTION00510000000000000000
[9] http://allmydata.org/buildbot/builders/dapper/builds/1469/steps/ 
darcs/logs/stdio
[10] http://allmydata.org/buildbot/builders/dapper/builds/1470/steps/ 
darcs/logs/stdio



More information about the darcs-users mailing list