[darcs-users] darcs-2 performs really well for the "darcs get" use case
zooko
zooko at zooko.com
Tue Apr 22 23:00:11 UTC 2008
Folks:
We use darcs to manage our source code in the http://allmydata.org
project (it is an open source, secure, decentralized file system).
Our trunk repository [1] currently has 2,484 patches in it. The
current version of the source code has 269 files, at a total of 13
MiB bytes (some of the files are binaries) and around 48,000 lines in
the non-binaries. The _darcs/patches directory (which contains all
of the patches, gzipped) takes 40 MiB of disk space.
We have automated builds and tests using buildbot, so this was an
opportunity for me to benchmark different versions of darcs.
The buildslave that I used is a Celeron Coppermine at 564 MHz running
Ubuntu Dapper [exhibit 2]. Originally it was running darcs-1.0.5
(that's what comes with Ubuntu Dapper), and a "darcs get --partial"
of our source code over HTTP took 286 seconds [exhibit 3].
Next I installed darcs-1.0.9 -- the final release in the darcs-1
line. A "darcs get --partial" took 308 seconds [exhibit 4]. (I
didn't try this experiment enough times to determine if the
difference between darcs-1.0.5 and darcs-1.0.9 was merely jitter in
the network or the machine load.)
Next I installed darcs-2.0.0. A "darcs get --partial" took 93
seconds [exhibit 5].
Next I configured it to do its darcs get from a hashed-format
repository instead of an old darcs-1-format repository as described
in the darcs manual [6]. A "darcs get --partial" took 6.47 seconds
[exhibit 7].
Next I configured it to use a "global cache" as described in the
darcs manual [8]. The global cache was not populated yet, of course,
so the next "darcs get --partial" did not benefit from it, and indeed
took 7.19 seconds to run and to populate the global cache [exhibit 9].
Finally, I ran it again with the global cache having been populated
in the previous run. This time "darcs get --partial" took 3.85
seconds [exhibit 10].
Morals of the story:
1. Upgrade from darcs-1 to darcs-2.
2. Starting using hashed-format repositories.
3. If you don't mind having only a partial copy of history, in order
to have faster "darcs get", then use "darcs get --lazy" (which is the
preferred spelling for "darcs get --partial" in darcs-2).
4. Whether or not you are using --lazy, enable a global cache. A
global cache can speed up other operations in addition to "get",
including working on different branches.
5. If you have a workload that is important to you other than "darcs
get", then try an experiment like this one on your workload and
report your results to darcs-users at darcs.net. :-)
Regards,
Zooko
[1] http://allmydata.org
[2] http://allmydata.org/buildbot/waterfall?
builder=dapper&last_time=1208899391
[3] http://allmydata.org/buildbot/builders/dapper/builds/1464/steps/
darcs/logs/stdio
[4] http://allmydata.org/buildbot/builders/dapper/builds/1466/steps/
darcs/logs/stdio
[5] http://allmydata.org/buildbot/builders/dapper/builds/1467/steps/
darcs/logs/stdio
[6] http://darcs.net/manual/node7.html#SECTION00740000000000000000
[7] http://allmydata.org/buildbot/builders/dapper/builds/1468/steps/
darcs/logs/stdio
[8] http://darcs.net/manual/node5.html#SECTION00510000000000000000
[9] http://allmydata.org/buildbot/builders/dapper/builds/1469/steps/
darcs/logs/stdio
[10] http://allmydata.org/buildbot/builders/dapper/builds/1470/steps/
darcs/logs/stdio
More information about the darcs-users
mailing list