[darcs-users] Cheap repositories -- some results

Juliusz Chroboczek jch at pps.jussieu.fr
Sat Jan 22 00:45:06 UTC 2005


Hi,

As some might know, I've been hacking at ``cheap'' repositories.  I've
got some roughly working but completely unoptimised code right now.

What I've got right now:

  - the ability to run with no _darcs/current directory;
  - the ability to relink files under _darcs/patches and
    _darcs/current to a given ``sibling'' repository.

What I'm planning:

  - have get and pull doing ``the right thing'' when there's a sibling;
  - another format for current, ``current.hashed'', even slower ;-)

Here are a few results on the current unstable polipo tree.  This
consists of 854 kB of sources and 600 kB of patches; there are no
checkpoints.

1. size of the repository

  plain: 2516 kB
  current.none: 1648 kB

  2 x plain: 5036 kB
  2 x plain, relinked: 3460 kB
  2 x current.none: 3300 kB
  2 x current.none, relinked: 2592 kB

Obviously, you still need to pay the price of the extra working dir,
but other than that, there is no duplication (5036 - 854 = 2606 which
is 2516 up to relativistic effects -- this is a fast machine).

2. speed

  darcs get, plain: 0.14 s
  darcs get, current.none: 1.3 s

  darcs pull, plain: 5.7 s
  darcs pull, current.none: 8.3 s

  darcs whatsnew, plain: 0.03 s
  darcs whatnew, plain, ignore-times: 0.1 s
  darcs whatsnew, current.none: 1.2 s

Obviously, there's something wrong: we're slurping current during get
and pull, which is obviously extremely expensive when there's no
current to slurp (everything needs to be recomputed on the fly).  It
looks like we're slurping current once when doing get, and twice when
doing a pull.  I'll look into it after my next sleep period.

The slowness of whatsnew is expected, as there's just no way to find
out what's new without recomputing everything.  (The ``current.hashed''
format is designed to fix that, but it will need some infrastructure
changes.)

                                        Juliusz




More information about the darcs-users mailing list