[darcs-users] Benchmarking "get"

Lele Gaifax lele at nautilus.homeip.net
Tue Mar 17 23:39:38 UTC 2009

On Tue, 17 Mar 2009 14:49:52 -0600
zooko <zooko at zooko.com> wrote:

> Lele Gaifax converted the Tahoe repo [1] from darcs (hashed-format)

Actually this script did

[1] http://progetti.arstecnica.it/tailor/attachment/ticket/178/testtahoe.sh

using current tailor.

For completeness:

    $ time darcs get --lazy /tmp/testtahoe/rootdir/darcs-side lazy
    Finished getting.

    real    0m1.180s
    user    0m0.932s
    sys     0m0.240s
    $ du -sb *
    57285206        darcs-side
    10199601        lazy
    $ cd darcs-side
    $ darcs query repo
              Type: darcs
            Format: hashed
              Root: /tmp/d/darcs-side
          Pristine: HashedPristine
             Cache: thisrepo:/tmp/d/darcs-side, cache:/home/lele/.darcs/cache, repo:/tmp/testtahoe/rootdir/darcs-side
    boringfile Pref: .darcs-boringfile
    Default Remote: /tmp/testtahoe/rootdir/darcs-side
       Num Patches: 3746

Dunno if/how git can be made lazy...

> Subjectively, I don't have a problem with the performance of "darcs  
> get".  I do have a problem with command-lines like these:
> darcs query contents --quiet --match "hash  
> 20080925213457-92b7f-4cf7b6d41cb114e2598f10b666d8e8e97d6ffb8f.gz"  
> "docs/proposed/mutsemi.svg""

    $ time darcs query contents --quiet --match \
      "hash 20080925213457-92b7f-4cf7b6d41cb114e2598f10b666d8e8e97d6ffb8f.gz" \
      "docs/proposed/mutsemi.svg" | wc -l

    real    0m2.913s
    user    0m2.096s
    sys     0m0.764s

Does not seem particularly slow for me...

> and
> darcs annotate --xml-output --match "hash 20081124204046-e01fd- 
> b4d82d92f5fc8d9af65af44ccce8c29279fb73af.gz"
> src/allmydata/interfaces.py

    $ time darcs annotate --xml-output --match \
      "hash 20081124204046-e01fd-b4d82d92f5fc8d9af65af44ccce8c29279fb73af.gz" \
      src/allmydata/interfaces.py | wc -l

    real    0m24.679s
    user    0m23.397s
    sys     0m1.276s

Sigh, we know, annotate is a different story!

> Whenever you hit this URL: http://allmydata.org/trac/tahoe/browser  
> trac issues commands like those to darcs, and currently it can often  
> take tens of seconds for darcs to generate the answer, which causes  
> trac to time-out and return an error to the user.

Uhm, are you sure that's the effective command executed *whenever* you
hit the browser? With darcs v2, trac-darcs should really use

    $ time darcs query contents --match \
      "hash 20081124204046-e01fd-b4d82d92f5fc8d9af65af44ccce8c29279fb73af.gz" \
      src/allmydata/interfaces.py | wc -l

    real    0m2.150s
    user    0m1.576s
    sys     0m0.556s

Of course, these days most of the time "the user" is actually some
spider, and its a matter of time that one of requests get fullfilled
and cached after that. Just a workaround, to say the best.

ciao, lele.
nickname: Lele Gaifax    | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas    | comincerò ad aver paura di chi mi copia.
lele at nautilus.homeip.net |                 -- Fortunato Depero, 1929.

More information about the darcs-users mailing list