[darcs-users] report on using darcs- (performance of "darcs query contents")

Petr Rockai me at mornfall.net
Sun Jul 12 08:45:18 UTC 2009

Hi again,

Zooko Wilcox-O'Hearn <zooko at zooko.com> writes:
> darcs query contents --quiet --match "hash
> 20080721164936-4233b-426bdab5eb2247f275419cc7744a76d0a5fd6aa6.gz"
> "docs/proposed/mutable-DSA.txt"
The question arises, why are you so interested in a copy of the file from a
year ago? Is that just because this was the last revision to modify the file?
If so, you would be *much* better off asking for the last revision, which is
identical anyway.

> Of course, this hardly matters anyway to my tracdarcs use case; what I really
> need for tracdarcs is for "darcs query contents" to return  the same answers in
> about 1/100 of the time it currently takes (i.e.  about 30 milliseconds would
> be an improvement), or perhaps to allow  queries on multiple files in a single
> call so that the tracdarcs  plugin doesn't need to invoke it dozens of times in
> order to render a  directory full of dozens of files.  See
> http://bugs.darcs.net/ issue1477 for details.
I am starting to suspect that tracdarcs is doing something fundamentally
broken, if it issues queries like the above. Unless someone is browsing history
from 2008, nothing like this should be happening. And in case they are
(browsing back in 2008), there's little that can be done anyway: we need to
examine all those patches between now and half of 2008 to see if they modified
the files we are interested in. In that case, tracdarcs would be much better
off doing a get --to-match (or similar), since that pays the cost just once:
all of the queries for contents would then take an order of 10ms.

Anyway, doing --match 'hash <latest patch>' is still inefficient in current
darcs, and I have fixed that in darcs-hs, bringing the time down to roughly the
same order as with darcs show contents (without match). (See my previous mail,
while I was thinking you just need to show contents on recent revisions...)

With my darcs-hs (not public yet, since it has some error reporting issues that
need fixing first):

(; for i in `seq 1 1000`; do; darcs show contents --match  README > /dev/null)  3,00s user 2,86s system 91% cpu 6,426 total
(; for i in `seq 1 1000`; do; darcs show contents README > /dev/null; done; )  2,64s user 2,93s system 93% cpu 5,939 total

(i.e. you need about 6-7 seconds for 1000 show contents queries, *of a recent revision*)


More information about the darcs-users mailing list