[darcs-devel] strace timing logs Re: a couple of quick timings

zooko zooko at zooko.com
Thu Feb 7 13:47:08 UTC 2008


On Feb 7, 2008, at 12:38 AM, Petr Rockai wrote:

> Not sure what OS are you using, but at least Linux will still work if
> you are very tight with memory and have no swap. It will just work
> extremely slowly, since all it can swap is code pages (which are in
> the executables so they don't need extra swap space). It's usual that
> when you have no swap and run out of memory on Linux, trivial
> operations can take extremely long time.

Are you sure about this?  It seems like this could be a problem only  
when total amount of memory required by your applications is epsilon  
less than the physical memory, and it remains at just that level for  
a long period of time.

I've been running without swap for about four years now, and I've  
never observed trivial operations to take an extremely long time.   
(Indeed, the reason that I turned off swap is that now, without swap,  
all operations finish quickly -- there is no "swap thrash" state to  
cause operations to drag on for minutes or hours.)

Have you ever observed this phenomenon or is it just something you've  
heard about?


> Please make sure this is not
> a memory problem, since otherwise the timing is fairly worthless (and
> the issue is really in memory usage)... Also, checking what kind of
> load is on the machine (with say top) could help -- is it
> CPU-intensive or IO-intensive workload? Or is it mostly idling?


Wait, why would the timing be worthless?

The timing is a measure of how long it takes darcs and hg to do a  
task which I need to do (import ciphercycles tarball) on a machine  
that I need to use to do it.  Hg works fine in this situation.

Anyway, I'm re-running it now, with "strace -r -tt" for detailed  
timing information.  top shows that darcs is using 94 M virt, 88 M  
res, and consistently using about 40% to 50% CPU (it is sharing CPU  
with a ghc compile).  The darcs status-update-line output (you know,  
the nice new keep-the-user-informed feature of darcs-2) currently  
says "Writing pristine 2049/2058 :  
e07cd73fa2a129ece7aa452f7fe29b6c6185e8e1", and has said this same  
thing for at least ten minutes.


time strace -o tar.strace.log.txt -r -tt tar xjvf  
ciphercycles-20070205.tar.bz2

0m2.7s

time strace -o darcsinit.log.txt -r -tt darcs init --hashed

0m0.3s

time strace -o darcsadd.log.txt -r -tt darcs add -r  
ciphercycles-20070205

0m7.8s

time strace -o darcsrecord.log.txt -r -tt darcs record --all -- 
author=zooko at zooko.com -m'init import of ciphercycles-20070205'

34m20.0s

(Hm, perhaps sharing the CPU with ghc compile made it take twice as  
long.  Or perhaps doing the strace logging slowed it down.)


And now for Mercurial:

time strace -o tar.strace.log.txt -r -tt tar xjvf  
ciphercycles-20070205.tar.bz2

0m2.3s

time strace -o hginit.strace.log.txt -r -tt hg init

0m0.2s

time strace -o hgaddremove.strace.log.txt -r -tt hg addremove

0m1.9s

time strace -o hgcommit.strace.log.txt -r -tt hg commit -m'init  
import of ciphercycles-20070205'

0m5.5s


Here is a .tar.bz2 containing all of the strace logs produced --  
https://zooko.com/stracelogs.tar.bz2 .  They contain fine-grained  
(microsecond) timing of each system call, so they should be a good  
help diagnosing why darcs is thousands of times slower than hg for  
this.  Here's the first clue:

wc -l straceit*/*.strace.log.txt

     12662 straceit/darcsadd.strace.log.txt
       323 straceit/darcsinit.strace.log.txt
   1912200 straceit/darcsrec.strace.log.txt
     14150 straceithg/hgaddremove.strace.log.txt
     59911 straceithg/hgcommit.strace.log.txt
      3614 straceithg/hginit.strace.log.txt
     15736 straceithg/tar.strace.log.txt
     15448 straceit/tar.strace.log.txt
   2034044 total


So darcs made 1.9 million system calls to do darcs record while hg  
made 60,000 system calls to do hg commit.


Regards,

Zooko



More information about the darcs-devel mailing list