[darcs-devel] [issue1536] break caches up into subdirectories

Simon Marlow bugs at darcs.net
Sun Aug 30 13:26:53 UTC 2009

Simon Marlow <simonmar at microsoft.com> added the comment:

I have some figures for how much this may be affecting performance on Linux with
a recent ext3 (kernel 2.6.28).

For a GHC repository with 21000 files in _darcs/patches, I used a program that
opens and closes every file in the directory.

  - cold cache: 14.70s real   0.15s user   0.62s system 
  - warm cache:  0.26s real   0.09s user   0.14s system

(to flush the cache before running the test, I used "echo 3

After making 16 subdirectories 0/ 1/ ... e/ f/ and splitting the patches amongst
the subdirectories:

  - cold cache: 4.70s real   0.09s user   0.74s system
  - warm cache: 0.24s real   0.11s user   0.12s system 

Conclusion: with a warm cache, there's no difference - presumably Linux's name
lookup cache is big enough to hold all 21k lookups.  Without anything cached,
the subdirectory version is 3x faster (but the difference is all in real time,
not system time, which implies that this is due to reading less data from disk
rather than poor algorithms in the kernel's lookup code).

Program I used to measure this:

import System.IO
import Control.Monad
import System.Posix
import System.Environment
import qualified Data.ByteString.Char8 as B

main = do
  [file] <- getArgs
  ls <- B.split '\n' `fmap` B.readFile file
  forM_ ls $ \file -> do
    let str = B.unpack file
    when (not (null str)) $ do
      fd <- openFd str ReadOnly Nothing defaultFileFlags
      closeFd fd

nosy: +simonmar

Darcs bug tracker <bugs at darcs.net>

More information about the darcs-devel mailing list