[darcs-devel] Why is darcs get so slow?

Thu Dec 30 06:49:42 PST 2004

Got to go on nephew-duty, so I'll just send this somewhat incomplete
response.  It looks like write_checkpoint_patch may indeed be the problem
(saw that after responding mostly...)

On Thu, Dec 30, 2004 at 04:58:49PM +1000, Anthony Towns wrote:
> How come "darcs get --partial linux" is so slow? AFAICS, there aren't 
> even any mergers involved, which is the usual scapegoat? And it's not 
> /that/ big -- the checkpoint is 200MB uncompressed, and the patches 
> after that are only another 8MB. Yet darcs seems to chew up well over a 
> gig of RAM trying to process it, and seems to just sit and spin for ages 
> after getting the patches and before actually creating the working 
> directory or current/ (at which point I wander off, come back, decide 
> I'm bored and hit Ctrl-C).

In this case it's a simple case of darcs being slow--not a scaling issue in
the sense of worse than O(N) scaling, but rather that the prefactor is too
big.

> Trying the profiler, I get:
> 
> COST CENTRE                    MODULE               %time %alloc
> hPutPS                         FastPackedString      39.7   36.8
> 
> taking up most of the time, invoked from write_checkpoint_patch in 
> get_cmd. But I'm not sure if that's accurate, or a profiling sampling 
> error - since the profiler seems to only notice about 7 minutes out of 
> 20 minutes runtime.
> 
> And this is just a simple get -- it's just applying patches, no 
> commutation and evidently no mergers, so there's no real reason for it 
> to be much slower than just using tar/patch would be. Yet it is.
>
> I don't get what's going on. Anyone know?
>
> Is darcs trying to create current/ in RAM, and wasting lots of time 
> converting between string representations for no particular reason? Is 
> there some other time sink somewhere?

I think most of the time is spent parsing the big patch and then applying
it.  You could tell better if you use the -v option, since probably most of
the time is spent applying the checkpoint patch--although I don't recall if
darcs informs you when it starts applying the checkpoint patch.

Certainly that is where the memory goes.  Darcs unfortunately parses the
entire patch before applying it, which means the entire parsed patch goes
into memory.  Using antimemoize can reduce the memory perhaps, but only at
the cost of reparsing later.

The advantage (which isn't trivial) of parsing the patch strictly
(i.e. before using it) is that we know immediately whether a patch is
malformed, so we can create nice error messages, and won't end up partially
applying a patch before finding out that it's been messed up somehow in
transit.  We could create a "lazy" patch parser, which would allow us to
interleave the patch reading with the use of the patch, which would benefit
both memory and time usage (the time benefit would come from interleaving
with IO time, and better memory cache behavior--plus better disk cache
behavior.  The catch is that we'd then have to maintain two separate
parsers, unless someone (Ian?) could come up with a tricky way to implement
a lazy and a strict parser with the same code.  Since we definitely want to
strictly parse patches that are to be applied to an existing non-corrupt
repository.

It could be that we'd also have to be careful to increase the laziness of
apply_to_slurpy in order to benefit from lazy patch parsing.  Which brings
up another possibility/idea.  I've been thinking about the possibility of
implementing an abstraction over the filesystem, sort of like
MissingH.IO.HVFS, only ideally implemented as a class of monads

http://gopher.quux.org:70/devel/missingh/html/MissingH.IO.HVFS.html

With an interface like this for Slurpies (and yes, AJ, at this I know I'm
getting deeper into darcs internals than you're familiar with), we could go
from

apply_to_slurpy :: Patch -> Slurpy -> Maybe Slurpy

to

apply_patch :: FileSystemMonad a => Patch -> a ()

where (I don't remember the syntax for class declarations, so I'm making
something up)

class FileSystemMonad a
  fsRenameFile :: FilePath -> FilePath -> a ()
  fsReadFilePS :: FilePath -> a PackedString
  etc...

********

Of course, it could be that hPutPS *is* taking up most of the time.  Darcs
does write the files one line at a time, which could be slow if there isn't
enough buffering going on.  On the other hand, it *shouldn't* be that slow,
as far as I can tell.  On the other hand, the total number of lines
involved is indeed quite high.

A comparison with patch could be gotten by creating a massive patch with a
recursive diff, which creates the entire linux tree out of nothing, and
then seeing how long it takes to apply this patch to an empty directory.
That's roughly how long the --partial get should take, if darcs were as
well-optimized as patch (which basically means perfectly-optimized, I'd
guess).  To do better than that (i.e. to be competetive with tar) we'd need
to store the checkpoint in a non-text format--that is with each file stored
verbatim, preceded by the file size.
-- 
David Roundy
http://www.darcs.net