[darcs-devel] optimization of darcs memory and time usage

Ian Lynagh igloo at earth.li
Sun Apr 10 20:19:29 PDT 2005


On Sat, Apr 09, 2005 at 09:58:52AM -0400, David Roundy wrote:
> 
> Just for perspective, in a linux kernel tarball:
> 
> $ time ../darcs_unstable record --all -m 'Initial' --look-for-adds
> Finished recording patch 'Initial'
> 
> real    15m41.784s
> user    10m12.240s
> sys     0m36.140s
> 
> takes about 1G of memory.  This is definitely a command that could use some
> improvement.

I think there are a few things at work here.

In SelectChanges.with_any_selected_changes we return

       get_first_choice consider_pc
    ++ get_middle_choice consider_pc

and then in the job we pass in we start off by seeing if this is null
(=> we have no work to do), but get_first_choice goes right to the end
of the list (forcing all the reads) and returns [], so then
get_middle_choice gets called and everything is in memory.

Instead, I think we want something like this:

-----------------------------
get_first_middle__last_choice :: PatchChoices -> ([TaggedPatch], [TaggedPatch])
get_first_middle__last_choice (PCs e) = pull_firsts_middles e

pull_firsts_middles :: EasyPC -> ([TaggedPatch], [TaggedPatch])
pull_firsts_middles [] = ([], [])
pull_firsts_middles (PC tp (Just True):e)
 = let (fms, ls) = pull_firsts_middles e
   in (tp:fms, ls)
pull_firsts_middles (PC tp Nothing:e)
 = let (fms, ls) = pull_firsts_middles e
   in (tp:fms, ls)
pull_firsts_middles (PC (TP t p) (Just False):e)
 = let (fms, ls) = pull_firsts_middles e
       (fms', p') = commute_past_list p fms
   in (fms', TP t p':ls)

commute_past_list :: Patch -> [TaggedPatch] -> ([TaggedPatch], Patch)
commute_past_list p [] = ([], p)
commute_past_list pat (TP t p:tps)
 = case commute (p, pat) of
       Just (pat', p') -> let (tps', pat'') = commute_past_list pat' tps
                          in (TP t p':tps', pat'')
       Nothing -> error "Aaack fixme! commute_past_list"
-----------------------------

(and similarly a get_first__middle_last_choice for other cases)
(the tests pass with this, but another pair of eyeballs to make sure it
looks sane would be good)

We then call apply_to_slurpy and have to wait for it to return Just,
which again requires reading everything to see if it fails to apply.
The Nothing case is impossible, though, so we probably want this to be
parameterised by the monad so this can be lazy when suitable. Also, this
is only used by the test code, so we can push it inside the block for
testing (although this still happens even if there is no test unless
--no-test is given).

This gets us clear down to
    write_patch opts $ adddeps mypatch deps
but we're hanging on to the patch to apply it to current.

Instead, read the file we're writing and apply that to current.
Unfortunately it's a single file to be read, but worse than that we're
needlessly waiting for a Just when applying to a slurpy again, so memory
usage skyrockets. My first attempt at fixing this doesn't seem to be
enough, so I'll send what I've got up to this point.

Memory usage still seems on the high side, but it's better at least.
Quicker, too. Testing with

darcs init
darcs add -r .
darcs record -a -m foo --no-test


BTW, what is the logic behind withSignalsBlocked returning () on
SIGPIPE?


Thanks
Ian





More information about the darcs-devel mailing list