[darcs-users] darcs record and huge patches

Aggelos Economopoulos aoiko at cc.ece.ntua.gr
Wed Jan 28 16:04:48 UTC 2004

On Wed, 28 Jan 2004 06:51:40 -0500
David Roundy <droundy at abridgegame.org> wrote:

> On Wed, Jan 28, 2004 at 12:40:17AM +0200, Aggelos Economopoulos wrote:
> > On Tue, 27 Jan 2004 22:36:58 +0200
> > Aggelos Economopoulos <aoiko at cc.ece.ntua.gr> wrote:
> > 
> > > So, has anyone tried running record on a large tree with many
> > > local changes? If it isn't supposed to work I'll just kill the
> > > process, but if is, how long should it take?
> > 
> > Well, it took about five hours, but seems to have worked (produced a
> > 27M patch (6M compressed)). Still, it doesn't seem normal that it
> > should take that long or consume so much memory - can't you force
> > garbage collection at some point? Would disabling use of mmap() help
> > in such extreme cases?
> You said the CPU usage stayed at 90% pretty much the whole time? If
> that's the case, then it seems unlikely that reducing memory usage
> would speed much, as it sounds like it wasn't thrashing too much.

No, it wasn't. Reducing memory usage obviously wouldn't help
performance much (well, it would free up page tables when running
under linux); all the cpu cycles were spent in userland.

> What version of darcs are you running? The latest version in the
> repository has a few patches that seem to reduce swapping, at least
> when running darcs check on large repositories.  The peak memory usage
> I don't think I'll be able to significantly improve, so probably
> records won't be helped much.

I'm running 0.9.15. Sorry I forgot to mention it.

Any idea what these memory areas are for?
(this is from /proc/.../map; fields are: start, end, resident, private
resident, object, protection, reference count, shadow count, flags,
cow, needs cow, type. Type 'default' means MAP_ANON)

0x81e4000 0xe637000 4212 0 0xd3d3d960 rwx 2 0 0x2184 NCOW NNC default
0x2ab00000 0x36f00000 49965 0 0xd495b2a0 rwx 1 0 0x2184 NCOW NNC default

most of the pages are resident, but I doubt darcs needs them all, that's
why I asked if there's any way to reclaim them.

> I'd also be interested in your cvsps conversion script, as I've done
> the same thing myself, and am optimistic that you may have done at
> least part of it better than I did.  :) I've been playing with
> converting the bkcvs linux kernel repository to darcs (so far it's
> been I think about four days).  None of the patches took that
> long--the biggest is 24M compressed, but it was all simple additions
> rather than file modifications.  And I think most of the time is being
> spent simply reading the listings of all the directories and checking
> the file modification times on all the files. But that's because most
> of the changesets only affect a few files.

Yeah, small changesets take almost constant time, equal to the 'scan
the whole tree' overhead. If only I could let darcs record know where to
look. Consider this a feature request :)

I'll send you the script privately.


More information about the darcs-users mailing list