[darcs-users] Developer machine spec for Linux kernel w/ darcs

David Roundy droundy at abridgegame.org
Tue Nov 30 13:23:16 UTC 2004


On Tue, Nov 30, 2004 at 12:24:31PM +0100, Tim O'Callaghan wrote:
> Same problem here, but with the GCC mainline tree.  I thought it might
> have be something to do with the tailor.py script, but after reading
> your report i see we have the same problem. I left a tailor/darcs
> session syncing with the GCC mainline, come back 8 hours later my box
> is dying, no response from it. So I think, ok, it will get over this.
> 14 hours later its dead and needs rebooting.
> 
> Quite a feat for a revision control system!
> 
> I think it may be time to learn haskel and join the devel list. Is
> there somewhere in the codebase that can be pointed to as the culprit
> for  this memory usage? This kind of memory usage smacks of
> malloc(100);fork(); or some kind of recursion gone mad.
> 
> I mean it's using a *scary* amount of  memory. The GCC mainline
> codebase, is about 190 meg, about 24,600 files. What kind of state
> information is darcs keeping that it needs over 750 meg of ram?

It's storing the contents of those files, plus the locations of the
beginning and ending of every line in those files in the form of linked
lists, which probably accounts for most of the rest of the memory.

The antimemoize option allows it to drop (and if necesary recompute) the
line breaks, but I haven't worked out whether there might be a case where
darcs still holds everything in memory at once (although there shouldn't
be, it's just hard to track these things down).

Darcs is often quite swap-friendly, so if your machine is dying, then
adding more swap is quite likely a sufficient fix.  Certainly darcs'
working set is usually less than one half or one third of its total
memory.

> Sorry for the rant, but my questions still stand.:
> If this is a known problem, where is the problem?
> What kind of state information does darcs need to keep that it eats memory.

The problem is that darcs is (most likely) creating a single patch that
creates the entire source tree, so that patch must be parsed and held in
memory.  (Or when recording, that patch must be created, and the
information to create it must be held in memory.)

> On Tue, 30 Nov 2004 00:00:44 +0000, Jim Hague
> <jim at fluffy.bear-cave.org.uk> wrote:
> > I'm a complete newcomer to darcs. Reading the manual and tutorial it looks
> > just like what I want for doing occasional driver work on the Linux kernel.
> >
> > The problem is that when I try and get the kernel repository linked
> > from the Darcs home page, darcs chews memory until nuked by the out of
> > memory killer. I realise it is memory-hungry; what I want to know is
> > some idea of how much memory I actually need. I've searched the mailing
> > list archives, and also found the Wiki performance page where David
> > Roundy (serious respect, sir) guesses it could require 1Gb, but I have
> > seen another message where someone suggests they succeeded with 256Mb +
> > 660Mb swap.
> >
> > My development machine is currently 256Mb + 2 x 490Mb swap areas. Would
> > upgrading it to 512Mb be likely to work or do I really need 1Gb? Can
> > anyone report a working 256Mb or 512Mb configuration that does
> > successfully get the Linux kernel, and if so are you building darcs
> > with the -enable-antimemoize option? Basically I'm keen to get this
> > working, but need a solid data point on just how I'm much I'm going to
> > need to beef up my configuration.

The best think you can do (I think) is to add more swap.  This will at the
minimum keep darcs from killing your machine.  Much of darcs is *very*
swap-friendly (i.e. working set is much smaller than total memory usage),
so you can get by with more swap than you might otherwise consider sane.

And definitely let us know how it works out, as it'll be useful to others
to hear how much memory/swap you require, plus it'll help me know how far
we have to go in improving darcs' memory usage.  The trouble is that darcs
is slow as well as memory-intensive, and often there's a tradeoff between
one and the other.  :( Not always... I could switch from linked lists to
arrays, which would improve both, but that would make a lot of code a lot
dirtier.
-- 
David Roundy
http://www.darcs.net




More information about the darcs-users mailing list