[darcs-users] Re: Limits of Darcs (the whole Linux kernel?)

David Roundy droundy at abridgegame.org
Sat Nov 6 12:56:44 UTC 2004

On Fri, Nov 05, 2004 at 06:19:12PM +0000, Mark Stosberg wrote:
> On 2004-11-05, Samuel Tardieu <sam at rfc1149.net> wrote:
> >>>>>> "David" == David Roundy <droundy at abridgegame.org> writes:
> >
> >David> You could either make more memory available for darcs, or
> >David> compile darcs with the --enable-antimemoize option.  The
> >David> initial record is the command which most stresses darcs'
> >David> memory, as it requires holding the entire tree in parsed and in
> >David> memory.
> >
> > So this brings an immediate question: why does darcs need to hold the
> > entire tree in memory at anytime?

Darcs needs to be able to hold parsed patches in memory, basically for
efficiency reasons.  We can't afford to be reading parts of patches again
and again, and I don't want to write my own virtual memory system to make
this "automatic".  Actually, that's not quite true, the antimemoization
trick allows us to *not* hold the entire parsed results of a patch, but the
cost is that you may have to reparse it sometimes.

> Another related question is:
> Can the initial import be "special cased" to improve performance? It's
> seem me that it should be possible...

Yes, it could be special-cased.  However, I think it would be better to
improve the code so it didn't need to be special-cased.  By treating the
initial record as a special case, the most we could gain is probably less
than a factor of about two in memory use, in the worst case scenario.  This
is becauses the initial *get* will then have to hold the parsed patch in
memory, so even if we made the record free, we'd still have to hold the
entire patch in memory.

The antimemoize trick helps *all* such situations, albeit to a limited
extent, and at the potential cost of extra CPU time.  The trouble is that
antimemoization hasn't been well tested, so it's not the default (which is
why it isn't well-tested).  Perhaps I should make it default on the 1.1
branch of darcs, just to get more testing.

There may also be parts of the code that need to be made lazier or stricter
to give better memory behavior.  Most likely we need increased laziness in
places, which can allow us to hold less data in memory.

>  From a marketing or "new user" perspective, this is rather important to
> address, because it's one of the first impressions people have with
> darcs.

Indeed, it would be great to improve the initial record experience, but I'd
prefer to do so in a way that would also improve darcs' general behavior.
David Roundy

More information about the darcs-users mailing list