[darcs-users] Need help getting off darcs

David Roundy droundy at darcs.net
Thu Jan 3 17:19:44 UTC 2008


On Thu, Jan 03, 2008 at 08:40:52AM +1100, James D Sadler wrote:
> A 150 mb patch is far from ideal, yes but I disagree that it is an
> error on our part.  We can perfrom gets and pulls just fine at work,
> and that means that darcs is handling that patch just fine in that
> situation.   It doesn't handle that patch when trying a pull that
> attempts to pull that one patch *only*.  Darcs consumes inordinate
> amounts of RAM on order to do this - my guess is that Darcs is
> scanning other patches in order to check for dependency relationships
> with the patch I need to pull.

No, it's just that darcs is optimized for the common case.  In the common
case of pulling, the remote repository has only a relatively small number
of patches that we do not have locally, and reading them all into memory
makes sense.  When pulling into an empty repository, this means that one
always reads (and parses) the entire remote repository into memory.  Darcs
doesn't need to check any dependency relationships, it just grabs
everything into memory so it can nicely prompt you interactively to see
which changes you want.

One slow but cheap (in terms of memory use) approach would be to start with
a current repository and unpull patches one at a time (possibly optimizing
in between).  This would be fast and use little memory, since each
operation is a "normal" operation.

Also more efficient than pulling into an empty repository would be darcs
get --to-patch, or the like.

> While writing a tool a couple of months ago to extract the content of
> a darcs repo without invoking darcs itself, I got quite familiar with
> the patch format.  Something that stood out immediately was that
> patches do not contain references to the patches that they depend on.
> Essentially that means that darcs has to do a *lot* of work in order
> to figure out the dependency relationships - I think it is here where
> it loads all the patches into RAM and uses the 'patch algebra' to make
> its conclusions.  This is the error, IMHO. (I am trying to tread
> carefully here, it's not my intention to start a flame war :0)   )

The alternative is making darcs record O(N) where N is the size of the
repository.  That's unacceptable.  Also, it would make the disk use of a
repository O(N^2) in the (uncommon) worst-case scenario.
-- 
David Roundy
Department of Physics
Oregon State University


More information about the darcs-users mailing list