[darcs-users] Need help getting off darcs

James D Sadler james at jamesdsadler.com
Thu Jan 3 23:44:12 UTC 2008


On 04/01/2008, David Roundy <droundy at darcs.net> wrote:
> On Thu, Jan 03, 2008 at 08:40:52AM +1100, James D Sadler wrote:
> > A 150 mb patch is far from ideal, yes but I disagree that it is an
> > error on our part.  We can perfrom gets and pulls just fine at work,
> > and that means that darcs is handling that patch just fine in that
> > situation.   It doesn't handle that patch when trying a pull that
> > attempts to pull that one patch *only*.  Darcs consumes inordinate
> > amounts of RAM on order to do this - my guess is that Darcs is
> > scanning other patches in order to check for dependency relationships
> > with the patch I need to pull.
>
> No, it's just that darcs is optimized for the common case.  In the common
> case of pulling, the remote repository has only a relatively small number
> of patches that we do not have locally, and reading them all into memory
> makes sense.  When pulling into an empty repository, this means that one
> always reads (and parses) the entire remote repository into memory.  Darcs
> doesn't need to check any dependency relationships, it just grabs
> everything into memory so it can nicely prompt you interactively to see
> which changes you want.
>
> One slow but cheap (in terms of memory use) approach would be to start with
> a current repository and unpull patches one at a time (possibly optimizing
> in between).  This would be fast and use little memory, since each
> operation is a "normal" operation.
>
> Also more efficient than pulling into an empty repository would be darcs
> get --to-patch, or the like.
>
> > While writing a tool a couple of months ago to extract the content of
> > a darcs repo without invoking darcs itself, I got quite familiar with
> > the patch format.  Something that stood out immediately was that
> > patches do not contain references to the patches that they depend on.
> > Essentially that means that darcs has to do a *lot* of work in order
> > to figure out the dependency relationships - I think it is here where
> > it loads all the patches into RAM and uses the 'patch algebra' to make
> > its conclusions.  This is the error, IMHO. (I am trying to tread
> > carefully here, it's not my intention to start a flame war :0)   )
>
> The alternative is making darcs record O(N) where N is the size of the
> repository.  That's unacceptable.  Also, it would make the disk use of a
> repository O(N^2) in the (uncommon) worst-case scenario.

I don't understand how patches having back references to their
dependencies leads you to conclude that record would be O(N) and size
of the repository O(N^2).  This is not a challenge, I am just missing
the 'glue' that links my statements to your conclusion!

-- 
James


More information about the darcs-users mailing list