[darcs-users] Need help getting off darcs
James D Sadler
james at jamesdsadler.com
Thu Jan 3 23:44:12 UTC 2008
On 04/01/2008, David Roundy <droundy at darcs.net> wrote:
> On Thu, Jan 03, 2008 at 08:40:52AM +1100, James D Sadler wrote:
> > A 150 mb patch is far from ideal, yes but I disagree that it is an
> > error on our part. We can perfrom gets and pulls just fine at work,
> > and that means that darcs is handling that patch just fine in that
> > situation. It doesn't handle that patch when trying a pull that
> > attempts to pull that one patch *only*. Darcs consumes inordinate
> > amounts of RAM on order to do this - my guess is that Darcs is
> > scanning other patches in order to check for dependency relationships
> > with the patch I need to pull.
>
> No, it's just that darcs is optimized for the common case. In the common
> case of pulling, the remote repository has only a relatively small number
> of patches that we do not have locally, and reading them all into memory
> makes sense. When pulling into an empty repository, this means that one
> always reads (and parses) the entire remote repository into memory. Darcs
> doesn't need to check any dependency relationships, it just grabs
> everything into memory so it can nicely prompt you interactively to see
> which changes you want.
>
> One slow but cheap (in terms of memory use) approach would be to start with
> a current repository and unpull patches one at a time (possibly optimizing
> in between). This would be fast and use little memory, since each
> operation is a "normal" operation.
>
> Also more efficient than pulling into an empty repository would be darcs
> get --to-patch, or the like.
>
> > While writing a tool a couple of months ago to extract the content of
> > a darcs repo without invoking darcs itself, I got quite familiar with
> > the patch format. Something that stood out immediately was that
> > patches do not contain references to the patches that they depend on.
> > Essentially that means that darcs has to do a *lot* of work in order
> > to figure out the dependency relationships - I think it is here where
> > it loads all the patches into RAM and uses the 'patch algebra' to make
> > its conclusions. This is the error, IMHO. (I am trying to tread
> > carefully here, it's not my intention to start a flame war :0) )
>
> The alternative is making darcs record O(N) where N is the size of the
> repository. That's unacceptable. Also, it would make the disk use of a
> repository O(N^2) in the (uncommon) worst-case scenario.
I don't understand how patches having back references to their
dependencies leads you to conclude that record would be O(N) and size
of the repository O(N^2). This is not a challenge, I am just missing
the 'glue' that links my statements to your conclusion!
--
James
More information about the darcs-users
mailing list