[darcs-users] Large initial sets

Ivan Stankovic pokemon at fly.srk.fer.hr
Thu Aug 26 14:16:41 UTC 2004


On Thu, Aug 26, 2004 at 06:56:09AM -0400, David Roundy wrote:
> On Wed, Aug 25, 2004 at 01:08:35PM +0200, Ivan Stankovic wrote:
> > This makes me wonder, why couldn't darcs do it automatically, in one
> > (initial) record?  Do _all_ patches really need to be held in memory at
> > the same time?  What's wrong with darcs recursively adding
> > subdirectories?
> 
> Perhaps a --adds-only flag to record would be possible, which would allow
> record to optimize its memory consumption because it would know that it
> wouldn't need to commute patches around in order to "order" the patch.

Sorry for being ignorant, but does this mean that using --adds-only would
only avoid commuting patches (while all of them would be present in the
memory anyway)?  If so, I don't see any great benefits of doing that; yes,
it would certainly help, but still having to load the entire (potentially huge)
repo would be a pain.

One more thing, since --adds-only would be used only for the initial record
(right?), why wouldn't darcs automatically recognize it and behave accordingly
thus eliminating the need for yet another flag?

> Breaking an initial patch into multiple patches, while possible, is an
> awfully ugly hack that I'd rather leave to the user.  It should be possible
> to reduce memory use on the initial record to a reasonable level--that is
> to say, to less than or equal to the memory used by darcs get on the same
> patch.  Currently it's something like 50% higher than the memory used by
> darcs get.

The possibility sounds nice; I think ideal initial darcs record should take
no longer than time needed to copy the whole repo plus some relatively small
constant.

> One area where we may be able to make some memory improvements is in the
> handling of filenames.  I can switch to a packed representation.  The catch
> is that then I'd have to unpack the filenames every time I want to pass
> them to the standard library routines, and last time I tried this it gave a
> slight performance hit (although that was a long time ago now).  Filenames
> in darcs are well encapsulated, so it wouldn't be too hard to test this
> out.  Probably it wouldn't make much gain in the memory usage.  

Yes, in fact on large repos with thousands of files I expect memory gains
to be nullified by overhead of packing/unpacking routines.

> Probably the only thing that will help is increasing the laziness of the
> record so it won't need to create the entire patch at once in memory.

Definitely.
-- 
Ivan Stankovic, pokemon at fly.srk.fer.hr




More information about the darcs-users mailing list