[darcs-users] Possibly a very simplistic solution

David Roundy droundy at abridgegame.org
Wed May 19 11:20:34 UTC 2004


On Wed, May 19, 2004 at 08:29:15AM +1000, Nigel Rowe wrote:
> A possibly overly simplistic solution occurred to me, to the problem of find 
> or grep -r finding files in _darcs/current.
> 
> Is there any reason why darcs could not store them compressed?  After all 
> patches are compressed.  For that matter (to carry the idea further) they 
> could be named in the same way, using the sha1 of date/creator/name.
> 
> Am I missing anything here?

Files in _darcs/current could be compressed, but would have a couple of
downsides.  One is that we no longer could check if the two files are the
same size, which is a nice thing to check.  When recording, it is assumed
(unless --ignore-times is specified) that files which have the same
modification time as their counterpart in _darcs/current *and* have the
same length were not modified.  So removing this check would make it easier
to modify the file right after recording and have darcs not notice.

The other downside is speed and memory usage.  When files in _darcs/current
are read, they are normally mmapped if they're big enough--mmap always uses
a page of memory, so it doesn't pay off for small files.  Of course, with
small files that aren't mmapped compressing them won't help with disk
usage.  _darcs/current gets read very often, and when running commands that
you don't want to wait on, like whatsnew, so I hesitate to slow things
down.

I think the best direction to move would be towards supporting (optionally
or configurably) a berkeleydb database for _darcs/current.  This
"shouldn't" be too hard, and would give us all sorts of nice features like
transactions, etc.  It would also probably be faster than the current
implementation, since we could tune the database to our access times, and
we'd be able to avoid system calls to a large extent.  On big repos, the
bottleneck is often stat(2) calls to find modification times and file
sizes.

As far as changing the names of files in _darcs/current, that would be very
awkward to do.
-- 
David Roundy
http://www.abridgegame.org




More information about the darcs-users mailing list