[darcs-users] [OT] Larry McVoy on the Bitkeeper licence

Ralph Corderoy ralph at inputplus.co.uk
Sat Feb 19 13:37:06 UTC 2005


Hi David,

> > Stupid question: why does the application have to care for holding
> > the stuff in memory, rather than the operating system's buffer or
> > cache?
> 
> I just mean we parse it and store the parsed information.  Parsing
> does take some time, and that (plus the file or network IO) is what we
> want to avoid repeating, but of course for sufficiently large
> repositories (or sufficiently small memory present) you'd rather throw
> away the parsed result as you use it and then reparse later.  It's a
> tradeoff between memory and time behavior, and darcs right now
> optimizes on time in this instance.

Clearly, memory's exhaustable.  Typically before disc.  What's the
performance like if nothing is kept and everything thrown away, i.e. the
other extreme.  There's the system call overhead of asking the OS for
the data again.  Is parsing expensive and if so, is that because more is
being parsed out than is often necessary, perhaps because files are
compressed?

Personally, having a program store each of its files, e.g. patches,
compressed seems a bit of a pain.  They don't fit in well with use from
Unix.  It takes time for back up programs to fail to compress them.  If
I am short of disc space I'd rather use a compression filesystem and
reap benefits everywhere.  And I can't fail to mention the ugly `they're
called *.gz but they might not be compressed'.

Given darcs may have a large dataset to access, has using a cache
friendly library like Judy Arrays been considered?

    http://judy.sourceforge.net/

Cheers,


Ralph.





More information about the darcs-users mailing list