[darcs-users] Re: /proc/.../maps results

David Roundy droundy at abridgegame.org
Sun Feb 27 14:00:39 UTC 2005


On Sun, Feb 27, 2005 at 12:58:52AM +0100, Peter Strand wrote:
> David Roundy wrote:
> >It's reasonable if the garbage collector hasn't run recently.  Darcs only
> >unmaps files on GC, and when a file gets modified, darcs creates a new
> >inode and then (when it wants to read the modified version) mmaps the file
> >again.
> >
> >It may be that we should trigger the GC manually to ensure that mmapped
> >files are released in a timely manner, but on the other hand, we don't want
> >to waste time garbage collecting if we can avoid it.
> 
> I think the possibility to use a function like
> 
> withMappedFile :: FileName -> (FileContents -> IO ()) -> IO ()
> (or something similar)
> 
> should be considered as well, to get exact control of resource usage.
> Sadly, it doesn't fit at all with how darcs currently treats slurpies,
> from what I have seen.

I agree that this would be an improvement, and I'd like to see it
implemented, but don't think we should eliminate the current GC-based mmap
IO, since there are times (e.g. when reading a patch file) when we want to
hold the mmapped file in memory for a long time, during which we'll be
doing lots of interesting stuff.

But for simple things like applying patches to _darcs/current/,
withMappedFile would be perfect.  This is the sort of thing I'd like to do
with the DarcsIO module that is in darcs-unstable.  It's not complete, and
will take a bit more work, but eventually my hope is that the polymorphic
apply in PatchApply will get finished, and is used to implement
apply_to_slurpy, and then we can start applying patches directly to the
filesystem without a Slurpy intermediate.  Then, ideally, we'd implement
the IO instance of mModifyFileXXX using withMappedFile rather than using
mmapFilePS as we currently do.

> The use of unsafeInterleaveIO, to get pure and lazy reading of files is
> nice, but it does have its share of drawbacks as well.
> 
> As a compromise, we could manually keep track of mapped files, and close 
> them between rounds of modifications (patch applications, in the get
> case?), like the AtExit stuff but not necessarily at exit.

This is frightening... not to say it's a bad idea, but it seems like asking
for hard-to-track-down bugs.  withMappedFile is nice, because it's clean
and safe--mostly because you give it a return type of ().

> Deterministic mapping of files would also make it possible to use mmap
> on windows, i believe. Last time I looked, too long-lived mappings was
> the showstopper there.

Indeed, that's the one reason I'd lean towards something like this.  On the
plus side, I imagine we can use withMappedFile safely on windows.  Since
it's always better to use withMappedFile (which I'd probably type as
FileName -> (PackedString -> IO ()) -> IO ()) than relying on the GC to do
the unmapping, perhaps we could make a gradual transition, incrementally
using withMappedFile in more and more places.  Especially if this were done
in conjuction with a switch to using apply directly on the filesystem, and
when combined with lazy patch parsing, so we can apply the patch as we read
it (and don't need to hold the whole patch in memory) we ought to be able
to make considerable improvements in darcs' memory usage.
-- 
David Roundy
http://www.darcs.net




More information about the darcs-users mailing list