[darcs-users] darcs patch: switch Darcs.Patch.FileName to be ByteString.Char8 int...

Simon Marlow marlowsd at gmail.com
Thu Oct 1 08:18:55 UTC 2009

On 01/10/2009 07:15, Jason Dagit wrote:
> Thanks Simon and Duncan.
> I'm still trying to wrap my head around this so I have some questions below.
> On Fri, Sep 25, 2009 at 6:58 AM, Simon Marlow <marlowsd at gmail.com
> <mailto:marlowsd at gmail.com>> wrote:

> Does this also mean that the space consumed by ByteStrings is not
> considered for the purposes of memory pressure?  I've always noticed
> that darcs tends to allocate a lot of virtual memory, and now this
> discussion makes me wonder if that's partially because the GC isn't
> collecting ByteStrings as eagerly as it does other garbage?  If this is
> the case, is there a way to specify the memory pressure of a value?
> Basically, I'm saying, "Can I give hints to the GC about the relative
> need to collect something once it becomes garbage?"

No, ByteStrings should be taken into account in the same way as 
everything else for the purposes of deciding when to GC.  However, 
because of the pinning issue, a single ByteString may hold on to 4KB of 
memory even though it only requires a tiny fraction of that.

The reason they don't show up in heap profiles is because we have no way 
of knowing how much of that 4KB is actually live.  Perhaps we should 
consider it all live.

Hmm, I've just had a thought about how I could improve the situation in 
GHC: if we used the mark-region style GC I could reclaim some unused 
parts of the pinned blocks, which would help with the fragmentation 
problem.  I need to mull this over some more.

> Or perhaps I'm completely misunderstanding the problem here.
> I've been looking at taking some dynamic library injection code that Ian
> wrote for linux and porting it to OS X so that I can track how much
> memory we allocate via mmap (which I believe should catch everything
> that is unaccounted for in this case).
> Is there a way to convince the RTS to collect more eagerly or run any
> finalizers (or whatever it would take) to get any ByteStrings cleaned up
> more eagerly?  This might make for a nice experiment.

By using the +RTS -F flag you can make GHC collect more often, but I'd 
be surprised if that actually helps.  It might reduce your memory 
overheads, but you'll probably get better results by using +RTS -c to 
enable in-place compaction (compaction doesn't apply to ByteStrings, 
however; they are still pinned and immovable).

If you have mmapped memory with finalizers, then you could improve 
things by finalizing eagerly using finalizeForeignPtr when you know that 
something is no longer required.

>     The underlying problem in your example may be fragmentation, due to
>     the way that ByteStrings are pinned and hence hold on to the whole
>     4KB block in which they were allocated until they die.  Duncan has
>     been thinking about how to improve the situation, but I'm not sure
>     of the current status - Duncan?
> What is it about fragmentation that is an issue?  We are definitely
> using a lot of bytestrings and a lot of mmap memory.  If you told me
> that the GC simply isn't tracking this memory in the heap profile then
> it makes a lot of sense what I'm seeing.  If you're suggesting something
> else is going on in addition to the deficient tracking then I need to
> spend some time understanding the problem better.

mmapped memory is not tracked by the heap profiler or the GC, that's 
correct.  I hope I've explained the ByteString fragmentation problem 
above; if not please let me know.


More information about the darcs-users mailing list