[darcs-users] darcs patch: optimize firstspace/firstnonspace to remove boxed retu...

Jason Dagit dagit at codersbase.com
Mon Sep 28 21:48:50 UTC 2009


On Mon, Sep 28, 2009 at 3:37 AM, Ian Lynagh <igloo at earth.li> wrote:

> On Sun, Sep 27, 2009 at 03:10:01PM -0700, Jason Dagit wrote:
> >
> > While profiling away on darcs I discovered that we create a lot of
> > intermediate garbage while parsing due to what is otherwise a very,
> > very small intermediate allocation.
> >
> > What maginfies this allocation into a problem is that we're allocating
> > an int for each element of the bytestrings that we traverse with this
> > function.  GHC is unable to optimize this because in the IO monad it
> > must return a boxed int instead of an unboxed int.
>
> MagicHash should be avoided wherever possible, IMO.
>
> Can you give a small, standalone program showing the problem and your
> solution, please?
>

I haven't been able to no.  I tried several different things with toy
examples.  I wasn't really able to reproduce this.  Perhaps if I worked from
the darcs source, but then I think the functions I modified are pretty
simple definitions so I don't think that's what you're looking for (or else
you'd just look at the darcs source).

Some things I do know:
1) adding inline for firstspace/firstnonspace without other changes doesn't
make a difference.
2) refactoring firstspace/firstnonspace so that they are wrappers around a
worker function and then inlining does reduce the allocations quite a bit
for firstspace, but doesn't get rid of the CPU usage.
3) the patch I sent in makes firstspace stop appearing in the list of
functions at the top of the .prof file (meaning, it no longer does a lot of
allocations nor takes much CPU time).

I think I miss-explained the problem previously.  I was thinking it had more
to do with the recursive call to firstspace but on further inspection
(failing to construct a minimal example based on the recursive call), I'm
not sure what causes the allocations.  I do know that the total run time and
total number of allocated bytes goes down so I don't think it's just a
change in profiling accounting due to better inlining.  It does seem to
really solve the problem and other things I've tried do not solve it.  I
can't seem to reproduce it with a toy example.

This function is only used within the module and it is heavily used.
Perhaps this magic hash version is just closer to metal and works better for
that reason.

Sorry that I don't have better answers or explanations.

Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osuosl.org/pipermail/darcs-users/attachments/20090928/46ac3cae/attachment.htm>


More information about the darcs-users mailing list