[darcs-users] Handling relative directories

Jason Dagit dagit at codersbase.com
Thu Apr 8 04:17:44 UTC 2010

On Wed, Apr 7, 2010 at 9:45 AM, Duncan Coutts
<duncan.coutts at googlemail.com>wrote:

> On Tue, 2010-04-06 at 09:03 -0700, Jason Dagit wrote:
> > On Tue, Apr 6, 2010 at 6:26 AM, Petr Rockai <me at mornfall.net> wrote:
> >         Duncan Coutts <duncan.coutts at googlemail.com> writes:
> >         > One option would be to use a representation like a
> >         reverse-order list of path components, with each component
> >         stored as a short packed string. That allows for sharing
> >         between paths and would reduce the cost of using long absolute
> >         paths.
> >         Interesting idea. I have already started using a path type
> >         that is a list of components (represented as bytestrings), it
> >         just did not occur to me to make it reverse to improve
> >         sharing. I'll try to look into doing that.
> >
> > I could be wrong, but I was under the impression that many short
> > bytestrings leads to memory fragmentation in the GC.
> Indeed, which is why I said "short packed string" rather than
> ByteString. I would implement a short packed string as a wrapper around
> GHC's ByteArray# type which can be allocated unpinned. That gives an
> overhead of 2 or 4 words compared to 5 or 8 words (differences depend on
> sharing and if the type is unpacked into a containing data constructor).
> > I was also under the impression that small bytestrings have worse
> > overhead than small Strings.
> If I recall correctly they even up at around 3-5 characters. However the
> pinning of ByteString is a major PITA. So I would not suggest using them
> for short strings.
> IMHO, ByteStrings using ForeignPtrs was a mistake (and one I hope to
> correct at some point in the future).

<with my darcs dev hat on>
It sounds like we should keep using ByteString instead of rolling our own
solution.  We're in the version control business, not the low level string
abstraction business.  Maybe we should use ByteString and let it catch up in
optimizations on its own schedule.
<removing the hat>

How much work/effort is the refactor you talk about?  I mean, the refactor
of ByteString to not use ForeignPtrs.

How much work/effort to convince the ByteString user base that the refactor
is ready for consumption?

I ask because it's a change that could add a lot of value to our
benchmarking efforts for darcs by giving us more accurate heap usage
information and better garbage collection if the fragmentation is reduced.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osuosl.org/pipermail/darcs-users/attachments/20100407/ab37032e/attachment-0001.htm>

More information about the darcs-users mailing list