[darcs-users] filecache and making darcs faster

Eric Kow kowey at darcs.net
Wed Aug 26 13:00:20 UTC 2009

Hi Taylor, Benedikt and darcs users,

Filecache optimisation

On Wed, Aug 26, 2009 at 04:18:10 -0400, Taylor R Campbell wrote:
> I found two issues about performance of darcs changes and annotate,
> issue124 and issue984, as well as some discussion on the mailing list
> from October of last year (citing me, even) about such a cache,
> particularly

I've opened a ticket just for this: http://bugs.darcs.net/issue1566

> I see no recent evidence of progress toward a file cache in a Darcs
> release to speed up darcs changes and annotate.

I want to make it clear that YES the Darcs team is absolutely interested
in making the filecache optimisation happen.  It has been on our roadmap
for a while <http://wiki.darcs.net/Roadmap>.

We did not make it for Darcs 2.3.  We hope to have it done for Darcs
2.4, but it may well take longer than that.  I'm afraid I cannot offer
you much more than the "it will be done when it's done" timetable.

Darcs performance progress

Now let's revisit Bryan's observations:
| Why isn't everyone using Darcs, then?  For years, it had severe
| performance problems that made it completely impractical.  These have
| been addressed, to the point where it is now merely quite slow.'

First: He's absolutely right.  Darcs is still quite slow.
We should use that to spur ourselves on.

Let's look at those performance bugs in more detail:

I think you can divide the problems and proposed solutions into five
broad categories.  Here are the ways Darcs has improved in the past
three years and how we hope to continue improving it in years to come

   * patch theory
      - DONE: darcs-2 conflict handling
      - camp/darcs-3 conflict handling

   * patch practice
      - filecache <http://bugs.darcs.net/issue1566>
      - chunky hunks <http://bugs.darcs.net/issue1357>
      - better binary patches <http://bugs.darcs.net/issue1009>

   * network
      - DONE: hashed repos (--lazy)
      - DONE: HTTP pipelining (needs recent curl + cabal configure flag)
      - DONE: darcs transfer-mode (needs remote Darcs 2)
      - packing (future hashed-storage) <http://bugs.darcs.net/issue1535>

   * disk
      - DONE: hashed repos (--lazy)
      - DONE: GHC library bug <http://bugs.darcs.net/issue973>
      - packing (future hashed-storage) <http://bugs.darcs.net/issue1535>
      - GSoC hashed-storage work (to merge in)

   * memory
      - TODO: ???

How to help (profile!)

The point I'm trying to get across here is that YOU can help!  Because
the darcs performance issue is multi-faceted, we can chip away at a lot
of these issues in parallel.

Benedikt is still working on filecache, which I believe will be a big

Ganesh and Petr are working on getting the basic hashed-storage work
merged into Darcs, which I think should be our top priority.  It may
take some time because the work consists in outsourcing a critical
component that does not yet have a precedent of being well understood
outside of Darcs to a third-party library.  We *will* get there, but it
will take time, and a lot of patience and good humour. Good thing that
third-party library is written by a Darcs hackers :-)

So where do the rest of us fit in?

Well do you notice that big gaping (???) in the 'memory' section?  This
is where we need some loving from the wider Haskell community.  We need
people to profile Darcs.  We need people to make it easier for other
people to profile Darcs.  I would not be surprised if there are some
obvious space leaks in there that any Don-blog-reading Haskeller can
crush without caring about how Darcs works.  In fact, Patai Gergely was
working on a GSoC project to improve the heap profiling experience.
Maybe his hp2any work can be pressed into service!

How to help otherwise

One thing I think we need help with in particular is getting a good
understanding of what kinds of slowness people suffer from.  Are they
just dealing with residual issues like not having Darcs-2 on the server
side (no SSH connection sharing) or trying to fetch old-fashioned
repositories (Darcs now does a get --hashed by default, which means it's
converting old-fashioned to hashed on the fly which can be slower).  If
not, what actually is slow for them?  Are there any other obvious
problems lurking around?  Is Darcs slower on Windows and MacOS X than
on Linux?  If you can help us with the fog of war problem that would be
good too.  Maybe we need to start thinking about that Darcs User Survey
- http://bugs.darcs.net/issue1069

Otherwise, we have a sort of performance overview on
not to mention 22 performance tickets still open
and 93 probably easy bugs that you can work on if you are new to
Darcs hacking or Haskell in general:

Thanks :-)

Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL: <http://lists.osuosl.org/pipermail/darcs-users/attachments/20090826/8e208c39/attachment.pgp>

More information about the darcs-users mailing list