[darcs-users] hashed-storage work now merged in (woo!)

Eric Kow kowey at darcs.net
Thu Oct 8 16:40:29 UTC 2009


Hi everybody,

So you may have noticed me saying this in a couple of recent threads.
Petr Ročkai's hashed-storage work from his 2009 Google Summer of Code
project has been merged!

I thought I would take a few moments to give everybody an overview of
how this work benefits us, and where we'll be going in the future.

In a nutshell
-------------
What does this mean for you?  Faster repository-local operations.

Hashed format repositories (with darcs-1 and darcs-2 patches alike)
should now be faster to use on a daily basis.  We saw the very
beginnings of this work in Darcs 2.3.0 with a faster darcs whatsnew.
Now these speed improvements cover *all* repository-local operations.

The next Darcs beta is a couple of months away, but before that,
I would like to encourage you to try this out for yourself:

  darcs get --lazy http://darcs.net
  cd darcs.net
  cabal install

For best results, please run darcs optimize --upgrade followed by darcs
optimize --pristine.  Pay attention over the next couple of weeks when
you try a record, amend, revert, unrecord.  If we've done our work
right, there should be nothing to see.  Darcs should be less noticeable,
with fewer "Synchronizing pristine" messages and a faster return to the
command prompt.  We think you'll like it.  But please get back to us.
Is Darcs faster for you?

If you're particularly interested, I will step through these changes in
greater detail at the end of this message.  Meanwhile, I would like to
step back a little and take stock of how these improvements fit in to
the bigger picture.

The road ahead
--------------
The hashed storage work is a big step forward and definitely a cause for
celebration.  I think it is useful to reflect on this progress and
consider how it fits in with our progress since darcs 1.0.9:

 - ssh connection sharing (darcs transfer mode)
 - HTTP pipelining
 - lazy repositories
 - the global cache

and now

 - index-based diffing
 - hashed-storage efficiency

We cannot promise that Darcs will magically become fast overnight.  But
what we can and will do is continue chipping away at it, solving
problems one at a time; release by release, a little bit better, a
little bit faster every time until one day we can look back and marvel
at all the progress we've made.

So Petr's work makes Darcs easier to live with on a day-to-day basis.
But that's not enough.  Now we need to turn our attention to that
crucial first impression; what happens when people try Darcs out for the
first time is that they darcs get a repository they want and... then...
they... wait...

This is embarrassing, but we can fix it.  In fact, we already have
started working on the problem.  The next version of hashed-storage will
likely introduce a notion of "packs" in which the many often very small
files that Darcs keeps track of will be concatenated into more
substantial "packs" that compress better and reduce the ill effects of
latency.  My hope is that we will be able to complete the packs work by
Darcs 2.5.

There's a lot more progress to be made: smarter patch representations,
tuning for large patches, file-to-patch caching for long histories.
And that's just performance!  For more details about our performance
work, please have a look at

  http://tinyurl.com/darcs-performance2

If you could do anything to help, benchmark, profile, anything at all,
please let us know :-)

The fight continues.

Thank-you!
----------
Petr and Ganesh deserve a huge round of applause.  Petr, thanks for
thinking up this work, getting it done and pushing it through. Ganesh,
thanks for an extremely thorough and thoughtful review.  The two of you,
thanks for holding on, for tenacious cooperation in the face of
adversity.

Thanks also to all the wider Darcs community for all your support,
comments, patch reviews.

I'm looking forward to seeing you at the upcoming Darcs hacking sprint.
The sprint will take place in Vienna, Austria on the weekend of 14-15
November.  Everybody, especially Darcs and Haskell newbies, is welcome
to join in.  Details on http://wiki.darcs.net/Sprints/2009-11

And if I may take a paragraph to mention this, Darcs needs your support.
Every little counts, if you can send patches, review patches, tweak
documentation, profile, benchmark, submit bug reports.  Barring that,
you could also make a contribution to our travel fund via the Software
Freedom Conservancy.  See http://darcs.net/donations.html for details.

Thanks everybody and enjoy!

Eric

Changes in detail
-----------------
- Darcs uses an "index" file to compute working directory and
  pristine cache diffs.  This avoids timestamps going out of
  synch when you have multiple local branches, which saves a
  huge and needless slowdown.

- Hashed storage is more efficient in general.  Even if you
  already have perfect timestamps, the new optimisations should make
  Darcs faster in general.

- The new 'darcs optimize --pristine' reduces spurious mismatches
  on directories.

- Darcs no longer requires a one second sleep after applying patches.

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://lists.osuosl.org/pipermail/darcs-users/attachments/20091008/e0827e8a/attachment.pgp>


More information about the darcs-users mailing list