[darcs-devel] announcing darcs 2.0.0pre1, the first prerelease for darcs 2

David Roundy daveroundy at gmail.com
Wed Dec 12 16:43:06 UTC 2007


Thanks for your report!

On Dec 12, 2007 8:45 AM, Simon Marlow <simonmarhaskell at gmail.com> wrote:
> David Roundy wrote:
>
> > === Creating a repository in the darcs-2 format ===
> >
> > Converting an existing repository to the darcs-2 format is as easy as
> >
> > darcs convert oldrepository newrepository
>
> I did this for GHC's repository.  I left it running last night, and I'm not
> sure whether it completed successfully - I certainly have a repository, but
> it had a _darcs/lock file left in it.  It seems to have all the patches in
> _darcs/patches, and the last one is dated about 3.5 hours after I started
> the conversion.

Hmmm.  I'll have to try that again myself.  But after my computer
comes back up (it's in Oregon, and is either crashed or off due to a
power outage).

darcs check should work to indicate the conversion went fine.

> $ darcs2 query repo
>            Type: darcs
>          Format: hashed, darcs-2-experimental
>            Root: /64playpen/simonmar/ghc-darcs2
>        Pristine: HashedPristine
>           Cache: thisrepo:/64playpen/simonmar/ghc-darcs2
>     Num Patches: 17532
>
> A few quick performance tests.  The darcs2 repository is on a local filesystem:
>
> $ time darcs2 whatsnew -s
> No changes!
> 2.25s real   2.04s user   0.18s system   98% darcs2 w -s
>
> In a darcs1 GHC repository mounted over NFS:
>
> $ time darcs whatsnew -s
> No changes!
> 0.13s real   0.03s user   0.05s system   58% darcs w -s

The difference here is that I haven't implemented the time-stamp
synchronizing feature for hashed repositories.  I wasn't sure it was
still needed (and would be nice to drop, as it's a bit hackish), since
for the darcs repository whatsnew is pretty fast.  Will have to add it
to the TODO list.

It's also possible that you're getting hurt by the cost of checking
the sha1 hashes, which we currently do in a rather paranoid way (I
like being paranoid, except when it hurts).  If this is the case, we
could speed things up by using a faster sha1 hash function.  Right now
we use on written in Haskell, but it wouldn't be hard to bind to a
well-optimized implementation (openssl or something).

But I guess I've been running on local disks long enough that I've
forgotten the cost of opening up a file over nfs... I'd best go ahead
and make this change.  It's potentially a little painful, as
synchronizing the modification time of files in the pristine cache
doesn't interact well with hard linking between files in the pristine
caches of different repositories.  Which means either we live with a
performance cost to hard linking of pristine caches, or we store
modification times in the file contents of the pristine cache, so that
we could have multiple modification times per file.  :(

Another optimization (which could gain us just a simple factor of two,
potentially) would be to check the hashes of files in the working
directory against the known sha1 hash of the pristine cache.  This
would save us half the IO (in the normal case where most files are
unmodified), and would maintain the robustness of not depending on
file modification times (which has always been a rather fragile
optimization).

> "darcs changes" seems to have a big performance regression:
>
> $ time darcs2 changes --last=10 >/dev/null
>
> I killed it after 3 minutes of CPU time and the process had grown to 1.4Gb.
>   darcs1 does this in 0.05 seconds using 2Mb.  Perhaps the repository is
> corrupted somehow?

Yikes! That's actually a very surprising bug.  I'd be interested in
hearing if it shows up if you run a darcs2 optimize first? Either way,
of course, it's a serious bug, but that'd give a hint where the
trouble is.

> I've tarred up the repo and put it here:

Thanks! I'll take a look at it when I get a chance.

> Documentation nits
>
> The 'darcs show' documentation appears in two places, under "Seeing what
> you've done" and "Advanced examination of the repository".

That should be easy to fix (once I get myself a working copy of darcs'
repository...).

> The docs still say that two patches making the same change are considered
> to be in conflict.

:( I should fix that...

> I can't find any docs about using lazy patch downloading and the
> ~/.darcs/sources file.

This is in the Prefs section
(http://darcs.net/doc-unstable/node5.html#SECTION00510000000000000000),
but that's not really the best place for it.  There should definitely
be a section there, but more needs to be added to the get
documentation.

Thanks for taking the time to try this out!

David


More information about the darcs-devel mailing list