[darcs-users] Handling pristine format transition(s) (Was: Re: [patch156] ...)

Petr Rockai me at mornfall.net
Fri Feb 26 19:44:27 UTC 2010


Hi,

one more thing that I ran into. It poses considerable problems to
support transparent write operations in the old-style
pristine.hashed. What I mean is that while reading an old-style pristine
is not a problem, it is a problem to modify it without also converting
all of it to the prefix-less form. The problem is the assumption (in
hashed-storage) that there is 1:1 (up to hash conflicts) mapping between
subtrees and hashes. Of course, if there are multiple valid
representations of the same directory, this assumption falls apart and
there are zillion different hashes that could be assigned to each
directory legally. In darcs 2.3 we sort of needed not to care, since
only read-only operations were done using Tree.

In 2.4, still the only code that uses the Tree code for actually writing
into pristine is check/repair. We went through some regressions with
that and that was a good thing, since the noslurps patchset makes the
use of the Tree interface pervasive for both reading and writing
pristine.

Now check/repair is a relatively simple business: you start from zero,
so there are no compatibility issues involved. But with modifying
existing Trees, which is what now happens, we can end up with
directories that mix old and new style entries and cause general
suckiness, i.e. the hashes of those directories don't match what
hashed-storage expects (since their text representation is not
"canonic"). There are some ways around that, but all involve a lot of
ugliness and breaks an, in my opinion, quite basic and useful assumption
about the content<->hash correspondence.

The, in my opinion best, even if still compromising, solution to this is
to convert the pristine in one go, as opposed to letting it gradually
change over. This means there'll be an upfront price to pay, and that
older darcsen will get an immediate blow in performance on the given
repository, as opposed to deteriorating gradually as more of the
pristine becomes newstyle.

This could be basically solved by checking whether a top-level hash
(pristine root pointer) has a size in it, and in case it has, trigger
optimize --pristine, upon first pristine-modifying operation with a new
darcs. No hybrid trees (that come with newstyle toplevel hash but still
have oldstyle bits in pristine) should exist, unless people used
noslurps darcs.

Yours,
   Petr.

PS: I'd possibly agree that having a "go back" command, like darcs
optimize --oldstyle-pristine or something like that could be useful in
this context, so people that want to go back with their repositories to
older darcs still can without taking a major performance impact.


More information about the darcs-users mailing list