[darcs-users] Some progress on hashed-storage.

Petr Rockai me at mornfall.net
Mon Feb 2 23:09:49 UTC 2009


Eric Kow <kowey at darcs.net> writes:
> On Sun, Feb 01, 2009 at 00:13:20 +0100, Petr Rockai wrote:
>> I'm starting to think it might be worth getting at least partial support for
>> this into darcs 2.3. (I'm wondering if Kowey will require a sunset procedure
>> for SlurpDirectory though, if we really take this route...)
> Yep! If I may clarify my position, to avoid future doubt:
I'm wondering, does this mean yes, will require sunset procedure?

> The two forces I'm trying to balance are that
>  (A) Real people depend on darcs.  People including us as users
>      keep their crown jewels in darcs and therefore, are rightfully
>      sensitive about darcs breaking.  What a scary position!  The surest
>      way to avoid one kind of breakage is not to change anything.

>  (B) Darcs *has* to change.  Like any piece of software, darcs has
>      its bugs, and there's no fixing bugs without changing darcs.
>      But in the big picture, we want to change lots of things
>      beyond fixing bugs: we want to make darcs fast fast fast; we
>      want better conflict marking; we want darcs 3 with hopefully
>      a smoother transition.  In the bigger picture still, we want to
>      change darcs to make it more sustainable, shaping our code
>      and our community in such a way to ensure that we can still
>      hack on darcs and keep our day jobs.  This is why I consider
>      work like what you're doing with hashed-storage to very
>      important.  We need to keep spinning off things that aren't
>      really part of what makes darcs darcs, one for great modularity
>      and two so that we the darcs team can focus our energy on
>      hacking darcs.

> Hopefully my insistence on sunset procedures makes a little more sense
> in context of both (A) and (B).  This is not obstructionism in the name
> of cautious conservatism (that's where A comes in).  Rather, it is a
> means of giving ourselves /permission/ to change darcs, permission in
> our own eyes as responsible software engineers and in our users' eyes as
> owners of crown jewels.
[Somewhat bitter and sarcastic reply deleted. Glad that I did.]

Okey, now with a little colder head. It's been a little frustrating working on
darcs. I understand all the concerns, and I will probably repeat myself, but,
let's try to make a point to a wider audience.

I don't believe that keeping old things around helps new things. I am not very
happy about our autoconf->cabal transition. It's sort of messed up and in
retrospect, it was a bad idea to do it the way we did (and I know I am more
than a little responsible for this). We should have either swallowed the
proverbial red pill and go for all cabal, no regrets. Or, stay with autoconf. I
have been doing the RM-ing and it wasn't very pleasant dealing with two
complementary sets of bugs and annoyances of both systems. The double
versioning (which, to make things worse, coincided on the final release, making
the tarballs indistinguishable by name) didn't help that either.

The other thing we had problem with was the zlib transition that we sort of
reversed now, due to the CRC issue. I believe, again in retrospect, that it
would have been better to immediately remove the internal bindings. Of course,
that would keep us with broken darcs, but also with incentive to fix it, fix it
properly and quickly. As things were (and are), that incentive was missing and
the result is that there's no fix to this day.

Of course, both results have been hard to predict and it's always easy to say
we should have done differently. But these are at least tangible arguments
against lengthy sunset procedures. The other is, and that is also related to
Eric's other mail, that I believe our efforts could be spent better (ie. fixing
the issues with new solutions, instead of trying to keep both old and new
working simultaneously).

Where I think caution is very well deserved are those, where irrepairable (or
very hard to repair) corruption to precious data can happen. Neither autoconf
nor zlib are one of those cases. Especially bad are the cases, where such
corruption might go unnoticed for a while. I believe these bugs fall into two

- Patch manipulation code. We currently aren't touching this as far as I can
- Pristine cache accuracy with regards to actual pristine state (ie. things
  that check would report as inconsistent repository). If this happens, bad
  patches get recorded and if people make branches of such repositories with
  "get", these will propagate unnoticed.

These where we *really* need to be wary. The rest are generally nuisances, when
they go wrong. If you get unreadable repository that can be repaired without
data loss, that's not the end of the world. This is the case for CRC-corrupt
compressed patches. It sure won't help with image of darcs as reliable system,
but it's not nearly as devastating as losing entire project history due to
botched commute or corrupt patch buried beneath hundreds of dependant patches.

Moreover, I maintain that for the above two critical cases, the best we can do
is cover our bases with automated testing. No amount of review can prevent such
bugs from happening. And a good testsuite can be more efficient *and* cheaper
in developer time. So to be also constructive (as opposed to critic), I'd
propose a different solution to your posed problems.

Instead of keeping things around in case "something breaks", we should ask new
code to come with test coverage, and maybe correctness reasoning. I am not
asking proofs of correctness, because nobody is going to supply them. But I do
believe that a good testsuite is crucial for our success. Whenever manpower is
in short supply, automated testing can be of great value, since it lets people
worry less about introducing bugs, and keeps their hands free from superfluous
testing and re-testing their code manually. Of course, bugs will happen, but if
we are forced to live on our own dogfood, and on the bleeding edge of it, even,
I don't think bugs will take a long time to get fixed. We should be forced to
fix our own bugs, instead of giving ourselves easy ways to avoid them by
compiling with different options.

I think that's what I wanted to say.


PS: It'd be great if we could get those mmap patches in. Maybe it's time to get
a little adventurous. We have 5 or 6 months to fix breakage. No-one can blame
us if the *unstable* version of darcs is a little broken here and there. And,
if things spin out of control, we are using an RCS for a reason. Let's go for a
ride, what is there to lose? I won't hesitate to start over from branch-2.2 if
after two months we find ourselves in a dead end. Let's make an omelette or
two. Just keep me motivated and the patches will keep coming. Just give me a
deal, please! I'm begging you...

Peter Rockai | me()mornfall!net | prockai()redhat!com
 http://blog.mornfall.net | http://web.mornfall.net

"In My Egotistical Opinion, most people's C programs should be
 indented six feet downward and covered with dirt."
     -- Blair P. Houghton on the subject of C program indentation

More information about the darcs-users mailing list