[darcs-users] Hashed-storage & darcs 2.3 (Was; Re: 2.3 release schedule)

Petr Rockai me at mornfall.net
Wed May 27 00:30:36 UTC 2009


Hi,

Eric Kow <kowey at darcs.net> writes:
> On Tue, May 26, 2009 at 12:19:59 +0200, Petr Rockai wrote:
>> There's my hashed-storage work. I am almost inclined to include the
>> "whatsnew-only" part of it in the release, with the provision that it can be
>> easily backed out after beta 1 if anything breaks.
>> 
>>     1202 darcs whatsnew significantly slower on a hashed repository (cache on
>>          network drive) (Petr)
>
> For what it's worth, my opinion (as usual) is that we should take our
> time.  I'd be happier to see something more complete and better tested
> in 7 months than something partial and highly experimental in 1 month.
> The idea would be that we could unveil a more fully hashed-storage-ised
> darcs in one swoop.
The problem with this is, that this doesn't necessarily translate to "better
tested", more to the contrary. The reasoning is that if we release darcs with
whatsnew based on the index (which in no way can corrupt a repository), we will
still get wide testing coverage by real-world users, avoiding most of the risks
associated with (even 7 months later) release of index-based record. If
whatsnew misbehaves, people will file bugs about it and we can fix it -- but if
no-one uses the code for another 7 months, no-one will notice the bugs anyway,
and much more (and more critical code) will be affected by that time.

Moreover, annoying bugs in whatsnew can be fixed in point releases in the
summer (and I'll have a lot more time in summer for darcs than around 2.4, when
it will be long past SoC). So the compromise I propose is this:

For 2.3, make darcs create and maintain the index, but only ever use it for
answering the "whatsnew" query. This is both safe (record keeps using the
current diffing code) and provides us with testing coverage that is otherwise
unattainable (rolling out the code to most of our userbase). We also still get
the same 7 months ourselves to perfect the index code before using it in any
critical code path.

If we want to be extra paranoid, it's not a problem to add an option to
whatsnew that flips between old and new code, so that people can say
--disable-index (in their defaults file, even) if they run into issues with
indexing.

> Another issue to consider is what if something changes in
> hashed-storage?  I wouldn't want us to have to manage a compatibility
> issue between darcs 2.3 and 2.x (or perhaps you've anticipated the need
> for a versioning scheme in the index?)
Forward compatibility shouldn't be an issue either. We can simply name the file
index_0 and when an incompatible change in format happens, just increase the
number to index_1. This is much more robust than inline versioning metadata, as
no parsing of anything will be involved. You just care about the right version
number. As for hashed-storage, we can just say we want hashed-storage = 0.3 or
something like that. I also don't expect to change the index format (unless we
will add unexpected major new features, which we, by definition, don't expect
to happen *g*). In other words, it shouldn't be a problem to freeze the index
format in a two weeks time (I really just need to check it for any outstanding
endianness issues, and that whitespace-in-paths bug which is probably somewhere
else than in the index code anyway).

> Anyway, I won't oppose this provided the patches make it through review
> and the changes are (as I understand them) completely backward
> compatible (old darcs just ignores the index that we would generate.
> Gently, gently, is all I'm saying.
Yes, older darcs will ignore index just fine. There will likely be an
incompatible format change later in development, when new pristine format is
designed and implemented, but this is not yet an issue.

As for review, that's something I can hardly influence. All I can promise is,
that the patches to commands themselves will be trivial, and that new code will
be relatively isolated, not sprawling changes across darcs. In other words,
I'll be just adding new bits (probably first to Darcs.Gorsvet) and then slowly
convert commands to use those new bits, and when we get to the point that
pre-existing darcs functionality is no longer needed, we can remove that code
and migrate the Darcs.Gorsvet counterpart to a more appropriate location. This
should separate the concerns and make review (hopefully) easier. It also makes
it possible to work on the new code without compromising any external
functionality of darcs, since we can keep the (trivial) patches flipping the
commands out of the "production" tree.

Yours,
   Petr.

-- 
Peter Rockai | me()mornfall!net | prockai()redhat!com
 http://blog.mornfall.net | http://web.mornfall.net

"In My Egotistical Opinion, most people's C programs should be
 indented six feet downward and covered with dirt."
     -- Blair P. Houghton on the subject of C program indentation


More information about the darcs-users mailing list