[darcs-devel] darcs patch: Include validators in slurpies. (and 1 more)

David Roundy droundy at abridgegame.org
Sun Oct 30 03:55:59 PST 2005


On Sat, Oct 29, 2005 at 09:24:28PM +0200, Juliusz Chroboczek wrote:
> Now I'm planning to define a new on-disk patch type, say Import, that
> is equivalent to an addfile patch followed with a hunk.  The import
> patch will just contain a byte count followed by the raw file data (no
> ``+'' signs at the beginning of lines), and a hash.
> 
> The net effect of this is that pulling a large import between two
> repositories with support for import patches will never need to
> linesPS the import.

Actually, this sounds a lot like the next-generation hunk patches that Ian
and I have talked about.  The realization we had was that for ordinary hunk
patches we can store the "before" and "after" portions without the "+" and
"-" before each line.  If we do this, and also store in the Hunk a (Maybe
PackedString), we can avoid linesPS when reading hunk patches.  This would
be an alternate on-disk format for ordinary hunk patches, and would be
controlled by the RepoFormat.  The format would be something like

hunk -OLDLINES/OLDBYTES +NEWLINES/NEWBYTES
(the raw old data)
changed to
(the raw new data)
end hunk

where the "changed to" and "end hunk" are added for human readability, but
don't affect the parsint.  OLDLINES would be the number of lines in the
"old data", and OLDBYTES would be the number of bytes included, which is
needed in order to parse this patch (since otherwise we wouldn't know when
the old data ends).

I think this patch format would render unnecesary your import patch idea.
I imagine we'd still use the "+/-" format for human-readable display, but
being able to skip over the contents should be a *huge* performance gain,
particularly for operations like annotate where we need to parse the entire
repository, but don't care about most of the patches.

As you might imagine, we would also want to introduce a similar raw binary
patch type, which would render darcs patches to binary files binary
themselves, but I don't think that's a serious issue, when compared with
the performace gain--plus this would remain an optional format.

> > Will the presence of this validator allow us to eliminate the GitSlurpy?
> 
> Not immediately -- GitSlurpies have this distinction between pure and
> dirty slurpies, and trees are ordered in a very peculiar manner.

Ah, I forgot about that.  Bummer.

> > This patch makes me think that perhaps it would be nice to switch
> > Slurpy to use record accessors
> 
> I'm willing to do that, but that's a change that will require careful
> coordination with both you and Ian, since otherwise we'll generate a
> lot of conflicts.

Right now Ian's not doing much active development on darcs because of his
thesis, so it's a good time to get something like this done, as
darcs-unstable isn't in a great deal of flux at the moment.  Of course,
you'd still want to post patches for review early and often, but I don't
think there's a great danger of conflicts at the moment.
-- 
David Roundy
http://www.darcs.net




More information about the darcs-devel mailing list