[darcs-users] on-disk formats of patches
Ganesh Sittampalam
ganesh at earth.li
Sat Oct 16 13:47:43 UTC 2010
Hi,
There are some differences in the structure of the datatypes for v1 and
v2 patches, and also in the way that they are stored on disk. I'm hoping
to clean this up while still maintaining backwards compatibility.
The fundamental distinction is that v1 patches (Patch) are stored in
repositories of type Repository Patch, which in turn means that a full
recorded patch is of type Named Patch. This works because one of the
constructors of Patch is ComP :: FL Patch -> Patch - i.e. Patch itself has
the ability to store multiple primitive patches.
For v2 patches (RealPatch), the repository is of type Repository (FL
RealPatch), and there is no ComP equivalent.
One consequence of this distinction, which I suspect is accidental, is
that the on-disk format looks different. FL p is just printed out as each
p in sequence, with no separators. However the ComP constructor of Patch
is printed out with { } surrounding the sequence of inner patches.
Although in theory ComP could be nested, I've never seen this happen. So
the end result is that the on-disk format for v1 is a sequence of patches
surrounded by { } - unless there's a single primitive patch in which case
{ } is omitted. For v2 the on-disk format is a sequence of patches with no
{ }.
For pending, the printing/reading always goes via v1 patches, and thus the
on-disk format is as for v1.
I would like to clean up the code to remove ComP and use Repository (FL
Patch) instead. The rationale is getting consistency between v1 and v2
patches, and to eliminate the possibility of representing nested lists,
which I don't think is useful and just complicates code. However, we still
need to maintain the differences in the handling of { } to maintain
compatibility with older darcs.
I guess the first decision is what the right on-disk format is. The
downside of no { } at all is that FL (FL p) doesn't roundtrip because you
can't tell where the boundaries between the inner lists are. (In fact,
right now parsing FL (FL p) stack overflows because the parser can't cope,
but that's fixable.)
I also don't think the current behaviour of v1 patches, where a single
patch is printed without { }, is particularly useful.
So I suggest that for v3 patches and beyond, we should have { } around all
lists, and that therefore FL should be changed to print and read { }. For
reading I think it's ok for FL to accept either format. For printing v2
patches, it needs special casing to not print the { }. For printing v1
patches (and pending), this will mean a behaviour change for how single
primitive patches are printed, but I think that'll still be handled fine
by older versions of darcs.
Any thoughts/objections? I actually have most of this coded up already,
though there are a few unrelated issues left to sort out before I submit,
and also it would be nice to find some way of doing automated tests of
backwards compatibility.
Cheers,
Ganesh
More information about the darcs-users
mailing list