[darcs-users] on-disk formats of patches

Ganesh Sittampalam ganesh at earth.li
Sat Oct 16 13:47:43 UTC 2010


Hi,

There are some differences in the structure of the datatypes for v1 and 
v2 patches, and also in the way that they are stored on disk. I'm hoping 
to clean this up while still maintaining backwards compatibility.

The fundamental distinction is that v1 patches (Patch) are stored in 
repositories of type Repository Patch, which in turn means that a full 
recorded patch is of type Named Patch. This works because one of the 
constructors of Patch is ComP :: FL Patch -> Patch - i.e. Patch itself has 
the ability to store multiple primitive patches.

For v2 patches (RealPatch), the repository is of type Repository (FL 
RealPatch), and there is no ComP equivalent.

One consequence of this distinction, which I suspect is accidental, is 
that the on-disk format looks different. FL p is just printed out as each 
p in sequence, with no separators. However the ComP constructor of Patch 
is printed out with { } surrounding the sequence of inner patches.

Although in theory ComP could be nested, I've never seen this happen. So 
the end result is that the on-disk format for v1 is a sequence of patches 
surrounded by { } - unless there's a single primitive patch in which case 
{ } is omitted. For v2 the on-disk format is a sequence of patches with no 
{ }.

For pending, the printing/reading always goes via v1 patches, and thus the 
on-disk format is as for v1.

I would like to clean up the code to remove ComP and use Repository (FL 
Patch) instead. The rationale is getting consistency between v1 and v2 
patches, and to eliminate the possibility of representing nested lists, 
which I don't think is useful and just complicates code. However, we still 
need to maintain the differences in the handling of { } to maintain 
compatibility with older darcs.

I guess the first decision is what the right on-disk format is. The 
downside of no { } at all is that FL (FL p) doesn't roundtrip because you 
can't tell where the boundaries between the inner lists are. (In fact, 
right now parsing FL (FL p) stack overflows because the parser can't cope, 
but that's fixable.)

I also don't think the current behaviour of v1 patches, where a single 
patch is printed without { }, is particularly useful.

So I suggest that for v3 patches and beyond, we should have { } around all 
lists, and that therefore FL should be changed to print and read { }. For 
reading I think it's ok for FL to accept either format. For printing v2 
patches, it needs special casing to not print the { }. For printing v1 
patches (and pending), this will mean a behaviour change for how single 
primitive patches are printed, but I think that'll still be handled fine 
by older versions of darcs.

Any thoughts/objections? I actually have most of this coded up already, 
though there are a few unrelated issues left to sort out before I submit, 
and also it would be nice to find some way of doing automated tests of 
backwards compatibility.

Cheers,

Ganesh


More information about the darcs-users mailing list