[darcs-users] Binary data (& XML)
Jan Scheffczyk
jan.scheffczyk at gmx.net
Wed Jun 18 05:58:07 UTC 2003
Hi all,
IMHO there is defnitely a need to handle binary data, especially in
conjunction with XML.
E.g. Openoffice stores data as zipped XML files.
Some folks at M$ also seem to like the XML idea ;-)
> I haven't gotten around to this largely because I haven't had need of it,
> but also because I haven't decided the best way to deal with binary files.
> For example, I could treat a tar.gz archive as a directory, which would
> provide version control of the files within the archive (assuming they are
> text).
Yes that would definetely help in the office context.
But I'm afraid we need another patch to handle XML files correctly.
Recently I came across the following proposal:
@inproceedings{585073,
author = {Raymond K. Wong and Nicole Lam},
title = {Managing and querying multi-version XML data with update logging},
booktitle = {Proceedings of the 2002 ACM symposium on Document engineering},
year = {2002},
isbn = {1-58113-594-7},
pages = {74--81},
location = {McLean, Virginia, USA},
doi = {http://doi.acm.org/10.1145/585058.585073},
publisher = {ACM Press},
}
Implementing their XML deltas as patches should be possible in Haskell, making
use of XML parsers like HaXML, XmlToolbox, or HXML.
In sum, adding support for XML and copressed data would open the road to
handle office stuff, for which there is a huge market.
> A more normal (and more general) solution would be to introduce binary
> deltas, which would still be a bit of a pain, and much less entertaining.
> It also has the disadvantage that you lose a lot of the benefits of version
> control, since you can't merge binary patches that don't understand their
> content.
Huh, binary patches seem complex to me and I see no real benefit.
Maybe we should simply add a "dummy binary patch" saying
"replace all content in <oldBinFile> by all content in <newBinFile>"
This would correspond to CVS, which IMHO has no binary diffs and stores
complete(!) files instead.
> Another interesting type of patch I've considered is an image change patch
> for images files (I'd have to find a good image reading library) that
> change in content but not in size. In that case you could perform rather
> interesting merges of changes, but while it might be fun to code, I'm not
> sure how useful it would actually be.
Fortunately, some image formats are pure texts, e.g., SVG.
Cheers,
Jan
More information about the darcs-users
mailing list