[darcs-devel] native XML/OpenOffice Patches in Darcs

Jan Scheffczyk jan.scheffczyk at gmx.net
Fri Jul 29 12:01:21 PDT 2005


Hi folks,

we have thought about supporting native XML patches in Darcs.
Since OpenOffice documents are essentially XML, this should be a way 
to support native OpenOffice patches.
The idea behind is to give applications a chance to analyze and query 
those native patches, e.g., for better merging etc. ...

Initially, we did some performance measurements that confirmed that 
for OpenOffice XML trees (wide and flat) XyDiff seems to be 
appropriate.

http://www-rocq.inria.fr/gemo/XyDiff/cdrom/www/xydiff/index-eng.htm

There will be a research paper on this year's DocEng in Bristol.

Then I grabbed a rather recent darcs copy and tried to implement 
native XML patches based on XyDiff.
This is what I did:

- Haskell Interface to XyDiff (patches are invertible)
- XML Patch Wrapper Haskell module to abstract from actual XML diff 
implementation
- native XML patching for XML files (based in regexes similar to 
binary files) 
- simple diffing/patching/inversion for native XML patches
- autoconf support for this very experimental feature (nothing should 
break if you do not --enable-xml)
- XML patches are fully abstract to Darcs


limitations and issues

- XML files in _darcs/current are equivalent to those in the actual 
repo (but not necessarily equal!)
- XML patches are only applicable to XML files, no other patches 
should be applied to XML files
- XML files remain XML files forever (i.e., no darcs mv foo.xml 
foo.txt), although this is not enforced at the moment

so this can indeed fail:

hunk patch A
xml patch B
invert (xml patch B)
invert (hunk patch A) -- this is not necessarily applicable!

A solution could be to convert an XML file to some canonical form 
first and to create an appropriate hunk patch.

- XyDiff reads the DTD of an XML file from the reference within the 
XML file in order to improve speed and to check the result of a patch 
application.
I am trying to find a way to provide the DTD to XyDiff separately ...

- XML patches do not commute nor do they coalesce  at the moment
Here we would need to go deeper into the internal patch format of 
XyDiff.
Since XML Patches are essentially abstrace FilePatches this only 
affects patches to the same XML file.

That's it for the moment.

Any critics/comments/suggestions/remarks/ideas/helps are most welcome!
Does someone like some patches to discuss more technical stuff?
Should some non-breaking (of course) patches go into darcs-unstable?

Cheers,
Jan




More information about the darcs-devel mailing list