[darcs-users] Language aware darcs

Ian Lynagh igloo at earth.li
Fri Jan 14 01:21:07 UTC 2005


On Thu, Jan 13, 2005 at 06:47:01PM -0500, Michael Conrad wrote:
> 
> What if we had 'parse' patches that told darcs how to break the text of a
> file?  For instance, assume that a file is one indivisible chunk of data,
> like a binary patch.  Then, darcs runs into this patch:
> 
> parse ./filename split (' ')
> 
> Darcs then goes through and breaks the file on any occourance of a space
> character, retaining the space as a token.

This is similar to an idea I had. I started off with wanting darcs to
transparently handle gzipped files (I have a large XML file I want
accessible over the web; my editor and {browser and/or server} can
transparently handle it being compressed, so both for consistency and
disk space reasons it's a pain that darcs can't). While this could be
hardcoded it would be nicer to say

<create foo.xml as normal>
darcs store --to=gzip --from="gzip -d" --as=foo.xml.gz foo.xml
darcs record
<darcs does gzip < foo.xml > foo.xml.gz and makes a note somewhere, e.g.
 _darcs/meta/foo.xml contains this info. It can be the result of showing
 a lookup table of type [(String, String)] to make it extensible>
Future edits of foo.xml are recorded by first running gzip -d on the old
and new and then continuing as normal.

In this case the from command gives you the real file contents, but it
would equally well give you a different view of it, e.g. a tokenised
view with tokens and whitespace on alternating lines or somesuch.
It's not going to make very pretty patches for normal editing, though,
so maybe you would also want to be able to specify it as a one off in a
more lightweight manner.

The main problem with this scheme is that you need the commands to work
everywhere there is a repo. Perhaps special cases are a saner way to go,
short of going the whole hog with a DSL.


Also, I think ideas involving parsing languages would require a lot of
work to implement properly. Unless I am mistaken there are issues
whenever you fix a bug in your language code, or extend it with new
language features, where the valid commutations with other patches can
change. Also, sometimes you /do/ have multiple languages in one file,
e.g. {LaTeX and Haskell in .lhs}, {$scripting_language and HTML}.
Finally, some common languages are a pain to deal with; e.g., to lex
TeX correctly you need to execute it, and once preprocessors get thrown
in things become even uglier.


Thanks
Ian





More information about the darcs-users mailing list