[darcs-users] machine-readable formats

Petr Rockai me at mornfall.net
Fri Sep 3 10:35:43 UTC 2010


Hi,

Lele Gaifax <lele at nautilus.homeip.net> writes:
> I second that, XML is expecially overkill when the structure is simple
> enough that most "plain-text markup" would be up to the task. JSON is
> simpler, but still not simple enough. I'd welcome (a strict subset of)
> something like YAML (see http://yaml.org/ for an example) instead, that
> offers a nice readable stream, based on common glyphs and indentation
> to render arbitrary structures.

I had a look at YAML, and it seems nice and I think we can compromise
for a regular subset of YAML quite easily. For changes --yaml, it could
look something like:

- name: Remove redundant set -ev from tentative_revert.sh.
  author: Petr Rockai <me at mornfall.net>
  date: 2010-09-03 00:13:27 +0
  hash: 20100903001327-fb03a-045b1923d4b1b1b432d3e3b03840101f4f9891e3
  salt: 3927beb21fed7484d681854a7d7df2c5
  comment: |
    Some fancy comment that
    spans multiple lines.
- name: Update changes_with_move for differences in annotate.
  author: ...
  (etc.)

For readability, it would be nice if empty keys were omitted. I don't
think this poses any significant problems with parsing. Also, you don't
need a full YAML parser to process this, but if you already have one, it
should parse the output just fine.

The code to list just the patch hashes would then be
grep "^  hash:" | cut -c9-

I am, however, a bit puzzled as to what to do about annotate, since YAML
doesn't seem to be very well suited for that. One option would be

- <hash>: line
- <hash>: line
...

which makes it a list of singleton hash->line maps... which may be about
as convenient as it gets with the data model at hand. For now, I'll
implement this and try to sell it as annotate --yaml in adventure. :)

We could then adopt the same convention for changes --yaml -s:

- name: Update changes_with_move for differences in annotate.
  [snip some keys]
  changes:
  - M: [ tests/changes_with_move.sh, -10, 3 ]
  - A: ...

> IMO, the main problem is the encoding, even with darcs current XML
> output.

Well, for annotate in the above form, it's not much of a problem (we
can't know the encoding of the annotated files anyway). The only thing
we need (and I haven't done yet) is escape non-printable characters in
the output.

For changes, the metadata could be probably escaped like it is in the
terminal output (those <U+NNNN> bits), just using YAML escapes. I think
with Reinier's changes, we do a reasonable job of letting valid UTF-8
through, while catching any non-UTF garbage.

Yours,
   Petr.


More information about the darcs-users mailing list