[darcs-users] [patch374] the state of the adventure

Fri Sep 3 07:19:13 UTC 2010

Hi,

Max Battcher <me at worldmaker.net> writes:
>> 5) the tests pass, although I had to remove one --xml
>>
>> For point 5, I don't think we should really retain annotate --xml. My guess is
>> that a simple regular language would be much better for both us and darcs-using
>> tools. At least Lele (tracdarcs) agrees. The proposed format (to be
>> implemented) is
>>
>> <patch-hash>  | line of text
>> <patch-hash>  | another line
>> ...
>>
>> which is much easier to parse than the XML and also avoids the validity issues
>> (since we currently don't have code that'd enable us to generate actual valid
>> XML).
>
> Hmm... I'm not sure removing --xml is a good idea in the long-term. No matter
> how easy to parse the output is on day one, it isn't guaranteed to stay that
> way and eventually you get into compatibility fights between those that wish to
> keep the output parse-able (particularly with older tools) and those that want
> "prettier" output for humans. That is never a good place to end up.

I am not sure I see your point. The above is a dedicated
machine-readable format. The human readable variant is different. (See
the start of this thread or http://pastebin.dqd.cz/hSYd/ for example.)
Also, the format split (machine/human) should make it easy enough to
evolve the 'human' format without compromising compatibility.

> Honestly, I think the best course of action would be to find the appropriate
> haskell library to do XML output correctly. However, I'd be up for discussing
> the possibility of another markup format in its stead. For instance,
> --json-output might be a good compromise that can be easier to produce valid
> output than XML.

JSON is a CFL. What's so wrong with regular languages? People are going
to use regexps to parse the output *anyway*, so why not make it actually
the right thing to do? Pulling out a complex CFL parser for what is
essentially (ab)* is just a source of pain for everyone.

How many times you wished you could extract patch hashes from darcs
changes easily? How many times you actually used a real parser to get
it? One example for all:

    hashes <- grep "hash='" <$> lines <$> darcs [ "changes", "--xml", f ]
    hash <- case hashes of
      [] -> fail $ "Bad file to show contents of: " ++ show f
      _ -> return $ extract (last $ take n hashes)
    return (f, hash)
    (...)
  where extract x | null x = error "Bad line in changes --xml..."
                  | "hash='" `isPrefixOf` x = take 61 (drop 6 x)
                  | otherwise = extract (tail x)

(That's from darcs-benchmark, I admit to writing it and I actually don't
know how to get that info in less code. Even though the JSON parser from
Hackage is fairly simple, it would still likely take more code, while if
the output was actually regular, the code wouldn't be a hack and there
would be less of it *and* it would be faster.)

Yours,
   Petr.