[darcs-users] Escaping of hunks and file names
alex at byzantine.no
Mon Nov 8 16:03:33 UTC 2004
David Roundy wrote:
>>I know next to nothing about Unix terminal emulation, so forgive me if
>>this is the expected behaviour. I hadn't noticed the colourization
> No, this isn't obvious. Oddly enough, it seems that perl does the same
> thing. I don't know what the haskell standard library "isTerminal"
> function checks, but apparently when these languages call external
> programs, they somehow are able to trick haskell into thinking it's in a
> terminal. :(
Given the lack of general Unix support for Haskell's "isTerminal", I see
two options here:
1) Figure out how to fix the terminal check. Clearly other programs ("ls
--color=auto", for example) do this successfully, apparently by using
isatty(fd). Is it easy to call arbitrary C library functions from
Haskell? Can you get at the output stream's file descriptor?
2) Add something like a --non-terminal option to all of Darcs' commands,
allowing one to force the desired behaviour.
> The hex escaped are how things show up on terminals--it's an attempt to
> keep from messing up the terminal configuration by displaying escape
> characters (except for color codes that are intentional). On a terminal,
> the hex escaped characters always show up blue...
> If darcs isn't in a terminal, it never should escape.
(Btw, possible bug: "darcs annotate" does not do the to-terminal hex
>>Outputting file names as UTF-8 is fine. However, why is Darcs escaping
>>the UTF-8, and in such a non-standard (\yy\) format?
> Only whitespace (and backslashes) are escaped in that format, and the
> stupid format is because that is what I came up with when I was coding this
> ages back. Technically, only spaces and newlines actually need to be
> escaped, since they would mess up darcs' parsing of patches--tabs and
> carriage returns aren't used in darcs patch format as delimiters.
> Basically, I didn't put much thought into it, since at the time I was
> thinking it wouldn't often come into play, since I consider white space in
> filenames a bad idea, and backslashes in filenames also don't greatly
> enhance the portability of your code.
Would you be willing, at this stage, to move to a more Unixy escaping
syntax? The principle of least surprise etc. When people all over start
writing scripts, it's going to be one of Darcs "little warts", I think,
that people complain about.
>>However, XML handles unescaped Unicode (or UTF-8) just fine, as long as
>>you declare the appropriate encoding at the beginning, eg. <?xml
> We can't really declare the encoding, since we don't know what the encoding
> of the user's data is.
The default encoding in XML is UTF-8. So whether or not you declare it,
you must still adhere to a specific encoding.
For file names, enforcing UTF-8 -- and therefore pretty much outputting
them verbatim -- might not be such a bad idea.
For actual file data, the best way to do this, I think, is to escape
everything above 127 as character references, eg. €. I think you
can safely output everything below verbatim. But you can't output all
characters as-is because certain combinations can be construed as UTF
control sequences even when they aren't.
>>Speaking of output, Darcs also needs improvement when it comes to
>>detecting error conditions. For example, "darcs add a-non-existent-file"
>>will return with exit code 0, as will "darcs add a-file-already-added".
>>A script could perceive the lack of messages as meaning success and
>>everything else meaning error, but it's not exactly robust. One of the
>>things my code needs to do is determine whether a file is recorded in
> The problem here is that often one will run
> darcs add *
> trusting darcs to add only the relevant files. It's hard to see in this
> case exactly what the error code should indicate. I suppose if *no* files
> are added, we could consider that failure. It would be sufficient if you
> were adding a single file, but mightn't that definition of failure cause
> trouble when adding several files at once? I guess the answer may be that
> careful scripts should add files one at a time? Or maybe we should fail if
> any of the files couldn't be added, and figure that when run interactively
> the error code will be ignored and the user will just read the message.
The problem really is that 1) you have a composite command that
continues regardless of sub-failures, and 2) you want to capture the
aggregate status in a single numeric result value. The only completely
sane solution is to output the individual results, perhaps CVS-style:
$ darcs add *
or whatever, and then encapsulate the outcome in a three-state value: 0
(everything added, perhaps boring files ignored), 1 (some added, perhaps
some failed) or 2 (all failed).
More information about the darcs-users