[darcs-users] Regular Expression libraries and linker errors

Trent W. Buck trentbuck at gmail.com
Sat Oct 3 13:12:28 UTC 2009

Jason Dagit <dagit at codersbase.com> writes:

> 5) The default regex that we provide (eg., on darcs init), are not fully
> optimized and may not do what people expect in all cases.

What regexps are used by darcs init?

> Here is what I propose:
> a) We switch to regex-posix.

Fewer dependencies, so it suits me.

> b) We invest a small bit of time writing a function to optimize a list of
> simple regexes into one big but efficient regex.

The objection to this is that the resulting regex is unreadable, and
apparently people want to be able to edit _darcs/prefs/boring as well as
merely appending to it.

> Originally it looked like:
> \.foo$
> \.FOO$
> Specifically, our current list looks like:
> \.(foo|FOO)$

I made that change, because I was sick of the list being so damn long.

> I think we should transform that to:
> \.[fF][oO][oO]$
> I think that better captures the case-insensitive intent that we had.

I can't remember why this was kiboshed... possibly "readability".
Other than being fugly, I have no objection to it.

> I can also imagine other stop gap proposals like making a standalone
> commandline tool that can optimize the regexs and write them back out
> so people have a chance to review them.

Emacs can do this for OR'd literals.  Here's one I prepared earlier:

    ## From Darcs 1.0.9

I think it basically looks for common substrings, and converts them to
[abc] or (aa|bb|cc) as appropriate.

> But, having darcs optimize them on the fly (or adding that to the
> regex-base library) is nice because then they throw any old regex at
> darcs and it tries to clean it up before using it.


More information about the darcs-users mailing list