[darcs-users] darcs patch: Refactor actual_boring_file_filter. (and 2 more)

Trent W. Buck trentbuck at gmail.com
Mon Jan 19 09:27:26 UTC 2009


On Mon, Jan 19, 2009 at 08:59:38AM +0000, Eric Kow wrote:
> Refactor actual_boring_file_filter.
> -----------------------------------
> > +-- | From a list of paths, filter out any that are within @_darcs@ or
> > +-- match a boring regexp.
> >  actual_boring_file_filter :: [Regex] -> [FilePath] -> [FilePath]
> > hunk ./src/Darcs/Repository/Prefs.lhs 289
> > -actual_boring_file_filter regexps fs =
> > -    filter (abf (not.is_darcsdir) regexps . normalize) fs
> > -    where
> > -    abf fi (r:rs) = abf (\f -> fi f && isNothing (matchRegex r f)) rs
> > -    abf fi [] = fi
> 
> > +actual_boring_file_filter regexps files = filter (not . boring) files
> > +    where boring file = is_darcsdir file ||
> > +                        any (\regexp -> isJust $ matchRegex regexp file) regexps
> 
> No more normalize?  Why not?

Zoiks!  It was removed accidentally.

> Also, you can eta-reduce your version if you want:
> 
>   actual_boring_file_filter regexps = filter (not . boring)

I'm no longer a fan of eta-reduce.  I tried point-free form (in the
guise of Factor, a Forth-like language) and discovered that it makes
my brain hurt when trying to trace execution.

> Refactor darcs_binaries.
> ------------------------
> > Trent W. Buck <trentbuck at gmail.com>**20090117135906
> >  Ignore-this: 76174e648ec72ace6f26e6372a4e816
> >  
> >  Instead of creating a very long list of simple regexps, this now
> >  creates two regexps of the form \.(a|b|...|z)$, the latter being
> >  uppercase.  I have also combined some extension variants (e.g. .jpe?g
> >  instead of two entries .jpg and .jpeg) and sorted the extension list.
> >  
> >  I've elected NOT to use Emacs' regexp-opt to build a faster regexp,
> >  because that would make it very hard for end users to find and remove
> >  an extension from the default list.  I think merging .jpe?g is OK.
> 
> But would the very long list of simpler regexps easier for human beings
> to work with?

IMO no, because they scroll offscreen.

If people favour the old approach, I at least advocate

    \.(png|PNG)$

instead of

    \.png$
    \.PNG$

> (i.e. you look into binaries file and instantly understand
> how to add something by copy and pasting... notably by inserting a whole
> new line, not by modifying a pre-existing one)

Hopefully the extended commentary makes it easier to understand how to
work with the file (and indeed, what it does).

> Also, I wonder if there is a way within the regexp to request case
> insensitive matching

Yes, but it's extraordinarily ugly:

    \.[pP][nN][gG]$

To add something like perl's /\.png$/i, where the 'i' signifies
insensitivity, would require a change to the on-disk format of the
_darcs/prefs/binaries file.  Unless you want to change the *semantics*
of that file, such that it ALWAYS matches case-insensitively?

Under what circumstances would we want to mark as a binary (or as
boring) paths by a case-sensitive regexp?  I guess you wouldn't expect
things like ./src/rcs/ to be ignored just because ./RCS/ is a metadata
directory.


More information about the darcs-users mailing list