[darcs-users] cgi script thoughts

David Roundy droundy at abridgegame.org
Sun Jul 4 13:45:13 UTC 2004

On Sun, Jul 04, 2004 at 02:43:13AM -0700, Will wrote:
> Hi Simon,
> Simon Michael <simon at joyful.com> writes:
> > Hello Will Glozer, all.. here are some notes from setting up the new
> > CGI script on my server.
> >
> > I moved cgi.conf into cgi/ and expanded cgi/README a little - patches
> > sent to David's repo.
> I suppose the cgi.conf is in the root dir because it is also used by
> the original darcs_cgi.lhs.

Yeah, but may as well move it, as it'll clean things up a bit (and I'm
still planning to some day remove darcs_cgi).

> > configure.ac (I think) sets sysconfdir to /usr/etc/darcs on my system
> > when it should be /etc/darcs. Not sure why that is.
> The version of darcsrv that I handed over didn't use the config file
> or have any autoconf config, so I'll defer to David on this one.

Hmmmm.  This is a problem.  The trouble is that the BSD hierarchy puts the
sysconf directory in /usr/local/etc or /usr/etc, while the LFS puts it
always in /etc.  I prefer the LFS, but don't want to alienate the BSD
folks.  Currently it's set so if you don't specify a --prefix, darcs
installs its conf files in /etc, and the rest in /usr/local.

I could change this... perhaps I could revert to /etc if ${prefix}/etc
doesn't exist? That way as long as you don't have a /usr/etc (my debian
doesn't), you'd get /etc as you want, but BSD people would get /usr/etc, as
they would want.

> > The docs describe this as "darcsrv" and the script is named
> > "darcs.cgi". Is this optimal ? Should the filename be darcsrv ?
> >
> > The darcsrv link at the bottom of the UI - hmm, it was linking to the
> > darcsrv home page, which was inaccessible. Now it's linking to the
> > darcs home page on abridgegame for some reason.
> I named the project 'darcsrv' before I knew that David wanted to
> include it as part of darcs; it does seem nice to have a convenient
> name to refer to rather than "that darcs cgi script" =) Since it is
> part of darcs now I don't intend to maintain a product page.
> That said, I think 'darcs.cgi' is more aesthetically pleasing than
> 'darcsrv.cgi', but I don't have strong feelings on the matter.

Absolutely.  We could call rename the script as just "darcs.cgi", perhaps,
which would clear up possible confustion?

> > I have two reservations about running this publicly:
> >
> > 1. Like the old cgi script, email addresses are served in the clear,
> >    exposing contributors to spammers. This is a problem. I guess we
> >    need to at least do a mailman-style "x at y.com" conversion ? Any
> >    ideas where in the script to do this, or should it be done in darcs
> >    itself ?
> It should be pretty easy to do this in the XSLT templates, but is it
> really effective?  I would expect the email harvesters to be able to
> parse many simple obfuscations and it would be an inconvenience to
> legitimate users.  Still, if there is a desire to do this as a default I
> will update the templates.

My general theory here is that you aren't required to use an email
address... perhaps that isn't made clear in the docs.  So I figure if you
don't want your address publicly known you could just use something else.
On the other hand, as Will says, if someone wants this obfuscation, I
have no objection either.

> > 2. Robot safety. The annotate links seem relatively expensive, taking
> >    several seconds. Also these pages form a large network of unique
> >    urls.. how many I'm not sure. This seems to add up to yet another
> >    way for robots (or a deliberate DDOS attack) to stress my web
> >    server. Any thoughts on this ?
> This is an important security consideration, everything but the file
> listings causes invocations of darcs which can be very expensive in
> terms of processor and memory use.  I would suggest using rlimit or
> whatever your OS's equivalent is to limit the resource consumption.
> I've also envisioned using something such as mod_cache to cache
> responses and avoid invocation of the CGI at all.

Hmmmm.  I've never heard of mod_cache... that sounds like a good idea (as
long as you don't cache the wrong pages).  Perhaps we could also add a
little locking to prevent too many cgi requests from running at once? I
don't imagine it would be too hard to have a file keep track of the number
of darcs.cgi calls currently running, and return a "try again later" page
if we're too busy.
David Roundy

More information about the darcs-users mailing list