[darcs-users] GSoC: network optimisation vs cache vs library?

Zooko O'Whielacronx zookog at gmail.com
Mon Apr 19 02:56:17 UTC 2010


On Sun, Apr 18, 2010 at 3:02 PM, Max Battcher <me at worldmaker.net> wrote:
>
> Browsing/link-following (HTTP GET) is not a clear indicator that a user
> wishes to see a page.

In this case we have a robots.txt exclusion file and the only times
the HTTP GETs are called (as far as I've noticed) is when a user wants
to see that page.

> I've made the suggestion to Lele that Trac+Darcs, in the case of "hard operations" such as this one, should make use of a processing queue

This would be a change to Trac -- it can't really be done in a
revision control plugin for Trac. Trac assumes that all calls to
revision control tool to browse history will return quickly.

> I also made the suggestion that Trac+Darcs could provide a command to
> pre-populate the cache (easily built on top of a processing queue), which
> could be invoked outside of normal "business hours" (via cron) or even as a
> post-hook to darcs operations. (My own "darcsforge" system attempts to do as
> much of its caching as it can in an apply post-hook.)

Hey that's a good idea! Thanks. Let's see, I could wget the most recent change:

http://tahoe-lafs.org/trac/tahoe-lafs/log/?limit=1

Then I could easily parse out the hyperlink to browse the revision, for example:

$ grep "Browse at revision" index.html\?limit\=1
                      <a title="Browse at revision 4268"
href="/trac/tahoe-lafs/browser/?rev=4268">

Then I could wget that URL:

http://tahoe-lafs.org/trac/tahoe-lafs/browser/?rev=4268

Hm, but now it gets slightly more complicated. The next step is
probably to call "darcs show files", parse out the version number
and/or patch ID, and construct a URL like this for every file and
directory in the repo:

http://tahoe-lafs.org/trac/tahoe-lafs/browser/setup.cfg?rev=4268

Lele: do you think it would be a good idea to do something like this?
When the patch is fresh it is cheap to do these darcs queries.

> (Also, if I may throw in a further recommendation, Zooko: sqlite is not well-optimized for web serving needs and you may want to consider installing a more traditionally optimized database server for your Trac needs, such as PostgreSQL.)

Well I don't think that is causing any problems for us. The timings on
the sql queries are always less than a few milliseconds. Here are some
of the diagnostic outputs:

2010-04-18 19:40:43,073 Trac[dbutil] DEBUG: SELECT dnc.node_id,
dnc.rev, dnc.path, dnc.parent_id
           FROM darcs_node_changes AS dnc
           WHERE dnc.repo_id = ''
             AND dnc.rev = (SELECT max(dnc2.rev)
                            FROM darcs_node_changes AS dnc2, darcs_nodes AS dn
                            WHERE dnc2.repo_id = dnc.repo_id
                              AND dnc2.node_id = dnc.node_id
                              AND dn.repo_id = dnc2.repo_id
                              AND dn.node_id = dnc2.node_id
                              AND dnc2.rev <= 5883 AND ((dn.remove_rev
IS NULL)       OR (dn.remove_rev > 5883))) AND dnc.parent_id = 306: 1
executions in 477 millisecs, fastest run 477 millisecs, slowe
st 477 millisecs, average 477 millisecs

Oh, except hold up, what's this?

2010-01-18 19:28:24,565 Trac[dbutil] DEBUG: SELECT dnc.node_id,
dnc.rev, dnc.path, dnc.parent_id
           FROM darcs_node_changes AS dnc
           WHERE dnc.repo_id = ''
             AND dnc.rev = (SELECT max(dnc2.rev)
                            FROM darcs_node_changes AS dnc2, darcs_nodes AS dn
                            WHERE dnc2.repo_id = dnc.repo_id
                              AND dnc2.node_id = dnc.node_id
                              AND dn.repo_id = dnc2.repo_id
                              AND dn.node_id = dnc2.node_id
                              AND dnc2.rev <= 7926 AND ((dn.remove_rev
IS NULL)       OR (dn.remove_rev > 7926))) AND dnc.parent_id IS NULL:
1 executions in 5.67 seconds, fastest run 5.67 seconds, slowest 5.67
seconds, average 5.67 seconds

Here's a sql query that took 5.67 seconds. I have no idea what it is
doing, though. There aren't anywhere near 7000 patches in the
tahoe-lafs repository. Lele: do you know what this is and if it is a
real performance issue?

Regards,

Zooko


More information about the darcs-users mailing list