[darcs-users] GSoC: network optimisation vs cache vs library?

Sun Apr 18 20:07:19 UTC 2010

On Wednesday, 2010-04-14, at 18:18 , Max Battcher wrote:

> All of which goes to show that Trac+darcs still isn't well  
> optimized for caching darcs queries

False -- it caches the results of these queries in a sqlite db. (This  
is thanks to Lele Gaifax's work on the Trac-Darcs plugin.)

> or dealing gracefully with with long running command invocations...

True -- it gives no indication to the user that it is making progress  
and it hard times-out after about 30 seconds. Also it locks the  
entire database so other users can't do anything even though they  
don't need the results of the query. (This is due to Trac design and  
is hard for Lele to fix in the Trac-Darcs plugin.)

> I still say the Trac reliance on CVS/SVN-style revision numbers  
> means that Trac is absolutely not well-adapted for serving darcs  
> repositories. It may be "revision 1782" to Trac, but 'show contents  
> --match "hash 2008..."' is "commute this file to how it would  
> appear if only the patches preceding or equal to this one with a  
> timestamp from two years ago were applied" to darcs. (Which ends up  
> being quite possibly not a "real" historic version at all, and  
> which does quite a bit of work to be so easily susceptible to  
> crawlers/DDoS/accidental DDoS...)

I'm not precisely sure what you mean by a "real" historic version,  
but we find this linear ordering eminently useful. We disable  
obliterate, amend-record, and optimize in the canonical repository,  
and the order that the patches were added to that repository is by  
our definition the canonical order. People may have different orders  
in their personal repositories, but we know that we all see the same  
order when we look at the trac, and we can use that fact. For  
example, we refer to patches by their number e.g.

http://tahoe-lafs.org/trac/tahoe-lafs/changeset/4268

when we mean "This patch in the context of all patches that preceded  
it in the canonical repository", or by their hash, e.g.:

http://tahoe-lafs.org/trac/tahoe-lafs/changeset/20100416220935-93fa1- 
f06ccbc164c83632380abcddd461d1296618a99d

when we mean "This patch by itself.".

If someone is talking about patch [4267] and patch [4268], then you  
know that they are talking about two patches that arrived one after  
the other in the canonical repository, although you don't know when  
they were written or recorded, whether they were recorded on the same  
computer as one another, or whether they appear in that same order in  
any particular personal repository.

For another example, we know that the official snapshot tarballs that  
have been built have been built in that order:

http://tahoe-lafs.org/source/tahoe-lafs/snapshots/

And, when someone runs Tahoe-LAFS it emits its version, which  
includes that number. Here is a live Tahoe-LAFS node -- you can see  
the version numbers on the upper right-hand side of the page:

http://testgrid.allmydata.org:3567

> 20secs doesn't sound unreasonable from the point of view that you  
> are asking darcs to create an entire new "version" of a file. While  
> I expect there is plenty of performance left to squeeze from this,  
> I don't think a query like this one will ever near git/svn/...  
> historic revision lookup, because this is an entirely different  
> beast. It doesn't make sense for me for Trac to rely on it for  
> common queries.

Currently Trac-Darcs issues these queries only when the user has  
indicated that this is what they want to see, for example by browsing  
the Trac-Darcs page about a patch and then clicking the link to see  
the file as it existed (in this repository -- the canonical one)  
before and/or after the patch was applied. It isn't the most common  
query. It is probably the fourth most common, after the Revision Log  
(http://tahoe-lafs.org/trac/tahoe-lafs/log/ ), the Changeset (http:// 
tahoe-lafs.org/trac/tahoe-lafs/changeset/ 
20100416190404-93fa1-9d35adab517f7d7b104a1f397fec8cd85399c7ae ), and  
the Browse Source (http://tahoe-lafs.org/trac/tahoe-lafs/browser ).  
Many people probably never use the view-of-historical-revision  
feature at all.

(I happen to use it a lot, and I complain on darcs-users a lot, so  
you've been hearing about it...)

By the way way, there is a browser like this showing the darcs  
repository which is linked from the front page of http://darcs.net:

http://tahoe-lafs.org/trac/darcs-2/browser

Regards,

Zooko