[darcs-users] Request for early user feedback on patch index darcs

Ben Franksen benjamin.franksen at bessy.de
Sun Jul 22 23:35:59 UTC 2012


Aditya wrote:
>  Patch index improves the speed of changes and annotate commands by
>  quickly
> identifying the patches that modified a given file. This optimization
> is especially useful for large repositories.

How large is the expected gain? Can you share some performance numbers, i.e. 
comparison of commands on e.g. darcs repo with and without index?

>  The public repository is at:
>  http://den.darcs.net/Aditya/darcs-patch-index
> 
>  Patch index will be automatically created upon get or init.
> 
>  If you run changes/annotate/record/.. using patch index darcs on an
> existing repository, patch index will be created automatically.
> Alternatively, run optimize --patch-index to exclusively create patch
> index.
> 
>  You can disable patch index using optimize --no-patch-index, and enable
>  it
> back with optimize --patch-index. If you wish to disable patch index
> at creation, pass --no-patch-index. A lazy get will implicitly disable
> patch index, as you require to have all patches to create a patch index.

I cannot see a use case for optimize --no-patch-index, except for 
measurements or debugging; however, it doesn't hurt to have it, either.

> at creation, pass --no-patch-index

>  I request users to try patch index darcs, and give feedback on potential
> ui changes, bugs etc
> 
>  One of the concerns I already have is automatic creation of patch index
>  on
> existing repositories. The time it takes to create patch index increases
> dramatically based on buffer
> cache<http://oss.sgi.com/LDP/LDP/sag/buffer-cache.html>.
> Creating a patch index just after get takes 6 sec, whereas it takes 1min
> after cleaning the cache(This is for darcs development repo). This
> suggests that a user could experience a potentially large delay on
> changes/record/.. .

I share your concern. Creating the patch index for a whole repo in one go 
will certainly take a long time for large repositories. So, to overcome this 
problem, I would propose to create the index incrementally. The idea is that 
the index caches what the darcs command finds out, letting the index grow 
over time and use (explicit optimize --patch-index notwithstanding). No long 
extra delays for get, and the extra time to cache the information into the 
index would probably be negligable. For darcs init, it makes sense to always 
start with an index, and update the index on each record or pull.

The explicit optimize command has the advantage that I can let darcs do the 
work in the background, maybe even at night by a cron job when I don't care 
how long it takes.

Cheers
Ben



More information about the darcs-users mailing list