[darcs-users] Is darcs optimize --compress still useful?

Trent W. Buck trentbuck at gmail.com
Tue Mar 24 01:14:48 UTC 2009


On Tue, Mar 24, 2009 at 11:53:08AM +1100, Trent W. Buck wrote:
> On Tue, Mar 24, 2009 at 11:37:11AM +1100, Trent W. Buck wrote:
> > As a case study, let's examine a --complete checkout of Darcs' repo,
> > both compressed and uncompressed.
> >
> >     $ find _darcs -type f \( -size 0b -o -size 1b \) | wc -l
> >     5093
> >     $ find _darcs -type f -not -size 0b -not -size 1b | wc -l
> >     2728
> 
> Interestingly, using lzma compression is actually worse than using
> gzip compression (fewer one-block files).  I guess this is because the
> files are so small, and lzma's dictionary is 1024 times larger than
> gzip's?
> 
>     $ darcs optimize --uncompress
>     $ find _darcs -type f -exec lzma {} +
>     $ find _darcs -type f \( -size 0b -o -size 1b \) | wc -l
>     4986
>     $ find _darcs -type f -not -size 0b -not -size 1b | wc -l
>     2835

And just because I couldn't resist, here's what happens if we simulate
packing by doing compression after tarring ("one-shot") as compared to
what we have now ("per-file").  An post-pack lzma is one third of the
size of a pre-pack gzip.  (Note that tar isn't block oriented.)

4.9M _darcs-one-shot-lzma.tar.lzma
8.1M _darcs-one-shot-gzip.tar.gz
 15M _darcs-per-file-gzip.tar
 15M _darcs-per-file-lzma.tar
 29M _darcs-per-file-none.tar


More information about the darcs-users mailing list