[darcs-users] hashed repository issue
Dan Pascu
dan at ag-projects.com
Mon Dec 8 11:30:12 UTC 2008
On Sunday 07 December 2008, Max Battcher wrote:
> Dan Pascu wrote:
> > I tried today to experiment a bit with a darcs-2 hashed repository
> > format.
> > I added an empty file then made 4 changes and recorded them. I was
> > surprised to find that the hashed pristine directory did contain 11
> > files. Every recorded change to a single file added 2 more. I was
> > under the impression that the hashed pristine would only contain one
> > file for each corresponding file in the working tree plus some index
> > file. As it looks right now, with files added there with every
> > change, it is asking for trouble with large projects (remember the
> > 32k limit for files in a directory with ext3). That means that even
> > projects with less files, but
> > many changes, can hit that limit.
>
> Pristine management is certainly something that seems like it can be
> optimized more than the current implementation, but I think it is
> smarter than I think you've seen in your test... Basically, from my
> experience (as a user; I can't comment on the code, hopefully someone
> else can) darcs optimize will clean out old pristine files. But darcs
> optimization also happens automatically now at moments of darcs'
> choosing (you will see one or more "Optimizing x ..." messages from
> time to time if you follow the progress reports), and I've particularly
> noticed that pristine optimizations seem most likely to occur after
> large records (changes to lots of files, don't ask me what that metric
> is), and many pulls/pushes.
Even so it is still a problem as I cannot predict when the problem will
hit me. It's not like if I have less than 32k files I'm safe. Even in
that case it can exceed the limit based on how many records I have.
Not to mention that a directory with 32k files makes things _very_ slow.
This flattened pristine directory may cause a considerable slowness of
darcs2 compared to darcs1 if the repository contains many files. The same
can be said for the flattened patches directory, but that is a problem
common to both of them. Still as the number of patches increases, things
will get gradually slower.
--
Dan
More information about the darcs-users
mailing list