[darcs-users] Re: fptools in darcs now available

Wed May 18 18:36:44 UTC 2005

David Roundy wrote:

>
>Indeed, that should help.  But even as things are now, a darcs-unstable
>initial record of the linux kernel requires only 10 times the CPU time that
>tar czf does, and only 7.5 times the wallclock time.  So if we assume that
>tar is pretty much optimal, we only have one order of magnitude improvement
>left to be made.  I expect that changing the hunk format (as we've
>discussed) should pretty much get us that order of magnitude in
>improvement in CPU time.
>
>The memory usage is way worse than that of tar, but I'm optimistic that
>we can improve things a bit in that realm.  Perhaps (for example) by
>storing PackedString file paths, or by making the directory-reading portion
>of slurp lazy.  In any case, 450M isn't such bad maximum memory consumption
>for a project the size of the kernel.
>  
>
It would seem that if an addfile primitive included the hunk patch of
the file's initial contents instead of treating them as separate
primitives, then certain implementation optimizations would be much more
feasible.  If you are parsing an addfile patch, then you can assume:
1) the embedded hunk patch (or binary patch) only contains add lines and
never delete lines
2) the hunk is not offset. it always starts at the beginning of the file
3) if the addfile itself does not conflict, then it is impossible for
its embedded hunk to conflict

Given those assumptions, you can choose to not preparse and load the
entire patch into memory, instead just stream the patch through hex
decoding and write it directly to file.

Presumably something similar would be possible for rmfile patches as well

Thoughts?

-Tupshin