[darcs-devel] Performance and really large trees

Andrew McGregor andrew at indranet.co.nz
Sun Apr 17 04:52:23 PDT 2005


So, if I do this:

darcs init
mv ../snapgear .
for i in snapgear/*; do darcs add --case-ok -r $i; darcs record -m 
"initial $i" -a; done &

and wait about two hours, it works.

But why does it not work if I do the whole initial checkin in one go?

Oh, and:

time darcs whatsnew
No changes!

real    0m44.163s
user    0m40.100s
sys     0m3.930s

So, it will be usable.

Andrew

On 17/04/2005, at 12:27 PM, Andrew McGregor wrote:

> So, I tried one and straced it, because 'about an hour' is what I'd 
> estimated too.
>
> The last line that isn't a timer expiry in the trace is:
> 14179 open("_darcs/patches/pending", 
> O_RDONLY|O_NONBLOCK|O_NOCTTY|O_LARGEFILE) = -1 ENOENT (No such file or 
> directory)
>
> This is bad, right?
>
> BTW, that line appeared after about 10 minutes, and a listing of 
> stat-related calls for every file and directory in the tree.
>
> It's a little odd though:
>
> 14179 stat64("minderOS/./vendors/senTec/romfs/etc/config/start", 
> {st_mode=S_IFREG|0644, st_size=158, ...}) = 0
> 14179 stat64("minderOS/./vendors/senTec/romfs/etc/config/start", 
> {st_mode=S_IFREG|0644, st_size=158, ...}) = 0
> 14179 lstat64("minderOS/./vendors/senTec/romfs/etc/config/start", 
> {st_mode=S_IFREG|0644, st_size=158, ...}) = 0
>
> Why stat that file three times?
>
> Andrew
>
> On 17/04/2005, at 9:05 AM, Ian Lynagh wrote:
>
>> On Sat, Apr 16, 2005 at 11:50:11PM +1200, Andrew McGregor wrote:
>>> In the light of the recent discussion about darcs performance for 
>>> large
>>> trees (like the Linux kernel) I'd like to ask what can be done to 
>>> help
>>> our case.
>>
>> I'm not sure I understand the question?
>>
>>> We're doing embedded linux development, and we'd like to put, not 
>>> just
>>> the kernel, but the whole OS, into darcs.  The tree contains two
>>> kernels, three c libraries and a reasonably complete command line
>>> Linux.  Unpacked, it's about 1.3Gb in 100k files.
>>
>> Just in case the Linux kernel was too easy?  :-)
>>
>>> We have an initial checkin with vanilla 1.0.2 that has been running 
>>> for
>>> the last three weeks without finishing, and one started on Friday 
>>> with
>>> a pull from Thursday afternoon is still running (on a 2.8GHz dual-HT
>>> Xeon with plenty of memory).  Now, it doesn't really matter if an
>>> initial checkin takes a long time... but I'm worried that these are
>>> actually never going to finish.
>>
>> You can see what progress is being made; in _darcs/patches there will
>> either be a .gz-0 (I think) or a .gz file. If the former then
>> zcat | wc -c will tell you how many bytes have been written. When
>> complete, this will be slightly over your 1.3G.
>>
>> If it's a .gz file then darcs will be applying it to current.
>> If find _darcs/current | wc -l is 100k then it's finished creating the
>> files; find _darcs/current -size 0 | wc -l will tell you how many 
>> still
>> have to be written (modulo any that really are 0 bytes, of course).
>>
>> With my latest patches (not yet in darcs unstable) I've just done a
>> checkin of the Linux kernel (18,175 files, 236M). It took under 2mins 
>> to
>> write the patch file and under 6 to apply it to current (darcs record
>> took 7m46 real / 7m26 user). I'd expect it to be pretty much linear, 
>> so
>> I'd hope your checkin would take around an hour.
>>
>> This was with:
>>     darcs init
>>     darcs add --case-ok -r .
>>     darcs record -a -m foo --no-test -v
>> on an "AMD Athlon(tm) XP 2200+".
>>
>> I've just sent most of the patches to the list. The other important 
>> one
>> is "Unsafe pull_firsts_middles optimisation" from an earlier mail 
>> (which
>> will probably break other darcs functionality at present).
>>
>>
>> Thanks
>> Ian
>>
>>
>> _______________________________________________
>> darcs-devel mailing list
>> darcs-devel at darcs.net
>> http://www.abridgegame.org/cgi-bin/mailman/listinfo/darcs-devel
>>
>>
>
>
> _______________________________________________
> darcs-devel mailing list
> darcs-devel at darcs.net
> http://www.abridgegame.org/cgi-bin/mailman/listinfo/darcs-devel
>
>





More information about the darcs-devel mailing list