[darcs-devel] Use System.Directory.copyFile for file copying
Kevin Quick
quick at sparq.org
Wed Aug 1 06:19:51 PDT 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Thanks! I had actually made the original change based on the preference
for system libraries over custom code (lacking any comments
justifying the
preference for the latter) but it's nice to see it comes with a
potential performance
boost for certain configurations.
Out of curiosity, and at the risk of boring folks, I added a third
method for comparison,
which performs a System.Process.runCommand("cp -r fromdir todir") >>=
waitForProcess,
essentially testing against standard cp plus the overhead of the
subprocess creation.
In addition, I also ran this on Mac OS X Tiger x86.
Updated results for Linux x86 reiserfs:
$ for X in 1 2 3 4 5 6 ; do rm -r copytestdir/small{2,3,4}/*; ./
copytest copytestdir/small?; done
Copy of 998 files:
via System.Directory.copyFile: 0.172032s
via readFilePS >= writeFilePS: 0.164614s
via System.Process("cp -r"): 0.115983s
Copy of 998 files:
via System.Directory.copyFile: 0.180447s
via readFilePS >= writeFilePS: 0.176695s
via System.Process("cp -r"): 0.120259s
Copy of 998 files:
via System.Directory.copyFile: 0.176865s
via readFilePS >= writeFilePS: 0.165542s
via System.Process("cp -r"): 0.128305s
Copy of 998 files:
via System.Directory.copyFile: 0.183365s
via readFilePS >= writeFilePS: 0.176855s
via System.Process("cp -r"): 0.126601s
Copy of 998 files:
via System.Directory.copyFile: 0.183976s
via readFilePS >= writeFilePS: 0.17487s
via System.Process("cp -r"): 0.125522s
Copy of 998 files:
via System.Directory.copyFile: 0.178875s
via readFilePS >= writeFilePS: 0.171737s
via System.Process("cp -r"): 0.127577s
$ for X in 1 2 3 4 5 6 ; do rm -r copytestdir/big{2,3,4}/*; /kwq/
unstable/copytest copytestdir/big?; done
Copy of 4 files:
via System.Directory.copyFile: 8.204959s
via readFilePS >= writeFilePS: 11.083989s
via System.Process("cp -r"): 13.116506s
Copy of 4 files:
via System.Directory.copyFile: 7.864635s
via readFilePS >= writeFilePS: 9.829027s
via System.Process("cp -r"): 11.634171s
Copy of 4 files:
via System.Directory.copyFile: 7.08999s
via readFilePS >= writeFilePS: 10.456167s
via System.Process("cp -r"): 15.665767s
Copy of 4 files:
via System.Directory.copyFile: 8.318283s
via readFilePS >= writeFilePS: 10.496049s
via System.Process("cp -r"): 10.557133s
Copy of 4 files:
via System.Directory.copyFile: 5.262135s
via readFilePS >= writeFilePS: 13.083001s
via System.Process("cp -r"): 11.247626s
Copy of 4 files:
via System.Directory.copyFile: 7.754436s
via readFilePS >= writeFilePS: 10.117483s
via System.Process("cp -r"): 13.9466s
The results for Mac OS X Tiger x86:
$ for X in 1 2 3 4 5 6 ; do rm -r copytestdir/small{2,3,4}/*; ./
copytest copytestdir/small?; done
Copy of 998 files:
via System.Directory.copyFile: 0.455989s
via readFilePS >= writeFilePS: 0.435122s
via System.Process("cp -r"): 0.25074s
Copy of 998 files:
via System.Directory.copyFile: 0.353156s
via readFilePS >= writeFilePS: 0.367436s
via System.Process("cp -r"): 0.250192s
Copy of 998 files:
via System.Directory.copyFile: 0.36179s
via readFilePS >= writeFilePS: 0.450544s
via System.Process("cp -r"): 0.249152s
Copy of 998 files:
via System.Directory.copyFile: 0.353648s
via readFilePS >= writeFilePS: 0.366946s
via System.Process("cp -r"): 0.250617s
Copy of 998 files:
via System.Directory.copyFile: 0.354281s
via readFilePS >= writeFilePS: 0.417799s
via System.Process("cp -r"): 0.256529s
Copy of 998 files:
via System.Directory.copyFile: 0.390028s
via readFilePS >= writeFilePS: 0.363963s
via System.Process("cp -r"): 0.248796s
$ for X in 1 2 3 4 5 6 ; do rm -r copytestdir/big{2,3,4}/*; ./
copytest copytestdir/big?; done
Copy of 4 files:
via System.Directory.copyFile: 4.419766s
via readFilePS >= writeFilePS: 1.27034s
via System.Process("cp -r"): 1.649713s
Copy of 4 files:
via System.Directory.copyFile: 1.429918s
via readFilePS >= writeFilePS: 1.40437s
via System.Process("cp -r"): 1.456039s
Copy of 4 files:
via System.Directory.copyFile: 1.552916s
via readFilePS >= writeFilePS: 1.219946s
via System.Process("cp -r"): 1.465697s
Copy of 4 files:
via System.Directory.copyFile: 1.426121s
via readFilePS >= writeFilePS: 1.32494s
via System.Process("cp -r"): 0.985613s
Copy of 4 files:
via System.Directory.copyFile: 1.429138s
via readFilePS >= writeFilePS: 1.319917s
via System.Process("cp -r"): 1.40475s
Copy of 4 files:
via System.Directory.copyFile: 1.428278s
via readFilePS >= writeFilePS: 1.471474s
via System.Process("cp -r"): 1.425157s
There's clearly some variability in the results that could be
investigated, but I'm not sure it's worth the effort trying to fine-
tune the benchmark since we're after "typical" results. The copyFile
never seems to be significantly slower, and can be significantly faster.
- -KQ
On 31 Jul 2007, at 10:46 AM, Jason Dagit wrote:
> Fantastic analysis. Exactly what I was looking for. I'm now
> convinced :)
>
> The icing on the cake would be to check this on various platforms but
> I'm convinced enough already.
>
> Thanks!
> Jason
>
> On 7/30/07, Kevin Quick <quick at sparq.org> wrote:
>> On Sat, 28 Jul 2007 17:23:13 -0700, "Jason Dagit"
>> <dagit at codersbase.com> wrote:
>>> I think it would be nice to see a benchmark between the old and new
>>> where there are thousands of tiny little files. Is there a
>>> noticeable
>>> performance difference? The reason for lots of little files is
>>> because of the permission copying. I don't know that it would
>>> affect
>>> the performance, but if the permission changes require a lot of
>>> function calls then it could have a big impact.
>>>
>>> Jason
>>
>> Attached is copytest.hs. It is given three directory names and
>> copies from the first to the second using System.Directory.copyFile,
>> then copies from the first to the third using readfilePS >>=
>> writeFilePS. It then reports the elapsed time for each test.
>>
>> To build (assuming it is located in the top-level of the darcs source
>> tree):
>>
>> $ ghc --make -isrc copytest src/fpstring.c -lz
>>
>> I used two input sets. The first directory set had 998 small
>> files (50
>> bytes to 500 bytes) in the source directory. The second set had 4
>> big
>> files:
>>
>> $ ls -lh copytestdir/big1
>> total 83M
>> -rw-rw-r-- 1 kquick kquick 16M Jul 30 22:15 bigfile1
>> -rw-rw-r-- 1 kquick kquick 14M Jul 30 22:15 bigfile2
>> -rw-rw-r-- 1 kquick kquick 30M Jul 30 22:15 bigfile3
>> -rw-rw-r-- 1 kquick kquick 25M Jul 30 22:16 bigfile4
>>
>> Test runs:
>>
>> $ for X in 1 2 3 4 5 6 ; do rm copytestdir/small{2,3}/*; ./copytest
>> copytestdir/small?; done
>> Copy of 998 files:
>> via System.Directory.copyFile: 0.160895s
>> via readFilePS >= writeFilePS: 0.153626s
>> Copy of 998 files:
>> via System.Directory.copyFile: 0.161436s
>> via readFilePS >= writeFilePS: 0.153718s
>> Copy of 998 files:
>> via System.Directory.copyFile: 0.163191s
>> via readFilePS >= writeFilePS: 0.155526s
>> Copy of 998 files:
>> via System.Directory.copyFile: 0.16112s
>> via readFilePS >= writeFilePS: 0.156255s
>> Copy of 998 files:
>> via System.Directory.copyFile: 0.162132s
>> via readFilePS >= writeFilePS: 0.157913s
>> Copy of 998 files:
>> via System.Directory.copyFile: 0.163213s
>> via readFilePS >= writeFilePS: 0.157451s
>> $
>>
>>
>> $ for X in 1 2 3 4 5 6 ; do rm copytestdir/big{2,3}/*; ./copytest
>> copytestdir/ big?; done
>> Copy of 4 files:
>> via System.Directory.copyFile: 10.418745s
>> via readFilePS >= writeFilePS: 11.420843s
>> Copy of 4 files:
>> via System.Directory.copyFile: 8.318079s
>> via readFilePS >= writeFilePS: 16.533595s
>> Copy of 4 files:
>> via System.Directory.copyFile: 8.384256s
>> via readFilePS >= writeFilePS: 11.35574s
>> Copy of 4 files:
>> via System.Directory.copyFile: 7.752898s
>> via readFilePS >= writeFilePS: 14.43615s
>> Copy of 4 files:
>> via System.Directory.copyFile: 8.029765s
>> via readFilePS >= writeFilePS: 14.187116s
>> Copy of 4 files:
>> via System.Directory.copyFile: 7.85273s
>> via readFilePS >= writeFilePS: 12.406907s
>>
>>
>>> From the above, I conclude that System.Directory.copyFile is
>>> better at
>> actually copying the file data, and that the overhead of copying
>> permissions (visible from copying small files) is quite small
>> (about 8
>> us/file).
>>
>>
>> --
>> --
>> Kevin Quick
>> quick at org after sparq
>>
>> _______________________________________________
>> darcs-devel mailing list
>> darcs-devel at darcs.net
>> http://lists.osuosl.org/mailman/listinfo/darcs-devel
>>
>>
>>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
iD8DBQFGsIh4t76lKrRL0ewRAgtPAJ9l+dByfhH3gy+xazwAVX9n887z3QCaAjHa
CAdzHVaUoihXE4Csd/g9QaI=
=2f27
-----END PGP SIGNATURE-----
More information about the darcs-devel
mailing list