[darcs-users] darcs patch: use mmap for readFileLinesPSetc and fix mmap signature

Gwern Branwen gwern0 at gmail.com
Mon Apr 28 17:04:48 UTC 2008


On 2008.04.27 22:49:53 -0700, Jason Dagit <dagit at codersbase.com> scribbled 12K characters:
> Please do not apply this patch to darcs, I'm sending it in for testing
> purposes.
>
> Gwern, does this this work on your 64bit machine or files over the 4GB
> limit?
>
> Thanks,
> Jason
>
> Sun Apr 27 19:49:21 PDT 2008  Jason Dagit <dagit at codersbase.com>
>   * use mmap for readFileLinesPSetc and fix mmap signature

[ghc] src/Exec.o
[ghc] src/FastPackedString.o
[ghc] src/OldFastPackedString.o

src/FastPackedString.hs:596:7:
    Not in scope: type constructor or class `CSize'

You forgot the import I told you about:

hunk ./src/FastPackedString.hs 104
-import Foreign.C.Types ( CInt, )
+import Foreign.C.Types ( CInt, CSize )

---

Now, that aside, there are some interesting performance characteristics:

gwern at localhost:2581~/foo>time whatsnew -s [11:51AM]
A ./bigtempfile
darcs whatsnew -s  0.99s user 0.84s system 99% cpu 1.834 total
gwern at localhost:2581~/foo>duh bigtempfile [11:52AM]
2.9G    bigtempfile
2.9G    total
gwern at localhost:2582~/foo>rm bigtempfile [11:52AM]
gwern at localhost:2583~/foo>time head -c 4079869184 /dev/zero > bigtempfile [11:52AM]
head -c 4079869184 /dev/zero > bigtempfile  0.57s user 15.19s system 11% cpu 2:15.27 total
gwern at localhost:2584~/foo>time whatsnew -s [11:54AM]
A ./bigtempfile
darcs whatsnew -s  1.74s user 3.43s system 3% cpu 2:13.51 total
gwern at localhost:2586~/foo>time whatsnew -s [11:58AM]
A ./bigtempfile
darcs whatsnew -s  1.84s user 3.48s system 3% cpu 2:46.19 total
gwern at localhost:2589~/foo>time whatsnew -s [12:02PM]
A ./bigtempfile
darcs whatsnew -s  1.90s user 3.12s system 4% cpu 2:01.31 total
gwern at localhost:2590~/foo>rm bigtempfile [12:05PM]
gwern at localhost:2591~/foo>time head -c 3079869184 /dev/zero > bigtempfile && time whatsnew -s                                       [12:05PM]
head -c 3079869184 /dev/zero > bigtempfile  0.42s user 11.21s system 11% cpu 1:37.89 total
A ./bigtempfile
darcs whatsnew -s  0.99s user 0.80s system 31% cpu 5.596 total
gwern at localhost:2592~/foo>rm bigtempfile&& time head -c 5079869184 /dev/zero > bigtempfile && time whatsnew -s                      [12:07PM]
head -c 5079869184 /dev/zero > bigtempfile  0.59s user 19.83s system 16% cpu 2:07.12 total
A ./bigtempfile
darcs whatsnew -s  2.12s user 4.37s system 2% cpu 4:11.74 total
gwern at localhost:2593~/foo>rm bigtempfile&& time head -c 6079869184 /dev/zero > bigtempfile && time whatsnew -s                      [12:15PM]
head -c 6079869184 /dev/zero > bigtempfile  0.78s user 22.74s system 12% cpu 3:08.20 total
A ./bigtempfile
darcs whatsnew -s  2.33s user 5.08s system 2% cpu 5:01.04 total
gwern at localhost:2596~/foo>rm bigtempfile&& time head -c 1079869184 /dev/zero > bigtempfile && time whatsnew -s                      [12:23PM]
head -c 1079869184 /dev/zero > bigtempfile  0.14s user 3.33s system 13% cpu 26.457 total
A ./bigtempfile
darcs whatsnew -s  0.28s user 0.44s system 7% cpu 9.879 total

In case you don't feel like looking through this entire list, the summary is this: a 1.1G file takes 10 seconds. A 5.7GB file takes 300 seconds. So a 6x file increase gives a 30x time increase (and memory usage is just as bad or even worse, though I didn't track it).

Or here's another example:

gwern at localhost:2601~/foo>rm bigtempfile&& time head -c 079869184 /dev/zero > bigtempfile && time whatsnew -s && duh bigtempfile; rm bigtempfile&& time head -c 979869184 /dev/zero > bigtempfile && time whatsnew -s && duh bigtempfile; rm bigtempfile&& time head -c 1079869184 /dev/zero > bigtempfile && time whatsnew -s && duh bigtempfile; rm bigtempfile&& time head -c 2079869184 /dev/zero > bigtempfile && time whatsnew -s && duh bigtempfile; rm bigtempfile&& time head -c 3079869184 /dev/zero > bigtempfile && time whatsnew -s && duh bigtempfile; rm bigtempfile&& time head -c 4079869184 /dev/zero > bigtempfile && time whatsnew -s && duh bigtempfile
head -c 079869184 /dev/zero > bigtempfile  0.01s user 0.24s system 90% cpu 0.277 total
A ./bigtempfile
darcs whatsnew -s  0.07s user 0.03s system 5% cpu 1.702 total
77M   bigtempfile
77M   total
head -c 979869184 /dev/zero > bigtempfile  0.17s user 3.19s system 25% cpu 13.127 total
A ./bigtempfile
darcs whatsnew -s  0.36s user 0.25s system 99% cpu 0.610 total
936M  bigtempfile
936M  total
head -c 1079869184 /dev/zero > bigtempfile  0.12s user 3.40s system 14% cpu 25.045 total
A ./bigtempfile
darcs whatsnew -s  0.43s user 0.27s system 99% cpu 0.702 total
1.1G  bigtempfile
1.1G  total
head -c 2079869184 /dev/zero > bigtempfile  0.31s user 7.50s system 12% cpu 1:01.27 total
A ./bigtempfile
darcs whatsnew -s  0.69s user 0.50s system 99% cpu 1.192 total
2.0G  bigtempfile
2.0G  total
head -c 3079869184 /dev/zero > bigtempfile  0.41s user 11.40s system 12% cpu 1:34.22 total
A ./bigtempfile
darcs whatsnew -s  1.07s user 0.71s system 99% cpu 1.781 total
2.9G  bigtempfile
2.9G  total
head -c 4079869184 /dev/zero > bigtempfile  0.57s user 15.20s system 11% cpu 2:14.94 total
A ./bigtempfile
darcs whatsnew -s  1.72s user 3.42s system 2% cpu 2:56.71 total
3.9G  bigtempfile
3.9G  total

Notice how it blows up between 2.9G and 3.9G: a single extra gigabyte, and our whatsnew -s time goes from 1.8 seconds to 174 seconds.

I tried it with profiling, and it claims the majority of the time is being used by linesPS. get_unrecorded calls smart_diff which calls gen_diff, which begat diff_files, and in the fullness of time, diff_files summoned get_text, which duly appointed linesPS to eat 100% of CPU time... So it's not clear to me why >3 gig files blow up.

---

Testing-wise, this patch looks good to me - I didn't see any segfaults on large files like with my own attempts, and make test is clean as usual with the exception of either_dependency.sh and whatsnew.pl. I think either_dependency.sh usually fails, but I don't know about whatsnew.pl.

---

Anyway, if the whatsnew.pl issue gets cleared up and a cleaner patch written (I think usage of mmap is supposed to be conditional on Autoconf.lhs), I think this change would be worth applying. It's an enormous speedup on small to large files, and it isn't *that* much worse for >3gig files (which folks on 32-bit systems can't even use in the first place, as I understand it).

Future work might be to change FPS.hs's 'mmap' call to just call readFilePS on too-large files (given that it already does something other than mmap on too-small files).

--
gwern
Terrorism CMS 1080H Choe Firewalls Lander 669 Zen HF STEP
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.osuosl.org/pipermail/darcs-users/attachments/20080428/ffaf6c08/attachment-0001.pgp 


More information about the darcs-users mailing list