[darcs-users] darcs patch: make reading the pending lazy in summary mode

Gwern Branwen gwern0 at gmail.com
Mon Apr 28 15:39:00 UTC 2008


On 2008.04.28 06:56:28 -0700, David Roundy <droundy at darcs.net> scribbled 1.5K characters:
> On Sun, Apr 27, 2008 at 07:41:44PM -0700, Jason Dagit wrote:
> > On Sat, Apr 26, 2008 at 4:22 AM, David Roundy <droundy at darcs.net> wrote:
> > >  Gwern's idea of making is_funky faster is always good (since it speeds
> > > up many darcs commands, if only a little), but I don't think it touches
> > > the real problem, which is that we shouldn't be opening these files at
> > > all.
> >
> > Right, and actually I've been working with Gwern over IRC about optimizing
> > whatsnew and specifically the binary detection.  He's noticed that co_slurpy
> > is strict in the IO which is interesting.  I think he's looking at getting
> > mmap working for address spaces that are greater than 32bits.   If you look
> > in fpstring.c you'll see that my_mmap takes an int where it should work with
> > size_t, but I don't think that's the only problem.  I think some of the FFI
> > stuff needs changing too.  By the way, what does the "co" in co_slurpy mean?
>
> The "co" in co_slurpy means that it's slurping "along with" another slurpy,
> rather than grabbing everything that is available.  This is an optimization
> for the case where you have many unrecorded files.  A friend used to have
> whatsnew take several minutes to run when he had no changes, simple because
> darcs had to sort through all the directory listings in his working
> directory (since it was using slurp).
>
> I'm not sure what you mean by co_slurpy being strinct.  It looks to me like
> it's got adequate unsafeInterleaveIO to make it lazy.
> --
> David Roundy

Well, it does have plenty of unsafeInterleaveIO, that is true, but the issue here is readFilePS: readFilePS is completely strict, it reads the entire file into memory (per docs and implementation). So, actually running readFilePS may get delayed to the last second, but once readFilePS gets inspected, it'll immediately do its best to suck in all 9 gigs or whatever.

This is why replacing readFilePS in co_slurp_helper with mmapFilePS is such a time saver - it is lazy and pretends to read in all 9 gigs immediately, but since with -s, we ultimately only read the first 4096 characters, only a little bit will ever actually get page-faulted into memory.

(The problem with mmapFilePS is that as lispy mentions, on my 64-bit system, mmapFilePS can no longer handle >3 gig files while readFilePS scaled up to at least 9gigs, albeit slowly.)

--
gwern
Terrorism CMS 1080H Choe Firewalls Lander 669 Zen HF STEP
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.osuosl.org/pipermail/darcs-users/attachments/20080428/e9a6c720/attachment.pgp 


More information about the darcs-users mailing list