[darcs-devel] Repository.writePatch (issue80)

Jason Dagit dagit at eecs.oregonstate.edu
Mon Jan 16 08:09:23 PST 2006


On Jan 16, 2006, at 5:41 AM, Juliusz Chroboczek wrote:

>> Total time (wall clock)
>> orig: 5 hours +
>> no-reread: 6 minutes
>
>> Peak RES (as measured by top)
>> orig: 940MB
>> no-reread: 950MB
>
> Am I reading this correctly?  This unoptimisation makes Darcs 50 times
> faster on large records while not using significantly more memory?
>
> This looks like some serious problem with gzReadPatchFileLazily.  I'm
> tempted to apply the workaround unless someone can explain this  
> behaviour.

There is at least one problem with gzReadPatchFileLazily.  But the  
problem I'm thinking of is well known, it's the format of hunks  is  
not good for parsing.  I suspect that there is also a space leak in  
gzReadPatchFileLazily.  I suspect there are space leaks before we get  
to that function as well, but I'm finding it near impossible to  
reason about space leaks (how to spot them, how to treat them, how to  
verify that they are a problem, etc...).

As for applying the workaround.  I've realized two problems with  
doing that.  Say this 500mb hunk gets into a patch and now darcs  
needs to read that hunk for some reason.  A record that only took 6  
minutes is now going to take more than 5 hours to deal with.  Maybe  
the hunk shouldn't get in there in the first place?  The other  
problem is that we need to remember to revert this change when the  
real problem is fixed, as this only treats the symptom.

Initially I thought applying this would be a very good idea, but now  
I'm having doubts.  On the other hand, how long will it take to find/ 
fix the real problem?  Who knows, I've looked at it some, but I'm  
making little or no progress and the trouble shooting feels like  
voodoo so far (just making changes without an understanding).

Thanks,
Jason




More information about the darcs-devel mailing list