[darcs-users] [darcs-devel] buildbot failure in darcs on karel opensolaris OpenSolaris-2008.05 nevada-86-rc3 i386
Karel Gardas
kgardas at objectsecurity.com
Wed May 21 21:27:32 UTC 2008
zooko wrote:
> On May 21, 2008, at 2:46 PM, Karel Gardas wrote:
>
>> sorry, but due to this issue in VirtualBox software:
>> http://www.virtualbox.org/ticket/1562 -- I'm switching current
>> OpenSolaris buildbot off.
>
> Thanks, Karel!
>
> But I kind of wonder if this error:
>
> http://buildbot.darcs.net/builders/karel%20opensolaris%20OpenSolaris-2008.05%20nevada-86-rc3%20i386/builds/18/steps/test_configure/logs/stdio
>
>
> Doesn't show an unnecessary "fragility" of darcs, that it hangs if the
> underlying gettimeofday jumps during a get. Hm, in fact I know
> something about this -- darcs is using libcurl, right? Let me see if it
> shows in your configure output...
>
> Yes:
>
> checking for libcurl... 7.15.5
> checking for curl_global_init in -lcurl... yes
> checking for curl_multi_timeout in -lcurl... yes
>
> So, libcurl is probably the one having this problem. If your libcurl
> was configured with libevent, and if your libevent was compiled on your
> system and detected the CLOCK_MONOTONIC, then darcs would have worked
> normally on an HTTP GET even though the gettimeofday was wrong.
>
> I would be interested in pursuing this. For one thing, buggy
> VirtualBoxes aren't the only way that this kind of thing can happen --
> I've seen it before with issues involving NTP, system administrators
> changing the system clock, dual-booting with Windows, using SMP on older
> Linux kernels, and embedded systems that rely on the network to set
> their clock.
>
> I would like it if libcurl were configured so that HTTP GETs succeeded
> regardless of how the gettimeofday behaves.
Zooko,
it's interesting how you spot the issue even w/o seeing the real machine
here. Well, you are right. You know we have patched twisted to not
timeout for me and so I've seen a looooong action (recorded in the last
build) of darcs running network test. It would keep me calm, but then
I've seen that VBox starts to eat one of the cores. I've been interested
to see what's happening there and see darcs consuming ~90% of cpu.
Interesting, so I used Solaris's pstack tool to print the stack of the
running process. It was quite strange to see all the threads ending on
something which should block and wait and see. So I executed pstack
several times and then seen that one thread is executing some call from
curl and that the threads' top stack item is gettimeofday call.
So basically I think you are right. But this was enough for me with VBox
and decided to switch it off.
Anyway, if you like to test anything on this broken "box", just let me
know and I'll test it for you, but it's IMHO useless to keep it running
just for darcs not being able to somehow finish with curl calls and
consume 1 core. Also let's see how Sun's engineers solves the issue for
us...
Cheers,
Karel
--
Karel Gardas kgardas at objectsecurity.com
ObjectSecurity Ltd. http://www.objectsecurity.com
More information about the darcs-users
mailing list