[darcs-users] test_network failing with deadlock

Karel Gardas kgardas at objectsecurity.com
Fri Aug 15 07:39:49 UTC 2008


Hello,

few days ago Zooko asked for curl versions on my Solaris and OpenBSD
buildbots. I've replied that not only those are affected by the issue in
test_network test. Also linuxes are affected. Today I see that heffalump
Linux-? Debian-lenny(ish) ? also do have problem with running
test_network for a long time.

Anyway, at least on my Solaris where I'm able to use pstack utility to
get a stack of all threads of running process, the state of such darcs
running test_network indefinitely looks:

24154:  /buildbot/workspace/karel
solaris/build/tests-network.dir/../darcs get
-----------------  lwp# 1 / thread# 1  --------------------
 fee5f559 lwp_park (0, 0, 0)
 fee59656 cond_wait_queue (85a426c, 85a427c, 0, 0) + 41
 fee59b3a _cond_wait (85a426c, 85a427c) + 69
 fee59b78 cond_wait (85a426c, 85a427c) + 24
 fee59bb2 pthread_cond_wait (85a426c, 85a427c, 85a427c, 0, fe772000,
fe912a00) + 1e
 084fc3e0 waitCondition (0, 0, fe772000, 0, 0, 10300) + 10
 08598860 MainCapability ()
-----------------  lwp# 2 / thread# 2  --------------------
 fee635a5 pollsys  (fe6fbe80, 1, fe6fbf18, 0)
 fee1a16e pselect  (4, fe77e018, fe77e0a0, feef8608, fe6fbf18, 0) + 19e
 fee1a47e select   (4, fe77e018, fe77e0a0, 0, fe77e128) + 7e
 0849abce ???????? (fe7862a6, 4, fe77e0a0, fe77e018, fe7862b8, 849b1f8)
 fe786316 ???????? (75fffe83, 45c722, 849aa9c, e9f4c783, 21b80, c7043c7)
 14775c7b ???????? ()
-----------------  lwp# 3 / thread# 3  --------------------
 fee5f559 lwp_park (0, 0, 0)
 fee59656 cond_wait_queue (85a5df4, 85a5e04, 0, 0) + 41
 fee59b3a _cond_wait (85a5df4, 85a5e04) + 69
 fee59b78 cond_wait (85a5df4, 85a5e04) + 24
 fee59bb2 pthread_cond_wait (85a5df4, 85a5e04, 0, 8598953, 85bcf30,
fe8c0a00) + 1e
 084fc3e0 waitCondition (0, 0, 0, 0, 0, 10300) + 10
 08598860 MainCapability ()
-----------------  lwp# 5 / thread# 5  --------------------
 fee5f559 lwp_park (0, 0, 0)
 fee59656 cond_wait_queue (85bcf4c, 85bcf5c, 0, 0) + 41
 fee59b3a _cond_wait (85bcf4c, 85bcf5c) + 69
 fee59b78 cond_wait (85bcf4c, 85bcf5c) + 24
 fee59bb2 pthread_cond_wait (85bcf4c, 85bcf5c, 0, 0, 0, fe8c1a00) + 1e
 084fc3e0 waitCondition (0, 0, 0, 0, 0, 10300) + 10
 08598860 MainCapability ()


when run over the time, stack trace differer only in one line:

--- /tmp/s1.txt Fri Aug 15 08:47:13 2008
+++ /tmp/s4.txt Fri Aug 15 09:38:02 2008
@@ -11,7 +11,7 @@
  fee635a5 pollsys  (fe6fbe80, 1, fe6fbf18, 0)
  fee1a16e pselect  (4, fe77e018, fe77e0a0, feef8608, fe6fbf18, 0) + 19e
  fee1a47e select   (4, fe77e018, fe77e0a0, 0, fe77e128) + 7e
- 0849abce ???????? (fe7862a6, 4, fe77e0a0, fe77e018, fe7862b8, 849b1f8)
+ 0849abce ???????? (fe78640e, 4, fe77e0a0, fe77e018, fe786420, 849b1f8)
  fe786316 ???????? (75fffe83, 45c722, 849aa9c, e9f4c783, 21b80, c7043c7)
  14775c7b ???????? ()
 -----------------  lwp# 3 / thread# 3  --------------------


but this is rather not so common change, since for example for last hour
the stack trace looks the same.

Any idea what's going wrong here? My still bet is something bad going
either in curl or in darcs/curl interaction...

Thanks,
Karel
-- 
Karel Gardas                  kgardas at objectsecurity.com
ObjectSecurity Ltd.           http://www.objectsecurity.com


More information about the darcs-users mailing list