All of lore.kernel.org
 help / color / mirror / Atom feed
* Tasks stuck in futex code (in 3.14-rc6)
@ 2014-03-19 15:26 ` Srikar Dronamraju
  0 siblings, 0 replies; 54+ messages in thread
From: Srikar Dronamraju @ 2014-03-19 15:26 UTC (permalink / raw)
  To: davidlohr; +Cc: peterz, torvalds, LKML, paulus, tglx, linuxppc-dev, mingo

Hi, 

When running specjbb on a power7 numa box, I am seeing java threads
getting stuck in futex 

#  ps -Ao pid,tt,user,fname,tmout,f,wchan | grep futex
 14808 pts/0    root     java         - 0 futex_wait_queue_me
 14925 pts/0    root     java         - 0 futex_wait_queue_me
#

stack traces, I see
[ 1843.426591] Call Trace:
[ 1843.426595] [c0000017101d74d0] [0000000000000020] 0x20 (unreliable)
[ 1843.426601] [c0000017101d76a0] [c000000000014c50] .__switch_to+0x1e0/0x390
[ 1843.426607] [c0000017101d7750] [c0000000006ed314] .__schedule+0x364/0x8c0
[ 1843.426613] [c0000017101d79d0] [c000000000139c28] .futex_wait_queue_me+0xf8/0x1a0
[ 1843.426619] [c0000017101d7a60] [c00000000013afbc] .futex_wait+0x17c/0x2a0
[ 1843.426626] [c0000017101d7c10] [c00000000013cee4] .do_futex+0x254/0xd80
[ 1843.426631] [c0000017101d7d60] [c00000000013db2c] .SyS_futex+0x11c/0x1d0
[ 1843.426638] [c0000017101d7e30] [c000000000009efc] syscall_exit+0x0/0x7c
[ 1843.426643] java            S 00003fffa08b16a0     0 14812  14203 0x00000080
[ 1843.426650] Call Trace:
[ 1843.426653] [c00000170c6034d0] [c000001710b09cf8] 0xc000001710b09cf8 (unreliable)
[ 1843.426660] [c00000170c6036a0] [c000000000014c50] .__switch_to+0x1e0/0x390
[ 1843.426666] [c00000170c603750] [c0000000006ed314] .__schedule+0x364/0x8c0
[ 1843.426672] [c00000170c6039d0] [c000000000139c28] .futex_wait_queue_me+0xf8/0x1a0
[ 1843.426679] [c00000170c603a60] [c00000000013afbc] .futex_wait+0x17c/0x2a0
[ 1843.453383] [c00000170c603c10] [c00000000013cee4] .do_futex+0x254/0xd80
[ 1843.453389] [c00000170c603d60] [c00000000013db2c] .SyS_futex+0x11c/0x1d0
[ 1843.453395] [c00000170c603e30] [c000000000009efc] syscall_exit+0x0/0x7c
[ 1843.453400] java            S 00003fffa08b1a74     0 14813  14203 0x00000080
[ 1843.453407] Call Trace:

There are 332 tasks all stuck in futex_wait_queue_me().
I am able to reproduce this consistently.

Infact I can reproduce this if the java_constraint is either node, socket, system.
However I am not able to reproduce if java_constraint is set to core.

I ran git bisect between v3.12 and v3.14-rc6 and found that

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b0c29f79ecea0b6fbcefc999e70f2843ae8306db

commit b0c29f79ecea0b6fbcefc999e70f2843ae8306db
Author: Davidlohr Bueso <davidlohr@hp.com>
Date:   Sun Jan 12 15:31:25 2014 -0800

futexes: Avoid taking the hb->lock if there's nothing to wake up

was the commit thats causing the threads to be stuck in futex.

I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6 and confirmed that
reverting the commit solved the problem.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2014-03-22  3:36 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-19 15:26 Tasks stuck in futex code (in 3.14-rc6) Srikar Dronamraju
2014-03-19 15:26 ` Srikar Dronamraju
2014-03-19 15:47 ` Peter Zijlstra
2014-03-19 15:47   ` Peter Zijlstra
2014-03-19 16:09   ` Srikar Dronamraju
2014-03-19 16:09     ` Srikar Dronamraju
2014-03-19 17:08   ` Peter Zijlstra
2014-03-19 17:08     ` Peter Zijlstra
2014-03-19 18:06     ` Davidlohr Bueso
2014-03-19 18:06       ` Davidlohr Bueso
2014-03-20  5:33     ` Srikar Dronamraju
2014-03-20  5:33       ` Srikar Dronamraju
2014-03-20  5:56       ` Davidlohr Bueso
2014-03-20  5:56         ` Davidlohr Bueso
2014-03-20 10:08         ` Srikar Dronamraju
2014-03-20 10:08           ` Srikar Dronamraju
2014-03-20 15:06           ` Davidlohr Bueso
2014-03-20 15:06             ` Davidlohr Bueso
2014-03-20 16:31         ` Davidlohr Bueso
2014-03-20 16:31           ` Davidlohr Bueso
2014-03-20 20:23           ` Benjamin Herrenschmidt
2014-03-20 20:23             ` Benjamin Herrenschmidt
2014-03-20 16:41         ` Linus Torvalds
2014-03-20 16:41           ` Linus Torvalds
2014-03-20 17:18           ` Davidlohr Bueso
2014-03-20 17:18             ` Davidlohr Bueso
2014-03-20 17:42             ` Linus Torvalds
2014-03-20 17:42               ` Linus Torvalds
2014-03-20 18:03               ` Davidlohr Bueso
2014-03-20 18:03                 ` Davidlohr Bueso
2014-03-20 18:16                 ` Linus Torvalds
2014-03-20 18:16                   ` Linus Torvalds
2014-03-20 18:36             ` Linus Torvalds
2014-03-20 18:36               ` Linus Torvalds
2014-03-20 19:08               ` Davidlohr Bueso
2014-03-20 19:08                 ` Davidlohr Bueso
2014-03-20 19:25                 ` Linus Torvalds
2014-03-20 19:25                   ` Linus Torvalds
2014-03-20 20:20                   ` Davidlohr Bueso
2014-03-20 20:20                     ` Davidlohr Bueso
2014-03-20 20:36                     ` Linus Torvalds
2014-03-20 20:36                       ` Linus Torvalds
2014-03-21  4:55                     ` Srikar Dronamraju
2014-03-21  4:55                       ` Srikar Dronamraju
2014-03-21  5:24                       ` Linus Torvalds
2014-03-21  5:24                         ` Linus Torvalds
2014-03-22  2:27                         ` Srikar Dronamraju
2014-03-22  2:27                           ` Srikar Dronamraju
2014-03-22  3:36                           ` Davidlohr Bueso
2014-03-22  3:36                             ` Davidlohr Bueso
2014-03-20  7:23       ` Peter Zijlstra
2014-03-20  7:23         ` Peter Zijlstra
2014-03-19 16:04 ` Linus Torvalds
2014-03-19 16:04   ` Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.