All of lore.kernel.org
 help / color / mirror / Atom feed
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: davidlohr@hp.com
Cc: peterz@infradead.org, torvalds@linux-foundation.org,
	LKML <linux-kernel@vger.kernel.org>,
	paulus@samba.org, tglx@linutronix.de,
	linuxppc-dev@lists.ozlabs.org, mingo@kernel.org
Subject: Tasks stuck in futex code (in 3.14-rc6)
Date: Wed, 19 Mar 2014 20:56:19 +0530	[thread overview]
Message-ID: <20140319152619.GB10406@linux.vnet.ibm.com> (raw)

Hi, 

When running specjbb on a power7 numa box, I am seeing java threads
getting stuck in futex 

#  ps -Ao pid,tt,user,fname,tmout,f,wchan | grep futex
 14808 pts/0    root     java         - 0 futex_wait_queue_me
 14925 pts/0    root     java         - 0 futex_wait_queue_me
#

stack traces, I see
[ 1843.426591] Call Trace:
[ 1843.426595] [c0000017101d74d0] [0000000000000020] 0x20 (unreliable)
[ 1843.426601] [c0000017101d76a0] [c000000000014c50] .__switch_to+0x1e0/0x390
[ 1843.426607] [c0000017101d7750] [c0000000006ed314] .__schedule+0x364/0x8c0
[ 1843.426613] [c0000017101d79d0] [c000000000139c28] .futex_wait_queue_me+0xf8/0x1a0
[ 1843.426619] [c0000017101d7a60] [c00000000013afbc] .futex_wait+0x17c/0x2a0
[ 1843.426626] [c0000017101d7c10] [c00000000013cee4] .do_futex+0x254/0xd80
[ 1843.426631] [c0000017101d7d60] [c00000000013db2c] .SyS_futex+0x11c/0x1d0
[ 1843.426638] [c0000017101d7e30] [c000000000009efc] syscall_exit+0x0/0x7c
[ 1843.426643] java            S 00003fffa08b16a0     0 14812  14203 0x00000080
[ 1843.426650] Call Trace:
[ 1843.426653] [c00000170c6034d0] [c000001710b09cf8] 0xc000001710b09cf8 (unreliable)
[ 1843.426660] [c00000170c6036a0] [c000000000014c50] .__switch_to+0x1e0/0x390
[ 1843.426666] [c00000170c603750] [c0000000006ed314] .__schedule+0x364/0x8c0
[ 1843.426672] [c00000170c6039d0] [c000000000139c28] .futex_wait_queue_me+0xf8/0x1a0
[ 1843.426679] [c00000170c603a60] [c00000000013afbc] .futex_wait+0x17c/0x2a0
[ 1843.453383] [c00000170c603c10] [c00000000013cee4] .do_futex+0x254/0xd80
[ 1843.453389] [c00000170c603d60] [c00000000013db2c] .SyS_futex+0x11c/0x1d0
[ 1843.453395] [c00000170c603e30] [c000000000009efc] syscall_exit+0x0/0x7c
[ 1843.453400] java            S 00003fffa08b1a74     0 14813  14203 0x00000080
[ 1843.453407] Call Trace:

There are 332 tasks all stuck in futex_wait_queue_me().
I am able to reproduce this consistently.

Infact I can reproduce this if the java_constraint is either node, socket, system.
However I am not able to reproduce if java_constraint is set to core.

I ran git bisect between v3.12 and v3.14-rc6 and found that

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b0c29f79ecea0b6fbcefc999e70f2843ae8306db

commit b0c29f79ecea0b6fbcefc999e70f2843ae8306db
Author: Davidlohr Bueso <davidlohr@hp.com>
Date:   Sun Jan 12 15:31:25 2014 -0800

futexes: Avoid taking the hb->lock if there's nothing to wake up

was the commit thats causing the threads to be stuck in futex.

I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6 and confirmed that
reverting the commit solved the problem.

-- 
Thanks and Regards
Srikar Dronamraju

WARNING: multiple messages have this Message-ID (diff)
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: davidlohr@hp.com
Cc: torvalds@linux-foundation.org, tglx@linutronix.de,
	peterz@infradead.org, mingo@kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	linuxppc-dev@lists.ozlabs.org, benh@kernel.crashing.org,
	paulus@samba.org
Subject: Tasks stuck in futex code (in 3.14-rc6)
Date: Wed, 19 Mar 2014 20:56:19 +0530	[thread overview]
Message-ID: <20140319152619.GB10406@linux.vnet.ibm.com> (raw)

Hi, 

When running specjbb on a power7 numa box, I am seeing java threads
getting stuck in futex 

#  ps -Ao pid,tt,user,fname,tmout,f,wchan | grep futex
 14808 pts/0    root     java         - 0 futex_wait_queue_me
 14925 pts/0    root     java         - 0 futex_wait_queue_me
#

stack traces, I see
[ 1843.426591] Call Trace:
[ 1843.426595] [c0000017101d74d0] [0000000000000020] 0x20 (unreliable)
[ 1843.426601] [c0000017101d76a0] [c000000000014c50] .__switch_to+0x1e0/0x390
[ 1843.426607] [c0000017101d7750] [c0000000006ed314] .__schedule+0x364/0x8c0
[ 1843.426613] [c0000017101d79d0] [c000000000139c28] .futex_wait_queue_me+0xf8/0x1a0
[ 1843.426619] [c0000017101d7a60] [c00000000013afbc] .futex_wait+0x17c/0x2a0
[ 1843.426626] [c0000017101d7c10] [c00000000013cee4] .do_futex+0x254/0xd80
[ 1843.426631] [c0000017101d7d60] [c00000000013db2c] .SyS_futex+0x11c/0x1d0
[ 1843.426638] [c0000017101d7e30] [c000000000009efc] syscall_exit+0x0/0x7c
[ 1843.426643] java            S 00003fffa08b16a0     0 14812  14203 0x00000080
[ 1843.426650] Call Trace:
[ 1843.426653] [c00000170c6034d0] [c000001710b09cf8] 0xc000001710b09cf8 (unreliable)
[ 1843.426660] [c00000170c6036a0] [c000000000014c50] .__switch_to+0x1e0/0x390
[ 1843.426666] [c00000170c603750] [c0000000006ed314] .__schedule+0x364/0x8c0
[ 1843.426672] [c00000170c6039d0] [c000000000139c28] .futex_wait_queue_me+0xf8/0x1a0
[ 1843.426679] [c00000170c603a60] [c00000000013afbc] .futex_wait+0x17c/0x2a0
[ 1843.453383] [c00000170c603c10] [c00000000013cee4] .do_futex+0x254/0xd80
[ 1843.453389] [c00000170c603d60] [c00000000013db2c] .SyS_futex+0x11c/0x1d0
[ 1843.453395] [c00000170c603e30] [c000000000009efc] syscall_exit+0x0/0x7c
[ 1843.453400] java            S 00003fffa08b1a74     0 14813  14203 0x00000080
[ 1843.453407] Call Trace:

There are 332 tasks all stuck in futex_wait_queue_me().
I am able to reproduce this consistently.

Infact I can reproduce this if the java_constraint is either node, socket, system.
However I am not able to reproduce if java_constraint is set to core.

I ran git bisect between v3.12 and v3.14-rc6 and found that

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b0c29f79ecea0b6fbcefc999e70f2843ae8306db

commit b0c29f79ecea0b6fbcefc999e70f2843ae8306db
Author: Davidlohr Bueso <davidlohr@hp.com>
Date:   Sun Jan 12 15:31:25 2014 -0800

futexes: Avoid taking the hb->lock if there's nothing to wake up

was the commit thats causing the threads to be stuck in futex.

I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6 and confirmed that
reverting the commit solved the problem.

-- 
Thanks and Regards
Srikar Dronamraju


             reply	other threads:[~2014-03-19 15:26 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-19 15:26 Srikar Dronamraju [this message]
2014-03-19 15:26 ` Tasks stuck in futex code (in 3.14-rc6) Srikar Dronamraju
2014-03-19 15:47 ` Peter Zijlstra
2014-03-19 15:47   ` Peter Zijlstra
2014-03-19 16:09   ` Srikar Dronamraju
2014-03-19 16:09     ` Srikar Dronamraju
2014-03-19 17:08   ` Peter Zijlstra
2014-03-19 17:08     ` Peter Zijlstra
2014-03-19 18:06     ` Davidlohr Bueso
2014-03-19 18:06       ` Davidlohr Bueso
2014-03-20  5:33     ` Srikar Dronamraju
2014-03-20  5:33       ` Srikar Dronamraju
2014-03-20  5:56       ` Davidlohr Bueso
2014-03-20  5:56         ` Davidlohr Bueso
2014-03-20 10:08         ` Srikar Dronamraju
2014-03-20 10:08           ` Srikar Dronamraju
2014-03-20 15:06           ` Davidlohr Bueso
2014-03-20 15:06             ` Davidlohr Bueso
2014-03-20 16:31         ` Davidlohr Bueso
2014-03-20 16:31           ` Davidlohr Bueso
2014-03-20 20:23           ` Benjamin Herrenschmidt
2014-03-20 20:23             ` Benjamin Herrenschmidt
2014-03-20 16:41         ` Linus Torvalds
2014-03-20 16:41           ` Linus Torvalds
2014-03-20 17:18           ` Davidlohr Bueso
2014-03-20 17:18             ` Davidlohr Bueso
2014-03-20 17:42             ` Linus Torvalds
2014-03-20 17:42               ` Linus Torvalds
2014-03-20 18:03               ` Davidlohr Bueso
2014-03-20 18:03                 ` Davidlohr Bueso
2014-03-20 18:16                 ` Linus Torvalds
2014-03-20 18:16                   ` Linus Torvalds
2014-03-20 18:36             ` Linus Torvalds
2014-03-20 18:36               ` Linus Torvalds
2014-03-20 19:08               ` Davidlohr Bueso
2014-03-20 19:08                 ` Davidlohr Bueso
2014-03-20 19:25                 ` Linus Torvalds
2014-03-20 19:25                   ` Linus Torvalds
2014-03-20 20:20                   ` Davidlohr Bueso
2014-03-20 20:20                     ` Davidlohr Bueso
2014-03-20 20:36                     ` Linus Torvalds
2014-03-20 20:36                       ` Linus Torvalds
2014-03-21  4:55                     ` Srikar Dronamraju
2014-03-21  4:55                       ` Srikar Dronamraju
2014-03-21  5:24                       ` Linus Torvalds
2014-03-21  5:24                         ` Linus Torvalds
2014-03-22  2:27                         ` Srikar Dronamraju
2014-03-22  2:27                           ` Srikar Dronamraju
2014-03-22  3:36                           ` Davidlohr Bueso
2014-03-22  3:36                             ` Davidlohr Bueso
2014-03-20  7:23       ` Peter Zijlstra
2014-03-20  7:23         ` Peter Zijlstra
2014-03-19 16:04 ` Linus Torvalds
2014-03-19 16:04   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140319152619.GB10406@linux.vnet.ibm.com \
    --to=srikar@linux.vnet.ibm.com \
    --cc=davidlohr@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@kernel.org \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.