From: Sasha Levin <sasha.levin@oracle.com>
To: paulmck@linux.vnet.ibm.com
Cc: "David S. Miller" <davem@davemloft.net>,
courmisch@gmail.com, LKML <linux-kernel@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Dave Jones <davej@redhat.com>
Subject: Re: net, phonet, rcu: rcu hang within gprs_attach
Date: Fri, 25 Jul 2014 19:23:06 -0400 [thread overview]
Message-ID: <53D2E6DA.9020306@oracle.com> (raw)
In-Reply-To: <20140725231903.GI11241@linux.vnet.ibm.com>
On 07/25/2014 07:19 PM, Paul E. McKenney wrote:
> On Thu, Jul 24, 2014 at 07:28:35PM -0400, Sasha Levin wrote:
>> > On 07/24/2014 06:54 PM, Paul E. McKenney wrote:
>>> > > On Thu, Jul 24, 2014 at 06:19:11PM -0400, Sasha Levin wrote:
>>>> > >> Hi all,
>>>> > >>
>>>> > >> While fuzzing with trinity inside a KVM tools guest running the latest -next
>>>> > >> kernel I've stumbled on the following stack trace (full log attached):
>>>> > >>
>>>> > >> [ 370.662014] INFO: task trinity-main:8727 blocked for more than 120 seconds.
>>>> > >> [ 370.662891] Not tainted 3.16.0-rc6-next-20140724-sasha-00046-g7324c87-dirty #932
>>>> > >> [ 370.663655] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> > >> [ 370.664562] trinity-main D ffff88053cc80000 13064 8727 8714 0x00000000
>>>> > >> [ 370.665328] ffff88053da6fc10 0000000000000002 ffff8805483e2dc8 ffff880541873000
>>>> > >> [ 370.666147] 000000276ed30787 ffff88053da6c010 ffff88053da6c000 ffff8805452a0000
>>>> > >> [ 370.667243] ffff880541873000 0000000000000000 7fffffffffffffff ffffffffb3ec51d8
>>>> > >> [ 370.668788] Call Trace:
>>>> > >> [ 370.669118] schedule (kernel/sched/core.c:2847)
>>>> > >> [ 370.670538] schedule_timeout (kernel/time/timer.c:1476)
>>>> > >> [ 370.671524] ? mark_lock (kernel/locking/lockdep.c:2894)
>>>> > >> [ 370.672299] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
>>>> > >> [ 370.673227] ? get_parent_ip (kernel/sched/core.c:2561)
>>>> > >> [ 370.674085] wait_for_completion (include/linux/spinlock.h:328 kernel/sched/completion.c:76 kernel/sched/completion.c:93 kernel/sched/completion.c:101 kernel/sched/completion.c:122)
>>>> > >> [ 370.674960] ? wake_up_state (kernel/sched/core.c:2942)
>>>> > >> [ 370.675576] _rcu_barrier (kernel/rcu/tree.c:3325 (discriminator 8))
>>>> > >> [ 370.676109] rcu_barrier (kernel/rcu/tree_plugin.h:920)
>>>> > >> [ 370.676627] netdev_run_todo (net/core/dev.c:6323)
>>>> > >> [ 370.677202] rtnl_unlock (net/core/rtnetlink.c:80)
>>>> > >> [ 370.677714] unregister_netdev (net/core/dev.c:6687)
>>>> > >> [ 370.678266] gprs_attach (net/phonet/pep-gprs.c:311)
>>>> > >> [ 370.679641] pep_setsockopt (net/phonet/pep.c:1016)
>>>> > >> [ 370.681082] sock_common_setsockopt (net/core/sock.c:2603)
>>>> > >> [ 370.682048] SyS_setsockopt (net/socket.c:1914 net/socket.c:1894)
>>>> > >> [ 370.682854] tracesys (arch/x86/kernel/entry_64.S:541)
>>>> > >> [ 370.683586] 1 lock held by trinity-main/8727:
>>>> > >> [ 370.684232] #0: (rcu_preempt_state.barrier_mutex){+.+...}, at: _rcu_barrier (kernel/rcu/tree.c:3233)
>>>> > >>
>>>> > >> This has reproduced couple of times, and has always originated from gprs_attach. I don't see any obvious
>>>> > >> issues with the code there, so I'm not sure if it's a fault of the phonet or the rcu code.
>>> > >
>>> > > Can't tell much from this. Any chance of a .config?
>>> > >
>>> > > Thanx, Paul
>>> > >
>> >
>> > Attached.
> If you were doing partial nohz_full= CPUs, there is a recent RCU bug
> that would result in these symptoms. No idea how you would make it
> happen without specifying the nohz_full= boot parameter, but I should
> be getting the fix into -next in a few days.
>
> But you never know. So if you are interested in testing sooner, and if
> my local tests pass, I could send you a modified patch that applies on
> top of rcu/next. If you would like such a patch, let me know.
Sure, if you Cc me on it I'll be happy to test it out, just don't go out
of your way since I've disabled phonet for now anyways, so it's not really
delaying me.
Thanks,
Sasha
next prev parent reply other threads:[~2014-07-25 23:23 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-24 22:19 net, phonet, rcu: rcu hang within gprs_attach Sasha Levin
2014-07-24 22:54 ` Paul E. McKenney
2014-07-24 23:28 ` Sasha Levin
2014-07-25 23:19 ` Paul E. McKenney
2014-07-25 23:23 ` Sasha Levin [this message]
2014-07-25 7:43 ` Rémi Denis-Courmont
2014-07-25 17:03 ` Sebastian Reichel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53D2E6DA.9020306@oracle.com \
--to=sasha.levin@oracle.com \
--cc=courmisch@gmail.com \
--cc=davej@redhat.com \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).