From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Ding Tianhong <dingtianhong@huawei.com>
Cc: gregkh@linuxfoundation.org, dhaval.giani@oracle.com,
stable@vger.kernel.org, stable-commits@vger.kernel.org
Subject: Re: Patch "rcu: Fix soft lockup for rcu_nocb_kthread" has been added to the 4.8-stable tree
Date: Sun, 4 Dec 2016 18:06:47 -0800 [thread overview]
Message-ID: <20161205020647.GX3924@linux.vnet.ibm.com> (raw)
In-Reply-To: <32e0a260-b726-08b3-9124-99bdc1045278@huawei.com>
Sorry, Ding, but this patch has been shown to fix other people's problems
as well as a few of yours. The fact that it doesn't solve -all- your
problems is absolutely no reason for you to stand in the way of its
solving these other people's problems.
Greg, please do add this patch to your -stable trees.
Thanx, Paul
On Mon, Dec 05, 2016 at 09:30:22AM +0800, Ding Tianhong wrote:
> Hi Greg:
>
> Please don't add this patch to the stable tree, I still discus with Paul about this problem and maybe original solution is not
> the best one, I need more time to check, thanks.
>
> Ding
>
> On 2016/12/3 16:53, gregkh@linuxfoundation.org wrote:
> >
> > This is a note to let you know that I've just added the patch titled
> >
> > rcu: Fix soft lockup for rcu_nocb_kthread
> >
> > to the 4.8-stable tree which can be found at:
> > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
> >
> > The filename of the patch is:
> > rcu-fix-soft-lockup-for-rcu_nocb_kthread.patch
> > and it can be found in the queue-4.8 subdirectory.
> >
> > If you, or anyone else, feels it should not be added to the stable tree,
> > please let <stable@vger.kernel.org> know about it.
> >
> >
> >>From bedc1969150d480c462cdac320fa944b694a7162 Mon Sep 17 00:00:00 2001
> > From: Ding Tianhong <dingtianhong@huawei.com>
> > Date: Wed, 15 Jun 2016 15:27:36 +0800
> > Subject: rcu: Fix soft lockup for rcu_nocb_kthread
> >
> > From: Ding Tianhong <dingtianhong@huawei.com>
> >
> > commit bedc1969150d480c462cdac320fa944b694a7162 upstream.
> >
> > Carrying out the following steps results in a softlockup in the
> > RCU callback-offload (rcuo) kthreads:
> >
> > 1. Connect to ixgbevf, and set the speed to 10Gb/s.
> > 2. Use ifconfig to bring the nic up and down repeatedly.
> >
> > [ 317.005148] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
> > [ 368.106005] BUG: soft lockup - CPU#1 stuck for 22s! [rcuos/1:15]
> > [ 368.106005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > [ 368.106005] task: ffff88057dd8a220 ti: ffff88057dd9c000 task.ti: ffff88057dd9c000
> > [ 368.106005] RIP: 0010:[<ffffffff81579e04>] [<ffffffff81579e04>] fib_table_lookup+0x14/0x390
> > [ 368.106005] RSP: 0018:ffff88061fc83ce8 EFLAGS: 00000286
> > [ 368.106005] RAX: 0000000000000001 RBX: 00000000020155c0 RCX: 0000000000000001
> > [ 368.106005] RDX: ffff88061fc83d50 RSI: ffff88061fc83d70 RDI: ffff880036d11a00
> > [ 368.106005] RBP: ffff88061fc83d08 R08: 0000000000000001 R09: 0000000000000000
> > [ 368.106005] R10: ffff880036d11a00 R11: ffffffff819e0900 R12: ffff88061fc83c58
> > [ 368.106005] R13: ffffffff816154dd R14: ffff88061fc83d08 R15: 00000000020155c0
> > [ 368.106005] FS: 0000000000000000(0000) GS:ffff88061fc80000(0000) knlGS:0000000000000000
> > [ 368.106005] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 368.106005] CR2: 00007f8c2aee9c40 CR3: 000000057b222000 CR4: 00000000000407e0
> > [ 368.106005] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 368.106005] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [ 368.106005] Stack:
> > [ 368.106005] 00000000010000c0 ffff88057b766000 ffff8802e380b000 ffff88057af03e00
> > [ 368.106005] ffff88061fc83dc0 ffffffff815349a6 ffff88061fc83d40 ffffffff814ee146
> > [ 368.106005] ffff8802e380af00 00000000e380af00 ffffffff819e0900 020155c0010000c0
> > [ 368.106005] Call Trace:
> > [ 368.106005] <IRQ>
> > [ 368.106005]
> > [ 368.106005] [<ffffffff815349a6>] ip_route_input_noref+0x516/0xbd0
> > [ 368.106005] [<ffffffff814ee146>] ? skb_release_data+0xd6/0x110
> > [ 368.106005] [<ffffffff814ee20a>] ? kfree_skb+0x3a/0xa0
> > [ 368.106005] [<ffffffff8153698f>] ip_rcv_finish+0x29f/0x350
> > [ 368.106005] [<ffffffff81537034>] ip_rcv+0x234/0x380
> > [ 368.106005] [<ffffffff814fd656>] __netif_receive_skb_core+0x676/0x870
> > [ 368.106005] [<ffffffff814fd868>] __netif_receive_skb+0x18/0x60
> > [ 368.106005] [<ffffffff814fe4de>] process_backlog+0xae/0x180
> > [ 368.106005] [<ffffffff814fdcb2>] net_rx_action+0x152/0x240
> > [ 368.106005] [<ffffffff81077b3f>] __do_softirq+0xef/0x280
> > [ 368.106005] [<ffffffff8161619c>] call_softirq+0x1c/0x30
> > [ 368.106005] <EOI>
> > [ 368.106005]
> > [ 368.106005] [<ffffffff81015d95>] do_softirq+0x65/0xa0
> > [ 368.106005] [<ffffffff81077174>] local_bh_enable+0x94/0xa0
> > [ 368.106005] [<ffffffff81114922>] rcu_nocb_kthread+0x232/0x370
> > [ 368.106005] [<ffffffff81098250>] ? wake_up_bit+0x30/0x30
> > [ 368.106005] [<ffffffff811146f0>] ? rcu_start_gp+0x40/0x40
> > [ 368.106005] [<ffffffff8109728f>] kthread+0xcf/0xe0
> > [ 368.106005] [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140
> > [ 368.106005] [<ffffffff816147d8>] ret_from_fork+0x58/0x90
> > [ 368.106005] [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140
> >
> > ==================================cut here==============================
> >
> > It turns out that the rcuos callback-offload kthread is busy processing
> > a very large quantity of RCU callbacks, and it is not reliquishing the
> > CPU while doing so. This commit therefore adds an cond_resched_rcu_qs()
> > within the loop to allow other tasks to run.
> >
> > Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
> > [ paulmck: Substituted cond_resched_rcu_qs for cond_resched. ]
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Dhaval Giani <dhaval.giani@oracle.com>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >
> > ---
> > kernel/rcu/tree_plugin.h | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -2173,6 +2173,7 @@ static int rcu_nocb_kthread(void *arg)
> > cl++;
> > c++;
> > local_bh_enable();
> > + cond_resched_rcu_qs();
> > list = next;
> > }
> > trace_rcu_batch_end(rdp->rsp->name, c, !!list, 0, 0, 1);
> >
> >
> > Patches currently in stable-queue which might be from dingtianhong@huawei.com are
> >
> > queue-4.8/rcu-fix-soft-lockup-for-rcu_nocb_kthread.patch
> >
> > .
> >
>
next prev parent reply other threads:[~2016-12-05 2:06 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-03 8:53 Patch "rcu: Fix soft lockup for rcu_nocb_kthread" has been added to the 4.8-stable tree gregkh
2016-12-05 1:30 ` Ding Tianhong
2016-12-05 2:06 ` Paul E. McKenney [this message]
2016-12-05 2:37 ` Ding Tianhong
2016-12-05 12:56 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161205020647.GX3924@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=dhaval.giani@oracle.com \
--cc=dingtianhong@huawei.com \
--cc=gregkh@linuxfoundation.org \
--cc=stable-commits@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).