From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:41656 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755136AbeCVNsV (ORCPT ); Thu, 22 Mar 2018 09:48:21 -0400 Subject: Patch "IB/ipoib: Fix deadlock between ipoib_stop and mcast join flow" has been added to the 4.9-stable tree To: ferasda@mellanox.com, alexander.levin@microsoft.com, dledford@redhat.com, gregkh@linuxfoundation.org, leon@kernel.org Cc: , From: Date: Thu, 22 Mar 2018 14:47:42 +0100 Message-ID: <152172646214186@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: This is a note to let you know that I've just added the patch titled IB/ipoib: Fix deadlock between ipoib_stop and mcast join flow to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: ib-ipoib-fix-deadlock-between-ipoib_stop-and-mcast-join-flow.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >>From foo@baz Thu Mar 22 14:40:23 CET 2018 From: Feras Daoud Date: Sun, 19 Mar 2017 11:18:55 +0200 Subject: IB/ipoib: Fix deadlock between ipoib_stop and mcast join flow From: Feras Daoud [ Upstream commit 3e31a490e01a6e67cbe9f6e1df2f3ff0fbf48972 ] Before calling ipoib_stop, rtnl_lock should be taken, then the flow clears the IPOIB_FLAG_ADMIN_UP and IPOIB_FLAG_OPER_UP flags, and waits for mcast completion if IPOIB_MCAST_FLAG_BUSY is set. On the other hand, the flow of multicast join task initializes a mcast completion, sets the IPOIB_MCAST_FLAG_BUSY and calls ipoib_mcast_join. If IPOIB_FLAG_OPER_UP flag is not set, this call returns EINVAL without setting the mcast completion and leads to a deadlock. ipoib_stop | | | clear_bit(IPOIB_FLAG_ADMIN_UP) | | | Context Switch | | ipoib_mcast_join_task | | | spin_lock_irq(lock) | | | init_completion(mcast) | | | set_bit(IPOIB_MCAST_FLAG_BUSY) | | | Context Switch | | clear_bit(IPOIB_FLAG_OPER_UP) | | | spin_lock_irqsave(lock) | | | Context Switch | | ipoib_mcast_join | return (-EINVAL) | | | spin_unlock_irq(lock) | | | Context Switch | | ipoib_mcast_dev_flush | wait_for_completion(mcast) | ipoib_stop will wait for mcast completion for ever, and will not release the rtnl_lock. As a result panic occurs with the following trace: [13441.639268] Call Trace: [13441.640150] [] schedule+0x29/0x70 [13441.641038] [] schedule_timeout+0x239/0x2d0 [13441.641914] [] ? complete+0x47/0x50 [13441.642765] [] ? flush_workqueue_prep_pwqs+0x16d/0x200 [13441.643580] [] wait_for_completion+0x116/0x170 [13441.644434] [] ? wake_up_state+0x20/0x20 [13441.645293] [] ipoib_mcast_dev_flush+0x150/0x190 [ib_ipoib] [13441.646159] [] ipoib_ib_dev_down+0x37/0x60 [ib_ipoib] [13441.647013] [] ipoib_stop+0x75/0x150 [ib_ipoib] Fixes: 08bc327629cb ("IB/ipoib: fix for rare multicast join race condition") Signed-off-by: Feras Daoud Signed-off-by: Leon Romanovsky Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -487,6 +487,9 @@ static int ipoib_mcast_join(struct net_d !test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) return -EINVAL; + init_completion(&mcast->done); + set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags); + ipoib_dbg_mcast(priv, "joining MGID %pI6\n", mcast->mcmember.mgid.raw); rec.mgid = mcast->mcmember.mgid; @@ -645,8 +648,6 @@ void ipoib_mcast_join_task(struct work_s if (mcast->backoff == 1 || time_after_eq(jiffies, mcast->delay_until)) { /* Found the next unjoined group */ - init_completion(&mcast->done); - set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags); if (ipoib_mcast_join(dev, mcast)) { spin_unlock_irq(&priv->lock); return; @@ -666,11 +667,9 @@ out: queue_delayed_work(priv->wq, &priv->mcast_task, delay_until - jiffies); } - if (mcast) { - init_completion(&mcast->done); - set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags); + if (mcast) ipoib_mcast_join(dev, mcast); - } + spin_unlock_irq(&priv->lock); } Patches currently in stable-queue which might be from ferasda@mellanox.com are queue-4.9/ib-ipoib-fix-deadlock-between-ipoib_stop-and-mcast-join-flow.patch queue-4.9/ib-ipoib-update-broadcast-object-if-pkey-value-was-changed-in-index-0.patch