public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <Alexander.Levin@microsoft.com>
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Cc: Erez Shitrit <erezsh@mellanox.com>,
	Leon Romanovsky <leon@kernel.org>,
	Jason Gunthorpe <jgg@mellanox.com>,
	Sasha Levin <Alexander.Levin@microsoft.com>
Subject: [PATCH AUTOSEL for 4.9 33/52] IB/ipoib: Fix race condition in neigh creation
Date: Sat, 3 Feb 2018 18:03:52 +0000	[thread overview]
Message-ID: <20180203180303.8490-33-alexander.levin@microsoft.com> (raw)
In-Reply-To: <20180203180303.8490-1-alexander.levin@microsoft.com>

From: Erez Shitrit <erezsh@mellanox.com>

[ Upstream commit 16ba3defb8bd01a9464ba4820a487f5b196b455b ]

When using enhanced mode for IPoIB, two threads may execute xmit in
parallel to two different TX queues while the target is the same.
In this case, both of them will add the same neighbor to the path's
neigh link list and we might see the following message:

  list_add double add: new=ffff88024767a348, prev=ffff88024767a348...
  WARNING: lib/list_debug.c:31__list_add_valid+0x4e/0x70
  ipoib_start_xmit+0x477/0x680 [ib_ipoib]
  dev_hard_start_xmit+0xb9/0x3e0
  sch_direct_xmit+0xf9/0x250
  __qdisc_run+0x176/0x5d0
  __dev_queue_xmit+0x1f5/0xb10
  __dev_queue_xmit+0x55/0xb10

Analysis:
Two SKB are scheduled to be transmitted from two cores.
In ipoib_start_xmit, both gets NULL when calling ipoib_neigh_get.
Two calls to neigh_add_path are made. One thread takes the spin-lock
and calls ipoib_neigh_alloc which creates the neigh structure,
then (after the __path_find) the neigh is added to the path's neigh
link list. When the second thread enters the critical section it also
calls ipoib_neigh_alloc but in this case it gets the already allocated
ipoib_neigh structure, which is already linked to the path's neigh
link list and adds it again to the list. Which beside of triggering
the list, it creates a loop in the linked list. This loop leads to
endless loop inside path_rec_completion.

Solution:
Check list_empty(&neigh->list) before adding to the list.
Add a similar fix in "ipoib_multicast.c::ipoib_mcast_send"

Fixes: b63b70d87741 ('IPoIB: Use a private hash table for path lookup in xmit path')
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c      | 25 ++++++++++++++++++-------
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |  5 ++++-
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 183db0cd849e..e37e918cb935 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -919,8 +919,8 @@ static int path_rec_start(struct net_device *dev,
 	return 0;
 }
 
-static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
-			   struct net_device *dev)
+static struct ipoib_neigh *neigh_add_path(struct sk_buff *skb, u8 *daddr,
+					  struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 	struct ipoib_path *path;
@@ -933,7 +933,15 @@ static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
 		spin_unlock_irqrestore(&priv->lock, flags);
 		++dev->stats.tx_dropped;
 		dev_kfree_skb_any(skb);
-		return;
+		return NULL;
+	}
+
+	/* To avoid race condition, make sure that the
+	 * neigh will be added only once.
+	 */
+	if (unlikely(!list_empty(&neigh->list))) {
+		spin_unlock_irqrestore(&priv->lock, flags);
+		return neigh;
 	}
 
 	path = __path_find(dev, daddr + 4);
@@ -971,7 +979,7 @@ static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
 			spin_unlock_irqrestore(&priv->lock, flags);
 			ipoib_send(dev, skb, path->ah, IPOIB_QPN(daddr));
 			ipoib_neigh_put(neigh);
-			return;
+			return NULL;
 		}
 	} else {
 		neigh->ah  = NULL;
@@ -988,7 +996,7 @@ static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
 
 	spin_unlock_irqrestore(&priv->lock, flags);
 	ipoib_neigh_put(neigh);
-	return;
+	return NULL;
 
 err_path:
 	ipoib_neigh_free(neigh);
@@ -998,6 +1006,8 @@ err_drop:
 
 	spin_unlock_irqrestore(&priv->lock, flags);
 	ipoib_neigh_put(neigh);
+
+	return NULL;
 }
 
 static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
@@ -1103,8 +1113,9 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	case htons(ETH_P_TIPC):
 		neigh = ipoib_neigh_get(dev, phdr->hwaddr);
 		if (unlikely(!neigh)) {
-			neigh_add_path(skb, phdr->hwaddr, dev);
-			return NETDEV_TX_OK;
+			neigh = neigh_add_path(skb, phdr->hwaddr, dev);
+			if (likely(!neigh))
+				return NETDEV_TX_OK;
 		}
 		break;
 	case htons(ETH_P_ARP):
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index fddff403d5d2..6b6826f3e446 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -818,7 +818,10 @@ void ipoib_mcast_send(struct net_device *dev, u8 *daddr, struct sk_buff *skb)
 		spin_lock_irqsave(&priv->lock, flags);
 		if (!neigh) {
 			neigh = ipoib_neigh_alloc(daddr, dev);
-			if (neigh) {
+			/* Make sure that the neigh will be added only
+			 * once to mcast list.
+			 */
+			if (neigh && list_empty(&neigh->list)) {
 				kref_get(&mcast->ah->ref);
 				neigh->ah	= mcast->ah;
 				list_add_tail(&neigh->list, &mcast->neigh_list);
-- 
2.11.0

  parent reply	other threads:[~2018-02-03 18:07 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-03 18:03 [PATCH AUTOSEL for 4.9 01/52] dmaengine: fsl-edma: disable clks on all error paths Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 03/52] net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 02/52] nvme: check hw sectors before setting chunk sectors Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 04/52] mtd: nand: gpmi: Fix failure when a erased page has a bitflip at BBM Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 05/52] mtd: nand: brcmnand: Zero bitflip is not an error Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 06/52] ipv6: icmp6: Allow icmp messages to be looped back Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 07/52] ARM: 8731/1: Fix csum_partial_copy_from_user() stack mismatch Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 09/52] mm,vmscan: Make unregister_shrinker() no-op if register_shrinker() failed Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 08/52] x86/asm: Allow again using asm.h when building for the 'bpf' clang target Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 10/52] sget(): handle failures of register_shrinker() Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 11/52] net: phy: xgene: disable clk on error paths Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 12/52] drm/nouveau/pci: do a msi rearm on init Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 13/52] mac80211_hwsim: Fix a possible sleep-in-atomic bug in hwsim_get_radio_nl Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 14/52] spi: atmel: fixed spin_lock usage inside atmel_spi_remove Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 16/52] net: mediatek: setup proper state for disabled GMAC on the default Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 15/52] ASoC: nau8825: fix issue that pop noise when start capture Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 17/52] net: arc_emac: fix arc_emac_rx() error paths Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 19/52] net: stmmac: Fix TX timestamp calculation Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 18/52] ip6_tunnel: get the min mtu properly in ip6_tnl_xmit Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 20/52] scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 21/52] ARM: dts: ls1021a: fix incorrect clock references Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 22/52] lib/mpi: Fix umul_ppmm() for MIPS64r6 Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 24/52] tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 23/52] tipc: error path leak fixes in tipc_enable_bearer() Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 25/52] tg3: Add workaround to restrict 5762 MRRS to 2048 Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 27/52] bnx2x: Improve reliability in case of nested PCI errors Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 26/52] tg3: Enable PHY reset in MTU change path for 5720 Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 29/52] IB/mlx5: Fix mlx5_ib_alloc_mr error flow Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 28/52] led: core: Fix brightness setting when setting delay_off=0 Sasha Levin
2018-02-03 21:22   ` Jacek Anaszewski
2018-02-03 22:34     ` Jacek Anaszewski
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 30/52] genirq: Guard handle_bad_irq log messages Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 31/52] s390/dasd: fix wrongly assigned configuration data Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 32/52] IB/mlx4: Fix mlx4_ib_alloc_mr error flow Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 34/52] xfs: quota: fix missed destroy of qi_tree_lock Sasha Levin
2018-02-03 18:03 ` Sasha Levin [this message]
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 36/52] macvlan: Fix one possible double free Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 35/52] xfs: quota: check result of register_shrinker() Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 38/52] NET: usb: qmi_wwan: add support for YUGA CLM920-NC5 PID 0x9625 Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 37/52] e1000: fix disabling already-disabled warning Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 39/52] drm/ttm: check the return value of kzalloc Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 40/52] uapi libc compat: add fallback for unsupported libcs Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 42/52] nl80211: Check for the required netlink attribute presence Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 41/52] i40e/i40evf: Account for frags split over multiple descriptors in check linearize Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 43/52] mac80211: mesh: drop frames appearing to be from us Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 44/52] can: flex_can: Correct the checking for frame length in flexcan_start_xmit() Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 46/52] xen-netfront: enable device after manual module load Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 45/52] bnxt_en: Fix the 'Invalid VF' id check in bnxt_vf_ndo_prep routine Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 48/52] SolutionEngine771x: fix Ether platform data Sasha Levin
2018-02-03 18:03 ` [PATCH AUTOSEL for 4.9 47/52] mdio-sun4i: Fix a memory leak Sasha Levin
2018-02-03 18:04 ` [PATCH AUTOSEL for 4.9 50/52] xen/gntdev: Fix partial gntdev_mmap() cleanup Sasha Levin
2018-02-03 18:04 ` [PATCH AUTOSEL for 4.9 49/52] xen/gntdev: Fix off-by-one error when unmapping with holes Sasha Levin
2018-02-03 18:04 ` [PATCH AUTOSEL for 4.9 51/52] sctp: make use of pre-calculated len Sasha Levin
2018-02-03 18:04 ` [PATCH AUTOSEL for 4.9 52/52] net: gianfar_ptp: move set_fipers() to spinlock protecting area Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180203180303.8490-33-alexander.levin@microsoft.com \
    --to=alexander.levin@microsoft.com \
    --cc=erezsh@mellanox.com \
    --cc=jgg@mellanox.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox