netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [pull request][net v2 0/7] mlx5 fixes 2020-11-03
@ 2020-11-05 20:21 Saeed Mahameed
  2020-11-05 20:21 ` [net v2 1/7] net/mlx5e: Fix modify header actions memory leak Saeed Mahameed
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Saeed Mahameed @ 2020-11-05 20:21 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, David S. Miller, Saeed Mahameed

Hi Jakub,

This series introduces some fixes to mlx5 driver.

v1->v2:
 - Fix fixes line tag in patch #1
 - Toss ktls refcount leak fix, Maxim will look further into the root
   cause.
 - Toss eswitch chain 0 prio patch, until we determine if it is needed
   for -rc and net.

Please pull and let me know if there is any problem.

For -stable v5.1
 ('net/mlx5: Fix deletion of duplicate rules')

For -stable v5.4
 ('net/mlx5e: Fix modify header actions memory leak')

For -stable v5.8
 ('net/mlx5e: Protect encap route dev from concurrent release')

For -stable v5.9
 ('net/mlx5e: Fix VXLAN synchronization after function reload')
 ('net/mlx5e: Use spin_lock_bh for async_icosq_lock')
 ('net/mlx5e: Fix incorrect access of RCU-protected xdp_prog')
 ('net/mlx5: E-switch, Avoid extack error log for disabled vport')

Thanks,
Saeed.

---
The following changes since commit 9621618130bf7e83635367c13b9a6ee53935bb37:

  sfp: Fix error handing in sfp_probe() (2020-11-02 17:19:59 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2020-11-03

for you to fetch changes up to 1a50cf9a67ff2241c2949d30bc11c8dd4280eef8:

  net/mlx5e: Fix incorrect access of RCU-protected xdp_prog (2020-11-05 12:17:06 -0800)

----------------------------------------------------------------
mlx5-fixes-2020-11-03

----------------------------------------------------------------
Aya Levin (1):
      net/mlx5e: Fix VXLAN synchronization after function reload

Maor Dickman (1):
      net/mlx5e: Fix modify header actions memory leak

Maor Gottlieb (1):
      net/mlx5: Fix deletion of duplicate rules

Maxim Mikityanskiy (2):
      net/mlx5e: Use spin_lock_bh for async_icosq_lock
      net/mlx5e: Fix incorrect access of RCU-protected xdp_prog

Parav Pandit (1):
      net/mlx5: E-switch, Avoid extack error log for disabled vport

Vlad Buslov (1):
      net/mlx5e: Protect encap route dev from concurrent release

 .../net/ethernet/mellanox/mlx5/core/en/rep/tc.c    |  6 +-
 .../net/ethernet/mellanox/mlx5/core/en/tc_tun.c    | 72 ++++++++++++++--------
 .../net/ethernet/mellanox/mlx5/core/en/xsk/setup.c |  4 +-
 .../net/ethernet/mellanox/mlx5/core/en/xsk/tx.c    |  4 +-
 .../ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c | 14 ++---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.h   |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c    |  2 +
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  |  2 -
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  |  7 ++-
 .../net/ethernet/mellanox/mlx5/core/lib/vxlan.c    | 23 +++++--
 .../net/ethernet/mellanox/mlx5/core/lib/vxlan.h    |  2 +
 13 files changed, 90 insertions(+), 51 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [net v2 1/7] net/mlx5e: Fix modify header actions memory leak
  2020-11-05 20:21 [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Saeed Mahameed
@ 2020-11-05 20:21 ` Saeed Mahameed
  2020-11-07 20:50   ` patchwork-bot+netdevbpf
  2020-11-05 20:21 ` [net v2 2/7] net/mlx5e: Protect encap route dev from concurrent release Saeed Mahameed
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 10+ messages in thread
From: Saeed Mahameed @ 2020-11-05 20:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Maor Dickman, Paul Blakey,
	Saeed Mahameed

From: Maor Dickman <maord@nvidia.com>

Modify header actions are allocated during parse tc actions and only
freed during the flow creation, however, on error flow the allocated
memory is wrongly unfreed.

Fix this by calling dealloc_mod_hdr_actions in __mlx5e_add_fdb_flow
and mlx5e_add_nic_flow error flow.

Fixes: d7e75a325cb2 ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions")
Fixes: 2f4fe4cab073 ("net/mlx5e: Add offloading of NIC TC pedit (header re-write) actions")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index e3a968e9e2a0..2e2fa0440032 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -4658,6 +4658,7 @@ __mlx5e_add_fdb_flow(struct mlx5e_priv *priv,
 	return flow;
 
 err_free:
+	dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts);
 	mlx5e_flow_put(priv, flow);
 out:
 	return ERR_PTR(err);
@@ -4802,6 +4803,7 @@ mlx5e_add_nic_flow(struct mlx5e_priv *priv,
 	return 0;
 
 err_free:
+	dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts);
 	mlx5e_flow_put(priv, flow);
 out:
 	return err;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net v2 2/7] net/mlx5e: Protect encap route dev from concurrent release
  2020-11-05 20:21 [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Saeed Mahameed
  2020-11-05 20:21 ` [net v2 1/7] net/mlx5e: Fix modify header actions memory leak Saeed Mahameed
@ 2020-11-05 20:21 ` Saeed Mahameed
  2020-11-05 20:21 ` [net v2 3/7] net/mlx5e: Use spin_lock_bh for async_icosq_lock Saeed Mahameed
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2020-11-05 20:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Vlad Buslov, Roi Dayan, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

In functions mlx5e_route_lookup_ipv{4|6}() route_dev can be arbitrary net
device and not necessary mlx5 eswitch port representor. As such, in order
to ensure that route_dev is not destroyed concurrent the code needs either
explicitly take reference to the device before releasing reference to
rtable instance or ensure that caller holds rtnl lock. First approach is
chosen as a fix since rtnl lock dependency was intentionally removed from
mlx5 TC layer.

To prevent unprotected usage of route_dev in encap code take a reference to
the device before releasing rt. Don't save direct pointer to the device in
mlx5_encap_entry structure and use ifindex instead. Modify users of
route_dev pointer to properly obtain the net device instance from its
ifindex.

Fixes: 61086f391044 ("net/mlx5e: Protect encap hash table with mutex")
Fixes: 6707f74be862 ("net/mlx5e: Update hw flows when encap source mac changed")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en/rep/tc.c   |  6 +-
 .../ethernet/mellanox/mlx5/core/en/tc_tun.c   | 72 ++++++++++++-------
 .../net/ethernet/mellanox/mlx5/core/en_rep.h  |  2 +-
 3 files changed, 52 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
index e36e505d38ad..d29af7b9c695 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
@@ -107,12 +107,16 @@ void mlx5e_rep_update_flows(struct mlx5e_priv *priv,
 		mlx5e_tc_encap_flows_del(priv, e, &flow_list);
 
 	if (neigh_connected && !(e->flags & MLX5_ENCAP_ENTRY_VALID)) {
+		struct net_device *route_dev;
+
 		ether_addr_copy(e->h_dest, ha);
 		ether_addr_copy(eth->h_dest, ha);
 		/* Update the encap source mac, in case that we delete
 		 * the flows when encap source mac changed.
 		 */
-		ether_addr_copy(eth->h_source, e->route_dev->dev_addr);
+		route_dev = __dev_get_by_index(dev_net(priv->netdev), e->route_dev_ifindex);
+		if (route_dev)
+			ether_addr_copy(eth->h_source, route_dev->dev_addr);
 
 		mlx5e_tc_encap_flows_add(priv, e, &flow_list);
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
index 7cce85faa16f..90930e54b6f2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
@@ -77,13 +77,13 @@ static int get_route_and_out_devs(struct mlx5e_priv *priv,
 	return 0;
 }
 
-static int mlx5e_route_lookup_ipv4(struct mlx5e_priv *priv,
-				   struct net_device *mirred_dev,
-				   struct net_device **out_dev,
-				   struct net_device **route_dev,
-				   struct flowi4 *fl4,
-				   struct neighbour **out_n,
-				   u8 *out_ttl)
+static int mlx5e_route_lookup_ipv4_get(struct mlx5e_priv *priv,
+				       struct net_device *mirred_dev,
+				       struct net_device **out_dev,
+				       struct net_device **route_dev,
+				       struct flowi4 *fl4,
+				       struct neighbour **out_n,
+				       u8 *out_ttl)
 {
 	struct neighbour *n;
 	struct rtable *rt;
@@ -117,18 +117,28 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv *priv,
 		ip_rt_put(rt);
 		return ret;
 	}
+	dev_hold(*route_dev);
 
 	if (!(*out_ttl))
 		*out_ttl = ip4_dst_hoplimit(&rt->dst);
 	n = dst_neigh_lookup(&rt->dst, &fl4->daddr);
 	ip_rt_put(rt);
-	if (!n)
+	if (!n) {
+		dev_put(*route_dev);
 		return -ENOMEM;
+	}
 
 	*out_n = n;
 	return 0;
 }
 
+static void mlx5e_route_lookup_ipv4_put(struct net_device *route_dev,
+					struct neighbour *n)
+{
+	neigh_release(n);
+	dev_put(route_dev);
+}
+
 static const char *mlx5e_netdev_kind(struct net_device *dev)
 {
 	if (dev->rtnl_link_ops)
@@ -193,8 +203,8 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
 	fl4.saddr = tun_key->u.ipv4.src;
 	ttl = tun_key->ttl;
 
-	err = mlx5e_route_lookup_ipv4(priv, mirred_dev, &out_dev, &route_dev,
-				      &fl4, &n, &ttl);
+	err = mlx5e_route_lookup_ipv4_get(priv, mirred_dev, &out_dev, &route_dev,
+					  &fl4, &n, &ttl);
 	if (err)
 		return err;
 
@@ -223,7 +233,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
 	e->m_neigh.family = n->ops->family;
 	memcpy(&e->m_neigh.dst_ip, n->primary_key, n->tbl->key_len);
 	e->out_dev = out_dev;
-	e->route_dev = route_dev;
+	e->route_dev_ifindex = route_dev->ifindex;
 
 	/* It's important to add the neigh to the hash table before checking
 	 * the neigh validity state. So if we'll get a notification, in case the
@@ -278,7 +288,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
 
 	e->flags |= MLX5_ENCAP_ENTRY_VALID;
 	mlx5e_rep_queue_neigh_stats_work(netdev_priv(out_dev));
-	neigh_release(n);
+	mlx5e_route_lookup_ipv4_put(route_dev, n);
 	return err;
 
 destroy_neigh_entry:
@@ -286,18 +296,18 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
 free_encap:
 	kfree(encap_header);
 release_neigh:
-	neigh_release(n);
+	mlx5e_route_lookup_ipv4_put(route_dev, n);
 	return err;
 }
 
 #if IS_ENABLED(CONFIG_INET) && IS_ENABLED(CONFIG_IPV6)
-static int mlx5e_route_lookup_ipv6(struct mlx5e_priv *priv,
-				   struct net_device *mirred_dev,
-				   struct net_device **out_dev,
-				   struct net_device **route_dev,
-				   struct flowi6 *fl6,
-				   struct neighbour **out_n,
-				   u8 *out_ttl)
+static int mlx5e_route_lookup_ipv6_get(struct mlx5e_priv *priv,
+				       struct net_device *mirred_dev,
+				       struct net_device **out_dev,
+				       struct net_device **route_dev,
+				       struct flowi6 *fl6,
+				       struct neighbour **out_n,
+				       u8 *out_ttl)
 {
 	struct dst_entry *dst;
 	struct neighbour *n;
@@ -318,15 +328,25 @@ static int mlx5e_route_lookup_ipv6(struct mlx5e_priv *priv,
 		return ret;
 	}
 
+	dev_hold(*route_dev);
 	n = dst_neigh_lookup(dst, &fl6->daddr);
 	dst_release(dst);
-	if (!n)
+	if (!n) {
+		dev_put(*route_dev);
 		return -ENOMEM;
+	}
 
 	*out_n = n;
 	return 0;
 }
 
+static void mlx5e_route_lookup_ipv6_put(struct net_device *route_dev,
+					struct neighbour *n)
+{
+	neigh_release(n);
+	dev_put(route_dev);
+}
+
 int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
 				    struct net_device *mirred_dev,
 				    struct mlx5e_encap_entry *e)
@@ -348,8 +368,8 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
 	fl6.daddr = tun_key->u.ipv6.dst;
 	fl6.saddr = tun_key->u.ipv6.src;
 
-	err = mlx5e_route_lookup_ipv6(priv, mirred_dev, &out_dev, &route_dev,
-				      &fl6, &n, &ttl);
+	err = mlx5e_route_lookup_ipv6_get(priv, mirred_dev, &out_dev, &route_dev,
+					  &fl6, &n, &ttl);
 	if (err)
 		return err;
 
@@ -378,7 +398,7 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
 	e->m_neigh.family = n->ops->family;
 	memcpy(&e->m_neigh.dst_ip, n->primary_key, n->tbl->key_len);
 	e->out_dev = out_dev;
-	e->route_dev = route_dev;
+	e->route_dev_ifindex = route_dev->ifindex;
 
 	/* It's importent to add the neigh to the hash table before checking
 	 * the neigh validity state. So if we'll get a notification, in case the
@@ -433,7 +453,7 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
 
 	e->flags |= MLX5_ENCAP_ENTRY_VALID;
 	mlx5e_rep_queue_neigh_stats_work(netdev_priv(out_dev));
-	neigh_release(n);
+	mlx5e_route_lookup_ipv6_put(route_dev, n);
 	return err;
 
 destroy_neigh_entry:
@@ -441,7 +461,7 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
 free_encap:
 	kfree(encap_header);
 release_neigh:
-	neigh_release(n);
+	mlx5e_route_lookup_ipv6_put(route_dev, n);
 	return err;
 }
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
index 9020d1419bcf..8932c387d46a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
@@ -186,7 +186,7 @@ struct mlx5e_encap_entry {
 	unsigned char h_dest[ETH_ALEN];	/* destination eth addr	*/
 
 	struct net_device *out_dev;
-	struct net_device *route_dev;
+	int route_dev_ifindex;
 	struct mlx5e_tc_tunnel *tunnel;
 	int reformat_type;
 	u8 flags;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net v2 3/7] net/mlx5e: Use spin_lock_bh for async_icosq_lock
  2020-11-05 20:21 [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Saeed Mahameed
  2020-11-05 20:21 ` [net v2 1/7] net/mlx5e: Fix modify header actions memory leak Saeed Mahameed
  2020-11-05 20:21 ` [net v2 2/7] net/mlx5e: Protect encap route dev from concurrent release Saeed Mahameed
@ 2020-11-05 20:21 ` Saeed Mahameed
  2020-11-05 20:21 ` [net v2 4/7] net/mlx5: Fix deletion of duplicate rules Saeed Mahameed
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2020-11-05 20:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Maxim Mikityanskiy, Tariq Toukan,
	Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

async_icosq_lock may be taken from softirq and non-softirq contexts. It
requires protection with spin_lock_bh, otherwise a softirq may be
triggered in the middle of the critical section, and it may deadlock if
it tries to take the same lock. This patch fixes such a scenario by
using spin_lock_bh to disable softirqs on that CPU while inside the
critical section.

Fixes: 8d94b590f1e4 ("net/mlx5e: Turn XSK ICOSQ into a general asynchronous one")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/xsk/setup.c |  4 ++--
 .../net/ethernet/mellanox/mlx5/core/en/xsk/tx.c    |  4 ++--
 .../ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c | 14 +++++++-------
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
index 4e574ac73019..be3465ba38ca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
@@ -122,9 +122,9 @@ void mlx5e_activate_xsk(struct mlx5e_channel *c)
 	set_bit(MLX5E_RQ_STATE_ENABLED, &c->xskrq.state);
 	/* TX queue is created active. */
 
-	spin_lock(&c->async_icosq_lock);
+	spin_lock_bh(&c->async_icosq_lock);
 	mlx5e_trigger_irq(&c->async_icosq);
-	spin_unlock(&c->async_icosq_lock);
+	spin_unlock_bh(&c->async_icosq_lock);
 }
 
 void mlx5e_deactivate_xsk(struct mlx5e_channel *c)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c
index fb671a457129..8e96260fce1d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c
@@ -36,9 +36,9 @@ int mlx5e_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
 		if (test_and_set_bit(MLX5E_SQ_STATE_PENDING_XSK_TX, &c->async_icosq.state))
 			return 0;
 
-		spin_lock(&c->async_icosq_lock);
+		spin_lock_bh(&c->async_icosq_lock);
 		mlx5e_trigger_irq(&c->async_icosq);
-		spin_unlock(&c->async_icosq_lock);
+		spin_unlock_bh(&c->async_icosq_lock);
 	}
 
 	return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
index ccaccb9fc2f7..7f6221b8b1f7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
@@ -188,7 +188,7 @@ static int post_rx_param_wqes(struct mlx5e_channel *c,
 
 	err = 0;
 	sq = &c->async_icosq;
-	spin_lock(&c->async_icosq_lock);
+	spin_lock_bh(&c->async_icosq_lock);
 
 	cseg = post_static_params(sq, priv_rx);
 	if (IS_ERR(cseg))
@@ -199,7 +199,7 @@ static int post_rx_param_wqes(struct mlx5e_channel *c,
 
 	mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, cseg);
 unlock:
-	spin_unlock(&c->async_icosq_lock);
+	spin_unlock_bh(&c->async_icosq_lock);
 
 	return err;
 
@@ -265,10 +265,10 @@ resync_post_get_progress_params(struct mlx5e_icosq *sq,
 
 	BUILD_BUG_ON(MLX5E_KTLS_GET_PROGRESS_WQEBBS != 1);
 
-	spin_lock(&sq->channel->async_icosq_lock);
+	spin_lock_bh(&sq->channel->async_icosq_lock);
 
 	if (unlikely(!mlx5e_wqc_has_room_for(&sq->wq, sq->cc, sq->pc, 1))) {
-		spin_unlock(&sq->channel->async_icosq_lock);
+		spin_unlock_bh(&sq->channel->async_icosq_lock);
 		err = -ENOSPC;
 		goto err_dma_unmap;
 	}
@@ -299,7 +299,7 @@ resync_post_get_progress_params(struct mlx5e_icosq *sq,
 	icosq_fill_wi(sq, pi, &wi);
 	sq->pc++;
 	mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, cseg);
-	spin_unlock(&sq->channel->async_icosq_lock);
+	spin_unlock_bh(&sq->channel->async_icosq_lock);
 
 	return 0;
 
@@ -360,7 +360,7 @@ static int resync_handle_seq_match(struct mlx5e_ktls_offload_context_rx *priv_rx
 	err = 0;
 
 	sq = &c->async_icosq;
-	spin_lock(&c->async_icosq_lock);
+	spin_lock_bh(&c->async_icosq_lock);
 
 	cseg = post_static_params(sq, priv_rx);
 	if (IS_ERR(cseg)) {
@@ -372,7 +372,7 @@ static int resync_handle_seq_match(struct mlx5e_ktls_offload_context_rx *priv_rx
 	mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, cseg);
 	priv_rx->stats->tls_resync_res_ok++;
 unlock:
-	spin_unlock(&c->async_icosq_lock);
+	spin_unlock_bh(&c->async_icosq_lock);
 
 	return err;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net v2 4/7] net/mlx5: Fix deletion of duplicate rules
  2020-11-05 20:21 [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2020-11-05 20:21 ` [net v2 3/7] net/mlx5e: Use spin_lock_bh for async_icosq_lock Saeed Mahameed
@ 2020-11-05 20:21 ` Saeed Mahameed
  2020-11-05 20:21 ` [net v2 5/7] net/mlx5: E-switch, Avoid extack error log for disabled vport Saeed Mahameed
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2020-11-05 20:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Maor Gottlieb, Mark Bloch,
	Saeed Mahameed

From: Maor Gottlieb <maorg@nvidia.com>

When a rule is duplicated, the refcount of the rule is increased so only
the second deletion of the rule should cause destruction of the FTE.
Currently, the FTE will be destroyed in the first deletion of rule since
the modify_mask will be 0.
Fix it and call to destroy FTE only if all the rules (FTE's children)
have been removed.

Fixes: 718ce4d601db ("net/mlx5: Consolidate update FTE for all removal changes")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 16091838bfcf..325a5b0d6829 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -2010,10 +2010,11 @@ void mlx5_del_flow_rules(struct mlx5_flow_handle *handle)
 	down_write_ref_node(&fte->node, false);
 	for (i = handle->num_rules - 1; i >= 0; i--)
 		tree_remove_node(&handle->rule[i]->node, true);
-	if (fte->modify_mask && fte->dests_size) {
-		modify_fte(fte);
+	if (fte->dests_size) {
+		if (fte->modify_mask)
+			modify_fte(fte);
 		up_write_ref_node(&fte->node, false);
-	} else {
+	} else if (list_empty(&fte->node.children)) {
 		del_hw_fte(&fte->node);
 		/* Avoid double call to del_hw_fte */
 		fte->node.del_hw_func = NULL;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net v2 5/7] net/mlx5: E-switch, Avoid extack error log for disabled vport
  2020-11-05 20:21 [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2020-11-05 20:21 ` [net v2 4/7] net/mlx5: Fix deletion of duplicate rules Saeed Mahameed
@ 2020-11-05 20:21 ` Saeed Mahameed
  2020-11-05 20:21 ` [net v2 6/7] net/mlx5e: Fix VXLAN synchronization after function reload Saeed Mahameed
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2020-11-05 20:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Parav Pandit, Roi Dayan, Saeed Mahameed

From: Parav Pandit <parav@nvidia.com>

When E-switch vport is disabled, querying its hardware address is
unsupported.
Avoid setting extack error log message in such case.

Fixes: f099fde16db3 ("net/mlx5: E-switch, Support querying port function mac address")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 6e6a9a563992..e8e6294c7cca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1902,8 +1902,6 @@ int mlx5_devlink_port_function_hw_addr_get(struct devlink *devlink,
 		ether_addr_copy(hw_addr, vport->info.mac);
 		*hw_addr_len = ETH_ALEN;
 		err = 0;
-	} else {
-		NL_SET_ERR_MSG_MOD(extack, "Eswitch vport is disabled");
 	}
 	mutex_unlock(&esw->state_lock);
 	return err;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net v2 6/7] net/mlx5e: Fix VXLAN synchronization after function reload
  2020-11-05 20:21 [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2020-11-05 20:21 ` [net v2 5/7] net/mlx5: E-switch, Avoid extack error log for disabled vport Saeed Mahameed
@ 2020-11-05 20:21 ` Saeed Mahameed
  2020-11-05 20:21 ` [net v2 7/7] net/mlx5e: Fix incorrect access of RCU-protected xdp_prog Saeed Mahameed
  2020-11-07 20:41 ` [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Jakub Kicinski
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2020-11-05 20:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Aya Levin, Moshe Shemesh, Saeed Mahameed

From: Aya Levin <ayal@nvidia.com>

During driver reload, perform firmware tear-down which results in
firmware losing the configured VXLAN ports. These ports are still
available in the driver's database. Fix this by cleaning up driver's
VXLAN database in the nic unload flow, before firmware tear-down. With
that, minimize mlx5_vxlan_destroy() to remove only what was added in
mlx5_vxlan_create() and warn on leftover UDP ports.

Fixes: 18a2b7f969c9 ("net/mlx5: convert to new udp_tunnel infrastructure")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  1 +
 .../ethernet/mellanox/mlx5/core/lib/vxlan.c   | 23 ++++++++++++++-----
 .../ethernet/mellanox/mlx5/core/lib/vxlan.h   |  2 ++
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index b3f02aac7f26..ebce97921e03 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -5253,6 +5253,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv)
 
 	mlx5e_disable_async_events(priv);
 	mlx5_lag_remove(mdev);
+	mlx5_vxlan_reset_to_default(mdev->vxlan);
 }
 
 int mlx5e_update_nic_rx(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.c
index 3315afe2f8dc..38084400ee8f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.c
@@ -167,6 +167,17 @@ struct mlx5_vxlan *mlx5_vxlan_create(struct mlx5_core_dev *mdev)
 }
 
 void mlx5_vxlan_destroy(struct mlx5_vxlan *vxlan)
+{
+	if (!mlx5_vxlan_allowed(vxlan))
+		return;
+
+	mlx5_vxlan_del_port(vxlan, IANA_VXLAN_UDP_PORT);
+	WARN_ON(!hash_empty(vxlan->htable));
+
+	kfree(vxlan);
+}
+
+void mlx5_vxlan_reset_to_default(struct mlx5_vxlan *vxlan)
 {
 	struct mlx5_vxlan_port *vxlanp;
 	struct hlist_node *tmp;
@@ -175,12 +186,12 @@ void mlx5_vxlan_destroy(struct mlx5_vxlan *vxlan)
 	if (!mlx5_vxlan_allowed(vxlan))
 		return;
 
-	/* Lockless since we are the only hash table consumers*/
 	hash_for_each_safe(vxlan->htable, bkt, tmp, vxlanp, hlist) {
-		hash_del(&vxlanp->hlist);
-		mlx5_vxlan_core_del_port_cmd(vxlan->mdev, vxlanp->udp_port);
-		kfree(vxlanp);
+		/* Don't delete default UDP port added by the HW.
+		 * Remove only user configured ports
+		 */
+		if (vxlanp->udp_port == IANA_VXLAN_UDP_PORT)
+			continue;
+		mlx5_vxlan_del_port(vxlan, vxlanp->udp_port);
 	}
-
-	kfree(vxlan);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.h
index ec766529f49b..34ef662da35e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.h
@@ -56,6 +56,7 @@ void mlx5_vxlan_destroy(struct mlx5_vxlan *vxlan);
 int mlx5_vxlan_add_port(struct mlx5_vxlan *vxlan, u16 port);
 int mlx5_vxlan_del_port(struct mlx5_vxlan *vxlan, u16 port);
 bool mlx5_vxlan_lookup_port(struct mlx5_vxlan *vxlan, u16 port);
+void mlx5_vxlan_reset_to_default(struct mlx5_vxlan *vxlan);
 #else
 static inline struct mlx5_vxlan*
 mlx5_vxlan_create(struct mlx5_core_dev *mdev) { return ERR_PTR(-EOPNOTSUPP); }
@@ -63,6 +64,7 @@ static inline void mlx5_vxlan_destroy(struct mlx5_vxlan *vxlan) { return; }
 static inline int mlx5_vxlan_add_port(struct mlx5_vxlan *vxlan, u16 port) { return -EOPNOTSUPP; }
 static inline int mlx5_vxlan_del_port(struct mlx5_vxlan *vxlan, u16 port) { return -EOPNOTSUPP; }
 static inline bool mlx5_vxlan_lookup_port(struct mlx5_vxlan *vxlan, u16 port) { return false; }
+static inline void mlx5_vxlan_reset_to_default(struct mlx5_vxlan *vxlan) { return; }
 #endif
 
 #endif /* __MLX5_VXLAN_H__ */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net v2 7/7] net/mlx5e: Fix incorrect access of RCU-protected xdp_prog
  2020-11-05 20:21 [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2020-11-05 20:21 ` [net v2 6/7] net/mlx5e: Fix VXLAN synchronization after function reload Saeed Mahameed
@ 2020-11-05 20:21 ` Saeed Mahameed
  2020-11-07 20:41 ` [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Jakub Kicinski
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2020-11-05 20:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Maxim Mikityanskiy, Tariq Toukan,
	Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

rq->xdp_prog is RCU-protected and should be accessed only with
rcu_access_pointer for the NULL check in mlx5e_poll_rx_cq.

rq->xdp_prog may change on the fly only from one non-NULL value to
another non-NULL value, so the checks in mlx5e_xdp_handle and
mlx5e_poll_rx_cq will have the same result during one NAPI cycle,
meaning that no additional synchronization is needed.

Fixes: fe45386a2082 ("net/mlx5e: Use RCU to protect rq->xdp_prog")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 599f5b5ebc97..6628a0197b4e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1584,7 +1584,7 @@ int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget)
 	} while ((++work_done < budget) && (cqe = mlx5_cqwq_get_cqe(cqwq)));
 
 out:
-	if (rq->xdp_prog)
+	if (rcu_access_pointer(rq->xdp_prog))
 		mlx5e_xdp_rx_poll_complete(rq);
 
 	mlx5_cqwq_update_db_record(cqwq);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [pull request][net v2 0/7] mlx5 fixes 2020-11-03
  2020-11-05 20:21 [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2020-11-05 20:21 ` [net v2 7/7] net/mlx5e: Fix incorrect access of RCU-protected xdp_prog Saeed Mahameed
@ 2020-11-07 20:41 ` Jakub Kicinski
  7 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2020-11-07 20:41 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: netdev, David S. Miller

On Thu, 5 Nov 2020 12:21:22 -0800 Saeed Mahameed wrote:
> This series introduces some fixes to mlx5 driver.
> 
> v1->v2:
>  - Fix fixes line tag in patch #1
>  - Toss ktls refcount leak fix, Maxim will look further into the root
>    cause.
>  - Toss eswitch chain 0 prio patch, until we determine if it is needed
>    for -rc and net.
> 
> Please pull and let me know if there is any problem.
> 
> For -stable v5.1
>  ('net/mlx5: Fix deletion of duplicate rules')
> 
> For -stable v5.4
>  ('net/mlx5e: Fix modify header actions memory leak')
> 
> For -stable v5.8
>  ('net/mlx5e: Protect encap route dev from concurrent release')
> 
> For -stable v5.9
>  ('net/mlx5e: Fix VXLAN synchronization after function reload')
>  ('net/mlx5e: Use spin_lock_bh for async_icosq_lock')
>  ('net/mlx5e: Fix incorrect access of RCU-protected xdp_prog')
>  ('net/mlx5: E-switch, Avoid extack error log for disabled vport')

Pulled, thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [net v2 1/7] net/mlx5e: Fix modify header actions memory leak
  2020-11-05 20:21 ` [net v2 1/7] net/mlx5e: Fix modify header actions memory leak Saeed Mahameed
@ 2020-11-07 20:50   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+netdevbpf @ 2020-11-07 20:50 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: kuba, netdev, davem, maord, paulb

Hello:

This series was applied to netdev/net.git (refs/heads/master):

On Thu, 5 Nov 2020 12:21:23 -0800 you wrote:
> From: Maor Dickman <maord@nvidia.com>
> 
> Modify header actions are allocated during parse tc actions and only
> freed during the flow creation, however, on error flow the allocated
> memory is wrongly unfreed.
> 
> Fix this by calling dealloc_mod_hdr_actions in __mlx5e_add_fdb_flow
> and mlx5e_add_nic_flow error flow.
> 
> [...]

Here is the summary with links:
  - [net,v2,1/7] net/mlx5e: Fix modify header actions memory leak
    https://git.kernel.org/netdev/net/c/e68e28b4a9d7
  - [net,v2,2/7] net/mlx5e: Protect encap route dev from concurrent release
    https://git.kernel.org/netdev/net/c/78c906e430b1
  - [net,v2,3/7] net/mlx5e: Use spin_lock_bh for async_icosq_lock
    https://git.kernel.org/netdev/net/c/f42139ba4979
  - [net,v2,4/7] net/mlx5: Fix deletion of duplicate rules
    https://git.kernel.org/netdev/net/c/465e7baab6d9
  - [net,v2,5/7] net/mlx5: E-switch, Avoid extack error log for disabled vport
    https://git.kernel.org/netdev/net/c/ae3585944560
  - [net,v2,6/7] net/mlx5e: Fix VXLAN synchronization after function reload
    https://git.kernel.org/netdev/net/c/c5eb51adf06b
  - [net,v2,7/7] net/mlx5e: Fix incorrect access of RCU-protected xdp_prog
    https://git.kernel.org/netdev/net/c/1a50cf9a67ff

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-11-07 20:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-05 20:21 [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Saeed Mahameed
2020-11-05 20:21 ` [net v2 1/7] net/mlx5e: Fix modify header actions memory leak Saeed Mahameed
2020-11-07 20:50   ` patchwork-bot+netdevbpf
2020-11-05 20:21 ` [net v2 2/7] net/mlx5e: Protect encap route dev from concurrent release Saeed Mahameed
2020-11-05 20:21 ` [net v2 3/7] net/mlx5e: Use spin_lock_bh for async_icosq_lock Saeed Mahameed
2020-11-05 20:21 ` [net v2 4/7] net/mlx5: Fix deletion of duplicate rules Saeed Mahameed
2020-11-05 20:21 ` [net v2 5/7] net/mlx5: E-switch, Avoid extack error log for disabled vport Saeed Mahameed
2020-11-05 20:21 ` [net v2 6/7] net/mlx5e: Fix VXLAN synchronization after function reload Saeed Mahameed
2020-11-05 20:21 ` [net v2 7/7] net/mlx5e: Fix incorrect access of RCU-protected xdp_prog Saeed Mahameed
2020-11-07 20:41 ` [pull request][net v2 0/7] mlx5 fixes 2020-11-03 Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).