netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [pull request][net 0/8] mlx5 fixes 2024-09-25
@ 2024-09-25 20:20 Saeed Mahameed
  2024-09-25 20:20 ` [net 1/8] net/mlx5: Fix error path in multi-packet WQE transmit Saeed Mahameed
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Saeed Mahameed @ 2024-09-25 20:20 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman,
	Leon Romanovsky

From: Saeed Mahameed <saeedm@nvidia.com>

This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.

Thanks,
Saeed.


The following changes since commit 0cbfd45fbcf0cb26d85c981b91c62fe73cdee01c:

  bonding: Fix unnecessary warnings and logs from bond_xdp_get_xmit_slave() (2024-09-24 15:19:50 +0200)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2024-09-25

for you to fetch changes up to 7b124695db40d5c9c5295a94ae928a8d67a01c3d:

  net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice (2024-09-25 13:15:46 -0700)

----------------------------------------------------------------
mlx5-fixes-2024-09-25

----------------------------------------------------------------
Dragos Tatulea (1):
      net/mlx5e: SHAMPO, Fix overflow of hd_per_wq

Elena Salomatkina (1):
      net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc()

Gerd Bayer (1):
      net/mlx5: Fix error path in multi-packet WQE transmit

Jianbo Liu (1):
      net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice

Mohamed Khalfella (1):
      net/mlx5: Added cond_resched() to crdump collection

Yevgeny Kliteynik (3):
      net/mlx5: Fix wrong reserved field in hca_cap_2 in mlx5_ifc
      net/mlx5: HWS, fixed double-free in error flow of creating SQ
      net/mlx5: HWS, changed E2BIG error to a negative return code

 drivers/net/ethernet/mellanox/mlx5/core/en.h                   |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/tir.c               |  3 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c       |  8 +++++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c                |  1 -
 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c          | 10 ++++++++++
 .../mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c      |  2 +-
 .../ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c |  4 ++--
 .../ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c |  2 +-
 .../ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c    |  8 +++++++-
 include/linux/mlx5/mlx5_ifc.h                                  |  2 +-
 10 files changed, 33 insertions(+), 9 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [net 1/8] net/mlx5: Fix error path in multi-packet WQE transmit
  2024-09-25 20:20 [pull request][net 0/8] mlx5 fixes 2024-09-25 Saeed Mahameed
@ 2024-09-25 20:20 ` Saeed Mahameed
  2024-10-03  0:30   ` patchwork-bot+netdevbpf
  2024-09-25 20:20 ` [net 2/8] net/mlx5: Added cond_resched() to crdump collection Saeed Mahameed
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 10+ messages in thread
From: Saeed Mahameed @ 2024-09-25 20:20 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman,
	Leon Romanovsky, Gerd Bayer, Zhu Yanjun, Maxim Mikityanskiy

From: Gerd Bayer <gbayer@linux.ibm.com>

Remove the erroneous unmap in case no DMA mapping was established

The multi-packet WQE transmit code attempts to obtain a DMA mapping for
the skb. This could fail, e.g. under memory pressure, when the IOMMU
driver just can't allocate more memory for page tables. While the code
tries to handle this in the path below the err_unmap label it erroneously
unmaps one entry from the sq's FIFO list of active mappings. Since the
current map attempt failed this unmap is removing some random DMA mapping
that might still be required. If the PCI function now presents that IOVA,
the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
function in error state.

The erroneous behavior was seen in a stress-test environment that created
memory pressure.

Fixes: 5af75c747e2a ("net/mlx5e: Enhanced TX MPWQE for SKBs")
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index b09e9abd39f3..f8c7912abe0e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -642,7 +642,6 @@ mlx5e_sq_xmit_mpwqe(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 	return;
 
 err_unmap:
-	mlx5e_dma_unmap_wqe_err(sq, 1);
 	sq->stats->dropped++;
 	dev_kfree_skb_any(skb);
 	mlx5e_tx_flush(sq);
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 2/8] net/mlx5: Added cond_resched() to crdump collection
  2024-09-25 20:20 [pull request][net 0/8] mlx5 fixes 2024-09-25 Saeed Mahameed
  2024-09-25 20:20 ` [net 1/8] net/mlx5: Fix error path in multi-packet WQE transmit Saeed Mahameed
@ 2024-09-25 20:20 ` Saeed Mahameed
  2024-09-25 20:20 ` [net 3/8] net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc() Saeed Mahameed
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2024-09-25 20:20 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman,
	Leon Romanovsky, Mohamed Khalfella, Yuanyuan Zhong, Moshe Shemesh

From: Mohamed Khalfella <mkhalfella@purestorage.com>

Collecting crdump involves reading vsc registers from pci config space
of mlx device, which can take long time to complete. This might result
in starving other threads waiting to run on the cpu.

Numbers I got from testing ConnectX-5 Ex MCX516A-CDAT in the lab:

- mlx5_vsc_gw_read_block_fast() was called with length = 1310716.
- mlx5_vsc_gw_read_fast() reads 4 bytes at a time. It was not used to
  read the entire 1310716 bytes. It was called 53813 times because
  there are jumps in read_addr.
- On average mlx5_vsc_gw_read_fast() took 35284.4ns.
- In total mlx5_vsc_wait_on_flag() called vsc_read() 54707 times.
  The average time for each call was 17548.3ns. In some instances
  vsc_read() was called more than one time when the flag was not set.
  As expected the thread released the cpu after 16 iterations in
  mlx5_vsc_wait_on_flag().
- Total time to read crdump was 35284.4ns * 53813 ~= 1.898s.

It was seen in the field that crdump can take more than 5 seconds to
complete. During that time mlx5_vsc_wait_on_flag() did not release the
cpu because it did not complete 16 iterations. It is believed that pci
config reads were slow. Adding cond_resched() every 128 register read
improves the situation. In the common case the, crdump takes ~1.8989s,
the thread yields the cpu every ~4.51ms. If crdump takes ~5s, the thread
yields the cpu every ~18.0ms.

Fixes: 8b9d8baae1de ("net/mlx5: Add Crdump support")
Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
index d0b595ba6110..432c98f2626d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
@@ -24,6 +24,11 @@
 	pci_write_config_dword((dev)->pdev, (dev)->vsc_addr + (offset), (val))
 #define VSC_MAX_RETRIES 2048
 
+/* Reading VSC registers can take relatively long time.
+ * Yield the cpu every 128 registers read.
+ */
+#define VSC_GW_READ_BLOCK_COUNT 128
+
 enum {
 	VSC_CTRL_OFFSET = 0x4,
 	VSC_COUNTER_OFFSET = 0x8,
@@ -273,6 +278,7 @@ int mlx5_vsc_gw_read_block_fast(struct mlx5_core_dev *dev, u32 *data,
 {
 	unsigned int next_read_addr = 0;
 	unsigned int read_addr = 0;
+	unsigned int count = 0;
 
 	while (read_addr < length) {
 		if (mlx5_vsc_gw_read_fast(dev, read_addr, &next_read_addr,
@@ -280,6 +286,10 @@ int mlx5_vsc_gw_read_block_fast(struct mlx5_core_dev *dev, u32 *data,
 			return read_addr;
 
 		read_addr = next_read_addr;
+		if (++count == VSC_GW_READ_BLOCK_COUNT) {
+			cond_resched();
+			count = 0;
+		}
 	}
 	return length;
 }
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 3/8] net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc()
  2024-09-25 20:20 [pull request][net 0/8] mlx5 fixes 2024-09-25 Saeed Mahameed
  2024-09-25 20:20 ` [net 1/8] net/mlx5: Fix error path in multi-packet WQE transmit Saeed Mahameed
  2024-09-25 20:20 ` [net 2/8] net/mlx5: Added cond_resched() to crdump collection Saeed Mahameed
@ 2024-09-25 20:20 ` Saeed Mahameed
  2024-09-25 20:20 ` [net 4/8] net/mlx5: Fix wrong reserved field in hca_cap_2 in mlx5_ifc Saeed Mahameed
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2024-09-25 20:20 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman,
	Leon Romanovsky, Elena Salomatkina, Simon Horman, Kalesh AP

From: Elena Salomatkina <esalomatkina@ispras.ru>

In mlx5e_tir_builder_alloc() kvzalloc() may return NULL
which is dereferenced on the next line in a reference
to the modify field.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: a6696735d694 ("net/mlx5e: Convert TIR to a dedicated object")
Signed-off-by: Elena Salomatkina <esalomatkina@ispras.ru>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/tir.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tir.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tir.c
index d4239e3b3c88..11f724ad90db 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tir.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tir.c
@@ -23,6 +23,9 @@ struct mlx5e_tir_builder *mlx5e_tir_builder_alloc(bool modify)
 	struct mlx5e_tir_builder *builder;
 
 	builder = kvzalloc(sizeof(*builder), GFP_KERNEL);
+	if (!builder)
+		return NULL;
+
 	builder->modify = modify;
 
 	return builder;
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 4/8] net/mlx5: Fix wrong reserved field in hca_cap_2 in mlx5_ifc
  2024-09-25 20:20 [pull request][net 0/8] mlx5 fixes 2024-09-25 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2024-09-25 20:20 ` [net 3/8] net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc() Saeed Mahameed
@ 2024-09-25 20:20 ` Saeed Mahameed
  2024-09-25 20:20 ` [net 5/8] net/mlx5: HWS, fixed double-free in error flow of creating SQ Saeed Mahameed
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2024-09-25 20:20 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman,
	Leon Romanovsky, Yevgeny Kliteynik

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

Fixing the wrong size of a field in hca_cap_2.
The bug was introduced by adding new fields for HWS
and not fixing the reserved field size.

Fixes: 34c626c3004a ("net/mlx5: Added missing mlx5_ifc definition for HW Steering")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 include/linux/mlx5/mlx5_ifc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 620a5c305123..04df1610736e 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -2115,7 +2115,7 @@ struct mlx5_ifc_cmd_hca_cap_2_bits {
 	u8	   ts_cqe_metadata_size2wqe_counter[0x5];
 	u8	   reserved_at_250[0x10];
 
-	u8	   reserved_at_260[0x120];
+	u8	   reserved_at_260[0x20];
 
 	u8	   format_select_dw_gtpu_dw_0[0x8];
 	u8	   format_select_dw_gtpu_dw_1[0x8];
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 5/8] net/mlx5: HWS, fixed double-free in error flow of creating SQ
  2024-09-25 20:20 [pull request][net 0/8] mlx5 fixes 2024-09-25 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2024-09-25 20:20 ` [net 4/8] net/mlx5: Fix wrong reserved field in hca_cap_2 in mlx5_ifc Saeed Mahameed
@ 2024-09-25 20:20 ` Saeed Mahameed
  2024-09-25 20:20 ` [net 6/8] net/mlx5: HWS, changed E2BIG error to a negative return code Saeed Mahameed
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2024-09-25 20:20 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman,
	Leon Romanovsky, Yevgeny Kliteynik, Dan Carpenter

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

When SQ creation fails, call the appropriate mlx5_core destroy function.

This fixes the following smatch warnings:
  divers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c:739
    hws_send_ring_open_sq() warn: 'sq->dep_wqe' double freed
    hws_send_ring_open_sq() warn: 'sq->wq_ctrl.buf.frags' double freed
    hws_send_ring_open_sq() warn: 'sq->wr_priv' double freed

Fixes: 2ca62599aa0b ("net/mlx5: HWS, added send engine and context handling")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/e4ebc227-4b25-49bf-9e4c-14b7ea5c6a07@stanley.mountain/
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/hws/mlx5hws_send.c        | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c
index a1adbb48735c..0c7989184c30 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c
@@ -653,6 +653,12 @@ static int hws_send_ring_create_sq(struct mlx5_core_dev *mdev, u32 pdn,
 	return err;
 }
 
+static void hws_send_ring_destroy_sq(struct mlx5_core_dev *mdev,
+				     struct mlx5hws_send_ring_sq *sq)
+{
+	mlx5_core_destroy_sq(mdev, sq->sqn);
+}
+
 static int hws_send_ring_set_sq_rdy(struct mlx5_core_dev *mdev, u32 sqn)
 {
 	void *in, *sqc;
@@ -696,7 +702,7 @@ static int hws_send_ring_create_sq_rdy(struct mlx5_core_dev *mdev, u32 pdn,
 
 	err = hws_send_ring_set_sq_rdy(mdev, sq->sqn);
 	if (err)
-		hws_send_ring_close_sq(sq);
+		hws_send_ring_destroy_sq(mdev, sq);
 
 	return err;
 }
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 6/8] net/mlx5: HWS, changed E2BIG error to a negative return code
  2024-09-25 20:20 [pull request][net 0/8] mlx5 fixes 2024-09-25 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2024-09-25 20:20 ` [net 5/8] net/mlx5: HWS, fixed double-free in error flow of creating SQ Saeed Mahameed
@ 2024-09-25 20:20 ` Saeed Mahameed
  2024-09-25 20:20 ` [net 7/8] net/mlx5e: SHAMPO, Fix overflow of hd_per_wq Saeed Mahameed
  2024-09-25 20:20 ` [net 8/8] net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice Saeed Mahameed
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2024-09-25 20:20 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman,
	Leon Romanovsky, Yevgeny Kliteynik, Dan Carpenter

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

Fixed all the 'E2BIG' returns in error flow of functions to
the negative '-E2BIG' as we are using negative error codes
everywhere in HWS code.

This also fixes the following smatch warnings:
	"warn: was negative '-E2BIG' intended?"

Fixes: 74a778b4a63f ("net/mlx5: HWS, added definers handling")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/f8c77688-7d83-4937-baba-ac844dfe2e0b@stanley.mountain/
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c     | 2 +-
 .../mellanox/mlx5/core/steering/hws/mlx5hws_definer.c         | 4 ++--
 .../mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c         | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c
index bb563f50ef09..601fad5fc54a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c
@@ -33,7 +33,7 @@ bool mlx5hws_bwc_match_params_is_complex(struct mlx5hws_context *ctx,
 		 * and let the usual match creation path handle it,
 		 * both for good and bad flows.
 		 */
-		if (ret == E2BIG) {
+		if (ret == -E2BIG) {
 			is_complex = true;
 			mlx5hws_dbg(ctx, "Matcher definer layout: need complex matcher\n");
 		} else {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c
index 3bdb5c90efff..d566d2ddf424 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c
@@ -1845,7 +1845,7 @@ hws_definer_find_best_match_fit(struct mlx5hws_context *ctx,
 		return 0;
 	}
 
-	return E2BIG;
+	return -E2BIG;
 }
 
 static void
@@ -1931,7 +1931,7 @@ mlx5hws_definer_calc_layout(struct mlx5hws_context *ctx,
 	/* Find the match definer layout for header layout match union */
 	ret = hws_definer_find_best_match_fit(ctx, match_definer, match_hl);
 	if (ret) {
-		if (ret == E2BIG)
+		if (ret == -E2BIG)
 			mlx5hws_dbg(ctx,
 				    "Failed to create match definer from header layout - E2BIG\n");
 		else
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c
index 33d2b31e4b46..61a1155d4b4f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c
@@ -675,7 +675,7 @@ static int hws_matcher_bind_mt(struct mlx5hws_matcher *matcher)
 	if (!(matcher->flags & MLX5HWS_MATCHER_FLAGS_COLLISION)) {
 		ret = mlx5hws_definer_mt_init(ctx, matcher->mt);
 		if (ret) {
-			if (ret == E2BIG)
+			if (ret == -E2BIG)
 				mlx5hws_err(ctx, "Failed to set matcher templates with match definers\n");
 			return ret;
 		}
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 7/8] net/mlx5e: SHAMPO, Fix overflow of hd_per_wq
  2024-09-25 20:20 [pull request][net 0/8] mlx5 fixes 2024-09-25 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2024-09-25 20:20 ` [net 6/8] net/mlx5: HWS, changed E2BIG error to a negative return code Saeed Mahameed
@ 2024-09-25 20:20 ` Saeed Mahameed
  2024-09-25 20:20 ` [net 8/8] net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice Saeed Mahameed
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2024-09-25 20:20 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman,
	Leon Romanovsky, Dragos Tatulea

From: Dragos Tatulea <dtatulea@nvidia.com>

When having larger RQ sizes and small MTUs sizes, the hd_per_wq variable
can overflow. Like in the following case:

$> ethtool --set-ring eth1 rx 8192
$> ip link set dev eth1 mtu 144
$> ethtool --features eth1 rx-gro-hw on

... yields in dmesg:

mlx5_core 0000:08:00.1: mlx5_cmd_out_err:808:(pid 194797): CREATE_MKEY(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x3bf6f), err(-22)

because hd_per_wq is 64K which overflows to 0 and makes the command
fail.

This patch increases the variable size to 32 bit.

Fixes: 99be56171fa9 ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index da0a1c65ec4a..57b7298a0e79 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -627,7 +627,7 @@ struct mlx5e_shampo_hd {
 	struct mlx5e_dma_info *info;
 	struct mlx5e_frag_page *pages;
 	u16 curr_page_index;
-	u16 hd_per_wq;
+	u32 hd_per_wq;
 	u16 hd_per_wqe;
 	unsigned long *bitmap;
 	u16 pi;
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 8/8] net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice
  2024-09-25 20:20 [pull request][net 0/8] mlx5 fixes 2024-09-25 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2024-09-25 20:20 ` [net 7/8] net/mlx5e: SHAMPO, Fix overflow of hd_per_wq Saeed Mahameed
@ 2024-09-25 20:20 ` Saeed Mahameed
  7 siblings, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2024-09-25 20:20 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman,
	Leon Romanovsky, Jianbo Liu

From: Jianbo Liu <jianbol@nvidia.com>

The km.state is not checked in driver's delayed work. When
xfrm_state_check_expire() is called, the state can be reset to
XFRM_STATE_EXPIRED, even if it is XFRM_STATE_DEAD already. This
happens when xfrm state is deleted, but not freed yet. As
__xfrm_state_delete() is called again in xfrm timer, the following
crash occurs.

To fix this issue, skip xfrm_state_check_expire() if km.state is not
XFRM_STATE_VALID.

 Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP
 CPU: 5 UID: 0 PID: 7448 Comm: kworker/u102:2 Not tainted 6.11.0-rc2+ #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Workqueue: mlx5e_ipsec: eth%d mlx5e_ipsec_handle_sw_limits [mlx5_core]
 RIP: 0010:__xfrm_state_delete+0x3d/0x1b0
 Code: 0f 84 8b 01 00 00 48 89 fd c6 87 c8 00 00 00 05 48 8d bb 40 10 00 00 e8 11 04 1a 00 48 8b 95 b8 00 00 00 48 8b 85 c0 00 00 00 <48> 89 42 08 48 89 10 48 8b 55 10 48 b8 00 01 00 00 00 00 ad de 48
 RSP: 0018:ffff88885f945ec8 EFLAGS: 00010246
 RAX: dead000000000122 RBX: ffffffff82afa940 RCX: 0000000000000036
 RDX: dead000000000100 RSI: 0000000000000000 RDI: ffffffff82afb980
 RBP: ffff888109a20340 R08: ffff88885f945ea0 R09: 0000000000000000
 R10: 0000000000000000 R11: ffff88885f945ff8 R12: 0000000000000246
 R13: ffff888109a20340 R14: ffff88885f95f420 R15: ffff88885f95f400
 FS:  0000000000000000(0000) GS:ffff88885f940000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f2163102430 CR3: 00000001128d6001 CR4: 0000000000370eb0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <IRQ>
  ? die_addr+0x33/0x90
  ? exc_general_protection+0x1a2/0x390
  ? asm_exc_general_protection+0x22/0x30
  ? __xfrm_state_delete+0x3d/0x1b0
  ? __xfrm_state_delete+0x2f/0x1b0
  xfrm_timer_handler+0x174/0x350
  ? __xfrm_state_delete+0x1b0/0x1b0
  __hrtimer_run_queues+0x121/0x270
  hrtimer_run_softirq+0x88/0xd0
  handle_softirqs+0xcc/0x270
  do_softirq+0x3c/0x50
  </IRQ>
  <TASK>
  __local_bh_enable_ip+0x47/0x50
  mlx5e_ipsec_handle_sw_limits+0x7d/0x90 [mlx5_core]
  process_one_work+0x137/0x2d0
  worker_thread+0x28d/0x3a0
  ? rescuer_thread+0x480/0x480
  kthread+0xb8/0xe0
  ? kthread_park+0x80/0x80
  ret_from_fork+0x2d/0x50
  ? kthread_park+0x80/0x80
  ret_from_fork_asm+0x11/0x20
  </TASK>

Fixes: b2f7b01d36a9 ("net/mlx5e: Simulate missing IPsec TX limits hardware functionality")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
index 3d274599015b..ca92e518be76 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
@@ -67,7 +67,6 @@ static void mlx5e_ipsec_handle_sw_limits(struct work_struct *_work)
 		return;
 
 	spin_lock_bh(&x->lock);
-	xfrm_state_check_expire(x);
 	if (x->km.state == XFRM_STATE_EXPIRED) {
 		sa_entry->attrs.drop = true;
 		spin_unlock_bh(&x->lock);
@@ -75,6 +74,13 @@ static void mlx5e_ipsec_handle_sw_limits(struct work_struct *_work)
 		mlx5e_accel_ipsec_fs_modify(sa_entry);
 		return;
 	}
+
+	if (x->km.state != XFRM_STATE_VALID) {
+		spin_unlock_bh(&x->lock);
+		return;
+	}
+
+	xfrm_state_check_expire(x);
 	spin_unlock_bh(&x->lock);
 
 	queue_delayed_work(sa_entry->ipsec->wq, &dwork->dwork,
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [net 1/8] net/mlx5: Fix error path in multi-packet WQE transmit
  2024-09-25 20:20 ` [net 1/8] net/mlx5: Fix error path in multi-packet WQE transmit Saeed Mahameed
@ 2024-10-03  0:30   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-10-03  0:30 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: davem, kuba, pabeni, edumazet, saeedm, netdev, tariqt, gal,
	leonro, gbayer, yanjun.zhu, maxtram95

Hello:

This series was applied to netdev/net.git (main)
by Saeed Mahameed <saeedm@nvidia.com>:

On Wed, 25 Sep 2024 13:20:06 -0700 you wrote:
> From: Gerd Bayer <gbayer@linux.ibm.com>
> 
> Remove the erroneous unmap in case no DMA mapping was established
> 
> The multi-packet WQE transmit code attempts to obtain a DMA mapping for
> the skb. This could fail, e.g. under memory pressure, when the IOMMU
> driver just can't allocate more memory for page tables. While the code
> tries to handle this in the path below the err_unmap label it erroneously
> unmaps one entry from the sq's FIFO list of active mappings. Since the
> current map attempt failed this unmap is removing some random DMA mapping
> that might still be required. If the PCI function now presents that IOVA,
> the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
> function in error state.
> 
> [...]

Here is the summary with links:
  - [net,1/8] net/mlx5: Fix error path in multi-packet WQE transmit
    https://git.kernel.org/netdev/net/c/2bcae12c795f
  - [net,2/8] net/mlx5: Added cond_resched() to crdump collection
    https://git.kernel.org/netdev/net/c/ec7931558941
  - [net,3/8] net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc()
    https://git.kernel.org/netdev/net/c/f25389e77950
  - [net,4/8] net/mlx5: Fix wrong reserved field in hca_cap_2 in mlx5_ifc
    https://git.kernel.org/netdev/net/c/19da17010a55
  - [net,5/8] net/mlx5: HWS, fixed double-free in error flow of creating SQ
    https://git.kernel.org/netdev/net/c/d8c561741ef8
  - [net,6/8] net/mlx5: HWS, changed E2BIG error to a negative return code
    https://git.kernel.org/netdev/net/c/d15525f30010
  - [net,7/8] net/mlx5e: SHAMPO, Fix overflow of hd_per_wq
    https://git.kernel.org/netdev/net/c/023d2a43ed0d
  - [net,8/8] net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice
    https://git.kernel.org/netdev/net/c/7b124695db40

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-10-03  0:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-25 20:20 [pull request][net 0/8] mlx5 fixes 2024-09-25 Saeed Mahameed
2024-09-25 20:20 ` [net 1/8] net/mlx5: Fix error path in multi-packet WQE transmit Saeed Mahameed
2024-10-03  0:30   ` patchwork-bot+netdevbpf
2024-09-25 20:20 ` [net 2/8] net/mlx5: Added cond_resched() to crdump collection Saeed Mahameed
2024-09-25 20:20 ` [net 3/8] net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc() Saeed Mahameed
2024-09-25 20:20 ` [net 4/8] net/mlx5: Fix wrong reserved field in hca_cap_2 in mlx5_ifc Saeed Mahameed
2024-09-25 20:20 ` [net 5/8] net/mlx5: HWS, fixed double-free in error flow of creating SQ Saeed Mahameed
2024-09-25 20:20 ` [net 6/8] net/mlx5: HWS, changed E2BIG error to a negative return code Saeed Mahameed
2024-09-25 20:20 ` [net 7/8] net/mlx5e: SHAMPO, Fix overflow of hd_per_wq Saeed Mahameed
2024-09-25 20:20 ` [net 8/8] net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice Saeed Mahameed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).