public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/5] mlx5 misc fixes 2025-12-25
@ 2025-12-25 13:27 Mark Bloch
  2025-12-25 13:27 ` [PATCH net 1/5] net/mlx5: Lag, multipath, give priority for routes with smaller network prefix Mark Bloch
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Mark Bloch @ 2025-12-25 13:27 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman

Hi,

This patchset provides misc bug fixes from the team to the mlx5 core and
Eth drivers.

Alexei Lazar (1):
  net/mlx5e: Don't gate FEC histograms on ppcnt_statistical_group

Cosmin Ratiu (1):
  net/mlx5e: Dealloc forgotten PSP RX modify header

Gal Pressman (2):
  net/mlx5e: Fix NULL pointer dereference in ioctl module EEPROM query
  net/mlx5e: Don't print error message due to invalid module

Patrisious Haddad (1):
  net/mlx5: Lag, multipath, give priority for routes with smaller network prefix

 .../net/ethernet/mellanox/mlx5/core/en_accel/psp.c | 14 +++++++++++---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c |  9 +++++----
 drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c   |  9 +++++++--
 drivers/net/ethernet/mellanox/mlx5/core/port.c     |  9 ++++++---
 4 files changed, 29 insertions(+), 12 deletions(-)


base-commit: 6402078bd9d1ed46e79465e1faaa42e3458f8a33
-- 
2.34.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net 1/5] net/mlx5: Lag, multipath, give priority for routes with smaller network prefix
  2025-12-25 13:27 [PATCH net 0/5] mlx5 misc fixes 2025-12-25 Mark Bloch
@ 2025-12-25 13:27 ` Mark Bloch
  2025-12-25 13:27 ` [PATCH net 2/5] net/mlx5e: Don't gate FEC histograms on ppcnt_statistical_group Mark Bloch
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Mark Bloch @ 2025-12-25 13:27 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman, Patrisious Haddad

From: Patrisious Haddad <phaddad@nvidia.com>

Today multipath offload is controlled by a single route and the route
controlling is selected if it meets one of the following criteria:
        1. No controlling route is set.
        2. New route destination is the same as old one.
        3. New route metric is lower than old route metric.

This can cause unwanted behaviour in case a new route is added
with a smaller network prefix which should get the priority.

Fix this by adding a new criteria to give priority to new route with
a smaller network prefix.

Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c
index aee17fcf3b36..cdc99fe5c956 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c
@@ -173,10 +173,15 @@ static void mlx5_lag_fib_route_event(struct mlx5_lag *ldev, unsigned long event,
 	}
 
 	/* Handle multipath entry with lower priority value */
-	if (mp->fib.mfi && mp->fib.mfi != fi &&
+	if (mp->fib.mfi &&
 	    (mp->fib.dst != fen_info->dst || mp->fib.dst_len != fen_info->dst_len) &&
-	    fi->fib_priority >= mp->fib.priority)
+	    mp->fib.dst_len <= fen_info->dst_len &&
+	    !(mp->fib.dst_len == fen_info->dst_len &&
+	      fi->fib_priority < mp->fib.priority)) {
+		mlx5_core_dbg(ldev->pf[idx].dev,
+			      "Multipath entry with lower priority was rejected\n");
 		return;
+	}
 
 	nh_dev0 = mlx5_lag_get_next_fib_dev(ldev, fi, NULL);
 	nh_dev1 = mlx5_lag_get_next_fib_dev(ldev, fi, nh_dev0);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 2/5] net/mlx5e: Don't gate FEC histograms on ppcnt_statistical_group
  2025-12-25 13:27 [PATCH net 0/5] mlx5 misc fixes 2025-12-25 Mark Bloch
  2025-12-25 13:27 ` [PATCH net 1/5] net/mlx5: Lag, multipath, give priority for routes with smaller network prefix Mark Bloch
@ 2025-12-25 13:27 ` Mark Bloch
  2025-12-25 13:27 ` [PATCH net 3/5] net/mlx5e: Fix NULL pointer dereference in ioctl module EEPROM query Mark Bloch
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Mark Bloch @ 2025-12-25 13:27 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman, Alexei Lazar

From: Alexei Lazar <alazar@nvidia.com>

Currently, the ppcnt_statistical_group capability check
incorrectly gates access to FEC histogram statistics.
This capability applies only to statistical and physical
counter groups, not for histogram data.

Restrict the ppcnt_statistical_group check to the
Physical_Layer_Counters and Physical_Layer_Statistical_Counters
groups.
Histogram statistics access remains gated by the pphcr
capability.

The issue is harmless as of today, as it happens that
ppcnt_statistical_group is set on all existing devices that
have pphcr set.

Fixes: 6b81b8a0b197 ("net/mlx5e: Don't query FEC statistics when FEC is disabled")
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index a2802cfc9b98..a8af84fc9763 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -1608,12 +1608,13 @@ void mlx5e_stats_fec_get(struct mlx5e_priv *priv,
 {
 	int mode = fec_active_mode(priv->mdev);
 
-	if (mode == MLX5E_FEC_NOFEC ||
-	    !MLX5_CAP_PCAM_FEATURE(priv->mdev, ppcnt_statistical_group))
+	if (mode == MLX5E_FEC_NOFEC)
 		return;
 
-	fec_set_corrected_bits_total(priv, fec_stats);
-	fec_set_block_stats(priv, mode, fec_stats);
+	if (MLX5_CAP_PCAM_FEATURE(priv->mdev, ppcnt_statistical_group)) {
+		fec_set_corrected_bits_total(priv, fec_stats);
+		fec_set_block_stats(priv, mode, fec_stats);
+	}
 
 	if (MLX5_CAP_PCAM_REG(priv->mdev, pphcr))
 		fec_set_histograms_stats(priv, mode, hist);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 3/5] net/mlx5e: Fix NULL pointer dereference in ioctl module EEPROM query
  2025-12-25 13:27 [PATCH net 0/5] mlx5 misc fixes 2025-12-25 Mark Bloch
  2025-12-25 13:27 ` [PATCH net 1/5] net/mlx5: Lag, multipath, give priority for routes with smaller network prefix Mark Bloch
  2025-12-25 13:27 ` [PATCH net 2/5] net/mlx5e: Don't gate FEC histograms on ppcnt_statistical_group Mark Bloch
@ 2025-12-25 13:27 ` Mark Bloch
  2025-12-25 13:27 ` [PATCH net 4/5] net/mlx5e: Don't print error message due to invalid module Mark Bloch
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Mark Bloch @ 2025-12-25 13:27 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman, Dragos Tatulea

From: Gal Pressman <gal@nvidia.com>

The mlx5_query_mcia() function unconditionally dereferences the status
pointer to store the MCIA register status value.
However, mlx5e_get_module_id() passes NULL since it doesn't need the
status value.

Add a NULL check before dereferencing the status pointer to prevent a
NULL pointer dereference.

Fixes: 2e4c44b12f4d ("net/mlx5: Refactor EEPROM query error handling to return status separately")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/port.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index 85a9e534f442..8f36454dd196 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -393,9 +393,11 @@ static int mlx5_query_mcia(struct mlx5_core_dev *dev,
 	if (err)
 		return err;
 
-	*status = MLX5_GET(mcia_reg, out, status);
-	if (*status)
+	if (MLX5_GET(mcia_reg, out, status)) {
+		if (status)
+			*status = MLX5_GET(mcia_reg, out, status);
 		return -EIO;
+	}
 
 	ptr = MLX5_ADDR_OF(mcia_reg, out, dword_0);
 	memcpy(data, ptr, size);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 4/5] net/mlx5e: Don't print error message due to invalid module
  2025-12-25 13:27 [PATCH net 0/5] mlx5 misc fixes 2025-12-25 Mark Bloch
                   ` (2 preceding siblings ...)
  2025-12-25 13:27 ` [PATCH net 3/5] net/mlx5e: Fix NULL pointer dereference in ioctl module EEPROM query Mark Bloch
@ 2025-12-25 13:27 ` Mark Bloch
  2025-12-25 13:27 ` [PATCH net 5/5] net/mlx5e: Dealloc forgotten PSP RX modify header Mark Bloch
  2026-01-04 19:12 ` [PATCH net 0/5] mlx5 misc fixes 2025-12-25 patchwork-bot+netdevbpf
  5 siblings, 0 replies; 7+ messages in thread
From: Mark Bloch @ 2025-12-25 13:27 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman

From: Gal Pressman <gal@nvidia.com>

Dumping module EEPROM on newer modules is supported through the netlink
interface only.

Querying with old userspace ethtool (or other tools, such as 'lshw')
which still uses the ioctl interface results in an error message that
could flood dmesg (in addition to the expected error return value).
The original message was added under the assumption that the driver
should be able to handle all module types, but now that such flows are
easily triggered from userspace, it doesn't serve its purpose.

Change the log level of the print in mlx5_query_module_eeprom() to
debug.

Fixes: bb64143eee8c ("net/mlx5e: Add ethtool support for dump module EEPROM")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/port.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index 8f36454dd196..7f8bed353e67 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -431,7 +431,8 @@ int mlx5_query_module_eeprom(struct mlx5_core_dev *dev,
 		mlx5_qsfp_eeprom_params_set(&query.i2c_address, &query.page, &offset);
 		break;
 	default:
-		mlx5_core_err(dev, "Module ID not recognized: 0x%x\n", module_id);
+		mlx5_core_dbg(dev, "Module ID not recognized: 0x%x\n",
+			      module_id);
 		return -EINVAL;
 	}
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 5/5] net/mlx5e: Dealloc forgotten PSP RX modify header
  2025-12-25 13:27 [PATCH net 0/5] mlx5 misc fixes 2025-12-25 Mark Bloch
                   ` (3 preceding siblings ...)
  2025-12-25 13:27 ` [PATCH net 4/5] net/mlx5e: Don't print error message due to invalid module Mark Bloch
@ 2025-12-25 13:27 ` Mark Bloch
  2026-01-04 19:12 ` [PATCH net 0/5] mlx5 misc fixes 2025-12-25 patchwork-bot+netdevbpf
  5 siblings, 0 replies; 7+ messages in thread
From: Mark Bloch @ 2025-12-25 13:27 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman, Cosmin Ratiu,
	Dragos Tatulea

From: Cosmin Ratiu <cratiu@nvidia.com>

The commit which added RX steering rules for PSP forgot to free a modify
header HW object on the cleanup path, which lead to health errors when
reloading the driver and uninitializing the device:

mlx5_core 0000:08:00.0: poll_health:803:(pid 3021): Fatal error 3 detected

Fix that by saving the modify header pointer in the PSP steering struct
and deallocating it after freeing the rule which references it.

Fixes: 9536fbe10c9d ("net/mlx5e: Add PSP steering in local NIC RX")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_accel/psp.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c
index 38e7c77cc851..9a74438ce10a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c
@@ -44,6 +44,7 @@ struct mlx5e_accel_fs_psp_prot {
 	struct mlx5_flow_table *ft;
 	struct mlx5_flow_group *miss_group;
 	struct mlx5_flow_handle *miss_rule;
+	struct mlx5_modify_hdr *rx_modify_hdr;
 	struct mlx5_flow_destination default_dest;
 	struct mlx5e_psp_rx_err rx_err;
 	u32 refcnt;
@@ -286,13 +287,19 @@ static int accel_psp_fs_rx_err_create_ft(struct mlx5e_psp_fs *fs,
 	return err;
 }
 
-static void accel_psp_fs_rx_fs_destroy(struct mlx5e_accel_fs_psp_prot *fs_prot)
+static void accel_psp_fs_rx_fs_destroy(struct mlx5e_psp_fs *fs,
+				       struct mlx5e_accel_fs_psp_prot *fs_prot)
 {
 	if (fs_prot->def_rule) {
 		mlx5_del_flow_rules(fs_prot->def_rule);
 		fs_prot->def_rule = NULL;
 	}
 
+	if (fs_prot->rx_modify_hdr) {
+		mlx5_modify_header_dealloc(fs->mdev, fs_prot->rx_modify_hdr);
+		fs_prot->rx_modify_hdr = NULL;
+	}
+
 	if (fs_prot->miss_rule) {
 		mlx5_del_flow_rules(fs_prot->miss_rule);
 		fs_prot->miss_rule = NULL;
@@ -396,6 +403,7 @@ static int accel_psp_fs_rx_create_ft(struct mlx5e_psp_fs *fs,
 		modify_hdr = NULL;
 		goto out_err;
 	}
+	fs_prot->rx_modify_hdr = modify_hdr;
 
 	flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST |
 			  MLX5_FLOW_CONTEXT_ACTION_CRYPTO_DECRYPT |
@@ -416,7 +424,7 @@ static int accel_psp_fs_rx_create_ft(struct mlx5e_psp_fs *fs,
 	goto out;
 
 out_err:
-	accel_psp_fs_rx_fs_destroy(fs_prot);
+	accel_psp_fs_rx_fs_destroy(fs, fs_prot);
 out:
 	kvfree(flow_group_in);
 	kvfree(spec);
@@ -433,7 +441,7 @@ static int accel_psp_fs_rx_destroy(struct mlx5e_psp_fs *fs, enum accel_fs_psp_ty
 	/* The netdev unreg already happened, so all offloaded rule are already removed */
 	fs_prot = &accel_psp->fs_prot[type];
 
-	accel_psp_fs_rx_fs_destroy(fs_prot);
+	accel_psp_fs_rx_fs_destroy(fs, fs_prot);
 
 	accel_psp_fs_rx_err_destroy_ft(fs, &fs_prot->rx_err);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net 0/5] mlx5 misc fixes 2025-12-25
  2025-12-25 13:27 [PATCH net 0/5] mlx5 misc fixes 2025-12-25 Mark Bloch
                   ` (4 preceding siblings ...)
  2025-12-25 13:27 ` [PATCH net 5/5] net/mlx5e: Dealloc forgotten PSP RX modify header Mark Bloch
@ 2026-01-04 19:12 ` patchwork-bot+netdevbpf
  5 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-01-04 19:12 UTC (permalink / raw)
  To: Mark Bloch
  Cc: edumazet, kuba, pabeni, andrew+netdev, davem, saeedm, leon,
	tariqt, netdev, linux-rdma, linux-kernel, gal

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 25 Dec 2025 15:27:12 +0200 you wrote:
> Hi,
> 
> This patchset provides misc bug fixes from the team to the mlx5 core and
> Eth drivers.
> 
> Alexei Lazar (1):
>   net/mlx5e: Don't gate FEC histograms on ppcnt_statistical_group
> 
> [...]

Here is the summary with links:
  - [net,1/5] net/mlx5: Lag, multipath, give priority for routes with smaller network prefix
    https://git.kernel.org/netdev/net/c/31057979cdad
  - [net,2/5] net/mlx5e: Don't gate FEC histograms on ppcnt_statistical_group
    https://git.kernel.org/netdev/net/c/6c75dc9de40f
  - [net,3/5] net/mlx5e: Fix NULL pointer dereference in ioctl module EEPROM query
    https://git.kernel.org/netdev/net/c/7d36a4a8bf62
  - [net,4/5] net/mlx5e: Don't print error message due to invalid module
    https://git.kernel.org/netdev/net/c/144297e2a24e
  - [net,5/5] net/mlx5e: Dealloc forgotten PSP RX modify header
    https://git.kernel.org/netdev/net/c/0462a15d2d1f

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-01-04 19:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-25 13:27 [PATCH net 0/5] mlx5 misc fixes 2025-12-25 Mark Bloch
2025-12-25 13:27 ` [PATCH net 1/5] net/mlx5: Lag, multipath, give priority for routes with smaller network prefix Mark Bloch
2025-12-25 13:27 ` [PATCH net 2/5] net/mlx5e: Don't gate FEC histograms on ppcnt_statistical_group Mark Bloch
2025-12-25 13:27 ` [PATCH net 3/5] net/mlx5e: Fix NULL pointer dereference in ioctl module EEPROM query Mark Bloch
2025-12-25 13:27 ` [PATCH net 4/5] net/mlx5e: Don't print error message due to invalid module Mark Bloch
2025-12-25 13:27 ` [PATCH net 5/5] net/mlx5e: Dealloc forgotten PSP RX modify header Mark Bloch
2026-01-04 19:12 ` [PATCH net 0/5] mlx5 misc fixes 2025-12-25 patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox