* [PATCH net V3 1/3] net/mlx5e: psp: Fix invalid access on PSP dev registration fail
2026-05-04 18:10 [PATCH net V3 0/3] net/mlx5e: PSP fixes Tariq Toukan
@ 2026-05-04 18:10 ` Tariq Toukan
2026-05-06 2:11 ` Jakub Kicinski
2026-05-04 18:10 ` [PATCH net V3 2/3] net/mlx5e: psp: Expose only a fully initialized priv->psp Tariq Toukan
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Tariq Toukan @ 2026-05-04 18:10 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Boris Pismenny, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
Mark Bloch, Daniel Zahka, Willem de Bruijn, Cosmin Ratiu,
Raed Salem, Rahul Rameshbabu, Dragos Tatulea, Kees Cook, netdev,
linux-rdma, linux-kernel, Gal Pressman
From: Cosmin Ratiu <cratiu@nvidia.com>
priv->psp->psp is initialized with the PSP device as returned by
psp_dev_create(). This could also return an error, in which case a
future psp_dev_unregister() will result in unpleasantness.
Avoid that by using a local variable and only saving the PSP device when
registration succeeds.
In case psp_dev_create() fails, priv->psp and steering structs are left
in place, but they will be inert. The unchecked access of priv->psp in
mlx5e_psp_offload_handle_rx_skb() won't happen because without a PSP
device, there can be no SAs added and therefore no packets will be
successfully decrypted and be handed off to the SW handler.
Fixes: 89ee2d92f66c ("net/mlx5e: Support PSP offload functionality")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
.../mellanox/mlx5/core/en_accel/psp.c | 26 ++++++++++++-------
1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c
index 6a50b6dec0fa..1ff818fb48df 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c
@@ -1070,29 +1070,37 @@ static struct psp_dev_ops mlx5_psp_ops = {
void mlx5e_psp_unregister(struct mlx5e_priv *priv)
{
- if (!priv->psp || !priv->psp->psp)
+ struct mlx5e_psp *psp = priv->psp;
+
+ if (!psp || !psp->psp)
return;
- psp_dev_unregister(priv->psp->psp);
+ psp_dev_unregister(psp->psp);
+ psp->psp = NULL;
}
void mlx5e_psp_register(struct mlx5e_priv *priv)
{
+ struct mlx5e_psp *psp = priv->psp;
+ struct psp_dev *psd;
+
/* FW Caps missing */
if (!priv->psp)
return;
- priv->psp->caps.assoc_drv_spc = sizeof(u32);
- priv->psp->caps.versions = 1 << PSP_VERSION_HDR0_AES_GCM_128;
+ psp->caps.assoc_drv_spc = sizeof(u32);
+ psp->caps.versions = 1 << PSP_VERSION_HDR0_AES_GCM_128;
if (MLX5_CAP_PSP(priv->mdev, psp_crypto_esp_aes_gcm_256_encrypt) &&
MLX5_CAP_PSP(priv->mdev, psp_crypto_esp_aes_gcm_256_decrypt))
- priv->psp->caps.versions |= 1 << PSP_VERSION_HDR0_AES_GCM_256;
+ psp->caps.versions |= 1 << PSP_VERSION_HDR0_AES_GCM_256;
- priv->psp->psp = psp_dev_create(priv->netdev, &mlx5_psp_ops,
- &priv->psp->caps, NULL);
- if (IS_ERR(priv->psp->psp))
+ psd = psp_dev_create(priv->netdev, &mlx5_psp_ops, &psp->caps, NULL);
+ if (IS_ERR(psd)) {
mlx5_core_err(priv->mdev, "PSP failed to register due to %pe\n",
- priv->psp->psp);
+ psd);
+ return;
+ }
+ psp->psp = psd;
}
int mlx5e_psp_init(struct mlx5e_priv *priv)
--
2.44.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH net V3 1/3] net/mlx5e: psp: Fix invalid access on PSP dev registration fail
2026-05-04 18:10 ` [PATCH net V3 1/3] net/mlx5e: psp: Fix invalid access on PSP dev registration fail Tariq Toukan
@ 2026-05-06 2:11 ` Jakub Kicinski
0 siblings, 0 replies; 6+ messages in thread
From: Jakub Kicinski @ 2026-05-06 2:11 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Paolo Abeni, Andrew Lunn, David S. Miller,
Boris Pismenny, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
Daniel Zahka, Willem de Bruijn, Cosmin Ratiu, Raed Salem,
Rahul Rameshbabu, Dragos Tatulea, Kees Cook, netdev, linux-rdma,
linux-kernel, Gal Pressman
On Mon, 4 May 2026 21:10:58 +0300 Tariq Toukan wrote:
> - if (!priv->psp || !priv->psp->psp)
> + struct mlx5e_psp *psp = priv->psp;
> +
> + if (!psp || !psp->psp)
> return;
>
> - psp_dev_unregister(priv->psp->psp);
> + psp_dev_unregister(psp->psp);
> + psp->psp = NULL;
TBH the pointless churn to add a local variable here was what I was
referring to when talking about unnecessary refactoring. One line
change to clear the pointer and you're turning it to a full rewrite
of the helper.
Whatever. Some things can't be taught I guess :\
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net V3 2/3] net/mlx5e: psp: Expose only a fully initialized priv->psp
2026-05-04 18:10 [PATCH net V3 0/3] net/mlx5e: PSP fixes Tariq Toukan
2026-05-04 18:10 ` [PATCH net V3 1/3] net/mlx5e: psp: Fix invalid access on PSP dev registration fail Tariq Toukan
@ 2026-05-04 18:10 ` Tariq Toukan
2026-05-04 18:11 ` [PATCH net V3 3/3] net/mlx5e: psp: Hook PSP dev reg/unreg to profile enable/disable Tariq Toukan
2026-05-06 2:20 ` [PATCH net V3 0/3] net/mlx5e: PSP fixes patchwork-bot+netdevbpf
3 siblings, 0 replies; 6+ messages in thread
From: Tariq Toukan @ 2026-05-04 18:10 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Boris Pismenny, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
Mark Bloch, Daniel Zahka, Willem de Bruijn, Cosmin Ratiu,
Raed Salem, Rahul Rameshbabu, Dragos Tatulea, Kees Cook, netdev,
linux-rdma, linux-kernel, Gal Pressman
From: Cosmin Ratiu <cratiu@nvidia.com>
Currently, during PSP init, priv->psp is initialized to an incompletely
built psp struct. Additionally, on fs init failure priv->psp is reset to
NULL.
Change this so that only a fully initialized priv->psp is set, which
makes the code easier to reason about in failure scenarios.
Fixes: af2196f49480 ("net/mlx5e: Implement PSP operations .assoc_add and .assoc_del")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c
index 1ff818fb48df..d9adb993e64d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c
@@ -1139,22 +1139,18 @@ int mlx5e_psp_init(struct mlx5e_priv *priv)
if (!psp)
return -ENOMEM;
- priv->psp = psp;
fs = mlx5e_accel_psp_fs_init(priv);
if (IS_ERR(fs)) {
err = PTR_ERR(fs);
- goto out_err;
+ kfree(psp);
+ return err;
}
psp->fs = fs;
+ priv->psp = psp;
mlx5_core_dbg(priv->mdev, "PSP attached to netdevice\n");
return 0;
-
-out_err:
- priv->psp = NULL;
- kfree(psp);
- return err;
}
void mlx5e_psp_cleanup(struct mlx5e_priv *priv)
--
2.44.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH net V3 3/3] net/mlx5e: psp: Hook PSP dev reg/unreg to profile enable/disable
2026-05-04 18:10 [PATCH net V3 0/3] net/mlx5e: PSP fixes Tariq Toukan
2026-05-04 18:10 ` [PATCH net V3 1/3] net/mlx5e: psp: Fix invalid access on PSP dev registration fail Tariq Toukan
2026-05-04 18:10 ` [PATCH net V3 2/3] net/mlx5e: psp: Expose only a fully initialized priv->psp Tariq Toukan
@ 2026-05-04 18:11 ` Tariq Toukan
2026-05-06 2:20 ` [PATCH net V3 0/3] net/mlx5e: PSP fixes patchwork-bot+netdevbpf
3 siblings, 0 replies; 6+ messages in thread
From: Tariq Toukan @ 2026-05-04 18:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Boris Pismenny, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
Mark Bloch, Daniel Zahka, Willem de Bruijn, Cosmin Ratiu,
Raed Salem, Rahul Rameshbabu, Dragos Tatulea, Kees Cook, netdev,
linux-rdma, linux-kernel, Gal Pressman
From: Cosmin Ratiu <cratiu@nvidia.com>
devlink reload while PSP connections are active does:
mlx5_unload_one_devl_locked() -> mlx5_detach_device()
-> _mlx5e_suspend()
-> mlx5e_detach_netdev()
-> profile->cleanup_rx
-> profile->cleanup_tx
-> mlx5e_destroy_mdev_resources() -> mlx5_core_dealloc_pd() fails:
...
mlx5_core 0000:08:00.0: mlx5_cmd_out_err:821:(pid 19722):
DEALLOC_PD(0x801) op_mod(0x0) failed, status bad resource state(0x9),
syndrome (0xef0c8a), err(-22)
...
The reason for failure is the existence of TX keys, which are removed by
the PSP dev unregistration happening in:
profile->cleanup() -> mlx5e_psp_unregister() -> mlx5e_psp_cleanup()
-> psp_dev_unregister()
...but this isn't invoked in the devlink reload flow, only when changing
the NIC profile (e.g. when transitioning to switchdev mode) or on dev
teardown.
Move PSP device registration into mlx5e_nic_enable(), and unregistration
into the corresponding mlx5e_nic_disable(). These functions are called
during netdev attach/detach after RX & TX are set up.
This ensures that the keys will be gone by the time the PD is destroyed.
Fixes: 89ee2d92f66c ("net/mlx5e: Support PSP offload functionality")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 5a46870c4b74..8e9443caa933 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -6023,7 +6023,6 @@ static int mlx5e_nic_init(struct mlx5_core_dev *mdev,
if (take_rtnl)
rtnl_lock();
- mlx5e_psp_register(priv);
/* update XDP supported features */
mlx5e_set_xdp_feature(priv);
@@ -6036,7 +6035,6 @@ static int mlx5e_nic_init(struct mlx5_core_dev *mdev,
static void mlx5e_nic_cleanup(struct mlx5e_priv *priv)
{
mlx5e_health_destroy_reporters(priv);
- mlx5e_psp_unregister(priv);
mlx5e_ktls_cleanup(priv);
mlx5e_psp_cleanup(priv);
mlx5e_fs_cleanup(priv->fs);
@@ -6160,6 +6158,7 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv)
mlx5e_fs_init_l2_addr(priv->fs, netdev);
mlx5e_ipsec_init(priv);
+ mlx5e_psp_register(priv);
err = mlx5e_macsec_init(priv);
if (err)
@@ -6230,6 +6229,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv)
mlx5_lag_remove_netdev(mdev, priv->netdev);
mlx5_vxlan_reset_to_default(mdev->vxlan);
mlx5e_macsec_cleanup(priv);
+ mlx5e_psp_unregister(priv);
mlx5e_ipsec_cleanup(priv);
}
--
2.44.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH net V3 0/3] net/mlx5e: PSP fixes
2026-05-04 18:10 [PATCH net V3 0/3] net/mlx5e: PSP fixes Tariq Toukan
` (2 preceding siblings ...)
2026-05-04 18:11 ` [PATCH net V3 3/3] net/mlx5e: psp: Hook PSP dev reg/unreg to profile enable/disable Tariq Toukan
@ 2026-05-06 2:20 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-05-06 2:20 UTC (permalink / raw)
To: Tariq Toukan
Cc: edumazet, kuba, pabeni, andrew+netdev, davem, borisp, saeedm,
leon, mbloch, daniel.zahka, willemdebruijn.kernel, cratiu, raeds,
rrameshbabu, dtatulea, kees, netdev, linux-rdma, linux-kernel,
gal
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Mon, 4 May 2026 21:10:57 +0300 you wrote:
> Hi,
>
> This patchset provides bug fixes from Cosmin to the mlx5e PSP feature.
>
> Thanks,
> Tariq.
>
> [...]
Here is the summary with links:
- [net,V3,1/3] net/mlx5e: psp: Fix invalid access on PSP dev registration fail
https://git.kernel.org/netdev/net/c/ae9582cd0b9c
- [net,V3,2/3] net/mlx5e: psp: Expose only a fully initialized priv->psp
https://git.kernel.org/netdev/net/c/50690733db59
- [net,V3,3/3] net/mlx5e: psp: Hook PSP dev reg/unreg to profile enable/disable
https://git.kernel.org/netdev/net/c/c4a5c46199b5
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 6+ messages in thread