* [PATCH net] net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown
@ 2026-06-11 13:48 Tariq Toukan
2026-06-11 16:01 ` Alexander Lobakin
2026-06-13 1:42 ` Jakub Kicinski
0 siblings, 2 replies; 3+ messages in thread
From: Tariq Toukan @ 2026-06-11 13:48 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Saeed Mahameed, Tariq Toukan, Mark Bloch, Leon Romanovsky,
Vlad Buslov, Paul Blakey, netdev, linux-rdma, linux-kernel,
Gal Pressman, Lama Kayal, Cosmin Ratiu
From: Lama Kayal <lkayal@nvidia.com>
Restore-rule creation stores ERR_PTR(errno) in act_id_restore_rule
on failure. Teardown still called mlx5_del_flow_rules() with that
value, which dereferenced it like a real mlx5_flow_handle and could
crash.
Clear act_id_restore_rule to NULL in the error branch after
esw_add_restore_rule() fails so teardown only sees NULL or a valid
handle.
Call Trace:
? page_fault+0x1e/0x30
? mlx5_del_flow_rules+0x12/0x140 [mlx5_core]
mlx5e_tc_action_miss_mapping_put+0x49/0x50 [mlx5_core]
mlx5_tc_ct_delete_flow+0x4d/0x70 [mlx5_core]
mlx5_free_flow_attr_actions+0xd2/0x160 [mlx5_core]
mlx5e_tc_del_fdb_flow+0x15d/0x210 [mlx5_core]
mlx5e_flow_put+0x23/0x40 [mlx5_core]
__mlx5e_add_fdb_flow+0xf3/0x430 [mlx5_core]
mlx5e_tc_add_flow+0x2ab/0x9c0 [mlx5_core]
mlx5e_configure_flower+0x2f4/0x620 [mlx5_core]
tc_setup_cb_add+0xca/0x1e0
fl_hw_replace_filter+0x143/0x1e0 [cls_flower]
[...]
Fixes: dfa1e46d6093 ("net/mlx5e: TC, Fix using eswitch mapping in nic mode")
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index a9001d1c902f..4c135858f297 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -5863,6 +5863,7 @@ int mlx5e_tc_action_miss_mapping_get(struct mlx5e_priv *priv, struct mlx5_flow_a
attr->act_id_restore_rule = esw_add_restore_rule(esw, *act_miss_mapping);
if (IS_ERR(attr->act_id_restore_rule)) {
err = PTR_ERR(attr->act_id_restore_rule);
+ attr->act_id_restore_rule = NULL;
goto err_rule;
}
base-commit: 0068940907d33217ae01217f84910a5cde606c17
--
2.44.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH net] net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown
2026-06-11 13:48 [PATCH net] net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown Tariq Toukan
@ 2026-06-11 16:01 ` Alexander Lobakin
2026-06-13 1:42 ` Jakub Kicinski
1 sibling, 0 replies; 3+ messages in thread
From: Alexander Lobakin @ 2026-06-11 16:01 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Saeed Mahameed, Mark Bloch, Leon Romanovsky,
Vlad Buslov, Paul Blakey, netdev, linux-rdma, linux-kernel,
Gal Pressman, Lama Kayal, Cosmin Ratiu
From: Tariq Toukan <tariqt@nvidia.com>
Date: Thu, 11 Jun 2026 16:48:36 +0300
> From: Lama Kayal <lkayal@nvidia.com>
>
> Restore-rule creation stores ERR_PTR(errno) in act_id_restore_rule
> on failure. Teardown still called mlx5_del_flow_rules() with that
> value, which dereferenced it like a real mlx5_flow_handle and could
> crash.
>
> Clear act_id_restore_rule to NULL in the error branch after
> esw_add_restore_rule() fails so teardown only sees NULL or a valid
> handle.
>
> Call Trace:
> ? page_fault+0x1e/0x30
> ? mlx5_del_flow_rules+0x12/0x140 [mlx5_core]
> mlx5e_tc_action_miss_mapping_put+0x49/0x50 [mlx5_core]
> mlx5_tc_ct_delete_flow+0x4d/0x70 [mlx5_core]
> mlx5_free_flow_attr_actions+0xd2/0x160 [mlx5_core]
> mlx5e_tc_del_fdb_flow+0x15d/0x210 [mlx5_core]
> mlx5e_flow_put+0x23/0x40 [mlx5_core]
> __mlx5e_add_fdb_flow+0xf3/0x430 [mlx5_core]
> mlx5e_tc_add_flow+0x2ab/0x9c0 [mlx5_core]
> mlx5e_configure_flower+0x2f4/0x620 [mlx5_core]
> tc_setup_cb_add+0xca/0x1e0
> fl_hw_replace_filter+0x143/0x1e0 [cls_flower]
> [...]
>
> Fixes: dfa1e46d6093 ("net/mlx5e: TC, Fix using eswitch mapping in nic mode")
> Signed-off-by: Lama Kayal <lkayal@nvidia.com>
> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Thanks,
Olek
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH net] net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown
2026-06-11 13:48 [PATCH net] net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown Tariq Toukan
2026-06-11 16:01 ` Alexander Lobakin
@ 2026-06-13 1:42 ` Jakub Kicinski
1 sibling, 0 replies; 3+ messages in thread
From: Jakub Kicinski @ 2026-06-13 1:42 UTC (permalink / raw)
To: tariqt
Cc: Jakub Kicinski, edumazet, pabeni, andrew+netdev, davem, saeedm,
mbloch, leon, vladbu, paulb, netdev, linux-rdma, linux-kernel,
gal, lkayal, cratiu
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown
This change clears attr->act_id_restore_rule to NULL after
esw_add_restore_rule() fails, so the teardown path no longer
dereferences an ERR_PTR cookie as if it were a real mlx5_flow_handle.
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> index a9001d1c902f..4c135858f297 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> @@ -5863,6 +5863,7 @@ int mlx5e_tc_action_miss_mapping_get(struct mlx5e_priv *priv, struct mlx5_flow_a
> attr->act_id_restore_rule = esw_add_restore_rule(esw, *act_miss_mapping);
> if (IS_ERR(attr->act_id_restore_rule)) {
> err = PTR_ERR(attr->act_id_restore_rule);
> + attr->act_id_restore_rule = NULL;
> goto err_rule;
> }
[High]
The commit message states that after this change "teardown only sees
NULL or a valid handle", framing NULL as benign. Is that invariant
actually enforced on the consumer side?
Looking at the teardown helper:
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:mlx5e_tc_action_miss_mapping_put() {
struct mapping_ctx *ctx = mlx5e_get_priv_obj_mapping(priv);
if (is_mdev_switchdev_mode(priv->mdev))
mlx5_del_flow_rules(attr->act_id_restore_rule);
mapping_remove(ctx, act_miss_mapping);
}
There is no NULL check before mlx5_del_flow_rules(), and
mlx5_del_flow_rules() itself dereferences handle->rule[0] without a
NULL check:
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:mlx5_del_flow_rules() {
struct fs_fte *fte;
int i;
...
fs_get_obj(fte, handle->rule[0]->node.parent);
down_write_ref_node(&fte->node, false);
...
}
So on any path where mlx5e_tc_action_miss_mapping_put() is reached with
attr->act_id_restore_rule == NULL in switchdev mode, this patch
converts the original ERR_PTR dereference into a NULL dereference at
the same site rather than avoiding the crash.
Today the put is gated by attr->ct_attr.offloaded in
mlx5_tc_ct_delete_flow(), and __mlx5_tc_ct_flow_offload() leaves
offloaded clear when mlx5e_tc_action_miss_mapping_get() fails, so the
specific oops in the call trace should no longer be reachable. But
should the put be made symmetric with the new "NULL or valid" producer
contract, e.g.:
if (is_mdev_switchdev_mode(priv->mdev) && attr->act_id_restore_rule)
mlx5_del_flow_rules(attr->act_id_restore_rule);
so that any future caller that relies on the documented invariant
(including mlx5e_clone_flow_attr_for_post_act() in en_tc.c, which
already inherits this field) does not reintroduce the same oops with a
slightly different fault address?
--
pw-bot: cr
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-13 1:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 13:48 [PATCH net] net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown Tariq Toukan
2026-06-11 16:01 ` Alexander Lobakin
2026-06-13 1:42 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox