From: Jakub Kicinski <kuba@kernel.org>
To: tariqt@nvidia.com
Cc: Jakub Kicinski <kuba@kernel.org>,
edumazet@google.com, pabeni@redhat.com, andrew+netdev@lunn.ch,
davem@davemloft.net, saeedm@nvidia.com, mbloch@nvidia.com,
leon@kernel.org, vladbu@nvidia.com, paulb@nvidia.com,
netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, gal@nvidia.com, lkayal@nvidia.com,
cratiu@nvidia.com
Subject: Re: [PATCH net] net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown
Date: Fri, 12 Jun 2026 18:42:16 -0700 [thread overview]
Message-ID: <20260613014216.2828397-1-kuba@kernel.org> (raw)
In-Reply-To: <20260611134836.534015-1-tariqt@nvidia.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown
This change clears attr->act_id_restore_rule to NULL after
esw_add_restore_rule() fails, so the teardown path no longer
dereferences an ERR_PTR cookie as if it were a real mlx5_flow_handle.
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> index a9001d1c902f..4c135858f297 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> @@ -5863,6 +5863,7 @@ int mlx5e_tc_action_miss_mapping_get(struct mlx5e_priv *priv, struct mlx5_flow_a
> attr->act_id_restore_rule = esw_add_restore_rule(esw, *act_miss_mapping);
> if (IS_ERR(attr->act_id_restore_rule)) {
> err = PTR_ERR(attr->act_id_restore_rule);
> + attr->act_id_restore_rule = NULL;
> goto err_rule;
> }
[High]
The commit message states that after this change "teardown only sees
NULL or a valid handle", framing NULL as benign. Is that invariant
actually enforced on the consumer side?
Looking at the teardown helper:
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:mlx5e_tc_action_miss_mapping_put() {
struct mapping_ctx *ctx = mlx5e_get_priv_obj_mapping(priv);
if (is_mdev_switchdev_mode(priv->mdev))
mlx5_del_flow_rules(attr->act_id_restore_rule);
mapping_remove(ctx, act_miss_mapping);
}
There is no NULL check before mlx5_del_flow_rules(), and
mlx5_del_flow_rules() itself dereferences handle->rule[0] without a
NULL check:
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:mlx5_del_flow_rules() {
struct fs_fte *fte;
int i;
...
fs_get_obj(fte, handle->rule[0]->node.parent);
down_write_ref_node(&fte->node, false);
...
}
So on any path where mlx5e_tc_action_miss_mapping_put() is reached with
attr->act_id_restore_rule == NULL in switchdev mode, this patch
converts the original ERR_PTR dereference into a NULL dereference at
the same site rather than avoiding the crash.
Today the put is gated by attr->ct_attr.offloaded in
mlx5_tc_ct_delete_flow(), and __mlx5_tc_ct_flow_offload() leaves
offloaded clear when mlx5e_tc_action_miss_mapping_get() fails, so the
specific oops in the call trace should no longer be reachable. But
should the put be made symmetric with the new "NULL or valid" producer
contract, e.g.:
if (is_mdev_switchdev_mode(priv->mdev) && attr->act_id_restore_rule)
mlx5_del_flow_rules(attr->act_id_restore_rule);
so that any future caller that relies on the documented invariant
(including mlx5e_clone_flow_attr_for_post_act() in en_tc.c, which
already inherits this field) does not reintroduce the same oops with a
slightly different fault address?
--
pw-bot: cr
prev parent reply other threads:[~2026-06-13 1:42 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-11 13:48 [PATCH net] net/mlx5e: Fix oops from ERR_PTR in act-miss restore teardown Tariq Toukan
2026-06-11 16:01 ` Alexander Lobakin
2026-06-13 1:42 ` Jakub Kicinski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260613014216.2828397-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=andrew+netdev@lunn.ch \
--cc=cratiu@nvidia.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lkayal@nvidia.com \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=paulb@nvidia.com \
--cc=saeedm@nvidia.com \
--cc=tariqt@nvidia.com \
--cc=vladbu@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox