From: Simon Horman <horms@kernel.org>
To: Petr Machata <petrm@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, Ido Schimmel <idosch@nvidia.com>,
Jiri Pirko <jiri@resnulli.us>,
Alexander Zubkov <green@qrator.net>,
mlxsw@nvidia.com
Subject: Re: [PATCH net 7/9] mlxsw: spectrum_acl_tcam: Fix warning during rehash
Date: Wed, 24 Apr 2024 15:52:32 +0100 [thread overview]
Message-ID: <20240424145232.GI42092@kernel.org> (raw)
In-Reply-To: <cc17eed86b41dd829d39b07906fec074a9ce580e.1713797103.git.petrm@nvidia.com>
On Mon, Apr 22, 2024 at 05:26:00PM +0200, Petr Machata wrote:
> From: Ido Schimmel <idosch@nvidia.com>
>
> As previously explained, the rehash delayed work migrates filters from
> one region to another. This is done by iterating over all chunks (all
> the filters with the same priority) in the region and in each chunk
> iterating over all the filters.
>
> When the work runs out of credits it stores the current chunk and entry
> as markers in the per-work context so that it would know where to resume
> the migration from the next time the work is scheduled.
>
> Upon error, the chunk marker is reset to NULL, but without resetting the
> entry markers despite being relative to it. This can result in migration
> being resumed from an entry that does not belong to the chunk being
> migrated. In turn, this will eventually lead to a chunk being iterated
> over as if it is an entry. Because of how the two structures happen to
> be defined, this does not lead to KASAN splats, but to warnings such as
> [1].
>
> Fix by creating a helper that resets all the markers and call it from
> all the places the currently only reset the chunk marker. For good
> measures also call it when starting a completely new rehash. Add a
> warning to avoid future cases.
>
> [1]
> WARNING: CPU: 7 PID: 1076 at drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c:407 mlxsw_afk_encode+0x242/0x2f0
> Modules linked in:
> CPU: 7 PID: 1076 Comm: kworker/7:24 Tainted: G W 6.9.0-rc3-custom-00880-g29e61d91b77b #29
> Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
> Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
> RIP: 0010:mlxsw_afk_encode+0x242/0x2f0
> [...]
> Call Trace:
> <TASK>
> mlxsw_sp_acl_atcam_entry_add+0xd9/0x3c0
> mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
> mlxsw_sp_acl_tcam_vchunk_migrate_all+0x109/0x290
> mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x470
> process_one_work+0x151/0x370
> worker_thread+0x2cb/0x3e0
> kthread+0xd0/0x100
> ret_from_fork+0x34/0x50
> </TASK>
>
> Fixes: 6f9579d4e302 ("mlxsw: spectrum_acl: Remember where to continue rehash migration")
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> Tested-by: Alexander Zubkov <green@qrator.net>
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
next prev parent reply other threads:[~2024-04-24 14:52 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-22 15:25 [PATCH net 0/9] mlxsw: Various ACL fixes Petr Machata
2024-04-22 15:25 ` [PATCH net 1/9] mlxsw: spectrum_acl_tcam: Fix race in region ID allocation Petr Machata
2024-04-24 14:47 ` Simon Horman
2024-04-22 15:25 ` [PATCH net 2/9] mlxsw: spectrum_acl_tcam: Fix race during rehash delayed work Petr Machata
2024-04-24 14:48 ` Simon Horman
2024-04-22 15:25 ` [PATCH net 3/9] mlxsw: spectrum_acl_tcam: Fix possible use-after-free during activity update Petr Machata
2024-04-24 14:49 ` Simon Horman
2024-04-22 15:25 ` [PATCH net 4/9] mlxsw: spectrum_acl_tcam: Fix possible use-after-free during rehash Petr Machata
2024-04-24 14:50 ` Simon Horman
2024-04-22 15:25 ` [PATCH net 5/9] mlxsw: spectrum_acl_tcam: Rate limit error message Petr Machata
2024-04-24 14:51 ` Simon Horman
2024-04-22 15:25 ` [PATCH net 6/9] mlxsw: spectrum_acl_tcam: Fix memory leak during rehash Petr Machata
2024-04-24 14:52 ` Simon Horman
2024-04-22 15:26 ` [PATCH net 7/9] mlxsw: spectrum_acl_tcam: Fix warning " Petr Machata
2024-04-24 14:52 ` Simon Horman [this message]
2024-04-22 15:26 ` [PATCH net 8/9] mlxsw: spectrum_acl_tcam: Fix incorrect list API usage Petr Machata
2024-04-24 14:53 ` Simon Horman
2024-04-22 15:26 ` [PATCH net 9/9] mlxsw: spectrum_acl_tcam: Fix memory leak when canceling rehash work Petr Machata
2024-04-24 14:53 ` Simon Horman
2024-04-25 2:40 ` [PATCH net 0/9] mlxsw: Various ACL fixes patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240424145232.GI42092@kernel.org \
--to=horms@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=green@qrator.net \
--cc=idosch@nvidia.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=mlxsw@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=petrm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.