All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Horman <horms@kernel.org>
To: Petr Machata <petrm@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	netdev@vger.kernel.org, Ido Schimmel <idosch@nvidia.com>,
	Jiri Pirko <jiri@resnulli.us>,
	Alexander Zubkov <green@qrator.net>,
	mlxsw@nvidia.com
Subject: Re: [PATCH net 6/9] mlxsw: spectrum_acl_tcam: Fix memory leak during rehash
Date: Wed, 24 Apr 2024 15:52:02 +0100	[thread overview]
Message-ID: <20240424145202.GH42092@kernel.org> (raw)
In-Reply-To: <d5edd4f4503934186ae5cfe268503b16345b4e0f.1713797103.git.petrm@nvidia.com>

On Mon, Apr 22, 2024 at 05:25:59PM +0200, Petr Machata wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> The rehash delayed work migrates filters from one region to another.
> This is done by iterating over all chunks (all the filters with the same
> priority) in the region and in each chunk iterating over all the
> filters.
> 
> If the migration fails, the code tries to migrate the filters back to
> the old region. However, the rollback itself can also fail in which case
> another migration will be erroneously performed. Besides the fact that
> this ping pong is not a very good idea, it also creates a problem.
> 
> Each virtual chunk references two chunks: The currently used one
> ('vchunk->chunk') and a backup ('vchunk->chunk2'). During migration the
> first holds the chunk we want to migrate filters to and the second holds
> the chunk we are migrating filters from.
> 
> The code currently assumes - but does not verify - that the backup chunk
> does not exist (NULL) if the currently used chunk does not reference the
> target region. This assumption breaks when we are trying to rollback a
> rollback, resulting in the backup chunk being overwritten and leaked
> [1].
> 
> Fix by not rolling back a failed rollback and add a warning to avoid
> future cases.
> 
> [1]
> WARNING: CPU: 5 PID: 1063 at lib/parman.c:291 parman_destroy+0x17/0x20
> Modules linked in:
> CPU: 5 PID: 1063 Comm: kworker/5:11 Tainted: G        W          6.9.0-rc2-custom-00784-gc6a05c468a0b #14
> Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
> Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
> RIP: 0010:parman_destroy+0x17/0x20
> [...]
> Call Trace:
>  <TASK>
>  mlxsw_sp_acl_atcam_region_fini+0x19/0x60
>  mlxsw_sp_acl_tcam_region_destroy+0x49/0xf0
>  mlxsw_sp_acl_tcam_vregion_rehash_work+0x1f1/0x470
>  process_one_work+0x151/0x370
>  worker_thread+0x2cb/0x3e0
>  kthread+0xd0/0x100
>  ret_from_fork+0x34/0x50
>  ret_from_fork_asm+0x1a/0x30
>  </TASK>
> 
> Fixes: 843500518509 ("mlxsw: spectrum_acl: Do rollback as another call to mlxsw_sp_acl_tcam_vchunk_migrate_all()")
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> Tested-by: Alexander Zubkov <green@qrator.net>
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Petr Machata <petrm@nvidia.com>

Reviewed-by: Simon Horman <horms@kernel.org>

...


  reply	other threads:[~2024-04-24 14:52 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-22 15:25 [PATCH net 0/9] mlxsw: Various ACL fixes Petr Machata
2024-04-22 15:25 ` [PATCH net 1/9] mlxsw: spectrum_acl_tcam: Fix race in region ID allocation Petr Machata
2024-04-24 14:47   ` Simon Horman
2024-04-22 15:25 ` [PATCH net 2/9] mlxsw: spectrum_acl_tcam: Fix race during rehash delayed work Petr Machata
2024-04-24 14:48   ` Simon Horman
2024-04-22 15:25 ` [PATCH net 3/9] mlxsw: spectrum_acl_tcam: Fix possible use-after-free during activity update Petr Machata
2024-04-24 14:49   ` Simon Horman
2024-04-22 15:25 ` [PATCH net 4/9] mlxsw: spectrum_acl_tcam: Fix possible use-after-free during rehash Petr Machata
2024-04-24 14:50   ` Simon Horman
2024-04-22 15:25 ` [PATCH net 5/9] mlxsw: spectrum_acl_tcam: Rate limit error message Petr Machata
2024-04-24 14:51   ` Simon Horman
2024-04-22 15:25 ` [PATCH net 6/9] mlxsw: spectrum_acl_tcam: Fix memory leak during rehash Petr Machata
2024-04-24 14:52   ` Simon Horman [this message]
2024-04-22 15:26 ` [PATCH net 7/9] mlxsw: spectrum_acl_tcam: Fix warning " Petr Machata
2024-04-24 14:52   ` Simon Horman
2024-04-22 15:26 ` [PATCH net 8/9] mlxsw: spectrum_acl_tcam: Fix incorrect list API usage Petr Machata
2024-04-24 14:53   ` Simon Horman
2024-04-22 15:26 ` [PATCH net 9/9] mlxsw: spectrum_acl_tcam: Fix memory leak when canceling rehash work Petr Machata
2024-04-24 14:53   ` Simon Horman
2024-04-25  2:40 ` [PATCH net 0/9] mlxsw: Various ACL fixes patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240424145202.GH42092@kernel.org \
    --to=horms@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=green@qrator.net \
    --cc=idosch@nvidia.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=mlxsw@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=petrm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.