From: Farhan Ali <alifm@linux.ibm.com>
To: Gerd Bayer <gbayer@linux.ibm.com>,
Saeed Mahameed <saeedm@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
Tariq Toukan <tariqt@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Shay Drory <shayd@nvidia.com>, Simon Horman <horms@kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>,
Bjorn Helgaas <helgaas@kernel.org>,
Niklas Schnelle <schnelle@linux.ibm.com>,
netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org,
linux-pci@vger.kernel.org
Subject: Re: [PATCH net] net/mlx5: Fix double unregister of HCA_PORTS component
Date: Wed, 3 Dec 2025 13:10:40 -0800 [thread overview]
Message-ID: <99db437a-be91-4e85-a201-ec3a890900c8@linux.ibm.com> (raw)
In-Reply-To: <20251202-fix_lag-v1-1-59e8177ffce0@linux.ibm.com>
On 12/2/2025 3:12 AM, Gerd Bayer wrote:
> Clear hca_devcom_comp in device's private data after unregistering it in
> LAG teardown. Otherwise a slightly lagging second pass through
> mlx5_unload_one() might try to unregister it again and trip over
> use-after-free.
>
> On s390 almost all PCI level recovery events trigger two passes through
> mxl5_unload_one() - one through the poll_health() method and one through
> mlx5_pci_err_detected() as callback from generic PCI error recovery.
> While testing PCI error recovery paths with more kernel debug features
> enabled, this issue reproducibly led to kernel panics with the following
> call chain:
>
> Unable to handle kernel pointer dereference in virtual kernel address space
> Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803 ESOP-2 FSI
> Fault in home space mode while using kernel ASCE.
> AS:00000000705c4007 R3:0000000000000024
> Oops: 0038 ilc:3 [#1]SMP
>
> CPU: 14 UID: 0 PID: 156 Comm: kmcheck Kdump: loaded Not tainted
> 6.18.0-20251130.rc7.git0.16131a59cab1.300.fc43.s390x+debug #1 PREEMPT
>
> Krnl PSW : 0404e00180000000 0000020fc86aa1dc (__lock_acquire+0x5c/0x15f0)
> R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000000000000 0000020f00000001 6b6b6b6b6b6b6c33 0000000000000000
> 0000000000000000 0000000000000000 0000000000000001 0000000000000000
> 0000000000000000 0000020fca28b820 0000000000000000 0000010a1ced8100
> 0000010a1ced8100 0000020fc9775068 0000018fce14f8b8 0000018fce14f7f8
> Krnl Code: 0000020fc86aa1cc: e3b003400004 lg %r11,832
> 0000020fc86aa1d2: a7840211 brc 8,0000020fc86aa5f4
> *0000020fc86aa1d6: c09000df0b25 larl %r9,0000020fca28b820
> >0000020fc86aa1dc: d50790002000 clc 0(8,%r9),0(%r2)
> 0000020fc86aa1e2: a7840209 brc 8,0000020fc86aa5f4
> 0000020fc86aa1e6: c0e001100401 larl %r14,0000020fca8aa9e8
> 0000020fc86aa1ec: c01000e25a00 larl %r1,0000020fca2f55ec
> 0000020fc86aa1f2: a7eb00e8 aghi %r14,232
>
> Call Trace:
> __lock_acquire+0x5c/0x15f0
> lock_acquire.part.0+0xf8/0x270
> lock_acquire+0xb0/0x1b0
> down_write+0x5a/0x250
> mlx5_detach_device+0x42/0x110 [mlx5_core]
> mlx5_unload_one_devl_locked+0x50/0xc0 [mlx5_core]
> mlx5_unload_one+0x42/0x60 [mlx5_core]
> mlx5_pci_err_detected+0x94/0x150 [mlx5_core]
> zpci_event_attempt_error_recovery+0xcc/0x388
>
> Fixes: 5a977b5833b7 ("net/mlx5: Lag, move devcom registration to LAG layer")
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
> Hi Shay et al,
>
> while checking for potential regressions by Lukas Wunner's recent work
> on pci_save/restore_state() for the recoverability of mlx5 functions I
> consistently hit this bug. (Bjorn has queued this up for 6.19, according
> to [0] and [1])
>
> Apparently, the issue is unrelated to Lukas' work but can be reproduced
> with master. It appears to be timing-sensitive, since it shows up only
> when I use s390's debug_defconfig, but I think needs fixing anyhow, as
> timing can change for other reasons, too.
>
> I've spotted two additional places where the devcom reference is not
> cleared after calling mlx5_devcom_unregister_component() in
> drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c that I have not
> addressed with a patch, since I'm unclear about how to test these
> paths.
>
> Thanks,
> Gerd
>
> [0] https://lore.kernel.org/all/cover.1760274044.git.lukas@wunner.de/
> [1] https://lore.kernel.org/linux-pci/cover.1763483367.git.lukas@wunner.de/
> ---
> drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> index 3db0387bf6dcb727a65df9d0253f242554af06db..8ec04a5f434dd4f717d6d556649fcc2a584db847 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> @@ -1413,6 +1413,7 @@ static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev)
> static void mlx5_lag_unregister_hca_devcom_comp(struct mlx5_core_dev *dev)
> {
> mlx5_devcom_unregister_component(dev->priv.hca_devcom_comp);
> + dev->priv.hca_devcom_comp = NULL;
> }
Though this fix looks correct to me in freeing hca_devcom_comp (not too
familiar with mlx5 internals), I wonder if it would be better to just
set devcom = NULL in devcom_free_comp_dev() after the kfree? This would
also take care of other places where devcom is not set to NULL?
Thanks
Farhan
next prev parent reply other threads:[~2025-12-03 21:11 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-02 11:12 [PATCH net] net/mlx5: Fix double unregister of HCA_PORTS component Gerd Bayer
2025-12-03 15:14 ` Moshe Shemesh
2025-12-04 9:48 ` Gerd Bayer
2025-12-04 17:07 ` Moshe Shemesh
2025-12-05 8:23 ` Gerd Bayer
2025-12-03 21:10 ` Farhan Ali [this message]
2025-12-04 8:27 ` Tariq Toukan
2025-12-04 9:00 ` Tariq Toukan
2025-12-04 14:30 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=99db437a-be91-4e85-a201-ec3a890900c8@linux.ibm.com \
--to=alifm@linux.ibm.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gbayer@linux.ibm.com \
--cc=helgaas@kernel.org \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
--cc=schnelle@linux.ibm.com \
--cc=shayd@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.