From: Farhan Ali <alifm@linux.ibm.com>
To: Gerd Bayer <gbayer@linux.ibm.com>,
Saeed Mahameed <saeedm@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
Tariq Toukan <tariqt@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Shay Drory <shayd@nvidia.com>, Simon Horman <horms@kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>,
Bjorn Helgaas <helgaas@kernel.org>,
Niklas Schnelle <schnelle@linux.ibm.com>,
netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org,
linux-pci@vger.kernel.org
Subject: Re: [PATCH net] net/mlx5: Fix double unregister of HCA_PORTS component
Date: Wed, 3 Dec 2025 13:10:40 -0800 [thread overview]
Message-ID: <99db437a-be91-4e85-a201-ec3a890900c8@linux.ibm.com> (raw)
In-Reply-To: <20251202-fix_lag-v1-1-59e8177ffce0@linux.ibm.com>
On 12/2/2025 3:12 AM, Gerd Bayer wrote:
> Clear hca_devcom_comp in device's private data after unregistering it in
> LAG teardown. Otherwise a slightly lagging second pass through
> mlx5_unload_one() might try to unregister it again and trip over
> use-after-free.
>
> On s390 almost all PCI level recovery events trigger two passes through
> mxl5_unload_one() - one through the poll_health() method and one through
> mlx5_pci_err_detected() as callback from generic PCI error recovery.
> While testing PCI error recovery paths with more kernel debug features
> enabled, this issue reproducibly led to kernel panics with the following
> call chain:
>
> Unable to handle kernel pointer dereference in virtual kernel address space
> Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803 ESOP-2 FSI
> Fault in home space mode while using kernel ASCE.
> AS:00000000705c4007 R3:0000000000000024
> Oops: 0038 ilc:3 [#1]SMP
>
> CPU: 14 UID: 0 PID: 156 Comm: kmcheck Kdump: loaded Not tainted
> 6.18.0-20251130.rc7.git0.16131a59cab1.300.fc43.s390x+debug #1 PREEMPT
>
> Krnl PSW : 0404e00180000000 0000020fc86aa1dc (__lock_acquire+0x5c/0x15f0)
> R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000000000000 0000020f00000001 6b6b6b6b6b6b6c33 0000000000000000
> 0000000000000000 0000000000000000 0000000000000001 0000000000000000
> 0000000000000000 0000020fca28b820 0000000000000000 0000010a1ced8100
> 0000010a1ced8100 0000020fc9775068 0000018fce14f8b8 0000018fce14f7f8
> Krnl Code: 0000020fc86aa1cc: e3b003400004 lg %r11,832
> 0000020fc86aa1d2: a7840211 brc 8,0000020fc86aa5f4
> *0000020fc86aa1d6: c09000df0b25 larl %r9,0000020fca28b820
> >0000020fc86aa1dc: d50790002000 clc 0(8,%r9),0(%r2)
> 0000020fc86aa1e2: a7840209 brc 8,0000020fc86aa5f4
> 0000020fc86aa1e6: c0e001100401 larl %r14,0000020fca8aa9e8
> 0000020fc86aa1ec: c01000e25a00 larl %r1,0000020fca2f55ec
> 0000020fc86aa1f2: a7eb00e8 aghi %r14,232
>
> Call Trace:
> __lock_acquire+0x5c/0x15f0
> lock_acquire.part.0+0xf8/0x270
> lock_acquire+0xb0/0x1b0
> down_write+0x5a/0x250
> mlx5_detach_device+0x42/0x110 [mlx5_core]
> mlx5_unload_one_devl_locked+0x50/0xc0 [mlx5_core]
> mlx5_unload_one+0x42/0x60 [mlx5_core]
> mlx5_pci_err_detected+0x94/0x150 [mlx5_core]
> zpci_event_attempt_error_recovery+0xcc/0x388
>
> Fixes: 5a977b5833b7 ("net/mlx5: Lag, move devcom registration to LAG layer")
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
> Hi Shay et al,
>
> while checking for potential regressions by Lukas Wunner's recent work
> on pci_save/restore_state() for the recoverability of mlx5 functions I
> consistently hit this bug. (Bjorn has queued this up for 6.19, according
> to [0] and [1])
>
> Apparently, the issue is unrelated to Lukas' work but can be reproduced
> with master. It appears to be timing-sensitive, since it shows up only
> when I use s390's debug_defconfig, but I think needs fixing anyhow, as
> timing can change for other reasons, too.
>
> I've spotted two additional places where the devcom reference is not
> cleared after calling mlx5_devcom_unregister_component() in
> drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c that I have not
> addressed with a patch, since I'm unclear about how to test these
> paths.
>
> Thanks,
> Gerd
>
> [0] https://lore.kernel.org/all/cover.1760274044.git.lukas@wunner.de/
> [1] https://lore.kernel.org/linux-pci/cover.1763483367.git.lukas@wunner.de/
> ---
> drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> index 3db0387bf6dcb727a65df9d0253f242554af06db..8ec04a5f434dd4f717d6d556649fcc2a584db847 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> @@ -1413,6 +1413,7 @@ static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev)
> static void mlx5_lag_unregister_hca_devcom_comp(struct mlx5_core_dev *dev)
> {
> mlx5_devcom_unregister_component(dev->priv.hca_devcom_comp);
> + dev->priv.hca_devcom_comp = NULL;
> }
Though this fix looks correct to me in freeing hca_devcom_comp (not too
familiar with mlx5 internals), I wonder if it would be better to just
set devcom = NULL in devcom_free_comp_dev() after the kfree? This would
also take care of other places where devcom is not set to NULL?
Thanks
Farhan
next prev parent reply other threads:[~2025-12-03 21:11 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-02 11:12 [PATCH net] net/mlx5: Fix double unregister of HCA_PORTS component Gerd Bayer
2025-12-03 15:14 ` Moshe Shemesh
2025-12-04 9:48 ` Gerd Bayer
2025-12-04 17:07 ` Moshe Shemesh
2025-12-05 8:23 ` Gerd Bayer
2025-12-03 21:10 ` Farhan Ali [this message]
2025-12-04 8:27 ` Tariq Toukan
2025-12-04 9:00 ` Tariq Toukan
2025-12-04 14:30 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=99db437a-be91-4e85-a201-ec3a890900c8@linux.ibm.com \
--to=alifm@linux.ibm.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gbayer@linux.ibm.com \
--cc=helgaas@kernel.org \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
--cc=schnelle@linux.ibm.com \
--cc=shayd@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).