From: Dragos Tatulea <dtatulea@nvidia.com>
To: Benjamin Block <bblock@linux.ibm.com>
Cc: Niklas Schnelle <schnelle@linux.ibm.com>,
Lukas Wunner <lukas@wunner.de>, Keith Busch <kbusch@kernel.org>,
Gerd Bayer <gbayer@linux.ibm.com>,
Matthew Rosato <mjrosato@linux.ibm.com>,
Halil Pasic <pasic@linux.ibm.com>,
Farhan Ali <alifm@linux.ibm.com>,
Julian Ruess <julianr@linux.ibm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Bjorn Helgaas <helgaas@kernel.org>,
Vasily Gorbik <gor@linux.ibm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
Tariq Toukan <tariqt@nvidia.com>,
"Ionut Nechita (Wind River)" <ionut.nechita@windriver.com>
Subject: Re: [PATCH v3 0/2] PCI/IOV: Fix deadlock when removing PF with enabled SR-IOV
Date: Mon, 23 Feb 2026 19:34:49 +0100 [thread overview]
Message-ID: <13979db1-4513-4841-8595-b30e42eb1769@nvidia.com> (raw)
In-Reply-To: <20260223173346.GD25740@p1gen4-pw042f0m>
On 23.02.26 18:33, Benjamin Block wrote:
> On Mon, Feb 23, 2026 at 03:10:35PM +0100, Dragos Tatulea wrote:
>> After pulling in these commits in our internal tree we can see the
>> lockdep splat from below in many internal tests. We are still trying to
>> find an easy repro for this. We had to internally revert both of them.
>>
>> I noticed some similar discussion in another thread [1] but there it
>> seems that these changes are actually fixing the issue which is not
>> the case for us.
>>
>> ------------[ cut here ]------------
>> WARNING: drivers/pci/remove.c:130 at pci_stop_and_remove_bus_device+0x39/0x40, CPU#2: modprobe/12956
>> Modules linked in: mlx5_core(-) act_tunnel_key vxlan dummy act_mirred act_gact cls_flower act_police act_ct nf_flow_table [...]
>> CPU: 2 UID: 0 PID: 12956 Comm: modprobe Not tainted 6.19.0net_next_e834b5e #1 PREEMPT
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>> RIP: 0010:pci_stop_and_remove_bus_device+0x39/0x40
>> Code: [...]
>> RSP: 0018:ffff888164c9fd10 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: ffff888188ff2000 RCX: 0000000000000001
>> RDX: 0000000000000046 RSI: ffffffff8307e068 RDI: ffff88816bf4c9c0
>> RBP: ffff888188ff2000 R08: 00000000000000f4 R09: ffff88816bf4c080
>> R10: 0000000000000001 R11: 0000000000000003 R12: 0000000000000000
>> R13: ffff888164c9fd27 R14: 0000000000000002 R15: 0000000000000000
>> FS: 00007f52364bd740(0000) GS:ffff8885a9019000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00005622dbf749d8 CR3: 0000000169132004 CR4: 0000000000372eb0
>> Call Trace:
>> <TASK>
>> pci_iov_remove_virtfn+0xbd/0x120
>> sriov_disable+0x30/0xe0
>> mlx5_sriov_disable+0x50/0xa0 [mlx5_core]
>> remove_one+0x68/0xe0 [mlx5_core]
>> pci_device_remove+0x39/0xa0
>> device_release_driver_internal+0x1e4/0x240
>> driver_detach+0x47/0x90
>> bus_remove_driver+0x84/0x110
>> pci_unregister_driver+0x3b/0x90
>
> This looks pretty much like what Ionut is trying to fix in
> v1: https://lore.kernel.org/linux-pci/20260214193235.262219-3-ionut.nechita@windriver.com/T/
> v2: https://lore.kernel.org/linux-pci/20260219212648.82606-1-ionut.nechita@windriver.com/T/
>
> Maybe try giving those patches a spin. I think one easy way to hit this sort
> of thing is to try unbinding a PF that has 1 or more VFs attached to it from
> some device driver. The "trick" is that SR-IOV has to be active.
Thanks or the pointer. Will try it.
Thanks,
Dragos
next prev parent reply other threads:[~2026-02-23 18:35 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-16 22:14 [PATCH v3 0/2] PCI/IOV: Fix deadlock when removing PF with enabled SR-IOV Niklas Schnelle
2025-12-16 22:14 ` [PATCH v3 1/2] Revert "PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV" Niklas Schnelle
2025-12-16 22:14 ` [PATCH v3 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and hotplug Niklas Schnelle
2026-03-17 1:57 ` Guenter Roeck
2026-03-17 9:01 ` Benjamin Block
2026-03-17 9:46 ` Benjamin Block
2026-03-17 11:33 ` Benjamin Block
2026-03-17 13:08 ` Lukas Wunner
2026-03-17 13:18 ` Lukas Wunner
2026-03-17 17:09 ` Benjamin Block
2026-02-01 15:56 ` [PATCH v3 0/2] PCI/IOV: Fix deadlock when removing PF with enabled SR-IOV Thorsten Leemhuis
2026-02-02 15:47 ` Niklas Schnelle
2026-02-03 0:48 ` Bjorn Helgaas
2026-02-23 14:10 ` Dragos Tatulea
2026-02-23 17:33 ` Benjamin Block
2026-02-23 18:34 ` Dragos Tatulea [this message]
2026-02-25 14:59 ` Dragos Tatulea
2026-02-25 18:32 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=13979db1-4513-4841-8595-b30e42eb1769@nvidia.com \
--to=dtatulea@nvidia.com \
--cc=agordeev@linux.ibm.com \
--cc=alifm@linux.ibm.com \
--cc=bblock@linux.ibm.com \
--cc=gbayer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=helgaas@kernel.org \
--cc=ionut.nechita@windriver.com \
--cc=julianr@linux.ibm.com \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mjrosato@linux.ibm.com \
--cc=pasic@linux.ibm.com \
--cc=schnelle@linux.ibm.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox