* [PATCHv2] nvme-pci: fix stuck reset on concurrent DPC and HP
@ 2025-03-07 23:26 Keith Busch
2025-03-08 7:51 ` Nilay Shroff
2025-03-10 13:23 ` Christoph Hellwig
0 siblings, 2 replies; 3+ messages in thread
From: Keith Busch @ 2025-03-07 23:26 UTC (permalink / raw)
To: linux-nvme; +Cc: hch, sagi, nilay, Keith Busch
From: Keith Busch <kbusch@kernel.org>
The PCIe DPC handling has the nvme driver quiesce the device, attempt to
restart it, then wait for that restart to complete.
The DPC event also toggles the PCIe link. If the slot doesn't have
out-of-band presence detection, this will trigger a pciehp
re-enumeration.
The DPC's error handling that calls nvme_error_resume is holding the
device lock while this happens. This lock prevents pciehp's request to
disconnect the driver from proceeding.
Meanwhile the nvme's reset_work can't make forward progress because its
device isn't there anymore with admin IO, and the timeout handler won't
do anything to fix it because the device is undergoing error handling.
End result: deadlocked.
Fix this by having the timeout handler disable the nvme queueus for a
disconnected PCIe device. We're relying on an IO timeout to unblock
this, which is a minute by default.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
v1->v2:
Leveraged the state machine to make the patch that much simpler.
drivers/nvme/host/pci.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 640590b217282..710f3dfef3663 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1411,9 +1411,12 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
struct nvme_dev *dev = nvmeq->dev;
struct request *abort_req;
struct nvme_command cmd = { };
+ struct pci_dev *pdev = to_pci_dev(dev->dev);
u32 csts = readl(dev->bar + NVME_REG_CSTS);
u8 opcode;
+ if (pci_dev_is_disconnected(pdev))
+ nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING);
if (nvme_state_terminal(&dev->ctrl))
goto disable;
@@ -1421,7 +1424,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
* the recovery mechanism will surely fail.
*/
mb();
- if (pci_channel_offline(to_pci_dev(dev->dev)))
+ if (pci_channel_offline(pdev))
return BLK_EH_RESET_TIMER;
/*
--
2.47.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCHv2] nvme-pci: fix stuck reset on concurrent DPC and HP
2025-03-07 23:26 [PATCHv2] nvme-pci: fix stuck reset on concurrent DPC and HP Keith Busch
@ 2025-03-08 7:51 ` Nilay Shroff
2025-03-10 13:23 ` Christoph Hellwig
1 sibling, 0 replies; 3+ messages in thread
From: Nilay Shroff @ 2025-03-08 7:51 UTC (permalink / raw)
To: Keith Busch, linux-nvme; +Cc: hch, sagi, Keith Busch
On 3/8/25 4:56 AM, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The PCIe DPC handling has the nvme driver quiesce the device, attempt to
> restart it, then wait for that restart to complete.
>
> The DPC event also toggles the PCIe link. If the slot doesn't have
> out-of-band presence detection, this will trigger a pciehp
> re-enumeration.
>
> The DPC's error handling that calls nvme_error_resume is holding the
> device lock while this happens. This lock prevents pciehp's request to
> disconnect the driver from proceeding.
>
> Meanwhile the nvme's reset_work can't make forward progress because its
> device isn't there anymore with admin IO, and the timeout handler won't
> do anything to fix it because the device is undergoing error handling.
>
> End result: deadlocked.
>
> Fix this by having the timeout handler disable the nvme queueus for a
> disconnected PCIe device. We're relying on an IO timeout to unblock
> this, which is a minute by default.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
Loos good to me:
Tested-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCHv2] nvme-pci: fix stuck reset on concurrent DPC and HP
2025-03-07 23:26 [PATCHv2] nvme-pci: fix stuck reset on concurrent DPC and HP Keith Busch
2025-03-08 7:51 ` Nilay Shroff
@ 2025-03-10 13:23 ` Christoph Hellwig
1 sibling, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2025-03-10 13:23 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, hch, sagi, nilay, Keith Busch
On Fri, Mar 07, 2025 at 03:26:18PM -0800, Keith Busch wrote:
> struct request *abort_req;
> struct nvme_command cmd = { };
> + struct pci_dev *pdev = to_pci_dev(dev->dev);
> u32 csts = readl(dev->bar + NVME_REG_CSTS);
> u8 opcode;
>
> + if (pci_dev_is_disconnected(pdev))
> + nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING);
Can you add a comment here (based on the commit message) explaining the
logic here, as it is anything but obvious?
Otherwise this looks good to me.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-03-10 14:49 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-07 23:26 [PATCHv2] nvme-pci: fix stuck reset on concurrent DPC and HP Keith Busch
2025-03-08 7:51 ` Nilay Shroff
2025-03-10 13:23 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox