* [PATCH] nvme-pci: serialize polling interrupt queue with shutdown
@ 2026-05-14 14:45 Keith Busch
2026-05-15 4:29 ` Christoph Hellwig
0 siblings, 1 reply; 3+ messages in thread
From: Keith Busch @ 2026-05-14 14:45 UTC (permalink / raw)
To: linux-nvme, hch; +Cc: Keith Busch, Bjorn Helgaas
From: Keith Busch <kbusch@kernel.org>
Polling an interrupt driven completion queue temporarilly disables the
irq. If this occurs concurrently with another thread disabling the
device, the irq vector may have been freed, which makes it available for
reuse. Reenabling the irq after polling the queue may be referencing a
stale irq at that point.
Fix this race by ensuring nvme_poll_irqdisable() can not run
concurrently with nvme_dev_disable(), and skip polling the completion
queue if the queue has already been disabled.
Reported-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/nvme/host/pci.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 139a10cd687f9..34845d73cb3ab 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1885,8 +1885,12 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
*/
if (test_bit(NVMEQ_POLLED, &nvmeq->flags))
nvme_poll(req->mq_hctx, NULL);
- else
- nvme_poll_irqdisable(nvmeq);
+ else {
+ mutex_lock(&dev->shutdown_lock);
+ if (test_bit(NVMEQ_ENABLED, &nvmeq->flags))
+ nvme_poll_irqdisable(nvmeq);
+ mutex_unlock(&dev->shutdown_lock);
+ }
if (blk_mq_rq_state(req) != MQ_RQ_IN_FLIGHT) {
dev_warn(dev->ctrl.device,
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] nvme-pci: serialize polling interrupt queue with shutdown
2026-05-14 14:45 [PATCH] nvme-pci: serialize polling interrupt queue with shutdown Keith Busch
@ 2026-05-15 4:29 ` Christoph Hellwig
2026-05-15 13:29 ` Keith Busch
0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2026-05-15 4:29 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, hch, Keith Busch, Bjorn Helgaas
On Thu, May 14, 2026 at 07:45:44AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Polling an interrupt driven completion queue temporarilly disables the
> irq. If this occurs concurrently with another thread disabling the
> device, the irq vector may have been freed, which makes it available for
> reuse. Reenabling the irq after polling the queue may be referencing a
> stale irq at that point.
>
> Fix this race by ensuring nvme_poll_irqdisable() can not run
> concurrently with nvme_dev_disable(), and skip polling the completion
> queue if the queue has already been disabled.
Do we need the same change in nvme_suspend_queue? I.e., should the check
and locking be moved into nvme_poll_irqdisable?
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] nvme-pci: serialize polling interrupt queue with shutdown
2026-05-15 4:29 ` Christoph Hellwig
@ 2026-05-15 13:29 ` Keith Busch
0 siblings, 0 replies; 3+ messages in thread
From: Keith Busch @ 2026-05-15 13:29 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Keith Busch, linux-nvme, Bjorn Helgaas
On Fri, May 15, 2026 at 06:29:41AM +0200, Christoph Hellwig wrote:
> On Thu, May 14, 2026 at 07:45:44AM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> >
> > Polling an interrupt driven completion queue temporarilly disables the
> > irq. If this occurs concurrently with another thread disabling the
> > device, the irq vector may have been freed, which makes it available for
> > reuse. Reenabling the irq after polling the queue may be referencing a
> > stale irq at that point.
> >
> > Fix this race by ensuring nvme_poll_irqdisable() can not run
> > concurrently with nvme_dev_disable(), and skip polling the completion
> > queue if the queue has already been disabled.
>
> Do we need the same change in nvme_suspend_queue? I.e., should the check
> and locking be moved into nvme_poll_irqdisable?
nvme_suspend_queue is called from only one place that already holds the
same lock, so not necessary. And we can't do the locking within
nvme_poll_irqdisable since nvme_dev_disable calls it with the lock
already held too. I can add lockdep asserts to make the expectations
clear, though.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-15 13:29 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14 14:45 [PATCH] nvme-pci: serialize polling interrupt queue with shutdown Keith Busch
2026-05-15 4:29 ` Christoph Hellwig
2026-05-15 13:29 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox