linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] NVMe related fixes
@ 2017-01-04 22:41 Keith Busch
  2017-01-04 22:41 ` [PATCH 1/6] irq/affinity: Assign all online CPUs to vectors Keith Busch
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Keith Busch @ 2017-01-04 22:41 UTC (permalink / raw)
  To: linux-nvme, linux-block, Jens Axboe, Jens Axboe,
	Christoph Hellwig, Thomas Gleixner
  Cc: Marc Merlin, Keith Busch

I've been looking into an old regression origianlly reported here:

  http://lists.infradead.org/pipermail/linux-nvme/2016-August/005699.html

The root cause is blk-mq's hot cpu notifier is stuck indefinitely during
suspend on requests that entered a stopped hardware context, and that
hardware context will not be restarted until suspend completes.

I originally set out to unwind the requests and block on reentry,
but blk-mq doesn't support doing that: once a request enters a hardware
context, it needs to complete on that context. Since the context won't be
starting again, we need to do _something_ with those entered requests,
and unfortunately ending them in error is the simplest way to resolve
the deadlock.

Alternatively, it might have been nice if we didn't need to freeze at
all if we could leverage the new blk_mq_quiesce_queue, but that wouldn't
work when the queue map needs to be redone...

Any feedback appreciated. Thanks!

Keith Busch (6):
  irq/affinity: Assign all online CPUs to vectors
  irq/affinity: Assign offline CPUs a vector
  nvme/pci: Start queues after tagset is updated
  blk-mq: Update queue map when changing queue count
  blk-mq: Fix freeze deadlock
  blk-mq: Remove unused variable

 block/blk-mq.c          | 86 +++++++++++++++++++++++++++++++++++++++++--------
 drivers/nvme/host/pci.c |  2 +-
 kernel/irq/affinity.c   | 17 ++++++++--
 3 files changed, 87 insertions(+), 18 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-01-23 18:32 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-04 22:41 [PATCH 0/6] NVMe related fixes Keith Busch
2017-01-04 22:41 ` [PATCH 1/6] irq/affinity: Assign all online CPUs to vectors Keith Busch
2017-01-13 20:21   ` Sagi Grimberg
2017-01-23 18:30   ` Christoph Hellwig
2017-01-04 22:41 ` [PATCH 2/6] irq/affinity: Assign offline CPUs a vector Keith Busch
2017-01-08 10:01   ` Christoph Hellwig
2017-01-13 20:26     ` Sagi Grimberg
2017-01-04 22:41 ` [PATCH 3/6] nvme/pci: Start queues after tagset is updated Keith Busch
2017-01-13 20:38   ` Sagi Grimberg
2017-01-23 18:31   ` Christoph Hellwig
2017-01-04 22:41 ` [PATCH 4/6] blk-mq: Update queue map when changing queue count Keith Busch
2017-01-13 20:39   ` Sagi Grimberg
2017-01-23 18:32   ` Christoph Hellwig
2017-01-04 22:41 ` [PATCH 5/6] blk-mq: Fix queue freeze deadlock Keith Busch
2017-01-05  7:33   ` Bart Van Assche
2017-01-17 17:53     ` Keith Busch
2017-01-13 21:05   ` Sagi Grimberg
2017-01-17 18:00     ` Keith Busch
2017-01-19  7:54       ` Sagi Grimberg
2017-01-04 22:41 ` [PATCH 6/6] blk-mq: Remove unused variable Keith Busch
2017-01-08 10:02   ` Christoph Hellwig
2017-01-13 21:05   ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).