linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] blk-mq: Fix cpu indexing error in blk_mq_alloc_request_hctx()
@ 2019-10-23 17:57 James Smart
  2019-10-24  9:28 ` Ming Lei
  2019-10-25 20:51 ` Sagi Grimberg
  0 siblings, 2 replies; 10+ messages in thread
From: James Smart @ 2019-10-23 17:57 UTC (permalink / raw)
  To: linux-block; +Cc: James Smart, Shagun Agrawal, Christoph Hellwig

During the following test scenario:
- Offline a cpu
- load lpfc driver, which auto-discovers NVMe devices. For a new
  nvme device, the lpfc/nvme_fc transport can request up to
  num_online_cpus() worth of nr_hw_queues. The target in
  this case allowed at least that many of nvme queues.
The system encountered the following crash:

 BUG: unable to handle kernel paging request at 00003659d33953a8
 ...
 Workqueue: nvme-wq nvme_fc_connect_ctrl_work [nvme_fc]
 RIP: 0010:blk_mq_get_request+0x21d/0x3c0
 ...
 Blk_mq_alloc_request_hctx+0xef/0x140
 Nvme_alloc_request+0x32/0x80 [nvme_core]
 __nvme_submit_sync_cmd+0x4a/0x1c0 [nvme_core]
 Nvmf_connect_io_queue+0x130/0x1a0 [nvme_fabrics]
 Nvme_fc_connect_io_queues+0x285/0x2b0 [nvme_fc]
 Nvme_fc_create_association+0x0x8ea/0x9c0 [nvme_fc]
 Nvme_fc_connect_ctrl_work+0x19/0x50 [nvme_fc]
 ...

There was a commit a while ago to simplify queue mapping which
replaced the use of cpumask_first() by cpumask_first_and().
The issue is if cpumask_first_and() does not find any _intersecting_ cpus,
it return's nr_cpu_id. nr_cpu_id isn't valid for the per_cpu_ptr index
which is done in __blk_mq_get_ctx().

Considered reverting back to cpumask_first(), but instead followed
logic in blk_mq_first_mapped_cpu() to check for nr_cpu_id before
calling cpumask_first().

Fixes: 20e4d8139319 ("blk-mq: simplify queue mapping & schedule with each possisble CPU")
Signed-off-by: Shagun Agrawal <shagun.agrawal@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8538dc415499..0b06b4ea57f1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -461,6 +461,8 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
 		return ERR_PTR(-EXDEV);
 	}
 	cpu = cpumask_first_and(alloc_data.hctx->cpumask, cpu_online_mask);
+	if (cpu >= nr_cpu_ids)
+		cpu = cpumask_first(alloc_data.hctx->cpumask);
 	alloc_data.ctx = __blk_mq_get_ctx(q, cpu);
 
 	rq = blk_mq_get_request(q, NULL, &alloc_data);
-- 
2.13.7


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-10-27  7:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-10-23 17:57 [PATCH] blk-mq: Fix cpu indexing error in blk_mq_alloc_request_hctx() James Smart
2019-10-24  9:28 ` Ming Lei
2019-10-24 13:02   ` Jens Axboe
2019-10-24 18:53     ` James Smart
2019-10-25  7:22       ` Ming Lei
2019-10-25 20:26         ` Sagi Grimberg
2019-10-25 22:20           ` Ming Lei
2019-10-25 22:33             ` Sagi Grimberg
2019-10-27  7:23               ` Ming Lei
2019-10-25 20:51 ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).