linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <keith.busch@intel.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Jens Axboe <axboe@fb.com>, Christoph Hellwig <hch@lst.de>,
	James Bottomley <jejb@linux.vnet.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Mike Snitzer <snitzer@redhat.com>,
	Doug Ledford <dledford@redhat.com>,
	Ming Lin <ming.l@ssi.samsung.com>,
	Laurence Oberman <loberman@redhat.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Subject: Re: [PATCH v3 0/11] Fix race conditions related to stopping block layer queues
Date: Thu, 20 Oct 2016 10:52:24 -0400	[thread overview]
Message-ID: <20161020145224.GA2771@localhost.localdomain> (raw)
In-Reply-To: <25418d7a-7e66-3b99-7532-669f7ebd58a6@sandisk.com>

On Wed, Oct 19, 2016 at 04:51:18PM -0700, Bart Van Assche wrote:
> 
> I assume that line 498 in blk-mq.c corresponds to BUG_ON(blk_queued_rq(rq))?
> Anyway, it seems to me like this is a bug in the NVMe code and also that
> this bug is completely unrelated to my patch series. In nvme_complete_rq() I
> see that blk_mq_requeue_request() is called. I don't think this is allowed
> from the context of nvme_cancel_request() because blk_mq_requeue_request()
> assumes that a request has already been removed from the request list.
> However, neither blk_mq_tagset_busy_iter() nor nvme_cancel_request() remove
> a request from the request list before nvme_complete_rq() is called. I think
> this is what triggers the BUG_ON() statement in blk_mq_requeue_request().
> Have you noticed that e.g. the scsi-mq code only calls
> blk_mq_requeue_request() after __blk_mq_end_request() has finished? Have you
> considered to follow the same approach in nvme_cancel_request()?

Both nvme and scsi requeue through their mp_ops 'complete' callback, so
nvme is similarly waiting for __blk_mq_end_request before requesting to
requeue. The problem, I think, is nvme's IO cancelling path is observing
active requests that it's requeuing from the queue_rq path.

Patch [11/11] kicks the requeue list unconditionally. This restarts queues
the driver had just quiesced a moment before, restarting those requests,
but the driver isn't ready to handle them. When the driver ultimately
unbinds from the device, it requeues those requests a second time.

Either the requeuing can't kick the requeue work when queisced, or the
shutdown needs to quiesce even when it hasn't restarted the queues.
Either patch below appears to fix the issue.

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ccd9cc5..078530c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -201,7 +201,7 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct gendisk *disk)
 
 void nvme_requeue_req(struct request *req)
 {
-	blk_mq_requeue_request(req, true);
+	blk_mq_requeue_request(req, !blk_mq_queue_stopped(req->q));
 }
 EXPORT_SYMBOL_GPL(nvme_requeue_req);
--

--- 
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4b30fa2..a05da98 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1681,10 +1681,9 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 	del_timer_sync(&dev->watchdog_timer);
 
 	mutex_lock(&dev->shutdown_lock);
-	if (pci_is_enabled(to_pci_dev(dev->dev))) {
-		nvme_stop_queues(&dev->ctrl);
+	nvme_stop_queues(&dev->ctrl);
+	if (pci_is_enabled(to_pci_dev(dev->dev)))
 		csts = readl(dev->bar + NVME_REG_CSTS);
-	}
 
 	queues = dev->online_queues - 1;
 	for (i = dev->queue_count - 1; i > 0; i--)
--

  reply	other threads:[~2016-10-20 14:52 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-18 21:48 [PATCH v3 0/11] Fix race conditions related to stopping block layer queues Bart Van Assche
2016-10-18 21:48 ` [PATCH v3 01/11] blk-mq: Do not invoke .queue_rq() for a stopped queue Bart Van Assche
2016-10-19 13:17   ` Christoph Hellwig
     [not found]   ` <595d4b59-3892-ef24-ef91-b7cab6611af7-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-10-19 23:48     ` Ming Lei
2016-10-18 21:49 ` [PATCH v3 02/11] blk-mq: Introduce blk_mq_hctx_stopped() Bart Van Assche
2016-10-19 13:19   ` Christoph Hellwig
2016-10-19 15:58     ` Bart Van Assche
2016-10-18 21:49 ` [PATCH v3 03/11] blk-mq: Introduce blk_mq_queue_stopped() Bart Van Assche
     [not found]   ` <b4e8cd03-0654-3e62-a559-ecc996676807-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-10-19 13:19     ` Christoph Hellwig
2016-10-18 21:50 ` [PATCH v3 04/11] blk-mq: Introduce blk_mq_quiesce_queue() Bart Van Assche
2016-10-19 13:23   ` Christoph Hellwig
2016-10-19 16:13     ` Bart Van Assche
2016-10-19 21:04   ` Bart Van Assche
2016-10-19 23:47     ` Ming Lei
2016-10-18 21:51 ` [PATCH v3 06/11] dm: Use BLK_MQ_S_STOPPED instead of QUEUE_FLAG_STOPPED in blk-mq code Bart Van Assche
2016-10-19 13:28   ` Christoph Hellwig
2016-10-18 21:52 ` [PATCH v3 07/11] dm: Fix a race condition related to stopping and starting queues Bart Van Assche
2016-10-19 13:30   ` Christoph Hellwig
2016-10-18 21:52 ` [PATCH v3 08/11] SRP transport: Move queuecommand() wait code to SCSI core Bart Van Assche
     [not found]   ` <09a8efcb-d0e0-d5bd-288c-686b553e6326-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-10-19 13:38     ` Christoph Hellwig
2016-10-18 21:52 ` [PATCH v3 09/11] SRP transport, scsi-mq: Wait for .queue_rq() if necessary Bart Van Assche
2016-10-19 13:39   ` Christoph Hellwig
2016-10-18 21:53 ` [PATCH v3 10/11] nvme: Use BLK_MQ_S_STOPPED instead of QUEUE_FLAG_STOPPED in blk-mq code Bart Van Assche
2016-10-19 13:39   ` Christoph Hellwig
2016-10-18 21:53 ` [PATCH v3 11/11] nvme: Fix a race condition Bart Van Assche
2016-10-19 13:41   ` Christoph Hellwig
     [not found] ` <b39eb0e7-1007-eb63-8e7f-9a7f08508379-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-10-18 21:51   ` [PATCH v3 05/11] blk-mq: Add a kick_requeue_list argument to blk_mq_requeue_request() Bart Van Assche
2016-10-19 13:23     ` Christoph Hellwig
2016-10-18 21:56   ` [PATCH v3 0/11] Fix race conditions related to stopping block layer queues Bart Van Assche
2016-10-19 22:24 ` Keith Busch
2016-10-19 23:51   ` Bart Van Assche
2016-10-20 14:52     ` Keith Busch [this message]
2016-10-20 15:35       ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161020145224.GA2771@localhost.localdomain \
    --to=keith.busch@intel.com \
    --cc=axboe@fb.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=dledford@redhat.com \
    --cc=hch@lst.de \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=martin.petersen@oracle.com \
    --cc=ming.l@ssi.samsung.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).