linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: swise@opengridcomputing.com (Steve Wise)
Subject: [PATCH 9/9] [RFC] nvme: Fix a race condition
Date: Tue, 27 Sep 2016 11:56:16 -0500	[thread overview]
Message-ID: <016a01d218e0$0fe7dc00$2fb79400$@opengridcomputing.com> (raw)
In-Reply-To: <9d8e0f32-6703-cf23-d424-bcecb65c2a26@sandisk.com>

> On 09/27/2016 09:31 AM, Steve Wise wrote:
> >> @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues);
> >>  void nvme_stop_queues(struct nvme_ctrl *ctrl)
> >>  {
> >>  	struct nvme_ns *ns;
> >> +	struct request_queue *q;
> >>
> >>  	mutex_lock(&ctrl->namespaces_mutex);
> >>  	list_for_each_entry(ns, &ctrl->namespaces, list) {
> >> -		blk_mq_cancel_requeue_work(ns->queue);
> >> -		blk_mq_stop_hw_queues(ns->queue);
> >> +		q = ns->queue;
> >> +		blk_quiesce_queue(q);
> >> +		blk_mq_cancel_requeue_work(q);
> >> +		blk_mq_stop_hw_queues(q);
> >> +		blk_resume_queue(q);
> >>  	}
> >>  	mutex_unlock(&ctrl->namespaces_mutex);
> >
> > Hey Bart, should nvme_stop_queues() really be resuming the blk queue?
> 
> Hello Steve,
> 
> Would you perhaps prefer that blk_resume_queue(q) is called from
> nvme_start_queues()? I think that would make the NVMe code harder to
> review. 

I'm still learning the blk code (and nvme code :)), but I would think
blk_resume_queue() would cause requests to start being submit on the NVME
queues, which I believe shouldn't happen when they are stopped.  I'm currently
debugging a problem where requests are submitted to the nvme-rdma driver while
it has supposedly stopped all the nvme and blk mqs.  I tried your series at
Christoph's request to see if it resolved my problem, but it didn't.  

> The above code won't cause any unexpected side effects if an
> NVMe namespace is removed after nvme_stop_queues() has been called and
> before nvme_start_queues() is called. Moving the blk_resume_queue(q)
> call into nvme_start_queues() will only work as expected if no
> namespaces are added nor removed between the nvme_stop_queues() and
> nvme_start_queues() calls. I'm not familiar enough with the NVMe code to
> know whether or not this change is safe ...
> 

I'll have to look and see if new namespaces can be added/deleted while a nvme
controller is in the RECONNECTING state.   In the meantime, I'm going to move
the blk_resume_queue() to nvme_start_queues() and see if it helps my problem.

Christoph:  Thoughts?

Steve.

  parent reply	other threads:[~2016-09-27 16:56 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-26 18:25 [PATCH 0/9] Introduce blk_quiesce_queue() and blk_resume_queue() Bart Van Assche
2016-09-26 18:26 ` [PATCH 1/9] blk-mq: Introduce blk_mq_queue_stopped() Bart Van Assche
2016-09-27  6:20   ` Hannes Reinecke
2016-09-27  7:38   ` Johannes Thumshirn
2016-09-26 18:26 ` [PATCH 2/9] dm: Fix a race condition related to stopping and starting queues Bart Van Assche
2016-09-27  6:21   ` Hannes Reinecke
2016-09-27  7:47   ` Johannes Thumshirn
2016-09-26 18:27 ` [PATCH 3/9] [RFC] nvme: Use BLK_MQ_S_STOPPED instead of QUEUE_FLAG_STOPPED in blk-mq code Bart Van Assche
2016-09-26 18:27 ` [PATCH 4/9] block: Move blk_freeze_queue() and blk_unfreeze_queue() code Bart Van Assche
2016-09-27  6:26   ` Hannes Reinecke
2016-09-27  7:52     ` Johannes Thumshirn
2016-09-26 18:27 ` [PATCH 5/9] block: Extend blk_freeze_queue_start() to the non-blk-mq path Bart Van Assche
2016-09-27  7:50   ` Johannes Thumshirn
2016-09-27 13:22   ` Ming Lei
2016-09-27 14:42     ` Bart Van Assche
2016-09-27 15:55       ` Bart Van Assche
2016-09-26 18:28 ` [PATCH 6/9] block: Rename mq_freeze_wq and mq_freeze_depth Bart Van Assche
2016-09-27  7:51   ` Johannes Thumshirn
2016-09-26 18:28 ` [PATCH 7/9] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue() Bart Van Assche
2016-09-26 18:28 ` [PATCH 8/9] SRP transport: Port srp_wait_for_queuecommand() to scsi-mq Bart Van Assche
2016-09-26 18:28 ` [PATCH 9/9] [RFC] nvme: Fix a race condition Bart Van Assche
2016-09-27 16:31   ` Steve Wise
2016-09-27 16:43     ` Bart Van Assche
2016-09-27 16:56       ` James Bottomley
2016-09-27 17:09         ` Bart Van Assche
2016-09-28 14:23           ` Steve Wise
2016-09-27 16:56       ` Steve Wise [this message]
2016-09-26 18:33 ` [PATCH 0/9] Introduce blk_quiesce_queue() and blk_resume_queue() Mike Snitzer
2016-09-26 18:46   ` Bart Van Assche
2016-09-26 22:26   ` Bart Van Assche
2016-10-11 16:27 ` Laurence Oberman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='016a01d218e0$0fe7dc00$2fb79400$@opengridcomputing.com' \
    --to=swise@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).