All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] NVMe: Re-attach waitqueue following device reset
@ 2014-10-31 19:09 Sam Bradshaw
  2014-11-04 14:55 ` Keith Busch
  0 siblings, 1 reply; 4+ messages in thread
From: Sam Bradshaw @ 2014-10-31 19:09 UTC (permalink / raw)


Bios queued to sq_cong prior to and during a device reset may not 
get re-issued or may get issued very slowly following the completion 
of the reset sequence.  The waitqueue was detached as part of the 
sequence and never re-attached, forcing all new make_request() bios 
to be queued and the queue to be flushed out without the help of the 
completion path waking the kthread.

This patch re-attaches the waitqueue as part of the reset process.

Signed-off-by: Sam Bradshaw <sbradshaw at micron.com>
---
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index b85a2a0..8133ee8 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -2329,6 +2329,23 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	nvme_free_queues(dev, nr_io_queues + 1);
 	nvme_assign_io_queues(dev);
 
+	/* Re-attach the waitqueue */
+	for (i = dev->queue_count - 1; i >= 1; i--) {
+		struct nvme_queue *nvmeq = raw_nvmeq(dev, i);
+
+		if (nvmeq) {
+			spin_lock_irq(&nvmeq->q_lock);
+			init_waitqueue_head(&nvmeq->sq_full);
+			init_waitqueue_entry(&nvmeq->sq_cong_wait,
+								nvme_thread);
+			if (bio_list_peek(&nvmeq->sq_cong) ||
+						!list_empty(&nvmeq->iod_bio))
+				add_wait_queue(&nvmeq->sq_full,
+							&nvmeq->sq_cong_wait);
+			spin_unlock_irq(&nvmeq->q_lock);
+		}
+	}
+
 	return 0;
 
  free_queues:

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH] NVMe: Re-attach waitqueue following device reset
  2014-10-31 19:09 [PATCH] NVMe: Re-attach waitqueue following device reset Sam Bradshaw
@ 2014-11-04 14:55 ` Keith Busch
  2014-11-04 21:48   ` Sam Bradshaw (sbradshaw)
  0 siblings, 1 reply; 4+ messages in thread
From: Keith Busch @ 2014-11-04 14:55 UTC (permalink / raw)


On Fri, 31 Oct 2014, Sam Bradshaw wrote:
> Bios queued to sq_cong prior to and during a device reset may not
> get re-issued or may get issued very slowly following the completion
> of the reset sequence.  The waitqueue was detached as part of the
> sequence and never re-attached, forcing all new make_request() bios
> to be queued and the queue to be flushed out without the help of the
> completion path waking the kthread.

Hi Sam,

Are you using the latest from linux-nvme tree? I could have sworn this
bug was fixed in commit:

http://git.infradead.org/users/willy/linux-nvme.git/commitdiff/3559b4d2159aaae42f0fc0fa22e40312c45d4606

Please let me know if there's still a gap.

> This patch re-attaches the waitqueue as part of the reset process.
>
> Signed-off-by: Sam Bradshaw <sbradshaw at micron.com>
> ---
> diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
> index b85a2a0..8133ee8 100644
> --- a/drivers/block/nvme-core.c
> +++ b/drivers/block/nvme-core.c
> @@ -2329,6 +2329,23 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
> 	nvme_free_queues(dev, nr_io_queues + 1);
> 	nvme_assign_io_queues(dev);
>
> +	/* Re-attach the waitqueue */
> +	for (i = dev->queue_count - 1; i >= 1; i--) {
> +		struct nvme_queue *nvmeq = raw_nvmeq(dev, i);
> +
> +		if (nvmeq) {
> +			spin_lock_irq(&nvmeq->q_lock);
> +			init_waitqueue_head(&nvmeq->sq_full);
> +			init_waitqueue_entry(&nvmeq->sq_cong_wait,
> +								nvme_thread);
> +			if (bio_list_peek(&nvmeq->sq_cong) ||
> +						!list_empty(&nvmeq->iod_bio))
> +				add_wait_queue(&nvmeq->sq_full,
> +							&nvmeq->sq_cong_wait);
> +			spin_unlock_irq(&nvmeq->q_lock);
> +		}
> +	}
> +
> 	return 0;
>
>  free_queues:

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] NVMe: Re-attach waitqueue following device reset
  2014-11-04 14:55 ` Keith Busch
@ 2014-11-04 21:48   ` Sam Bradshaw (sbradshaw)
  2014-11-04 23:25     ` Busch, Keith
  0 siblings, 1 reply; 4+ messages in thread
From: Sam Bradshaw (sbradshaw) @ 2014-11-04 21:48 UTC (permalink / raw)




> -----Original Message-----
> From: Keith Busch [mailto:keith.busch at intel.com]
> Sent: Tuesday, November 04, 2014 6:55 AM
> To: Sam Bradshaw (sbradshaw)
> Cc: linux-nvme at lists.infradead.org
> Subject: Re: [PATCH] NVMe: Re-attach waitqueue following device reset
> 
> On Fri, 31 Oct 2014, Sam Bradshaw wrote:
> > Bios queued to sq_cong prior to and during a device reset may not get
> > re-issued or may get issued very slowly following the completion of
> > the reset sequence.  The waitqueue was detached as part of the
> > sequence and never re-attached, forcing all new make_request() bios
> to
> > be queued and the queue to be flushed out without the help of the
> > completion path waking the kthread.
> 
> Hi Sam,
> 
> Are you using the latest from linux-nvme tree? I could have sworn this
> bug was fixed in commit:
> 
> http://git.infradead.org/users/willy/linux-
> nvme.git/commitdiff/3559b4d2159aaae42f0fc0fa22e40312c45d4606
> 
> Please let me know if there's still a gap.

Yes, I was apparently testing against an old tree.  The commit above
fixes the problem I was observing.

I still see queued bio's getting lost if a successful device reset
is followed by a failed resume.  I'm working on a patch for that case.

-Sam

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] NVMe: Re-attach waitqueue following device reset
  2014-11-04 21:48   ` Sam Bradshaw (sbradshaw)
@ 2014-11-04 23:25     ` Busch, Keith
  0 siblings, 0 replies; 4+ messages in thread
From: Busch, Keith @ 2014-11-04 23:25 UTC (permalink / raw)


> I still see queued bio's getting lost if a successful device reset
> is followed by a failed resume.  I'm working on a patch for that case.

Oh, that doesn't sound good. If the device fails to resume, the driver requests the pci layer remove it. That then calls the driver's .remove which clears the bio lists after the queue is freed.

This sequence may deadlock when bios are queued if you don't have this commit:

http://git.infradead.org/users/willy/linux-nvme.git/commitdiff/74bae7e0dd70f3eee05d577d3d25fbcd54af4228

But if you do have that commit, then something else is amiss.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-11-04 23:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-31 19:09 [PATCH] NVMe: Re-attach waitqueue following device reset Sam Bradshaw
2014-11-04 14:55 ` Keith Busch
2014-11-04 21:48   ` Sam Bradshaw (sbradshaw)
2014-11-04 23:25     ` Busch, Keith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.