nvme reset and probe updates

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* nvme reset and probe updates
@ 2015-10-02 17:58 Christoph Hellwig
  2015-10-02 17:58 ` [PATCH 1/7] nvme: delete dev from dev_list in nvme_reset Christoph Hellwig
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-02 17:58 UTC (permalink / raw)


Hi Jens, hi Keith,

this serie contains a couple small update and cleanups for the
reset and probe path in the nvme driver.  I've rebased it ontop
of Keith pending 3 patches in this area.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/7] nvme: delete dev from dev_list in nvme_reset
  2015-10-02 17:58 nvme reset and probe updates Christoph Hellwig
@ 2015-10-02 17:58 ` Christoph Hellwig
  2015-10-02 21:13   ` Keith Busch
  2015-10-02 17:58 ` [PATCH 2/7] nvme: merge nvme_dev_reset into nvme_reset_failed_dev Christoph Hellwig
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-02 17:58 UTC (permalink / raw)


Device resets need to delete the device from the device list before
kicking of the reset an re-probe, otherwise we get the device added
to the list twice.  nvme_reset is the only side missing this deletion
at the moment, and this patch adds it.

Signed-off-by: Christoph Hellwig <hch at lst.de>
---
 drivers/block/nvme-core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index c4a9a80..91f2b20 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -3093,6 +3093,7 @@ static int nvme_reset(struct nvme_dev *dev)
 
 	spin_lock(&dev_list_lock);
 	if (!work_pending(&dev->reset_work)) {
+		list_del_init(&dev->node);
 		queue_work(nvme_workq, &dev->reset_work);
 		ret = 0;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 1/7] nvme: delete dev from dev_list in nvme_reset
  2015-10-02 17:58 ` [PATCH 1/7] nvme: delete dev from dev_list in nvme_reset Christoph Hellwig
@ 2015-10-02 21:13   ` Keith Busch
  2015-10-03 10:01     ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2015-10-02 21:13 UTC (permalink / raw)


On Fri, 2 Oct 2015, Christoph Hellwig wrote:
> Device resets need to delete the device from the device list before
> kicking of the reset an re-probe, otherwise we get the device added
> to the list twice.  nvme_reset is the only side missing this deletion
> at the moment, and this patch adds it.

A moment of panic when I saw this, but we're 'ok'. The reset handler calls
nvme_dev_shutdown() first, and that removes the device from the list,
so we won't really see a controller twice added in the list.

Still, there's no reason to poll a device that's about to reset, so it's
a good change.

Reviewed-by: Keith Busch <keith.busch at intel.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/7] nvme: delete dev from dev_list in nvme_reset
  2015-10-02 21:13   ` Keith Busch
@ 2015-10-03 10:01     ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-03 10:01 UTC (permalink / raw)


On Fri, Oct 02, 2015@09:13:22PM +0000, Keith Busch wrote:
> A moment of panic when I saw this, but we're 'ok'. The reset handler calls
> nvme_dev_shutdown() first, and that removes the device from the list,
> so we won't really see a controller twice added in the list.

Oh.  So we get a double list del there.  I though list debuging would
complain about this.  Anyway, let's make the deletion uniform and so that
any fixes can be applied to all callers in the same way.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/7] nvme: merge nvme_dev_reset into nvme_reset_failed_dev
  2015-10-02 17:58 nvme reset and probe updates Christoph Hellwig
  2015-10-02 17:58 ` [PATCH 1/7] nvme: delete dev from dev_list in nvme_reset Christoph Hellwig
@ 2015-10-02 17:58 ` Christoph Hellwig
  2015-10-02 21:15   ` Keith Busch
  2015-10-02 17:58 ` [PATCH 3/7] nvme: factor reset code into a common helper Christoph Hellwig
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-02 17:58 UTC (permalink / raw)


And give the resulting function a more descriptive name.

Signed-off-by: Christoph Hellwig <hch at lst.de>
---
 drivers/block/nvme-core.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 91f2b20..e8a4218f 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -84,7 +84,6 @@ static wait_queue_head_t nvme_kthread_wait;
 
 static struct class *nvme_class;
 
-static void nvme_reset_failed_dev(struct work_struct *ws);
 static int nvme_reset(struct nvme_dev *dev);
 static int nvme_process_cq(struct nvme_queue *nvmeq);
 
@@ -3057,8 +3056,9 @@ static void nvme_dead_ctrl(struct nvme_dev *dev)
 	}
 }
 
-static void nvme_dev_reset(struct nvme_dev *dev)
+static void nvme_reset_work(struct work_struct *ws)
 {
+	struct nvme_dev *dev = container_of(ws, struct nvme_dev, reset_work);
 	bool in_probe = work_busy(&dev->probe_work);
 
 	nvme_dev_shutdown(dev);
@@ -3078,12 +3078,6 @@ static void nvme_dev_reset(struct nvme_dev *dev)
 	schedule_work(&dev->probe_work);
 }
 
-static void nvme_reset_failed_dev(struct work_struct *ws)
-{
-	struct nvme_dev *dev = container_of(ws, struct nvme_dev, reset_work);
-	nvme_dev_reset(dev);
-}
-
 static int nvme_reset(struct nvme_dev *dev)
 {
 	int ret = -EBUSY;
@@ -3146,7 +3140,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		goto free;
 
 	INIT_LIST_HEAD(&dev->namespaces);
-	INIT_WORK(&dev->reset_work, nvme_reset_failed_dev);
+	INIT_WORK(&dev->reset_work, nvme_reset_work);
 	dev->dev = get_device(&pdev->dev);
 	pci_set_drvdata(pdev, dev);
 	result = nvme_set_instance(dev);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/7] nvme: merge nvme_dev_reset into nvme_reset_failed_dev
  2015-10-02 17:58 ` [PATCH 2/7] nvme: merge nvme_dev_reset into nvme_reset_failed_dev Christoph Hellwig
@ 2015-10-02 21:15   ` Keith Busch
  0 siblings, 0 replies; 18+ messages in thread
From: Keith Busch @ 2015-10-02 21:15 UTC (permalink / raw)


On Fri, 2 Oct 2015, Christoph Hellwig wrote:
> And give the resulting function a more descriptive name.
>
> Signed-off-by: Christoph Hellwig <hch at lst.de>

Looks good,

Reviewed-by: Keith Busch <keith.busch at intel.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 3/7] nvme: factor reset code into a common helper
  2015-10-02 17:58 nvme reset and probe updates Christoph Hellwig
  2015-10-02 17:58 ` [PATCH 1/7] nvme: delete dev from dev_list in nvme_reset Christoph Hellwig
  2015-10-02 17:58 ` [PATCH 2/7] nvme: merge nvme_dev_reset into nvme_reset_failed_dev Christoph Hellwig
@ 2015-10-02 17:58 ` Christoph Hellwig
  2015-10-02 21:19   ` Keith Busch
  2015-10-02 17:58 ` [PATCH 4/7] nvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe Christoph Hellwig
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-02 17:58 UTC (permalink / raw)


Signed-off-by: Christoph Hellwig <hch at lst.de>
---
 drivers/block/nvme-core.c | 48 +++++++++++++++++++++++------------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index e8a4218f..d69b831 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -84,6 +84,7 @@ static wait_queue_head_t nvme_kthread_wait;
 
 static struct class *nvme_class;
 
+static int __nvme_reset(struct nvme_dev *dev);
 static int nvme_reset(struct nvme_dev *dev);
 static int nvme_process_cq(struct nvme_queue *nvmeq);
 
@@ -1278,17 +1279,13 @@ static void nvme_abort_req(struct request *req)
 	struct nvme_command cmd;
 
 	if (!nvmeq->qid || cmd_rq->aborted) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&dev_list_lock, flags);
-		if (work_busy(&dev->reset_work))
-			goto out;
-		list_del_init(&dev->node);
-		dev_warn(dev->dev, "I/O %d QID %d timeout, reset controller\n",
-							req->tag, nvmeq->qid);
-		queue_work(nvme_workq, &dev->reset_work);
- out:
-		spin_unlock_irqrestore(&dev_list_lock, flags);
+		spin_lock(&dev_list_lock);
+		if (!__nvme_reset(dev)) {
+			dev_warn(dev->dev,
+				 "I/O %d QID %d timeout, reset controller\n",
+				 req->tag, nvmeq->qid);
+		}
+		spin_unlock(&dev_list_lock);
 		return;
 	}
 
@@ -2083,13 +2080,11 @@ static int nvme_kthread(void *data)
 
 			if ((dev->subsystem && (csts & NVME_CSTS_NSSRO)) ||
 							csts & NVME_CSTS_CFS) {
-				if (work_busy(&dev->reset_work))
-					continue;
-				list_del_init(&dev->node);
-				dev_warn(dev->dev,
-					"Failed status: %x, reset controller\n",
-					readl(&dev->bar->csts));
-				queue_work(nvme_workq, &dev->reset_work);
+				if (!__nvme_reset(dev)) {
+					dev_warn(dev->dev,
+						"Failed status: %x, reset controller\n",
+						readl(&dev->bar->csts));
+				}
 				continue;
 			}
 			for (i = 0; i < dev->queue_count; i++) {
@@ -3078,19 +3073,24 @@ static void nvme_reset_work(struct work_struct *ws)
 	schedule_work(&dev->probe_work);
 }
 
+static int __nvme_reset(struct nvme_dev *dev)
+{
+	if (work_pending(&dev->reset_work))
+		return -EBUSY;
+	list_del_init(&dev->node);
+	queue_work(nvme_workq, &dev->reset_work);
+	return 0;
+}
+
 static int nvme_reset(struct nvme_dev *dev)
 {
-	int ret = -EBUSY;
+	int ret;
 
 	if (!dev->admin_q || blk_queue_dying(dev->admin_q))
 		return -ENODEV;
 
 	spin_lock(&dev_list_lock);
-	if (!work_pending(&dev->reset_work)) {
-		list_del_init(&dev->node);
-		queue_work(nvme_workq, &dev->reset_work);
-		ret = 0;
-	}
+	ret = __nvme_reset(dev);
 	spin_unlock(&dev_list_lock);
 
 	if (!ret) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/7] nvme: factor reset code into a common helper
  2015-10-02 17:58 ` [PATCH 3/7] nvme: factor reset code into a common helper Christoph Hellwig
@ 2015-10-02 21:19   ` Keith Busch
  0 siblings, 0 replies; 18+ messages in thread
From: Keith Busch @ 2015-10-02 21:19 UTC (permalink / raw)


On Fri, 2 Oct 2015, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch at lst.de>

Looks good,

Reviewed-by: Keith Busch <keith.busch at intel.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 4/7] nvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe
  2015-10-02 17:58 nvme reset and probe updates Christoph Hellwig
                   ` (2 preceding siblings ...)
  2015-10-02 17:58 ` [PATCH 3/7] nvme: factor reset code into a common helper Christoph Hellwig
@ 2015-10-02 17:58 ` Christoph Hellwig
  2015-10-02 21:27   ` Keith Busch
  2015-10-02 17:58 ` [PATCH 5/7] nvme: remove duplicate call to nvme_set_irq_hints Christoph Hellwig
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-02 17:58 UTC (permalink / raw)


And give the resulting function a sensible name.  This keeps all the
error handling in a single place and will allow for further improvements
to it.

Signed-off-by: Christoph Hellwig <hch at lst.de>
---
 drivers/block/nvme-core.c | 56 +++++++++++++++++++----------------------------
 1 file changed, 22 insertions(+), 34 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index d69b831..4660bfa 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -87,6 +87,7 @@ static struct class *nvme_class;
 static int __nvme_reset(struct nvme_dev *dev);
 static int nvme_reset(struct nvme_dev *dev);
 static int nvme_process_cq(struct nvme_queue *nvmeq);
+static void nvme_dead_ctrl(struct nvme_dev *dev);
 
 struct async_cmd_info {
 	struct kthread_work work;
@@ -2950,14 +2951,16 @@ static void nvme_set_irq_hints(struct nvme_dev *dev)
 	}
 }
 
-static int nvme_dev_start(struct nvme_dev *dev)
+
+static void nvme_probe_work(struct work_struct *work)
 {
-	int result;
+	struct nvme_dev *dev = container_of(work, struct nvme_dev, probe_work);
 	bool start_thread = false;
+	int result;
 
 	result = nvme_dev_map(dev);
 	if (result)
-		return result;
+		goto out;
 
 	result = nvme_configure_admin_queue(dev);
 	if (result)
@@ -2994,7 +2997,18 @@ static int nvme_dev_start(struct nvme_dev *dev)
 	nvme_set_irq_hints(dev);
 
 	dev->event_limit = 1;
-	return result;
+
+	if (dev->online_queues < 2) {
+		dev_warn(dev->dev, "IO queues not created\n");
+		nvme_free_queues(dev, 1);
+		nvme_dev_remove(dev);
+	} else {
+		nvme_unfreeze_queues(dev);
+		nvme_dev_add(dev);
+		nvme_set_irq_hints(dev);
+	}
+
+	return;
 
  free_tags:
 	nvme_dev_remove_admin(dev);
@@ -3006,7 +3020,9 @@ static int nvme_dev_start(struct nvme_dev *dev)
 	nvme_dev_list_remove(dev);
  unmap:
 	nvme_dev_unmap(dev);
-	return result;
+ out:
+	if (!work_busy(&dev->reset_work))
+		nvme_dead_ctrl(dev);
 }
 
 static int nvme_remove_dead_ctrl(void *arg)
@@ -3020,25 +3036,6 @@ static int nvme_remove_dead_ctrl(void *arg)
 	return 0;
 }
 
-static int nvme_dev_resume(struct nvme_dev *dev)
-{
-	int ret;
-
-	ret = nvme_dev_start(dev);
-	if (ret)
-		return ret;
-	if (dev->online_queues < 2) {
-		dev_warn(dev->dev, "IO queues not created\n");
-		nvme_free_queues(dev, 1);
-		nvme_dev_remove(dev);
-	} else {
-		nvme_unfreeze_queues(dev);
-		nvme_dev_add(dev);
-		nvme_set_irq_hints(dev);
-	}
-	return 0;
-}
-
 static void nvme_dead_ctrl(struct nvme_dev *dev)
 {
 	dev_warn(dev->dev, "Device failed to resume\n");
@@ -3117,7 +3114,6 @@ static ssize_t nvme_sysfs_reset(struct device *dev,
 }
 static DEVICE_ATTR(reset_controller, S_IWUSR, NULL, nvme_sysfs_reset);
 
-static void nvme_async_probe(struct work_struct *work);
 static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	int node, result = -ENOMEM;
@@ -3168,7 +3164,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	INIT_LIST_HEAD(&dev->node);
 	INIT_WORK(&dev->scan_work, nvme_dev_scan);
-	INIT_WORK(&dev->probe_work, nvme_async_probe);
+	INIT_WORK(&dev->probe_work, nvme_probe_work);
 	schedule_work(&dev->probe_work);
 	return 0;
 
@@ -3188,14 +3184,6 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	return result;
 }
 
-static void nvme_async_probe(struct work_struct *work)
-{
-	struct nvme_dev *dev = container_of(work, struct nvme_dev, probe_work);
-
-	if (nvme_dev_resume(dev) && !work_busy(&dev->reset_work))
-		nvme_dead_ctrl(dev);
-}
-
 static void nvme_reset_notify(struct pci_dev *pdev, bool prepare)
 {
 	struct nvme_dev *dev = pci_get_drvdata(pdev);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/7] nvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe
  2015-10-02 17:58 ` [PATCH 4/7] nvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe Christoph Hellwig
@ 2015-10-02 21:27   ` Keith Busch
  2015-10-03 10:02     ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2015-10-02 21:27 UTC (permalink / raw)


On Fri, 2 Oct 2015, Christoph Hellwig wrote:
> And give the resulting function a sensible name.  This keeps all the
> error handling in a single place and will allow for further improvements
> to it.
>
> Signed-off-by: Christoph Hellwig <hch at lst.de>

This, and the next patch by extension, are in conflict with the
linux-block/for-linus. We had to move where the irq affinity hint
is set due to the scanning occuring in yet-another-work-queue.

The simplication is great though, feel free to add my review if you
spin another.

Reviewed-by: Keith Busch <keith.busch at intel.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 4/7] nvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe
  2015-10-02 21:27   ` Keith Busch
@ 2015-10-03 10:02     ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-03 10:02 UTC (permalink / raw)


On Fri, Oct 02, 2015@09:27:28PM +0000, Keith Busch wrote:
> On Fri, 2 Oct 2015, Christoph Hellwig wrote:
>> And give the resulting function a sensible name.  This keeps all the
>> error handling in a single place and will allow for further improvements
>> to it.
>>
>> Signed-off-by: Christoph Hellwig <hch at lst.de>
>
> This, and the next patch by extension, are in conflict with the
> linux-block/for-linus. We had to move where the irq affinity hint
> is set due to the scanning occuring in yet-another-work-queue.
>
> The simplication is great though, feel free to add my review if you
> spin another.

Thanks - I'll rebase everything on top of Jens' tree.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 5/7] nvme: remove duplicate call to nvme_set_irq_hints
  2015-10-02 17:58 nvme reset and probe updates Christoph Hellwig
                   ` (3 preceding siblings ...)
  2015-10-02 17:58 ` [PATCH 4/7] nvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe Christoph Hellwig
@ 2015-10-02 17:58 ` Christoph Hellwig
  2015-10-02 17:58 ` [PATCH 6/7] nvme: properly handle partially initialized queues in nvme_create_io_queues Christoph Hellwig
  2015-10-02 17:58 ` [PATCH 7/7] nvme: remove failed controllers directly from probe/reset context Christoph Hellwig
  6 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-02 17:58 UTC (permalink / raw)


We just called the function a few lines earlier here.

Signed-off-by: Christoph Hellwig <hch at lst.de>
---
 drivers/block/nvme-core.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 4660bfa..c35ead3 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -3005,7 +3005,6 @@ static void nvme_probe_work(struct work_struct *work)
 	} else {
 		nvme_unfreeze_queues(dev);
 		nvme_dev_add(dev);
-		nvme_set_irq_hints(dev);
 	}
 
 	return;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/7] nvme: properly handle partially initialized queues in nvme_create_io_queues
  2015-10-02 17:58 nvme reset and probe updates Christoph Hellwig
                   ` (4 preceding siblings ...)
  2015-10-02 17:58 ` [PATCH 5/7] nvme: remove duplicate call to nvme_set_irq_hints Christoph Hellwig
@ 2015-10-02 17:58 ` Christoph Hellwig
  2015-10-02 21:36   ` Keith Busch
  2015-10-02 17:58 ` [PATCH 7/7] nvme: remove failed controllers directly from probe/reset context Christoph Hellwig
  6 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-02 17:58 UTC (permalink / raw)


This avoids having to clean up later in a seemingly unrelated place.

Signed-off-by: Christoph Hellwig <hch at lst.de>
---
 drivers/block/nvme-core.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index c35ead3..48688d6 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -2191,6 +2191,13 @@ static void nvme_alloc_ns(struct nvme_dev *dev, unsigned nsid)
 	kfree(ns);
 }
 
+/*
+ * Create I/O queues.  Failing to create an I/O queue is not an issue,
+ * we can continue with less than the desired amount of queues, and
+ * even a controller without I/O queues an still be used to issue
+ * admin commands.  This might be useful to upgrade a buggy firmware
+ * for example.
+ */
 static void nvme_create_io_queues(struct nvme_dev *dev)
 {
 	unsigned i;
@@ -2200,8 +2207,10 @@ static void nvme_create_io_queues(struct nvme_dev *dev)
 			break;
 
 	for (i = dev->online_queues; i <= dev->queue_count - 1; i++)
-		if (nvme_create_queue(dev->queues[i], i))
+		if (nvme_create_queue(dev->queues[i], i)) {
+			nvme_free_queues(dev, i);
 			break;
+		}
 }
 
 static int set_queue_count(struct nvme_dev *dev, int count)
@@ -2998,9 +3007,12 @@ static void nvme_probe_work(struct work_struct *work)
 
 	dev->event_limit = 1;
 
+	/*
+	 * Keep the controller around but remove all namespaces if we don't have
+	 * any working I/O queue.
+	 */
 	if (dev->online_queues < 2) {
 		dev_warn(dev->dev, "IO queues not created\n");
-		nvme_free_queues(dev, 1);
 		nvme_dev_remove(dev);
 	} else {
 		nvme_unfreeze_queues(dev);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/7] nvme: properly handle partially initialized queues in nvme_create_io_queues
  2015-10-02 17:58 ` [PATCH 6/7] nvme: properly handle partially initialized queues in nvme_create_io_queues Christoph Hellwig
@ 2015-10-02 21:36   ` Keith Busch
  2015-10-03 10:03     ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2015-10-02 21:36 UTC (permalink / raw)

On Fri, 2 Oct 2015, Christoph Hellwig wrote:
> This avoids having to clean up later in a seemingly unrelated place.
>
> Signed-off-by: Christoph Hellwig <hch at lst.de>

Somewhat related to this area of code but not a new concern, I think we're
in trouble if the controller resets after we have active request_queue,
but controller allows fewer IO queues than originally configured with
the blk-mq tagset. On the other hand, we don't have a way to provide
additional IO queues to the tagset after a reset either, so I think we
need a new blk-mq API to notify it of changes to number of h/w contexts.

This part looks good.

Reviewed-by: Keith Busch <keith.busch at intel.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 6/7] nvme: properly handle partially initialized queues in nvme_create_io_queues
  2015-10-02 21:36   ` Keith Busch
@ 2015-10-03 10:03     ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-03 10:03 UTC (permalink / raw)


On Fri, Oct 02, 2015@09:36:26PM +0000, Keith Busch wrote:
> On Fri, 2 Oct 2015, Christoph Hellwig wrote:
>> This avoids having to clean up later in a seemingly unrelated place.
>>
>> Signed-off-by: Christoph Hellwig <hch at lst.de>
>
> Somewhat related to this area of code but not a new concern, I think we're
> in trouble if the controller resets after we have active request_queue,
> but controller allows fewer IO queues than originally configured with
> the blk-mq tagset. On the other hand, we don't have a way to provide
> additional IO queues to the tagset after a reset either, so I think we
> need a new blk-mq API to notify it of changes to number of h/w contexts.

Yes, that's indeed an issue.  I think we need to mark the controller dead
in this case until we have such an API.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 7/7] nvme: remove failed controllers directly from probe/reset context
  2015-10-02 17:58 nvme reset and probe updates Christoph Hellwig
                   ` (5 preceding siblings ...)
  2015-10-02 17:58 ` [PATCH 6/7] nvme: properly handle partially initialized queues in nvme_create_io_queues Christoph Hellwig
@ 2015-10-02 17:58 ` Christoph Hellwig
  2015-10-02 21:54   ` Keith Busch
  6 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-02 17:58 UTC (permalink / raw)


Now that all probing and resets happen from workqueue context we can remove
the controller directly without any risk of deadlocking.

Signed-off-by: Christoph Hellwig <hch at lst.de>
---
 drivers/block/nvme-core.c | 37 +++++++++++--------------------------
 1 file changed, 11 insertions(+), 26 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 48688d6..39c40bb 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -87,7 +87,6 @@ static struct class *nvme_class;
 static int __nvme_reset(struct nvme_dev *dev);
 static int nvme_reset(struct nvme_dev *dev);
 static int nvme_process_cq(struct nvme_queue *nvmeq);
-static void nvme_dead_ctrl(struct nvme_dev *dev);
 
 struct async_cmd_info {
 	struct kthread_work work;
@@ -2960,6 +2959,15 @@ static void nvme_set_irq_hints(struct nvme_dev *dev)
 	}
 }
 
+static void nvme_remove_dead_ctrl(struct nvme_dev *dev)
+{
+	struct pci_dev *pdev = to_pci_dev(dev->dev);
+
+	dev_warn(dev->dev, "Device failed to start.\n");
+
+	if (pci_get_drvdata(pdev))
+		pci_stop_and_remove_bus_device_locked(pdev);
+}
 
 static void nvme_probe_work(struct work_struct *work)
 {
@@ -3033,30 +3041,7 @@ static void nvme_probe_work(struct work_struct *work)
 	nvme_dev_unmap(dev);
  out:
 	if (!work_busy(&dev->reset_work))
-		nvme_dead_ctrl(dev);
-}
-
-static int nvme_remove_dead_ctrl(void *arg)
-{
-	struct nvme_dev *dev = (struct nvme_dev *)arg;
-	struct pci_dev *pdev = to_pci_dev(dev->dev);
-
-	if (pci_get_drvdata(pdev))
-		pci_stop_and_remove_bus_device_locked(pdev);
-	kref_put(&dev->kref, nvme_free_dev);
-	return 0;
-}
-
-static void nvme_dead_ctrl(struct nvme_dev *dev)
-{
-	dev_warn(dev->dev, "Device failed to resume\n");
-	kref_get(&dev->kref);
-	if (IS_ERR(kthread_run(nvme_remove_dead_ctrl, dev, "nvme%d",
-						dev->instance))) {
-		dev_err(dev->dev,
-			"Failed to start controller remove task\n");
-		kref_put(&dev->kref, nvme_free_dev);
-	}
+		nvme_remove_dead_ctrl(dev);
 }
 
 static void nvme_reset_work(struct work_struct *ws)
@@ -3073,7 +3058,7 @@ static void nvme_reset_work(struct work_struct *ws)
 	/* Fail this device if reset occured during probe to avoid
 	 * infinite initialization loops. */
 	if (in_probe) {
-		nvme_dead_ctrl(dev);
+		nvme_remove_dead_ctrl(dev);
 		return;
 	}
 	/* Schedule device resume asynchronously so the reset work is available
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/7] nvme: remove failed controllers directly from probe/reset context
  2015-10-02 17:58 ` [PATCH 7/7] nvme: remove failed controllers directly from probe/reset context Christoph Hellwig
@ 2015-10-02 21:54   ` Keith Busch
  2015-10-03 10:03     ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2015-10-02 21:54 UTC (permalink / raw)


On Fri, 2 Oct 2015, Christoph Hellwig wrote:
> Now that all probing and resets happen from workqueue context we can remove
> the controller directly without any risk of deadlocking.
>
> Signed-off-by: Christoph Hellwig <hch at lst.de>

I don't know about this one. If you call pci_stop_and_remove_bus_device(),
that will get to the driver's ->remove() callback in the same context. The
first thing this driver does is flush the probe and reset work, and
I don't think a work queue can flush itself.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 7/7] nvme: remove failed controllers directly from probe/reset context
  2015-10-02 21:54   ` Keith Busch
@ 2015-10-03 10:03     ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2015-10-03 10:03 UTC (permalink / raw)


On Fri, Oct 02, 2015@09:54:20PM +0000, Keith Busch wrote:
> On Fri, 2 Oct 2015, Christoph Hellwig wrote:
>> Now that all probing and resets happen from workqueue context we can remove
>> the controller directly without any risk of deadlocking.
>>
>> Signed-off-by: Christoph Hellwig <hch at lst.de>
>
> I don't know about this one. If you call pci_stop_and_remove_bus_device(),
> that will get to the driver's ->remove() callback in the same context. The
> first thing this driver does is flush the probe and reset work, and
> I don't think a work queue can flush itself.

True.  I actually had another patch that ensures shutdown always run in
nvme_wq when I developed this patch, but I dropped it due to conflicts.

I'll drop this patch for now.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-10-03 10:03 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-02 17:58 nvme reset and probe updates Christoph Hellwig
2015-10-02 17:58 ` [PATCH 1/7] nvme: delete dev from dev_list in nvme_reset Christoph Hellwig
2015-10-02 21:13   ` Keith Busch
2015-10-03 10:01     ` Christoph Hellwig
2015-10-02 17:58 ` [PATCH 2/7] nvme: merge nvme_dev_reset into nvme_reset_failed_dev Christoph Hellwig
2015-10-02 21:15   ` Keith Busch
2015-10-02 17:58 ` [PATCH 3/7] nvme: factor reset code into a common helper Christoph Hellwig
2015-10-02 21:19   ` Keith Busch
2015-10-02 17:58 ` [PATCH 4/7] nvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe Christoph Hellwig
2015-10-02 21:27   ` Keith Busch
2015-10-03 10:02     ` Christoph Hellwig
2015-10-02 17:58 ` [PATCH 5/7] nvme: remove duplicate call to nvme_set_irq_hints Christoph Hellwig
2015-10-02 17:58 ` [PATCH 6/7] nvme: properly handle partially initialized queues in nvme_create_io_queues Christoph Hellwig
2015-10-02 21:36   ` Keith Busch
2015-10-03 10:03     ` Christoph Hellwig
2015-10-02 17:58 ` [PATCH 7/7] nvme: remove failed controllers directly from probe/reset context Christoph Hellwig
2015-10-02 21:54   ` Keith Busch
2015-10-03 10:03     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).