[PATCHSET 0/3] Split queue lock into submission/completion locks

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCHSET 0/3] Split queue lock into submission/completion locks
@ 2018-05-17 15:02 Jens Axboe
  2018-05-17 15:02 ` [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq() Jens Axboe
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Jens Axboe @ 2018-05-17 15:02 UTC (permalink / raw)


This series is on top of the previous work from yesterday, including
Christophs patch. It splits nvmeq->q_lock into two locks:

1) ->sq_lock, that protects the submission side of things
2) ->cq_lock, that protects the completion side

This also means that we can drop the IRQ safe nature of the submission
side locking.

 drivers/nvme/host/pci.c | 41 +++++++++++++++++++----------------------
 1 file changed, 19 insertions(+), 22 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq()
  2018-05-17 15:02 [PATCHSET 0/3] Split queue lock into submission/completion locks Jens Axboe
@ 2018-05-17 15:02 ` Jens Axboe
  2018-05-17 15:30   ` Christoph Hellwig
  2018-05-17 15:32   ` Keith Busch
  2018-05-17 15:02 ` [PATCH 2/3] nvme: split the nvme queue lock into submission and completion locks Jens Axboe
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 16+ messages in thread
From: Jens Axboe @ 2018-05-17 15:02 UTC (permalink / raw)


We only clear this after calling nvme_suspend_queue(), which must
have called nvme_stop_queues() first. The latter ensures that no
more IO is queued, or in progress of being queued, against this
hardware queue.

Signed-off-by: Jens Axboe <axboe at kernel.dk>
---
 drivers/nvme/host/pci.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 5277afc6e7b5..4ed3583ad3bc 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -887,11 +887,6 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 	blk_mq_start_request(req);
 
 	spin_lock_irq(&nvmeq->q_lock);
-	if (unlikely(nvmeq->cq_vector < 0)) {
-		ret = BLK_STS_IOERR;
-		spin_unlock_irq(&nvmeq->q_lock);
-		goto out_cleanup_iod;
-	}
 	__nvme_submit_cmd(nvmeq, &cmnd);
 	spin_unlock_irq(&nvmeq->q_lock);
 	return BLK_STS_OK;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] nvme: split the nvme queue lock into submission and completion locks
  2018-05-17 15:02 [PATCHSET 0/3] Split queue lock into submission/completion locks Jens Axboe
  2018-05-17 15:02 ` [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq() Jens Axboe
@ 2018-05-17 15:02 ` Jens Axboe
  2018-05-17 15:33   ` Christoph Hellwig
  2018-05-17 15:02 ` [PATCH 3/3] nvme: drop IRQ disabling on submission queue lock Jens Axboe
  2018-05-17 15:22 ` [PATCHSET 0/3] Split queue lock into submission/completion locks Jens Axboe
  3 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2018-05-17 15:02 UTC (permalink / raw)


This is now feasible. We protect the submission queue ring with
->sq_lock, and the completion side with ->cq_lock.

Signed-off-by: Jens Axboe <axboe at kernel.dk>
---
 drivers/nvme/host/pci.c | 36 +++++++++++++++++++-----------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4ed3583ad3bc..ae982edfa4f3 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -146,9 +146,10 @@ static inline struct nvme_dev *to_nvme_dev(struct nvme_ctrl *ctrl)
 struct nvme_queue {
 	struct device *q_dmadev;
 	struct nvme_dev *dev;
-	spinlock_t q_lock;
+	spinlock_t sq_lock;
 	struct nvme_command *sq_cmds;
 	struct nvme_command __iomem *sq_cmds_io;
+	spinlock_t cq_lock ____cacheline_aligned_in_smp;
 	volatile struct nvme_completion *cqes;
 	struct blk_mq_tags **tags;
 	dma_addr_t sq_dma_addr;
@@ -886,9 +887,9 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 	blk_mq_start_request(req);
 
-	spin_lock_irq(&nvmeq->q_lock);
+	spin_lock_irq(&nvmeq->sq_lock);
 	__nvme_submit_cmd(nvmeq, &cmnd);
-	spin_unlock_irq(&nvmeq->q_lock);
+	spin_unlock_irq(&nvmeq->sq_lock);
 	return BLK_STS_OK;
 out_cleanup_iod:
 	nvme_free_iod(dev, req);
@@ -992,9 +993,9 @@ static irqreturn_t nvme_irq(int irq, void *data)
 	struct nvme_queue *nvmeq = data;
 	u16 start, end;
 
-	spin_lock(&nvmeq->q_lock);
+	spin_lock(&nvmeq->cq_lock);
 	nvme_process_cq(nvmeq, &start, &end);
-	spin_unlock(&nvmeq->q_lock);
+	spin_unlock(&nvmeq->cq_lock);
 
 	return nvme_complete_cqes(nvmeq, start, end);
 }
@@ -1014,9 +1015,9 @@ static int __nvme_poll(struct nvme_queue *nvmeq, unsigned int tag)
 	if (!nvme_cqe_valid(nvmeq, nvmeq->cq_head, nvmeq->cq_phase))
 		return 0;
 
-	spin_lock_irq(&nvmeq->q_lock);
+	spin_lock_irq(&nvmeq->cq_lock);
 	nvme_process_cq(nvmeq, &start, &end);
-	spin_unlock_irq(&nvmeq->q_lock);
+	spin_unlock_irq(&nvmeq->cq_lock);
 
 	while (start != end) {
 		if (nvme_handle_cqe(nvmeq, start, tag))
@@ -1045,9 +1046,9 @@ static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl)
 	c.common.opcode = nvme_admin_async_event;
 	c.common.command_id = NVME_AQ_BLK_MQ_DEPTH;
 
-	spin_lock_irq(&nvmeq->q_lock);
+	spin_lock_irq(&nvmeq->sq_lock);
 	__nvme_submit_cmd(nvmeq, &c);
-	spin_unlock_irq(&nvmeq->q_lock);
+	spin_unlock_irq(&nvmeq->sq_lock);
 }
 
 static int adapter_delete_queue(struct nvme_dev *dev, u8 opcode, u16 id)
@@ -1313,15 +1314,15 @@ static int nvme_suspend_queue(struct nvme_queue *nvmeq)
 {
 	int vector;
 
-	spin_lock_irq(&nvmeq->q_lock);
+	spin_lock_irq(&nvmeq->cq_lock);
 	if (nvmeq->cq_vector == -1) {
-		spin_unlock_irq(&nvmeq->q_lock);
+		spin_unlock_irq(&nvmeq->cq_lock);
 		return 1;
 	}
 	vector = nvmeq->cq_vector;
 	nvmeq->dev->online_queues--;
 	nvmeq->cq_vector = -1;
-	spin_unlock_irq(&nvmeq->q_lock);
+	spin_unlock_irq(&nvmeq->cq_lock);
 
 	if (!nvmeq->qid && nvmeq->dev->ctrl.admin_q)
 		blk_mq_quiesce_queue(nvmeq->dev->ctrl.admin_q);
@@ -1398,7 +1399,8 @@ static int nvme_alloc_queue(struct nvme_dev *dev, int qid, int depth)
 
 	nvmeq->q_dmadev = dev->dev;
 	nvmeq->dev = dev;
-	spin_lock_init(&nvmeq->q_lock);
+	spin_lock_init(&nvmeq->sq_lock);
+	spin_lock_init(&nvmeq->cq_lock);
 	nvmeq->cq_head = 0;
 	nvmeq->cq_phase = 1;
 	nvmeq->q_db = &dev->dbs[qid * 2 * dev->db_stride];
@@ -1434,7 +1436,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid)
 {
 	struct nvme_dev *dev = nvmeq->dev;
 
-	spin_lock_irq(&nvmeq->q_lock);
+	spin_lock_irq(&nvmeq->cq_lock);
 	nvmeq->sq_tail = 0;
 	nvmeq->cq_head = 0;
 	nvmeq->cq_phase = 1;
@@ -1442,7 +1444,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid)
 	memset((void *)nvmeq->cqes, 0, CQ_SIZE(nvmeq->q_depth));
 	nvme_dbbuf_init(dev, nvmeq, qid);
 	dev->online_queues++;
-	spin_unlock_irq(&nvmeq->q_lock);
+	spin_unlock_irq(&nvmeq->cq_lock);
 }
 
 static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
@@ -1997,10 +1999,10 @@ static void nvme_del_cq_end(struct request *req, blk_status_t error)
 		 * and the I/O queue q_lock should always
 		 * nest inside the AQ one.
 		 */
-		spin_lock_irqsave_nested(&nvmeq->q_lock, flags,
+		spin_lock_irqsave_nested(&nvmeq->cq_lock, flags,
 					SINGLE_DEPTH_NESTING);
 		nvme_process_cq(nvmeq, &start, &end);
-		spin_unlock_irqrestore(&nvmeq->q_lock, flags);
+		spin_unlock_irqrestore(&nvmeq->cq_lock, flags);
 
 		nvme_complete_cqes(nvmeq, start, end);
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] nvme: drop IRQ disabling on submission queue lock
  2018-05-17 15:02 [PATCHSET 0/3] Split queue lock into submission/completion locks Jens Axboe
  2018-05-17 15:02 ` [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq() Jens Axboe
  2018-05-17 15:02 ` [PATCH 2/3] nvme: split the nvme queue lock into submission and completion locks Jens Axboe
@ 2018-05-17 15:02 ` Jens Axboe
  2018-05-17 15:33   ` Christoph Hellwig
  2018-05-17 15:22 ` [PATCHSET 0/3] Split queue lock into submission/completion locks Jens Axboe
  3 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2018-05-17 15:02 UTC (permalink / raw)


Since we aren't sharing the lock for completions now, we don't
have to make it IRQ safe.

Signed-off-by: Jens Axboe <axboe at kernel.dk>
---
 drivers/nvme/host/pci.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ae982edfa4f3..31b8e1808b3c 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -887,9 +887,9 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 	blk_mq_start_request(req);
 
-	spin_lock_irq(&nvmeq->sq_lock);
+	spin_lock(&nvmeq->sq_lock);
 	__nvme_submit_cmd(nvmeq, &cmnd);
-	spin_unlock_irq(&nvmeq->sq_lock);
+	spin_unlock(&nvmeq->sq_lock);
 	return BLK_STS_OK;
 out_cleanup_iod:
 	nvme_free_iod(dev, req);
@@ -1046,9 +1046,9 @@ static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl)
 	c.common.opcode = nvme_admin_async_event;
 	c.common.command_id = NVME_AQ_BLK_MQ_DEPTH;
 
-	spin_lock_irq(&nvmeq->sq_lock);
+	spin_lock(&nvmeq->sq_lock);
 	__nvme_submit_cmd(nvmeq, &c);
-	spin_unlock_irq(&nvmeq->sq_lock);
+	spin_unlock(&nvmeq->sq_lock);
 }
 
 static int adapter_delete_queue(struct nvme_dev *dev, u8 opcode, u16 id)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHSET 0/3] Split queue lock into submission/completion locks
  2018-05-17 15:02 [PATCHSET 0/3] Split queue lock into submission/completion locks Jens Axboe
                   ` (2 preceding siblings ...)
  2018-05-17 15:02 ` [PATCH 3/3] nvme: drop IRQ disabling on submission queue lock Jens Axboe
@ 2018-05-17 15:22 ` Jens Axboe
  3 siblings, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2018-05-17 15:22 UTC (permalink / raw)


On 5/17/18 9:02 AM, Jens Axboe wrote:
> This series is on top of the previous work from yesterday, including
> Christophs patch. It splits nvmeq->q_lock into two locks:
> 
> 1) ->sq_lock, that protects the submission side of things
> 2) ->cq_lock, that protects the completion side
> 
> This also means that we can drop the IRQ safe nature of the submission
> side locking.

Series also here:

http://git.kernel.dk/cgit/linux-block/log/?h=nvme-4.18

NOT on top of Christophs change, as it seems to regress performance for me.

Quick test on polling on an xpoint device with 4 threads shows a nice
improvement for me:

Current master:
   read: IOPS=519k, BW=253MiB/s (266MB/s)(2532MiB/10001msec)
    clat (usec): min=5, max=630, avg= 7.38, stdev= 3.16
     lat (usec): min=5, max=630, avg= 7.41, stdev= 3.16
    clat percentiles (nsec):
     |  1.00th=[ 6560],  5.00th=[ 6688], 10.00th=[ 6688], 20.00th=[ 6752],
     | 30.00th=[ 6816], 40.00th=[ 6880], 50.00th=[ 6944], 60.00th=[ 7072],
     | 70.00th=[ 7136], 80.00th=[ 7328], 90.00th=[ 7712], 95.00th=[ 8256],
     | 99.00th=[18048], 99.50th=[19072], 99.90th=[36096], 99.95th=[36608],
     | 99.99th=[41216]

and with this series:
   read: IOPS=507k, BW=248MiB/s (260MB/s)(2476MiB/10001msec)
    clat (usec): min=6, max=640, avg= 7.56, stdev= 3.19
     lat (usec): min=6, max=640, avg= 7.59, stdev= 3.19
    clat percentiles (nsec):
     |  1.00th=[ 6688],  5.00th=[ 6816], 10.00th=[ 6880], 20.00th=[ 6944],
     | 30.00th=[ 7008], 40.00th=[ 7072], 50.00th=[ 7136], 60.00th=[ 7200],
     | 70.00th=[ 7328], 80.00th=[ 7520], 90.00th=[ 7904], 95.00th=[ 8384],
     | 99.00th=[18048], 99.50th=[18816], 99.90th=[36096], 99.95th=[36608],
     | 99.99th=[42752]

For polling, looking at CPU utilization usually means seeing if we decrease
system time and shift that to app time instead. For master, it's
5.49/94.52% usr/sys, and with the patchset it's 5.71/94.27%.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq()
  2018-05-17 15:02 ` [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq() Jens Axboe
@ 2018-05-17 15:30   ` Christoph Hellwig
  2018-05-17 15:32   ` Keith Busch
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2018-05-17 15:30 UTC (permalink / raw)


Looks good.  I somehow remember seeing an equivalent patch on the
list a while ago, though..

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq()
  2018-05-17 15:02 ` [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq() Jens Axboe
  2018-05-17 15:30   ` Christoph Hellwig
@ 2018-05-17 15:32   ` Keith Busch
  2018-05-17 15:36     ` Jens Axboe
  1 sibling, 1 reply; 16+ messages in thread
From: Keith Busch @ 2018-05-17 15:32 UTC (permalink / raw)


On Thu, May 17, 2018@09:02:15AM -0600, Jens Axboe wrote:
> We only clear this after calling nvme_suspend_queue(), which must
> have called nvme_stop_queues() first. The latter ensures that no
> more IO is queued, or in progress of being queued, against this
> hardware queue.
> 
> Signed-off-by: Jens Axboe <axboe at kernel.dk>
> ---
>  drivers/nvme/host/pci.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 5277afc6e7b5..4ed3583ad3bc 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -887,11 +887,6 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
>  	blk_mq_start_request(req);
>  
>  	spin_lock_irq(&nvmeq->q_lock);
> -	if (unlikely(nvmeq->cq_vector < 0)) {
> -		ret = BLK_STS_IOERR;
> -		spin_unlock_irq(&nvmeq->q_lock);
> -		goto out_cleanup_iod;
> -	}
>  	__nvme_submit_cmd(nvmeq, &cmnd);
>  	spin_unlock_irq(&nvmeq->q_lock);
>  	return BLK_STS_OK;

Unfortunatley we are still relying on this to drain entered requests on
a dying queue: we restart them to flush to out requests to complete with
error. :(

There's probably a better way to handle this.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/3] nvme: split the nvme queue lock into submission and completion locks
  2018-05-17 15:02 ` [PATCH 2/3] nvme: split the nvme queue lock into submission and completion locks Jens Axboe
@ 2018-05-17 15:33   ` Christoph Hellwig
  2018-05-17 15:33     ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2018-05-17 15:33 UTC (permalink / raw)


> -	spin_lock_irq(&nvmeq->q_lock);
> +	spin_lock_irq(&nvmeq->sq_lock);
>  	__nvme_submit_cmd(nvmeq, &cmnd);
> -	spin_unlock_irq(&nvmeq->q_lock);
> +	spin_unlock_irq(&nvmeq->sq_lock);

I don't think we need to disable irqs here anymore.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 3/3] nvme: drop IRQ disabling on submission queue lock
  2018-05-17 15:02 ` [PATCH 3/3] nvme: drop IRQ disabling on submission queue lock Jens Axboe
@ 2018-05-17 15:33   ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2018-05-17 15:33 UTC (permalink / raw)


On Thu, May 17, 2018@09:02:17AM -0600, Jens Axboe wrote:
> Since we aren't sharing the lock for completions now, we don't
> have to make it IRQ safe.

Ah, you did this as a separate patch..

Reviewed-by: Christoph Hellwig <hch at lst.de>

for both.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/3] nvme: split the nvme queue lock into submission and completion locks
  2018-05-17 15:33   ` Christoph Hellwig
@ 2018-05-17 15:33     ` Jens Axboe
  0 siblings, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2018-05-17 15:33 UTC (permalink / raw)


On 5/17/18 9:33 AM, Christoph Hellwig wrote:
>> -	spin_lock_irq(&nvmeq->q_lock);
>> +	spin_lock_irq(&nvmeq->sq_lock);
>>  	__nvme_submit_cmd(nvmeq, &cmnd);
>> -	spin_unlock_irq(&nvmeq->q_lock);
>> +	spin_unlock_irq(&nvmeq->sq_lock);
> 
> I don't think we need to disable irqs here anymore.

See patch #3 :-)

I didn't want to do that in the same order.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq()
  2018-05-17 15:32   ` Keith Busch
@ 2018-05-17 15:36     ` Jens Axboe
  2018-05-17 15:42       ` Jens Axboe
  2018-05-17 15:44       ` Keith Busch
  0 siblings, 2 replies; 16+ messages in thread
From: Jens Axboe @ 2018-05-17 15:36 UTC (permalink / raw)


On 5/17/18 9:32 AM, Keith Busch wrote:
> On Thu, May 17, 2018@09:02:15AM -0600, Jens Axboe wrote:
>> We only clear this after calling nvme_suspend_queue(), which must
>> have called nvme_stop_queues() first. The latter ensures that no
>> more IO is queued, or in progress of being queued, against this
>> hardware queue.
>>
>> Signed-off-by: Jens Axboe <axboe at kernel.dk>
>> ---
>>  drivers/nvme/host/pci.c | 5 -----
>>  1 file changed, 5 deletions(-)
>>
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index 5277afc6e7b5..4ed3583ad3bc 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -887,11 +887,6 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
>>  	blk_mq_start_request(req);
>>  
>>  	spin_lock_irq(&nvmeq->q_lock);
>> -	if (unlikely(nvmeq->cq_vector < 0)) {
>> -		ret = BLK_STS_IOERR;
>> -		spin_unlock_irq(&nvmeq->q_lock);
>> -		goto out_cleanup_iod;
>> -	}
>>  	__nvme_submit_cmd(nvmeq, &cmnd);
>>  	spin_unlock_irq(&nvmeq->q_lock);
>>  	return BLK_STS_OK;
> 
> Unfortunatley we are still relying on this to drain entered requests on
> a dying queue: we restart them to flush to out requests to complete with
> error. :(
> 
> There's probably a better way to handle this.

I'd suggest we just move it to the top and get it out of the way instead,
and ensure that the ->cq_vector to -1 includes an mb(). Then we can just
make it:

if (unlikely(nvmeq->cq_vector < 0))
        return BLK_STS_IOERR;

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq()
  2018-05-17 15:36     ` Jens Axboe
@ 2018-05-17 15:42       ` Jens Axboe
  2018-05-17 15:52         ` Keith Busch
  2018-05-17 15:44       ` Keith Busch
  1 sibling, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2018-05-17 15:42 UTC (permalink / raw)


On 5/17/18 9:36 AM, Jens Axboe wrote:
> On 5/17/18 9:32 AM, Keith Busch wrote:
>> On Thu, May 17, 2018@09:02:15AM -0600, Jens Axboe wrote:
>>> We only clear this after calling nvme_suspend_queue(), which must
>>> have called nvme_stop_queues() first. The latter ensures that no
>>> more IO is queued, or in progress of being queued, against this
>>> hardware queue.
>>>
>>> Signed-off-by: Jens Axboe <axboe at kernel.dk>
>>> ---
>>>  drivers/nvme/host/pci.c | 5 -----
>>>  1 file changed, 5 deletions(-)
>>>
>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>>> index 5277afc6e7b5..4ed3583ad3bc 100644
>>> --- a/drivers/nvme/host/pci.c
>>> +++ b/drivers/nvme/host/pci.c
>>> @@ -887,11 +887,6 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>  	blk_mq_start_request(req);
>>>  
>>>  	spin_lock_irq(&nvmeq->q_lock);
>>> -	if (unlikely(nvmeq->cq_vector < 0)) {
>>> -		ret = BLK_STS_IOERR;
>>> -		spin_unlock_irq(&nvmeq->q_lock);
>>> -		goto out_cleanup_iod;
>>> -	}
>>>  	__nvme_submit_cmd(nvmeq, &cmnd);
>>>  	spin_unlock_irq(&nvmeq->q_lock);
>>>  	return BLK_STS_OK;
>>
>> Unfortunatley we are still relying on this to drain entered requests on
>> a dying queue: we restart them to flush to out requests to complete with
>> error. :(
>>
>> There's probably a better way to handle this.
> 
> I'd suggest we just move it to the top and get it out of the way instead,
> and ensure that the ->cq_vector to -1 includes an mb(). Then we can just
> make it:
> 
> if (unlikely(nvmeq->cq_vector < 0))
>         return BLK_STS_IOERR;

How about this:

http://git.kernel.dk/cgit/linux-block/commit/?h=nvme-4.18&id=9913686cb779a046924441cdcac275aa24147122

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq()
  2018-05-17 15:36     ` Jens Axboe
  2018-05-17 15:42       ` Jens Axboe
@ 2018-05-17 15:44       ` Keith Busch
  1 sibling, 0 replies; 16+ messages in thread
From: Keith Busch @ 2018-05-17 15:44 UTC (permalink / raw)


On Thu, May 17, 2018@09:36:58AM -0600, Jens Axboe wrote:
> I'd suggest we just move it to the top and get it out of the way instead,
> and ensure that the ->cq_vector to -1 includes an mb(). Then we can just
> make it:
> 
> if (unlikely(nvmeq->cq_vector < 0))
>         return BLK_STS_IOERR;

Yes, I like that!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq()
  2018-05-17 15:42       ` Jens Axboe
@ 2018-05-17 15:52         ` Keith Busch
  2018-05-17 15:52           ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Keith Busch @ 2018-05-17 15:52 UTC (permalink / raw)


On Thu, May 17, 2018@09:42:58AM -0600, Jens Axboe wrote:
> How about this:
> 
> http://git.kernel.dk/cgit/linux-block/commit/?h=nvme-4.18&id=9913686cb779a046924441cdcac275aa24147122

Not only better, this also fixes a memory leak: we never called the
dma_unmap_sg() on this error out case! Let's consider this patch
separately.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq()
  2018-05-17 15:52         ` Keith Busch
@ 2018-05-17 15:52           ` Jens Axboe
  2018-05-17 16:00             ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2018-05-17 15:52 UTC (permalink / raw)


On 5/17/18 9:52 AM, Keith Busch wrote:
> On Thu, May 17, 2018@09:42:58AM -0600, Jens Axboe wrote:
>> How about this:
>>
>> http://git.kernel.dk/cgit/linux-block/commit/?h=nvme-4.18&id=9913686cb779a046924441cdcac275aa24147122
> 
> Not only better, this also fixes a memory leak: we never called the
> dma_unmap_sg() on this error out case! Let's consider this patch
> separately.

Heh, didn't even notice that.

Christoph, let me know if you want me to post the whole series for your
reshuffling. It's what's on:

http://git.kernel.dk/cgit/linux-block/log/?h=nvme-4.18

now.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq()
  2018-05-17 15:52           ` Jens Axboe
@ 2018-05-17 16:00             ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2018-05-17 16:00 UTC (permalink / raw)


On Thu, May 17, 2018@09:52:48AM -0600, Jens Axboe wrote:
> Christoph, let me know if you want me to post the whole series for your
> reshuffling. It's what's on:
> 
> http://git.kernel.dk/cgit/linux-block/log/?h=nvme-4.18

I'll cherry pick that in.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-05-17 16:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-17 15:02 [PATCHSET 0/3] Split queue lock into submission/completion locks Jens Axboe
2018-05-17 15:02 ` [PATCH 1/3] nvme: remove ->cq_vector == -1 check in nvme_queue_rq() Jens Axboe
2018-05-17 15:30   ` Christoph Hellwig
2018-05-17 15:32   ` Keith Busch
2018-05-17 15:36     ` Jens Axboe
2018-05-17 15:42       ` Jens Axboe
2018-05-17 15:52         ` Keith Busch
2018-05-17 15:52           ` Jens Axboe
2018-05-17 16:00             ` Christoph Hellwig
2018-05-17 15:44       ` Keith Busch
2018-05-17 15:02 ` [PATCH 2/3] nvme: split the nvme queue lock into submission and completion locks Jens Axboe
2018-05-17 15:33   ` Christoph Hellwig
2018-05-17 15:33     ` Jens Axboe
2018-05-17 15:02 ` [PATCH 3/3] nvme: drop IRQ disabling on submission queue lock Jens Axboe
2018-05-17 15:33   ` Christoph Hellwig
2018-05-17 15:22 ` [PATCHSET 0/3] Split queue lock into submission/completion locks Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).