All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] dm-mpath: Improve handling of busy paths
@ 2017-09-20 18:12 Bart Van Assche
  2017-09-20 18:56 ` Mike Snitzer
  2017-09-20 22:36 ` [PATCH] " Ming Lei
  0 siblings, 2 replies; 8+ messages in thread
From: Bart Van Assche @ 2017-09-20 18:12 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jens Axboe, Hannes Reinecke, Ming Lei, dm-devel, Bart Van Assche,
	Christoph Hellwig

Instead of retrying request allocation after a delay if a path is
busy, wait until the underlying path has completed a request. This
patch avoids that submission of requests to busy paths is delayed and
hence creates more opportunities for merging sequential I/O requests.

Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
---
 drivers/md/dm-mpath.c | 10 +++++++++-
 drivers/md/dm-rq.c    |  3 ++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 11f273d2f018..b60ef655fa0c 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -495,7 +495,13 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
 
 	bdev = pgpath->path.dev->bdev;
 	q = bdev_get_queue(bdev);
-	clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC);
+	/*
+	 * For blk-mq this code is called from inside dm_mq_queue_rq() and
+	 * sleeping is allowed due to BLK_MQ_F_BLOCKING.  For the legacy block
+	 * layer this code is called from workqueue context. Hence unlocking
+	 * and reacquiring the queue lock is not necessary.
+	 */
+	clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_NOIO);
 	if (IS_ERR(clone)) {
 		/* EBUSY, ENODEV or EWOULDBLOCK: requeue */
 		bool queue_dying = blk_queue_dying(q);
@@ -504,6 +510,8 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
 		if (queue_dying) {
 			atomic_inc(&m->pg_init_in_progress);
 			activate_or_offline_path(pgpath);
+		} else {
+			WARN_ON_ONCE(true);
 		}
 		return DM_MAPIO_DELAY_REQUEUE;
 	}
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index eadfcfd106ff..f25a3cdb7f84 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -789,7 +789,8 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
 	md->tag_set->ops = &dm_mq_ops;
 	md->tag_set->queue_depth = dm_get_blk_mq_queue_depth();
 	md->tag_set->numa_node = md->numa_node_id;
-	md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE;
+	md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE |
+		BLK_MQ_F_BLOCKING;
 	md->tag_set->nr_hw_queues = dm_get_blk_mq_nr_hw_queues();
 	md->tag_set->driver_data = md;
 
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: dm-mpath: Improve handling of busy paths
  2017-09-20 18:12 [PATCH] dm-mpath: Improve handling of busy paths Bart Van Assche
@ 2017-09-20 18:56 ` Mike Snitzer
  2017-09-20 20:36   ` Bart Van Assche
  2017-09-20 22:36 ` [PATCH] " Ming Lei
  1 sibling, 1 reply; 8+ messages in thread
From: Mike Snitzer @ 2017-09-20 18:56 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, dm-devel, Ming Lei, Christoph Hellwig,
	Hannes Reinecke

On Wed, Sep 20 2017 at  2:12pm -0400,
Bart Van Assche <bart.vanassche@wdc.com> wrote:

> Instead of retrying request allocation after a delay if a path is
> busy, wait until the underlying path has completed a request. This
> patch avoids that submission of requests to busy paths is delayed and
> hence creates more opportunities for merging sequential I/O requests.
> 
> Reported-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/md/dm-mpath.c | 10 +++++++++-
>  drivers/md/dm-rq.c    |  3 ++-
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 11f273d2f018..b60ef655fa0c 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -495,7 +495,13 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
>  
>  	bdev = pgpath->path.dev->bdev;
>  	q = bdev_get_queue(bdev);
> -	clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC);
> +	/*
> +	 * For blk-mq this code is called from inside dm_mq_queue_rq() and
> +	 * sleeping is allowed due to BLK_MQ_F_BLOCKING.  For the legacy block
> +	 * layer this code is called from workqueue context. Hence unlocking
> +	 * and reacquiring the queue lock is not necessary.
> +	 */
> +	clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_NOIO);
>  	if (IS_ERR(clone)) {
>  		/* EBUSY, ENODEV or EWOULDBLOCK: requeue */
>  		bool queue_dying = blk_queue_dying(q);

It concerns me that we'd basically be allowing one or more really poorly
behaving underlying path(s) to sabotage the entire dm-multipath device
by blocking waiting for a request/tag on potentially pathologically
saturated and/or substandard path(s) (of particular concern when using
the round-robin path selector).

> @@ -504,6 +510,8 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
>  		if (queue_dying) {
>  			atomic_inc(&m->pg_init_in_progress);
>  			activate_or_offline_path(pgpath);
> +		} else {
> +			WARN_ON_ONCE(true);
>  		}
>  		return DM_MAPIO_DELAY_REQUEUE;
>  	}

There is no reason for this WARN_ON_ONCE to be part of this patch.
Really not seeing the point of it at all though.

> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index eadfcfd106ff..f25a3cdb7f84 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -789,7 +789,8 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
>  	md->tag_set->ops = &dm_mq_ops;
>  	md->tag_set->queue_depth = dm_get_blk_mq_queue_depth();
>  	md->tag_set->numa_node = md->numa_node_id;
> -	md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE;
> +	md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE |
> +		BLK_MQ_F_BLOCKING;
>  	md->tag_set->nr_hw_queues = dm_get_blk_mq_nr_hw_queues();
>  	md->tag_set->driver_data = md;
>  
> -- 
> 2.14.1
> 

This type of change needs to be carefully tested.  It is very
fundamental and takes us further and further away from avoiding
deadlocks/stalls.

Have you run this change through your IB SRP test harness?

I can run it through the mptest suite but that falls way short of
real-world destructive multipath testing in the face of heavy IO.

Mike

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: dm-mpath: Improve handling of busy paths
  2017-09-20 18:56 ` Mike Snitzer
@ 2017-09-20 20:36   ` Bart Van Assche
  0 siblings, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2017-09-20 20:36 UTC (permalink / raw)
  To: snitzer@redhat.com
  Cc: axboe@kernel.dk, dm-devel@redhat.com, ming.lei@redhat.com,
	hch@lst.de, hare@suse.com

On Wed, 2017-09-20 at 14:56 -0400, Mike Snitzer wrote:
> On Wed, Sep 20 2017 at  2:12pm -0400,
> Bart Van Assche <bart.vanassche@wdc.com> wrote:
> >  	bdev = pgpath->path.dev->bdev;
> >  	q = bdev_get_queue(bdev);
> > -	clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC);
> > +	/*
> > +	 * For blk-mq this code is called from inside dm_mq_queue_rq() and
> > +	 * sleeping is allowed due to BLK_MQ_F_BLOCKING.  For the legacy block
> > +	 * layer this code is called from workqueue context. Hence unlocking
> > +	 * and reacquiring the queue lock is not necessary.
> > +	 */
> > +	clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_NOIO);
> >  	if (IS_ERR(clone)) {
> >  		/* EBUSY, ENODEV or EWOULDBLOCK: requeue */
> >  		bool queue_dying = blk_queue_dying(q);
> 
> It concerns me that we'd basically be allowing one or more really poorly
> behaving underlying path(s) to sabotage the entire dm-multipath device
> by blocking waiting for a request/tag on potentially pathologically
> saturated and/or substandard path(s) (of particular concern when using
> the round-robin path selector).

That's a valid concern, but if different paths are known to have different
latency characteristics, shouldn't users choose another path selector than
round-robin, e.g. the service-time path selector?

Additionally, shouldn't substandard paths be handled at a higher level than
dm-path? My proposal if we really want flaky paths to be avoided automatically
is to modify multipathd such that the path state can have three possible
values instead of two (up/down/flaky instead of up/down) and to let multipathd
recognize flaky paths.

> > @@ -504,6 +510,8 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
> >  		if (queue_dying) {
> >  			atomic_inc(&m->pg_init_in_progress);
> >  			activate_or_offline_path(pgpath);
> > +		} else {
> > +			WARN_ON_ONCE(true);
> >  		}
> >  		return DM_MAPIO_DELAY_REQUEUE;
> >  	}
> 
> There is no reason for this WARN_ON_ONCE to be part of this patch.
> Really not seeing the point of it at all though.

OK, I will leave this change out. The only purpose of that WARN_ON_ONCE(true)
statement is to verify that request allocation only fails for dying paths and
not in any other case.

> This type of change needs to be carefully tested.  It is very
> fundamental and takes us further and further away from avoiding
> deadlocks/stalls.
> 
> Have you run this change through your IB SRP test harness?

Yes, this patch survives several srp-test runs. The command I used to test
this patch is as follows:

    srp-test/run_tests -f xfs -e deadline

This means that legacy dm on top of legacy SCSI, dm-mq on top of scsi-mq and
legacy dm on top of scsi-mq have been tested.

> I can run it through the mptest suite but that falls way short of
> real-world destructive multipath testing in the face of heavy IO.

Thanks, performing additional testing with the mptest suite is definitely
welcome.

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dm-mpath: Improve handling of busy paths
  2017-09-20 18:12 [PATCH] dm-mpath: Improve handling of busy paths Bart Van Assche
  2017-09-20 18:56 ` Mike Snitzer
@ 2017-09-20 22:36 ` Ming Lei
  2017-09-20 23:26   ` Bart Van Assche
  1 sibling, 1 reply; 8+ messages in thread
From: Ming Lei @ 2017-09-20 22:36 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, dm-devel, Christoph Hellwig, Mike Snitzer,
	Hannes Reinecke

On Wed, Sep 20, 2017 at 11:12:12AM -0700, Bart Van Assche wrote:
> Instead of retrying request allocation after a delay if a path is
> busy, wait until the underlying path has completed a request. This
> patch avoids that submission of requests to busy paths is delayed and
> hence creates more opportunities for merging sequential I/O requests.

Could you provide some test data with this change?

> 
> Reported-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/md/dm-mpath.c | 10 +++++++++-
>  drivers/md/dm-rq.c    |  3 ++-
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 11f273d2f018..b60ef655fa0c 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -495,7 +495,13 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
>  
>  	bdev = pgpath->path.dev->bdev;
>  	q = bdev_get_queue(bdev);
> -	clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC);
> +	/*
> +	 * For blk-mq this code is called from inside dm_mq_queue_rq() and
> +	 * sleeping is allowed due to BLK_MQ_F_BLOCKING.  For the legacy block
> +	 * layer this code is called from workqueue context. Hence unlocking
> +	 * and reacquiring the queue lock is not necessary.
> +	 */
> +	clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_NOIO);

Actually with GFP_ATOMIC, dispatch in dm-rq can't move on and no request
will be dequeued from IO scheduler queue if this allocation fails, that
means IO merge is still working at that time if the patchset of
'blk-mq-sched: improve SCSI-MQ performance' is applied.

As my test done, the sequential I/O performance is still not good
even though the patchset is applied.

That isn't strange because queue depth of IO scheduler queue can be
much bigger than either q->queue_depth or .cmd_per_lun, that means
the underlying queue has been busy for a while before the allocation
fails.

So this approach may not work well in theory.


-- 
Ming

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dm-mpath: Improve handling of busy paths
  2017-09-20 22:36 ` [PATCH] " Ming Lei
@ 2017-09-20 23:26   ` Bart Van Assche
  2017-09-21  1:41     ` Ming Lei
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2017-09-20 23:26 UTC (permalink / raw)
  To: ming.lei@redhat.com
  Cc: axboe@kernel.dk, dm-devel@redhat.com, hch@lst.de,
	snitzer@redhat.com, hare@suse.com

On Thu, 2017-09-21 at 06:36 +0800, Ming Lei wrote:
> Actually with GFP_ATOMIC, dispatch in dm-rq can't move on and no request
> will be dequeued from IO scheduler queue if this allocation fails, that
> means IO merge is still working at that time if the patchset of
> 'blk-mq-sched: improve SCSI-MQ performance' is applied.
> 
> As my test done, the sequential I/O performance is still not good
> even though the patchset is applied.
> 
> That isn't strange because queue depth of IO scheduler queue can be
> much bigger than either q->queue_depth or .cmd_per_lun, that means
> the underlying queue has been busy for a while before the allocation
> fails.

Hello Ming,

This patch is intended as an alternative for your patch series "dm-mpath:
improve I/O schedule". I wanted to show that it is possible to reduce
dm-mpath request submission latency if the underlying driver returns
"busy" frequently without touching the "return DM_MAPIO_DELAY_REQUEUE"
statement in multipath_clone_and_map(). I will provide measurement data as
soon as I have the time to run more measurements.

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dm-mpath: Improve handling of busy paths
  2017-09-20 23:26   ` Bart Van Assche
@ 2017-09-21  1:41     ` Ming Lei
  2017-09-21 15:53       ` Bart Van Assche
  0 siblings, 1 reply; 8+ messages in thread
From: Ming Lei @ 2017-09-21  1:41 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: axboe@kernel.dk, dm-devel@redhat.com, hch@lst.de,
	snitzer@redhat.com, hare@suse.com

On Wed, Sep 20, 2017 at 11:26:09PM +0000, Bart Van Assche wrote:
> On Thu, 2017-09-21 at 06:36 +0800, Ming Lei wrote:
> > Actually with GFP_ATOMIC, dispatch in dm-rq can't move on and no request
> > will be dequeued from IO scheduler queue if this allocation fails, that
> > means IO merge is still working at that time if the patchset of
> > 'blk-mq-sched: improve SCSI-MQ performance' is applied.
> > 
> > As my test done, the sequential I/O performance is still not good
> > even though the patchset is applied.
> > 
> > That isn't strange because queue depth of IO scheduler queue can be
> > much bigger than either q->queue_depth or .cmd_per_lun, that means
> > the underlying queue has been busy for a while before the allocation
> > fails.
> 
> Hello Ming,
> 
> This patch is intended as an alternative for your patch series "dm-mpath:
> improve I/O schedule". I wanted to show that it is possible to reduce

Hi Bart,

As I explained, this patch can't fix the I/O merge issue since it is easy
to trigger queue busy before running out of requests, that is why I
changes the 'nr_request' in the patch 5 of 'dm-mpath: improve I/O schedule'.

> dm-mpath request submission latency if the underlying driver returns
> "busy" frequently without touching the "return DM_MAPIO_DELAY_REQUEUE"

I already explained, the DM_MAPIO_DELAY_REQUEUE can be changed
to DM_MAPIO_REQUEUE at least for dm-rq-mq via SCHED_RESTART, even
for dm-rq-sq, it should be possible but need to make sure there
is in-flight requests because we run queue in rq_completed().

-- 
Ming

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dm-mpath: Improve handling of busy paths
  2017-09-21  1:41     ` Ming Lei
@ 2017-09-21 15:53       ` Bart Van Assche
  2017-09-21 22:58         ` Ming Lei
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2017-09-21 15:53 UTC (permalink / raw)
  To: ming.lei@redhat.com
  Cc: axboe@kernel.dk, dm-devel@redhat.com, hch@lst.de,
	snitzer@redhat.com, hare@suse.com

On Thu, 2017-09-21 at 09:41 +0800, Ming Lei wrote:
> On Wed, Sep 20, 2017 at 11:26:09PM +0000, Bart Van Assche wrote:
> > dm-mpath request submission latency if the underlying driver returns
> > "busy" frequently without touching the "return DM_MAPIO_DELAY_REQUEUE"
> 
> I already explained, the DM_MAPIO_DELAY_REQUEUE can be changed
> to DM_MAPIO_REQUEUE at least for dm-rq-mq via SCHED_RESTART, even
> for dm-rq-sq, it should be possible but need to make sure there
> is in-flight requests because we run queue in rq_completed().

Sorry but I think that would be wrong because it would result in the dm-mpath
driver to busy-wait for a request if blk_get_request(..., GFP_ATOMIC) fails
either due to all tags having been allocated or because the path is dying.
Have you had a look at commit 1c23484c355e ("dm-mpath: do not lock up a CPU
with requeuing activity")? Do you understand the purpose of that change?

> As I explained, this patch can't fix the I/O merge issue since it is easy
> to trigger queue busy before running out of requests, that is why I
> changes the 'nr_request' in the patch 5 of 'dm-mpath: improve I/O schedule'.

Sorry but I don't think that *any* value of nr_requests can prevent that the
dm-mpath driver submits requests against a busy path. If multiple LUNs are
associated with the same SCSI host and I/O is ongoing against multiple LUNs
concurrently then it is not possible to choose a value for nr_requests that
prevents that a request is queued against a busy SCSI device. How about using
something like the (untested) patch below to prevent that requests are queued
against a busy SCSI path?


[PATCH] scsi_lld_busy(): Improve busy detection

To achieve optimal I/O scheduling it is important that the
dm-mpath driver returns "busy" if the path it will submit a
request to is busy. Hence also check whether the target and
the host are busy from inside scsi_lld_busy(). Note:
dm_mq_queue_rq() calls scsi_lld_busy() as follows:

	if (ti->type->busy && ti->type->busy(ti))
		return BLK_STS_RESOURCE;

---
 drivers/scsi/scsi_lib.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f6097b89d5d3..db68605f98ea 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1534,7 +1534,9 @@ static int scsi_lld_busy(struct request_queue *q)
 	 * multiple queues, congestion of host/starget needs to be handled
 	 * in SCSI layer.
 	 */
-	if (scsi_host_in_recovery(shost) || scsi_device_is_busy(sdev))
+	if (scsi_host_in_recovery(shost) || scsi_device_is_busy(sdev) ||
+	    scsi_target_is_busy(scsi_target(sdev)) ||
+	    scsi_host_is_busy(shost))
 		return 1;
 
 	return 0;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] dm-mpath: Improve handling of busy paths
  2017-09-21 15:53       ` Bart Van Assche
@ 2017-09-21 22:58         ` Ming Lei
  0 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2017-09-21 22:58 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: axboe@kernel.dk, dm-devel@redhat.com, hch@lst.de,
	snitzer@redhat.com, hare@suse.com

On Thu, Sep 21, 2017 at 03:53:45PM +0000, Bart Van Assche wrote:
> On Thu, 2017-09-21 at 09:41 +0800, Ming Lei wrote:
> > On Wed, Sep 20, 2017 at 11:26:09PM +0000, Bart Van Assche wrote:
> > > dm-mpath request submission latency if the underlying driver returns
> > > "busy" frequently without touching the "return DM_MAPIO_DELAY_REQUEUE"
> > 
> > I already explained, the DM_MAPIO_DELAY_REQUEUE can be changed
> > to DM_MAPIO_REQUEUE at least for dm-rq-mq via SCHED_RESTART, even
> > for dm-rq-sq, it should be possible but need to make sure there
> > is in-flight requests because we run queue in rq_completed().
> 
> Sorry but I think that would be wrong because it would result in the dm-mpath
> driver to busy-wait for a request if blk_get_request(..., GFP_ATOMIC) fails
> either due to all tags having been allocated or because the path is dying.
> Have you had a look at commit 1c23484c355e ("dm-mpath: do not lock up a CPU
> with requeuing activity")? Do you understand the purpose of that change?

This issue shouldn't be related with 1c23484c355e ("dm-mpath: do not lock
up a CPU with requeuing activity"), but we still can return DM_MAPIO_DELAY_REQUEUE
when queue is dying.

Also I don't think 1c23484c355e makes sense and still like a workaround,
when one underlying queue is dying, the I/O to be requeued should be
switched to another path in the following dispatch, and finally the I/O
should be failed if all paths are down. So how can the requeue activity
lock up a CPU when get_request() returns NULL and queue is dying?

Actually the big performance issue is in that DM_MAPIO_DELAY_REQUEUE is
returned when get_request() returns NULL, this way will make .queue_rq()
return BLK_STS_OK, and lie to block layer, and actually BLK_STS_RESOURCE
should have been returned. Then I/O merge becomes hard to do.

> 
> > As I explained, this patch can't fix the I/O merge issue since it is easy
> > to trigger queue busy before running out of requests, that is why I
> > changes the 'nr_request' in the patch 5 of 'dm-mpath: improve I/O schedule'.
> 
> Sorry but I don't think that *any* value of nr_requests can prevent that the
> dm-mpath driver submits requests against a busy path. If multiple LUNs are
> associated with the same SCSI host and I/O is ongoing against multiple LUNs
> concurrently then it is not possible to choose a value for nr_requests that
> prevents that a request is queued against a busy SCSI device. How about using
> something like the (untested) patch below to prevent that requests are queued
> against a busy SCSI path?

Actually the 'nr_requests' is set for each path/underlying queue in my
patch, not per dm queue.

> 
> 
> [PATCH] scsi_lld_busy(): Improve busy detection
> 
> To achieve optimal I/O scheduling it is important that the
> dm-mpath driver returns "busy" if the path it will submit a
> request to is busy. Hence also check whether the target and
> the host are busy from inside scsi_lld_busy(). Note:
> dm_mq_queue_rq() calls scsi_lld_busy() as follows:
> 
> 	if (ti->type->busy && ti->type->busy(ti))
> 		return BLK_STS_RESOURCE;

OK, looks good to call pgpath_busy() in multipath_clone_and_map()
before allocating request, and make sure BLK_STS_RESOURCE is
returned to block layer.

-- 
Ming

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-09-21 22:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-20 18:12 [PATCH] dm-mpath: Improve handling of busy paths Bart Van Assche
2017-09-20 18:56 ` Mike Snitzer
2017-09-20 20:36   ` Bart Van Assche
2017-09-20 22:36 ` [PATCH] " Ming Lei
2017-09-20 23:26   ` Bart Van Assche
2017-09-21  1:41     ` Ming Lei
2017-09-21 15:53       ` Bart Van Assche
2017-09-21 22:58         ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.