From: Mike Snitzer <snitzer@redhat.com>
To: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Jens Axboe <axboe@kernel.dk>,
dm-devel@redhat.com, Ming Lei <ming.lei@redhat.com>,
Christoph Hellwig <hch@lst.de>, Hannes Reinecke <hare@suse.com>
Subject: Re: dm-mpath: Improve handling of busy paths
Date: Wed, 20 Sep 2017 14:56:37 -0400 [thread overview]
Message-ID: <20170920185637.GA30126@redhat.com> (raw)
In-Reply-To: <20170920181212.5532-1-bart.vanassche@wdc.com>
On Wed, Sep 20 2017 at 2:12pm -0400,
Bart Van Assche <bart.vanassche@wdc.com> wrote:
> Instead of retrying request allocation after a delay if a path is
> busy, wait until the underlying path has completed a request. This
> patch avoids that submission of requests to busy paths is delayed and
> hence creates more opportunities for merging sequential I/O requests.
>
> Reported-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Ming Lei <ming.lei@redhat.com>
> ---
> drivers/md/dm-mpath.c | 10 +++++++++-
> drivers/md/dm-rq.c | 3 ++-
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 11f273d2f018..b60ef655fa0c 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -495,7 +495,13 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
>
> bdev = pgpath->path.dev->bdev;
> q = bdev_get_queue(bdev);
> - clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC);
> + /*
> + * For blk-mq this code is called from inside dm_mq_queue_rq() and
> + * sleeping is allowed due to BLK_MQ_F_BLOCKING. For the legacy block
> + * layer this code is called from workqueue context. Hence unlocking
> + * and reacquiring the queue lock is not necessary.
> + */
> + clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_NOIO);
> if (IS_ERR(clone)) {
> /* EBUSY, ENODEV or EWOULDBLOCK: requeue */
> bool queue_dying = blk_queue_dying(q);
It concerns me that we'd basically be allowing one or more really poorly
behaving underlying path(s) to sabotage the entire dm-multipath device
by blocking waiting for a request/tag on potentially pathologically
saturated and/or substandard path(s) (of particular concern when using
the round-robin path selector).
> @@ -504,6 +510,8 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
> if (queue_dying) {
> atomic_inc(&m->pg_init_in_progress);
> activate_or_offline_path(pgpath);
> + } else {
> + WARN_ON_ONCE(true);
> }
> return DM_MAPIO_DELAY_REQUEUE;
> }
There is no reason for this WARN_ON_ONCE to be part of this patch.
Really not seeing the point of it at all though.
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index eadfcfd106ff..f25a3cdb7f84 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -789,7 +789,8 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
> md->tag_set->ops = &dm_mq_ops;
> md->tag_set->queue_depth = dm_get_blk_mq_queue_depth();
> md->tag_set->numa_node = md->numa_node_id;
> - md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE;
> + md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE |
> + BLK_MQ_F_BLOCKING;
> md->tag_set->nr_hw_queues = dm_get_blk_mq_nr_hw_queues();
> md->tag_set->driver_data = md;
>
> --
> 2.14.1
>
This type of change needs to be carefully tested. It is very
fundamental and takes us further and further away from avoiding
deadlocks/stalls.
Have you run this change through your IB SRP test harness?
I can run it through the mptest suite but that falls way short of
real-world destructive multipath testing in the face of heavy IO.
Mike
next prev parent reply other threads:[~2017-09-20 18:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-20 18:12 [PATCH] dm-mpath: Improve handling of busy paths Bart Van Assche
2017-09-20 18:56 ` Mike Snitzer [this message]
2017-09-20 20:36 ` Bart Van Assche
2017-09-20 22:36 ` [PATCH] " Ming Lei
2017-09-20 23:26 ` Bart Van Assche
2017-09-21 1:41 ` Ming Lei
2017-09-21 15:53 ` Bart Van Assche
2017-09-21 22:58 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170920185637.GA30126@redhat.com \
--to=snitzer@redhat.com \
--cc=axboe@kernel.dk \
--cc=bart.vanassche@wdc.com \
--cc=dm-devel@redhat.com \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=ming.lei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.