From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: "hch@infradead.org" <hch@infradead.org>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"snitzer@redhat.com" <snitzer@redhat.com>,
"axboe@kernel.dk" <axboe@kernel.dk>,
"dm-devel@redhat.com" <dm-devel@redhat.com>,
"loberman@redhat.com" <loberman@redhat.com>
Subject: Re: [PATCH 4/5] blk-mq: introduce blk_get_request_notify
Date: Tue, 23 Jan 2018 09:29:11 +0800 [thread overview]
Message-ID: <20180123012910.GC4411@ming.t460p> (raw)
In-Reply-To: <1516641182.2545.26.camel@sandisk.com>
On Mon, Jan 22, 2018 at 05:13:03PM +0000, Bart Van Assche wrote:
> On Mon, 2018-01-22 at 11:35 +0800, Ming Lei wrote:
> > DM-MPATH need to allocate request from underlying queue, but when the
> > allocation fails, there is no way to make underlying queue's RESTART
> > to restart DM's queue.
> >
> > This patch introduces blk_get_request_notify() for this purpose, and
> > caller need to pass 'wait_queue_entry_t' to this function, and make
> > sure it is initialized well, so after the current allocation fails,
> > DM will get notified when there is request available from underlying
> > queue.
>
> Please mention that this is only a partial solution because the case when
> e.g. blk_insert_cloned_request() returns BLK_STS_RESOURCE is not handled.
I don't think it is necessary to mention that in comment log, because
one patch won't deal with all things.
But definitely there will be one patch which deals with that in V2,
otherwise the performance issue can't be solved completely.
> This could help for drivers that support a very low queue depth (lpfc) but
> probably won't be that useful for other drivers.
Up to now, it is one dm-mpath specific performance issue under one
specific situation: IO is submitted to dm-mpath device and the
underlying queue at the same time, and the underlying queue depth is
very low, such as 1.
>
> > + /*
> > + * If caller requires notification when tag is available, add
> > + * wait entry of 'data->notifier' to the wait queue.
> > + */
> > + if (data->flags & BLK_MQ_REQ_NOWAIT) {
> > + bool added = false;
> > +
> > + spin_lock_irq(&ws->wait.lock);
> > + if (list_empty(&data->notifier->entry))
> > + __add_wait_queue(&ws->wait, data->notifier);
> > + else
> > + added = true;
> > + spin_unlock_irq(&ws->wait.lock);
> > +
> > + if (added)
> > + return BLK_MQ_TAG_FAIL;
> > +
> > + tag = __blk_mq_get_tag(data, bt);
> > + if (tag != -1)
> > + goto found_tag;
> > + return BLK_MQ_TAG_FAIL;
> > + }
>
> Sorry but I don't like this approach. Adding "data->notifier" to the wait
> queue creates a link between two request queues, e.g. a dm-mpath queue and
> one of the paths that is a member of that dm-mpath queue. This creates the
> potential for ugly races between e.g. "data->notifier" being triggered and
> removal of the dm-mpath queue.
I thought no such race because the dm request is still in dispatch list before
the 'notifier' is handled. And now looks this isn't true, but this issue can be
dealt easily by call blk_queue_enter_live() before the allocation, and
release them all once the notifier is triggered.
Anyway this interface need to be well documented.
>
> > diff --git a/block/blk-mq.h b/block/blk-mq.h
> > index 88c558f71819..bec2f675f8f1 100644
> > --- a/block/blk-mq.h
> > +++ b/block/blk-mq.h
> > @@ -160,6 +160,7 @@ struct blk_mq_alloc_data {
> > struct request_queue *q;
> > blk_mq_req_flags_t flags;
> > unsigned int shallow_depth;
> > + wait_queue_entry_t *notifier;
>
> If others would agree with the approach of this patch please use another name
> than "notifier". In the context of the Linux kernel a notifier is an instance
> of struct notifier_block. The above "notifier" member is not a notifier but a
> wait queue entry.
OK, I will change to 'wait', similar with 'dispatch_wait'.
--
Ming
next prev parent reply other threads:[~2018-01-23 1:29 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-22 3:35 [PATCH 0/5] blk-mq & dm: fix IO hang and deal with one performance issue Ming Lei
2018-01-22 3:35 ` [PATCH 1/5] blk-mq: introduce BLK_STS_DEV_RESOURCE Ming Lei
2018-01-22 16:32 ` Christoph Hellwig
2018-01-22 16:49 ` Bart Van Assche
2018-01-23 0:57 ` Ming Lei
2018-01-23 16:17 ` Bart Van Assche
2018-01-23 16:26 ` Ming Lei
2018-01-23 16:37 ` Bart Van Assche
2018-01-23 16:41 ` Ming Lei
2018-01-23 16:47 ` Bart Van Assche
2018-01-23 16:49 ` Ming Lei
2018-01-23 16:54 ` Bart Van Assche
2018-01-23 16:59 ` Ming Lei
2018-01-23 22:01 ` Bart Van Assche
2018-01-24 2:31 ` Ming Lei
2018-01-22 3:35 ` [PATCH 2/5] dm-rq: handle dispatch exception in dm_dispatch_clone_request() Ming Lei
2018-01-22 3:35 ` [PATCH 3/5] dm-rq: return BLK_STS_* from map_request() Ming Lei
2018-01-22 5:35 ` Ming Lei
2018-01-22 3:35 ` [PATCH 4/5] blk-mq: introduce blk_get_request_notify Ming Lei
2018-01-22 10:19 ` Ming Lei
2018-01-22 17:13 ` Bart Van Assche
2018-01-23 1:29 ` Ming Lei [this message]
2018-01-22 3:35 ` [PATCH 5/5] dm-mpath: use blk_mq_alloc_request_notify for allocating blk-mq req Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180123012910.GC4411@ming.t460p \
--to=ming.lei@redhat.com \
--cc=Bart.VanAssche@wdc.com \
--cc=axboe@kernel.dk \
--cc=dm-devel@redhat.com \
--cc=hch@infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=loberman@redhat.com \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).