From: Ming Lei <ming.lei@redhat.com>
To: John Garry <john.garry@huawei.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
"jejb@linux.ibm.com" <jejb@linux.ibm.com>,
"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"hare@suse.com" <hare@suse.com>,
"bvanassche@acm.org" <bvanassche@acm.org>,
"chenxiang (M)" <chenxiang66@hisilicon.com>
Subject: Re: [PATCH RFC V2 3/5] blk-mq: Facilitate a shared sbitmap per tagset
Date: Mon, 25 Nov 2019 11:00:39 +0800 [thread overview]
Message-ID: <20191125030039.GA29462@ming.t460p> (raw)
In-Reply-To: <db93e0ba-118a-a4f6-41a8-064353568ef7@huawei.com>
On Thu, Nov 21, 2019 at 10:24:16AM +0000, John Garry wrote:
> > > int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
> > > {
> > > + struct blk_mq_tag_set *tag_set = q->tag_set;
> > > struct blk_mq_hw_ctx *hctx;
> > > struct elevator_queue *eq;
> > > unsigned int i;
> > > @@ -537,6 +538,19 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
> > > blk_mq_debugfs_register_sched_hctx(q, hctx);
> > > }
> > > + if (blk_mq_is_sbitmap_shared(tag_set)) {
> > > + if (!blk_mq_init_sched_shared_sbitmap(tag_set, q->nr_requests)) {
> > > + ret = -ENOMEM;
> > > + goto err;
> > > + }
> > > + queue_for_each_hw_ctx(q, hctx, i) {
> > > + struct blk_mq_tags *tags = hctx->sched_tags;
> > > +
> > > + tags->pbitmap_tags = &tag_set->sched_shared_bitmap_tags;
> > > + tags->pbreserved_tags = &tag_set->sched_shared_breserved_tags;
> >
> > This kind of sharing is wrong, sched tags should be request queue wide
> > instead of tagset wide, and each request queue has its own & independent
> > scheduler queue.
>
> Right, so if we get get a scheduler tag we still need to get a driver tag,
> and this would be the "shared" tag.
>
> That makes things simpler then.
>
> >
> > > + }
> > > + }
> > > +
> > > return 0;
> > > err:
> > > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> > > index 42792942b428..6625bebb46c3 100644
> > > --- a/block/blk-mq-tag.c
> > > +++ b/block/blk-mq-tag.c
> > > @@ -35,9 +35,9 @@ bool __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
> > > */
> > > void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool include_reserve)
> > > {
> > > - sbitmap_queue_wake_all(&tags->bitmap_tags);
> > > + sbitmap_queue_wake_all(tags->pbitmap_tags);
> > > if (include_reserve)
> > > - sbitmap_queue_wake_all(&tags->breserved_tags);
> > > + sbitmap_queue_wake_all(tags->pbreserved_tags);
> > > }
>
> [...]
>
>
> > > mutex_init(&set->tag_list_lock);
> > > INIT_LIST_HEAD(&set->tag_list);
> > > @@ -3137,6 +3151,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
> > > {
> > > struct blk_mq_tag_set *set = q->tag_set;
> > > struct blk_mq_hw_ctx *hctx;
> > > + bool sched_tags = false;
> > > int i, ret;
> > > if (!set)
> > > @@ -3160,6 +3175,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
> > > ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr,
> > > false);
> > > } else {
> > > + sched_tags = true;
> > > ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags,
> > > nr, true);
> > > }
> > > @@ -3169,8 +3185,41 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
> > > q->elevator->type->ops.depth_updated(hctx);
> > > }
> > > - if (!ret)
> > > + /*
> > > + * if ret is 0, all queues should have been updated to the same depth
> > > + * if not, then maybe some have been updated - yuk, need to handle this for shared sbitmap...
> > > + * if some are updated, we should probably roll back the change altogether. FIXME
> > > + */
> > > + if (!ret) {
> > > + if (blk_mq_is_sbitmap_shared(set)) {
> > > + if (sched_tags) {
> > > + sbitmap_queue_free(&set->sched_shared_bitmap_tags);
> > > + sbitmap_queue_free(&set->sched_shared_breserved_tags);
> > > + if (!blk_mq_init_sched_shared_sbitmap(set, nr))
> > > + return -ENOMEM; /* fixup error handling */
> > > +
> > > + queue_for_each_hw_ctx(q, hctx, i) {
> > > + hctx->sched_tags->pbitmap_tags = &set->sched_shared_bitmap_tags;
> > > + hctx->sched_tags->pbreserved_tags = &set->sched_shared_breserved_tags;
> > > + }
> > > + } else {
> > > + sbitmap_queue_free(&set->shared_bitmap_tags);
> > > + sbitmap_queue_free(&set->shared_breserved_tags);
> > > + if (!blk_mq_init_shared_sbitmap(set))
> > > + return -ENOMEM; /* fixup error handling */
> >
> > No, we can't re-allocate driver tags here which are shared by all LUNs. > And you should see that 'can_grow' is set as false for driver tags
> > in blk_mq_update_nr_requests(), which can only touch per-request-queue
> > data, not tagset wide data.
>
> Yeah, I see that. We should just resize for driver tags bitmap.
>
> Personally I think the mainline code is a little loose here, as if we could
> grow driver tags, then blk_mq_tagset.tags would be out-of-sync with the
> hctx->tags. Maybe that should be made more explicit in the code.
>
> BTW, do you have anything to say about this (modified slightly) comment:
>
> /*
> * if ret != 0, q->nr_requests would not be updated, yet the depth
> * for some hctx sched tags may have changed - is that the right thing
> * to do?
> */
In theory, your concern is right, but so far we only support same
depth of hctx for either sched tags or driver tags, so not an issue
so far.
Thanks,
Ming
next prev parent reply other threads:[~2019-11-25 3:01 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-19 14:27 [PATCH RFC v2 0/5] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs John Garry
2019-11-19 14:27 ` [PATCH RFC v2 1/5] blk-mq: Remove some unused function arguments John Garry
2019-11-19 14:27 ` [PATCH RFC V2 2/5] blk-mq: rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED John Garry
2019-11-19 14:27 ` [PATCH RFC V2 3/5] blk-mq: Facilitate a shared sbitmap per tagset John Garry
2019-11-21 8:55 ` Ming Lei
2019-11-21 10:24 ` John Garry
2019-11-25 3:00 ` Ming Lei [this message]
2019-11-19 14:27 ` [PATCH RFC V2 4/5] scsi: Add template flag 'host_tagset' John Garry
2019-11-19 14:27 ` [PATCH RFC V2 5/5] scsi: hisi_sas: Switch v3 hw to MQ John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191125030039.GA29462@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=chenxiang66@hisilicon.com \
--cc=hare@suse.com \
--cc=jejb@linux.ibm.com \
--cc=john.garry@huawei.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox