From: Kashyap Desai <kashyap.desai@broadcom.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
"Martin K . Petersen" <martin.petersen@oracle.com>,
James Bottomley <James.Bottomley@hansenpartnership.com>,
Bart Van Assche <bvanassche@acm.org>,
Hannes Reinecke <hare@suse.com>,
John Garry <john.garry@huawei.com>,
Don Brace <don.brace@microsemi.com>,
Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>,
Christoph Hellwig <hch@lst.de>
Subject: RE: [PATCH 8/9] scsi: megaraid: convert private reply queue to blk-mq hw queue
Date: Fri, 7 Jun 2019 15:15:04 +0530 [thread overview]
Message-ID: <287b092778c749c8b101641f9d48ab44@mail.gmail.com> (raw)
In-Reply-To: f24109eb867deae8cb262466ecc70b09@mail.gmail.com
>
> >
> > Please drop the patch in my last email, and apply the following patch
> > and see if we can make a difference:
>
> Ming,
>
> I dropped early patch and applied the below patched. Now, I am getting
> expected performance (3.0M IOPS).
> Below patch fix the performance issue. See perf report after applying
the
> same -
>
> 8.52% [kernel] [k] sbitmap_any_bit_set
> 4.19% [kernel] [k] blk_mq_run_hw_queue
> 3.76% [megaraid_sas] [k] complete_cmd_fusion
> 3.24% [kernel] [k] scsi_queue_rq
> 2.53% [megaraid_sas] [k] megasas_build_ldio_fusion
> 2.34% [megaraid_sas] [k] megasas_build_and_issue_cmd_fusion
> 2.18% [kernel] [k] entry_SYSCALL_64
> 1.85% [kernel] [k] syscall_return_via_sysret
> 1.78% [kernel] [k] blk_mq_run_hw_queues
> 1.59% [kernel] [k] gup_pmd_range
> 1.49% [kernel] [k] _raw_spin_lock_irqsave
> 1.24% [kernel] [k] scsi_dec_host_busy
> 1.23% [kernel] [k] blk_mq_free_request
> 1.23% [kernel] [k] blk_mq_get_request
> 0.96% [kernel] [k] __slab_free
> 0.91% [kernel] [k] aio_complete
> 0.90% [kernel] [k] __sched_text_start
> 0.89% [megaraid_sas] [k] megasas_queue_command
> 0.85% [kernel] [k] __fget
> 0.84% [kernel] [k] scsi_mq_get_budget
>
> I will do some more testing and update the results.
Ming, I did testing on AMD Dual Socket server (AMD EPYC 7601 32-Core
Processor). System has total 128 logical cores.
Without patch, performance can go upto 2.8M IOPS. See below perf top
output.
7.37% [megaraid_sas] [k] complete_cmd_fusion
2.51% [kernel] [k] copy_user_generic_string
2.48% [kernel] [k] read_tsc
2.10% fio [.] thread_main
2.06% [kernel] [k] gup_pgd_range
1.98% [kernel] [k] __get_user_4
1.92% [kernel] [k] entry_SYSCALL_64
1.58% [kernel] [k] scsi_queue_rq
1.55% [megaraid_sas] [k] megasas_queue_command
1.52% [kernel] [k] irq_entries_start
1.43% fio [.] get_io_u
1.39% [kernel] [k] blkdev_direct_IO
1.34% [kernel] [k] __audit_syscall_exit
1.31% [megaraid_sas] [k] megasas_build_and_issue_cmd_fusion
1.27% [kernel] [k] syscall_slow_exit_work
1.23% [kernel] [k] io_submit_one
1.20% [kernel] [k] do_syscall_64
1.17% fio [.] td_io_queue
1.16% [kernel] [k] lookup_ioctx
1.14% [kernel] [k] kmem_cache_alloc
1.10% [megaraid_sas] [k] megasas_build_ldio_fusion
1.07% [kernel] [k] __memset
1.06% [kernel] [k] __virt_addr_valid
0.98% [kernel] [k] blk_mq_get_request
0.94% [kernel] [k] note_interrupt
0.91% [kernel] [k] __get_user_8
0.91% [kernel] [k] aio_read_events
0.85% [kernel] [k] __put_user_4
0.78% fio [.] fio_libaio_commit
0.74% [megaraid_sas] [k] MR_BuildRaidContext
0.70% [kernel] [k] __x64_sys_io_submit
0.69% fio [.] utime_since_now
With your patch - Performance can go upto 1.7M IOPS. See below perf top
output.
23.01% [kernel] [k] sbitmap_any_bit_set
6.42% [kernel] [k] blk_mq_run_hw_queue
4.44% [megaraid_sas] [k] complete_cmd_fusion
4.23% [kernel] [k] blk_mq_run_hw_queues
1.80% [kernel] [k] read_tsc
1.60% [kernel] [k] copy_user_generic_string
1.33% fio [.] thread_main
1.27% [kernel] [k] irq_entries_start
1.22% [kernel] [k] gup_pgd_range
1.20% [kernel] [k] __get_user_4
1.20% [kernel] [k] entry_SYSCALL_64
1.07% [kernel] [k] scsi_queue_rq
0.88% fio [.] get_io_u
0.87% [megaraid_sas] [k] megasas_queue_command
0.86% [kernel] [k] blkdev_direct_IO
0.85% fio [.] td_io_queue
0.80% [kernel] [k] note_interrupt
0.76% [kernel] [k] lookup_ioctx
0.76% [kernel] [k] do_syscall_64
0.75% [megaraid_sas] [k] megasas_build_and_issue_cmd_fusion
0.74% [megaraid_sas] [k] megasas_build_ldio_fusion
0.72% [kernel] [k] kmem_cache_alloc
0.71% [kernel] [k] __audit_syscall_exit
0.67% [kernel] [k] __virt_addr_valid
0.65% [kernel] [k] blk_mq_get_request
0.64% [kernel] [k] __memset
0.62% [kernel] [k] syscall_slow_exit_work
0.60% [kernel] [k] io_submit_one
0.59% [kernel] [k] ktime_get
0.58% fio [.] fio_libaio_commit
0.57% [kernel] [k] aio_read_events
0.54% [kernel] [k] __get_user_8
0.53% [kernel] [k] aio_complete_rw
0.51% [kernel] [k] kmem_cache_free
With your patch + reducing logical cpu core to 64 (CPU hotplugged),
performance can go upto 2.2M IOPS. See below perf top output.
9.56% [kernel] [k] sbitmap_any_bit_set
4.62% [megaraid_sas] [k] complete_cmd_fusion
3.02% [kernel] [k] blk_mq_run_hw_queue
2.15% [kernel] [k] copy_user_generic_string
2.13% [kernel] [k] blk_mq_run_hw_queues
2.09% [kernel] [k] read_tsc
1.66% [kernel] [k] __get_user_4
1.59% [kernel] [k] entry_SYSCALL_64
1.57% [kernel] [k] gup_pgd_range
1.55% fio [.] thread_main
1.51% [kernel] [k] scsi_queue_rq
1.31% [kernel] [k] __memset
1.21% [megaraid_sas] [k] megasas_build_and_issue_cmd_fusion
1.16% [megaraid_sas] [k] megasas_queue_command
1.13% fio [.] get_io_u
1.12% [kernel] [k] blk_mq_get_request
1.07% [kernel] [k] blkdev_direct_IO
1.06% [kernel] [k] __put_user_4
1.05% fio [.] td_io_queue
1.02% [kernel] [k] syscall_slow_exit_work
1.00% [megaraid_sas] [k] megasas_build_ldio_fusion
In summary, Part of the performance drop may be correlated with number of
hctx created in block layer. I can provide more details and can test
follow up patch.
Kashyap
>
> Kashyap
>
> >
> > diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index
> > 3d6780504dcb..69d6bffcc8ff 100644
> > --- a/block/blk-mq-debugfs.c
> > +++ b/block/blk-mq-debugfs.c
> > @@ -627,6 +627,9 @@ static int hctx_active_show(void *data, struct
> > seq_file
> > *m) {
> > struct blk_mq_hw_ctx *hctx = data;
> >
> > + if (hctx->flags & BLK_MQ_F_HOST_TAGS)
> > + hctx = blk_mq_master_hctx(hctx);
> > +
> > seq_printf(m, "%d\n", atomic_read(&hctx->nr_active));
> > return 0;
> > }
> > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index
> > 309ec5079f3f..58ef83a34fda 100644
> > --- a/block/blk-mq-tag.c
> > +++ b/block/blk-mq-tag.c
> > @@ -30,6 +30,9 @@ bool blk_mq_has_free_tags(struct blk_mq_tags *tags)
> > */
> > bool __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) {
> > + if (hctx->flags & BLK_MQ_F_HOST_TAGS)
> > + hctx = blk_mq_master_hctx(hctx);
> > +
> > if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state) &&
> > !test_and_set_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
> > atomic_inc(&hctx->tags->active_queues);
> > @@ -55,6 +58,9 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
> {
> > struct blk_mq_tags *tags = hctx->tags;
> >
> > + if (hctx->flags & BLK_MQ_F_HOST_TAGS)
> > + hctx = blk_mq_master_hctx(hctx);
> > +
> > if (!test_and_clear_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
> > return;
> >
> > @@ -74,6 +80,10 @@ static inline bool hctx_may_queue(struct
> > blk_mq_hw_ctx *hctx,
> >
> > if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_SHARED))
> > return true;
> > +
> > + if (hctx->flags & BLK_MQ_F_HOST_TAGS)
> > + hctx = blk_mq_master_hctx(hctx);
> > +
> > if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
> > return true;
> >
> > diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h index
> > 61deab0b5a5a..84e9b46ffc78 100644
> > --- a/block/blk-mq-tag.h
> > +++ b/block/blk-mq-tag.h
> > @@ -36,11 +36,22 @@ extern void blk_mq_tag_wakeup_all(struct
> > blk_mq_tags *tags, bool); void blk_mq_queue_tag_busy_iter(struct
> > request_queue *q, busy_iter_fn *fn,
> > void *priv);
> >
> > +static inline struct blk_mq_hw_ctx *blk_mq_master_hctx(
> > + struct blk_mq_hw_ctx *hctx)
> > +{
> > + return hctx->queue->queue_hw_ctx[0]; }
> > +
> > +
> > static inline struct sbq_wait_state *bt_wait_ptr(struct sbitmap_queue
*bt,
> > struct blk_mq_hw_ctx
*hctx)
> > {
> > if (!hctx)
> > return &bt->ws[0];
> > +
> > + if (hctx->flags & BLK_MQ_F_HOST_TAGS)
> > + hctx = blk_mq_master_hctx(hctx);
> > +
> > return sbq_wait_ptr(bt, &hctx->wait_index); }
> >
> > diff --git a/block/blk-mq.c b/block/blk-mq.c index
> > 49d73d979cb3..4196ed3b0085 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -303,7 +303,7 @@ static struct request *blk_mq_rq_ctx_init(struct
> > blk_mq_alloc_data *data,
> > } else {
> > if (data->hctx->flags & BLK_MQ_F_TAG_SHARED) {
> > rq_flags = RQF_MQ_INFLIGHT;
> > - atomic_inc(&data->hctx->nr_active);
> > + blk_mq_inc_nr_active(data->hctx);
> > }
> > rq->tag = tag;
> > rq->internal_tag = -1;
> > @@ -517,7 +517,7 @@ void blk_mq_free_request(struct request *rq)
> >
> > ctx->rq_completed[rq_is_sync(rq)]++;
> > if (rq->rq_flags & RQF_MQ_INFLIGHT)
> > - atomic_dec(&hctx->nr_active);
> > + blk_mq_dec_nr_active(hctx);
> >
> > if (unlikely(laptop_mode && !blk_rq_is_passthrough(rq)))
> > laptop_io_completion(q->backing_dev_info);
> > @@ -1064,7 +1064,7 @@ bool blk_mq_get_driver_tag(struct request *rq)
> > if (rq->tag >= 0) {
> > if (shared) {
> > rq->rq_flags |= RQF_MQ_INFLIGHT;
> > - atomic_inc(&data.hctx->nr_active);
> > + blk_mq_inc_nr_active(data.hctx);
> > }
> > data.hctx->tags->rqs[rq->tag] = rq;
> > }
> > diff --git a/block/blk-mq.h b/block/blk-mq.h index
> > 633a5a77ee8b..f1279b8c2289 100644
> > --- a/block/blk-mq.h
> > +++ b/block/blk-mq.h
> > @@ -193,6 +193,20 @@ unsigned int blk_mq_in_flight(struct
> > request_queue *q, struct hd_struct *part); void
> > blk_mq_in_flight_rw(struct request_queue *q, struct hd_struct *part,
> > unsigned int inflight[2]);
> >
> > +static inline void blk_mq_inc_nr_active(struct blk_mq_hw_ctx *hctx) {
> > + if (hctx->flags & BLK_MQ_F_HOST_TAGS)
> > + hctx = blk_mq_master_hctx(hctx);
> > + atomic_inc(&hctx->nr_active);
> > +}
> > +
> > +static inline void blk_mq_dec_nr_active(struct blk_mq_hw_ctx *hctx) {
> > + if (hctx->flags & BLK_MQ_F_HOST_TAGS)
> > + hctx = blk_mq_master_hctx(hctx);
> > + atomic_dec(&hctx->nr_active);
> > +}
> > +
> > static inline void blk_mq_put_dispatch_budget(struct blk_mq_hw_ctx
> > *hctx) {
> > struct request_queue *q = hctx->queue; @@ -218,7 +232,7 @@ static
> > inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx,
> >
> > if (rq->rq_flags & RQF_MQ_INFLIGHT) {
> > rq->rq_flags &= ~RQF_MQ_INFLIGHT;
> > - atomic_dec(&hctx->nr_active);
> > + blk_mq_dec_nr_active(hctx);
> > }
> > }
> >
> > Thanks,
> > Ming
next prev parent reply other threads:[~2019-06-07 9:45 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-31 2:27 [PATCH 0/9] blk-mq/scsi: convert private reply queue into blk_mq hw queue Ming Lei
2019-05-31 2:27 ` [PATCH 1/9] blk-mq: allow hw queues to share hostwide tags Ming Lei
2019-05-31 6:07 ` Hannes Reinecke
2019-05-31 15:37 ` Bart Van Assche
2019-06-24 8:44 ` Ming Lei
2019-06-05 14:10 ` John Garry
2019-06-24 8:46 ` Ming Lei
2019-06-24 13:14 ` John Garry
2019-05-31 2:27 ` [PATCH 2/9] block: null_blk: introduce module parameter of 'g_host_tags' Ming Lei
2019-05-31 6:08 ` Hannes Reinecke
2019-05-31 15:39 ` Bart Van Assche
2019-06-24 8:43 ` Ming Lei
2019-06-02 1:56 ` Minwoo Im
2019-05-31 2:27 ` [PATCH 3/9] scsi: Add template flag 'host_tagset' Ming Lei
2019-05-31 6:08 ` Hannes Reinecke
2019-05-31 2:27 ` [PATCH 4/9] scsi_debug: support host tagset Ming Lei
2019-05-31 6:09 ` Hannes Reinecke
2019-06-02 2:03 ` Minwoo Im
2019-06-02 17:01 ` Douglas Gilbert
2019-05-31 2:27 ` [PATCH 5/9] scsi: introduce scsi_cmnd_hctx_index() Ming Lei
2019-05-31 6:10 ` Hannes Reinecke
2019-05-31 2:27 ` [PATCH 6/9] scsi: hpsa: convert private reply queue to blk-mq hw queue Ming Lei
2019-05-31 6:15 ` Hannes Reinecke
2019-05-31 6:30 ` Ming Lei
2019-05-31 6:40 ` Hannes Reinecke
2019-05-31 2:27 ` [PATCH 7/9] scsi: hisi_sas_v3: " Ming Lei
2019-05-31 6:20 ` Hannes Reinecke
2019-05-31 6:34 ` Ming Lei
2019-05-31 6:42 ` Hannes Reinecke
2019-05-31 7:14 ` Ming Lei
2019-05-31 11:38 ` John Garry
2019-06-03 11:00 ` Ming Lei
2019-06-03 13:00 ` John Garry
2019-06-04 13:37 ` Ming Lei
2019-05-31 2:28 ` [PATCH 8/9] scsi: megaraid: " Ming Lei
2019-05-31 6:22 ` Hannes Reinecke
2019-06-01 21:41 ` Kashyap Desai
2019-06-02 6:42 ` Ming Lei
2019-06-02 7:48 ` Ming Lei
2019-06-02 16:34 ` Kashyap Desai
2019-06-03 3:56 ` Ming Lei
2019-06-03 10:00 ` Kashyap Desai
2019-06-07 9:45 ` Kashyap Desai [this message]
2019-05-31 2:28 ` [PATCH 9/9] scsi: mp3sas: " Ming Lei
2019-05-31 6:23 ` Hannes Reinecke
2019-06-06 11:58 ` Sreekanth Reddy
2019-06-04 8:49 ` [PATCH 0/9] blk-mq/scsi: convert private reply queue into blk_mq " John Garry
2019-08-13 8:30 ` John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=287b092778c749c8b101641f9d48ab44@mail.gmail.com \
--to=kashyap.desai@broadcom.com \
--cc=James.Bottomley@hansenpartnership.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=don.brace@microsemi.com \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=john.garry@huawei.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ming.lei@redhat.com \
--cc=sathya.prakash@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).