From: Kashyap Desai <kashyap.desai@broadcom.com>
To: John Garry <john.garry@huawei.com>,
axboe@kernel.dk, jejb@linux.ibm.com, martin.petersen@oracle.com,
ming.lei@redhat.com, bvanassche@acm.org, hare@suse.de,
don.brace@microsemi.com, Sumit Saxena <sumit.saxena@broadcom.com>,
hch@infradead.org,
Shivasharan Srikanteshwara
<shivasharan.srikanteshwara@broadcom.com>
Cc: chenxiang66@hisilicon.com, linux-block@vger.kernel.org,
linux-scsi@vger.kernel.org, esc.storagedev@microsemi.com,
Hannes Reinecke <hare@suse.com>
Subject: RE: [PATCH RFC v6 08/10] megaraid_sas: switch fusion adapters to MQ
Date: Tue, 7 Apr 2020 16:44:42 +0530 [thread overview]
Message-ID: <a1f0399e2e85b2244a9ae40e4a2f1089@mail.gmail.com> (raw)
In-Reply-To: <1583409280-158604-9-git-send-email-john.garry@huawei.com>
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -373,24 +373,24 @@ megasas_get_msix_index(struct megasas_instance
> *instance, {
> int sdev_busy;
>
> - /* nr_hw_queue = 1 for MegaRAID */
> - struct blk_mq_hw_ctx *hctx =
> - scmd->device->request_queue->queue_hw_ctx[0];
> + struct blk_mq_hw_ctx *hctx = scmd->request->mq_hctx;
Hi John,
There is one outstanding patch which will eventually remove device_busy
from sdev. To fix this interface, we may have to track per scsi device
outstanding within a driver.
For my testing I used below since we still have below interface available.
sdev_busy = atomic_read(&scmd->device->device_busy);
We have done some level of testing to know performance impact on SAS SSDs
and HDD setup. Here is my finding -
My testing used - Two socket Intel Skylake/Lewisburg/Purley
Output of numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 36 37 38 39 40 41
42 43 44 45 46 47 48 49 50 51 52 53
node 0 size: 31820 MB
node 0 free: 21958 MB
node 1 cpus: 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 54 55
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
node 1 size: 32247 MB
node 1 free: 21068 MB
node distances:
node 0 1
0: 10 21
1: 21 10
64 HDD setup -
With higher QD and io schedulder = mq-deadline, shared host tag is not
scaling well. If I use ioscheduler = none, I can see consistent 2.0M IOPs.
This issue is seen only with RFC. Without RFC mq-deadline scales up to
2.0M IOPS.
Perf Top result of RFC - (IOPS = 1.4M IOPS)
78.20% [kernel] [k] native_queued_spin_lock_slowpath
1.46% [kernel] [k] sbitmap_any_bit_set
1.14% [kernel] [k] blk_mq_run_hw_queue
0.90% [kernel] [k] _mix_pool_bytes
0.63% [kernel] [k] _raw_spin_lock
0.57% [kernel] [k] blk_mq_run_hw_queues
0.56% [megaraid_sas] [k] complete_cmd_fusion
0.54% [megaraid_sas] [k] megasas_build_and_issue_cmd_fusion
0.50% [kernel] [k] dd_has_work
0.38% [kernel] [k] _raw_spin_lock_irqsave
0.36% [kernel] [k] gup_pgd_range
0.35% [megaraid_sas] [k] megasas_build_ldio_fusion
0.31% [kernel] [k] io_submit_one
0.29% [kernel] [k] hctx_lock
0.26% [kernel] [k] try_to_grab_pending
0.24% [kernel] [k] scsi_queue_rq
0.22% fio [.] __fio_gettime
0.22% [kernel] [k] insert_work
0.20% [kernel] [k] native_irq_return_iret
Perf top without RFC driver - (IOPS = 2.0 M IOPS)
58.40% [kernel] [k] native_queued_spin_lock_slowpath
2.06% [kernel] [k] _mix_pool_bytes
1.38% [kernel] [k] _raw_spin_lock_irqsave
0.97% [kernel] [k] _raw_spin_lock
0.91% [kernel] [k] scsi_queue_rq
0.82% [kernel] [k] __sbq_wake_up
0.77% [kernel] [k] _raw_spin_unlock_irqrestore
0.74% [kernel] [k] scsi_mq_get_budget
0.61% [kernel] [k] gup_pgd_range
0.58% [kernel] [k] aio_complete_rw
0.52% [kernel] [k] elv_rb_add
0.50% [kernel] [k] llist_add_batch
0.50% [kernel] [k] native_irq_return_iret
0.48% [kernel] [k] blk_rq_map_sg
0.48% fio [.] __fio_gettime
0.47% [kernel] [k] blk_mq_get_tag
0.44% [kernel] [k] blk_mq_dispatch_rq_list
0.40% fio [.] io_u_queued_complete
0.39% fio [.] get_io_u
If you want me to test any top up patch, please let me know. BTW, we also
wants to provide module parameter for user to switch back to older
nr_hw_queue = 1 mode. I will work on that part.
24 SSD setup -
I am able to see performance using RFC and without RFC is almost same.
There is one specific drop, but that is generic kernel issue. Not related
to RFC.
We can discuss this issue separately. -
5.6 kernel is not able to scale very well if there is heavy outstanding
from application.
Example -
24 SSD setup and BS = 8K QD = 128 gives 1.73M IOPs which is h/w max, but
at QD = 256 it gives 1.4M IOPs. It looks like there are some overhead of
finding free tags at sdev or shost level which leads drops in IOPs.
Kashyap
next prev parent reply other threads:[~2020-04-07 11:14 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-05 11:54 [PATCH RFC v6 00/10] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs John Garry
2020-03-05 11:54 ` [PATCH RFC v6 01/10] blk-mq: rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED John Garry
2020-03-05 11:54 ` [PATCH RFC v6 02/10] blk-mq: rename blk_mq_update_tag_set_depth() John Garry
2020-03-05 11:54 ` [PATCH RFC v6 03/10] blk-mq: Use pointers for blk_mq_tags bitmap tags John Garry
2020-03-05 12:42 ` Hannes Reinecke
2020-03-05 11:54 ` [PATCH RFC v6 04/10] blk-mq: Facilitate a shared sbitmap per tagset John Garry
2020-03-05 12:49 ` Hannes Reinecke
2020-03-05 13:52 ` John Garry
2020-03-05 11:54 ` [PATCH RFC v6 05/10] blk-mq: Add support in hctx_tags_bitmap_show() for a shared sbitmap John Garry
2020-03-05 12:52 ` Hannes Reinecke
2020-03-05 11:54 ` [PATCH RFC v6 06/10] scsi: Add template flag 'host_tagset' John Garry
2020-03-06 11:12 ` John Garry
2020-03-05 11:54 ` [PATCH RFC v6 07/10] scsi: hisi_sas: Switch v3 hw to MQ John Garry
2020-03-05 12:52 ` Hannes Reinecke
2020-03-05 11:54 ` [PATCH RFC v6 08/10] megaraid_sas: switch fusion adapters " John Garry
2020-04-07 11:14 ` Kashyap Desai [this message]
2020-04-08 9:33 ` John Garry
2020-04-08 9:59 ` Kashyap Desai
2020-04-17 16:46 ` John Garry
2020-04-20 17:47 ` Kashyap Desai
2020-04-21 12:35 ` John Garry
2020-04-22 18:59 ` Kashyap Desai
2020-04-22 21:28 ` John Garry
2020-04-23 16:31 ` John Garry
2020-04-24 16:31 ` Kashyap Desai
2020-04-27 17:06 ` John Garry
2020-04-27 18:58 ` Kashyap Desai
2020-04-28 15:55 ` John Garry
2020-04-29 11:29 ` John Garry
2020-04-29 15:50 ` Kashyap Desai
2020-04-29 17:55 ` John Garry
2020-04-30 17:40 ` John Garry
2020-04-30 19:18 ` Kashyap Desai
2020-03-05 11:54 ` [PATCH RFC v6 09/10] smartpqi: enable host tagset John Garry
2020-03-05 11:54 ` [PATCH RFC v6 10/10] hpsa: enable host_tagset and switch to MQ John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a1f0399e2e85b2244a9ae40e4a2f1089@mail.gmail.com \
--to=kashyap.desai@broadcom.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=chenxiang66@hisilicon.com \
--cc=don.brace@microsemi.com \
--cc=esc.storagedev@microsemi.com \
--cc=hare@suse.com \
--cc=hare@suse.de \
--cc=hch@infradead.org \
--cc=jejb@linux.ibm.com \
--cc=john.garry@huawei.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ming.lei@redhat.com \
--cc=shivasharan.srikanteshwara@broadcom.com \
--cc=sumit.saxena@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).