From: Sagi Grimberg <sagig@dev.mellanox.co.il>
To: Mike Snitzer <snitzer@redhat.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
Christoph Hellwig <hch@infradead.org>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
"keith.busch@intel.com" <keith.busch@intel.com>,
device-mapper development <dm-devel@redhat.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
Bart Van Assche <bart.vanassche@sandisk.com>
Subject: Re: [RFC PATCH] dm: fix excessive dm-mq context switching
Date: Mon, 8 Feb 2016 14:21:59 +0200 [thread overview]
Message-ID: <56B88867.4060308@dev.mellanox.co.il> (raw)
In-Reply-To: <20160207172055.GA6477@redhat.com>
>> The perf report is very similar to the one that started this effort..
>>
>> I'm afraid we'll need to resolve the per-target m->lock in order
>> to scale with NUMA...
>
> Could be. Just for testing, you can try the 2 topmost commits I've put
> here (once applied both __multipath_map and multipath_busy won't have
> _any_ locking.. again, very much test-only):
>
> http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=devel2
Hi Mike,
So I still don't see the IOPs scale like I expected. With these two
patches applied I see ~670K IOPs while the perf output is different
and does not indicate a clear lock contention.
--
- 4.67% fio [kernel.kallsyms] [k]
blk_account_io_start
- blk_account_io_start
- 56.05% blk_insert_cloned_request
map_request
dm_mq_queue_rq
__blk_mq_run_hw_queue
blk_mq_run_hw_queue
blk_mq_insert_requests
blk_mq_flush_plug_list
blk_flush_plug_list
blk_finish_plug
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
- 43.94% blk_mq_bio_to_request
blk_mq_make_request
generic_make_request
submit_bio
do_blockdev_direct_IO
__blockdev_direct_IO
blkdev_direct_IO
generic_file_read_iter
blkdev_read_iter
aio_run_iocb
io_submit_one
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
- 2.52% fio [dm_mod] [k] dm_mq_queue_rq
- dm_mq_queue_rq
- 99.16% __blk_mq_run_hw_queue
blk_mq_run_hw_queue
blk_mq_insert_requests
blk_mq_flush_plug_list
blk_flush_plug_list
blk_finish_plug
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
- 2.52% fio [dm_mod] [k] dm_mq_queue_rq
- dm_mq_queue_rq
- 99.16% __blk_mq_run_hw_queue
blk_mq_run_hw_queue
blk_mq_insert_requests
blk_mq_flush_plug_list
blk_flush_plug_list
blk_finish_plug
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
+ 0.84% blk_mq_run_hw_queue
- 2.46% fio [kernel.kallsyms] [k]
blk_mq_hctx_mark_pending
- blk_mq_hctx_mark_pending
- 99.79% blk_mq_insert_requests
blk_mq_flush_plug_list
blk_flush_plug_list
blk_finish_plug
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
- 2.07% ksoftirqd/6 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
- blk_mq_run_hw_queues
- 99.70% rq_completed
dm_done
dm_softirq_done
blk_done_softirq
+ __do_softirq
+ 2.06% ksoftirqd/0 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 2.02% ksoftirqd/9 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 2.00% ksoftirqd/20 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 2.00% ksoftirqd/12 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.99% ksoftirqd/11 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.97% ksoftirqd/18 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.96% ksoftirqd/1 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.95% ksoftirqd/14 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.95% ksoftirqd/13 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.94% ksoftirqd/5 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.94% ksoftirqd/8 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.93% ksoftirqd/2 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.92% ksoftirqd/21 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.92% ksoftirqd/17 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.92% ksoftirqd/7 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.91% ksoftirqd/23 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.84% ksoftirqd/4 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.81% ksoftirqd/19 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.76% ksoftirqd/3 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.76% ksoftirqd/16 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.75% ksoftirqd/15 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.74% ksoftirqd/22 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.72% ksoftirqd/10 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.38% perf [kernel.kallsyms] [k]
copy_user_generic_string
+ 1.20% fio [kernel.kallsyms] [k] enqueue_task_fair
+ 1.18% fio [kernel.kallsyms] [k] part_round_stats
+ 1.08% fio [kernel.kallsyms] [k] enqueue_entity
+ 1.07% fio [kernel.kallsyms] [k] _raw_spin_lock
+ 1.02% fio [kernel.kallsyms] [k]
__blk_mq_run_hw_queue
+ 0.79% fio [dm_multipath] [k] multipath_busy
+ 0.57% fio [kernel.kallsyms] [k] insert_work
+ 0.54% fio [kernel.kallsyms] [k] blk_flush_plug_list
--
WARNING: multiple messages have this Message-ID (diff)
From: sagig@dev.mellanox.co.il (Sagi Grimberg)
Subject: [RFC PATCH] dm: fix excessive dm-mq context switching
Date: Mon, 8 Feb 2016 14:21:59 +0200 [thread overview]
Message-ID: <56B88867.4060308@dev.mellanox.co.il> (raw)
In-Reply-To: <20160207172055.GA6477@redhat.com>
>> The perf report is very similar to the one that started this effort..
>>
>> I'm afraid we'll need to resolve the per-target m->lock in order
>> to scale with NUMA...
>
> Could be. Just for testing, you can try the 2 topmost commits I've put
> here (once applied both __multipath_map and multipath_busy won't have
> _any_ locking.. again, very much test-only):
>
> http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=devel2
Hi Mike,
So I still don't see the IOPs scale like I expected. With these two
patches applied I see ~670K IOPs while the perf output is different
and does not indicate a clear lock contention.
--
- 4.67% fio [kernel.kallsyms] [k]
blk_account_io_start
- blk_account_io_start
- 56.05% blk_insert_cloned_request
map_request
dm_mq_queue_rq
__blk_mq_run_hw_queue
blk_mq_run_hw_queue
blk_mq_insert_requests
blk_mq_flush_plug_list
blk_flush_plug_list
blk_finish_plug
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
- 43.94% blk_mq_bio_to_request
blk_mq_make_request
generic_make_request
submit_bio
do_blockdev_direct_IO
__blockdev_direct_IO
blkdev_direct_IO
generic_file_read_iter
blkdev_read_iter
aio_run_iocb
io_submit_one
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
- 2.52% fio [dm_mod] [k] dm_mq_queue_rq
- dm_mq_queue_rq
- 99.16% __blk_mq_run_hw_queue
blk_mq_run_hw_queue
blk_mq_insert_requests
blk_mq_flush_plug_list
blk_flush_plug_list
blk_finish_plug
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
- 2.52% fio [dm_mod] [k] dm_mq_queue_rq
- dm_mq_queue_rq
- 99.16% __blk_mq_run_hw_queue
blk_mq_run_hw_queue
blk_mq_insert_requests
blk_mq_flush_plug_list
blk_flush_plug_list
blk_finish_plug
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
+ 0.84% blk_mq_run_hw_queue
- 2.46% fio [kernel.kallsyms] [k]
blk_mq_hctx_mark_pending
- blk_mq_hctx_mark_pending
- 99.79% blk_mq_insert_requests
blk_mq_flush_plug_list
blk_flush_plug_list
blk_finish_plug
do_io_submit
SyS_io_submit
entry_SYSCALL_64_fastpath
+ io_submit
- 2.07% ksoftirqd/6 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
- blk_mq_run_hw_queues
- 99.70% rq_completed
dm_done
dm_softirq_done
blk_done_softirq
+ __do_softirq
+ 2.06% ksoftirqd/0 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 2.02% ksoftirqd/9 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 2.00% ksoftirqd/20 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 2.00% ksoftirqd/12 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.99% ksoftirqd/11 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.97% ksoftirqd/18 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.96% ksoftirqd/1 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.95% ksoftirqd/14 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.95% ksoftirqd/13 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.94% ksoftirqd/5 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.94% ksoftirqd/8 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.93% ksoftirqd/2 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.92% ksoftirqd/21 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.92% ksoftirqd/17 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.92% ksoftirqd/7 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.91% ksoftirqd/23 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.84% ksoftirqd/4 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.81% ksoftirqd/19 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.76% ksoftirqd/3 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.76% ksoftirqd/16 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.75% ksoftirqd/15 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.74% ksoftirqd/22 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.72% ksoftirqd/10 [kernel.kallsyms] [k]
blk_mq_run_hw_queues
+ 1.38% perf [kernel.kallsyms] [k]
copy_user_generic_string
+ 1.20% fio [kernel.kallsyms] [k] enqueue_task_fair
+ 1.18% fio [kernel.kallsyms] [k] part_round_stats
+ 1.08% fio [kernel.kallsyms] [k] enqueue_entity
+ 1.07% fio [kernel.kallsyms] [k] _raw_spin_lock
+ 1.02% fio [kernel.kallsyms] [k]
__blk_mq_run_hw_queue
+ 0.79% fio [dm_multipath] [k] multipath_busy
+ 0.57% fio [kernel.kallsyms] [k] insert_work
+ 0.54% fio [kernel.kallsyms] [k] blk_flush_plug_list
--
next prev parent reply other threads:[~2016-02-08 12:21 UTC|newest]
Thread overview: 127+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-18 12:04 dm-multipath low performance with blk-mq Sagi Grimberg
2016-01-19 10:37 ` Sagi Grimberg
2016-01-19 22:45 ` Mike Snitzer
2016-01-19 22:45 ` Mike Snitzer
2016-01-25 21:40 ` Mike Snitzer
2016-01-25 21:40 ` Mike Snitzer
2016-01-25 23:37 ` Benjamin Marzinski
2016-01-25 23:37 ` [dm-devel] " Benjamin Marzinski
2016-01-26 13:29 ` Mike Snitzer
2016-01-26 13:29 ` Mike Snitzer
2016-01-26 14:01 ` Hannes Reinecke
2016-01-26 14:47 ` Mike Snitzer
2016-01-26 14:47 ` Mike Snitzer
2016-01-26 14:56 ` Christoph Hellwig
2016-01-26 14:56 ` Christoph Hellwig
2016-01-26 15:27 ` Mike Snitzer
2016-01-26 15:27 ` Mike Snitzer
2016-01-26 15:57 ` Benjamin Marzinski
2016-01-27 11:14 ` Sagi Grimberg
2016-01-27 11:14 ` Sagi Grimberg
2016-01-27 17:48 ` Mike Snitzer
2016-01-27 17:48 ` Mike Snitzer
2016-01-27 17:51 ` Jens Axboe
2016-01-27 17:51 ` Jens Axboe
2016-01-27 18:16 ` Mike Snitzer
2016-01-27 18:16 ` Mike Snitzer
2016-01-27 18:26 ` Jens Axboe
2016-01-27 18:26 ` Jens Axboe
2016-01-27 19:14 ` Mike Snitzer
2016-01-27 19:14 ` Mike Snitzer
2016-01-27 19:50 ` Jens Axboe
2016-01-27 19:50 ` Jens Axboe
2016-01-27 17:56 ` Sagi Grimberg
2016-01-27 17:56 ` Sagi Grimberg
2016-01-27 18:42 ` Mike Snitzer
2016-01-27 18:42 ` Mike Snitzer
2016-01-27 19:49 ` Jens Axboe
2016-01-27 19:49 ` Jens Axboe
2016-01-27 20:45 ` Mike Snitzer
2016-01-27 20:45 ` Mike Snitzer
2016-01-29 23:35 ` Mike Snitzer
2016-01-29 23:35 ` Mike Snitzer
2016-01-30 8:52 ` Hannes Reinecke
2016-01-30 8:52 ` Hannes Reinecke
2016-01-30 19:12 ` Mike Snitzer
2016-01-30 19:12 ` Mike Snitzer
2016-02-01 6:46 ` Hannes Reinecke
2016-02-01 6:46 ` Hannes Reinecke
2016-02-03 18:04 ` Mike Snitzer
2016-02-03 18:04 ` Mike Snitzer
2016-02-03 18:24 ` Mike Snitzer
2016-02-03 18:24 ` Mike Snitzer
2016-02-03 19:22 ` Mike Snitzer
2016-02-03 19:22 ` Mike Snitzer
2016-02-04 6:54 ` Hannes Reinecke
2016-02-04 6:54 ` Hannes Reinecke
2016-02-04 13:54 ` Mike Snitzer
2016-02-04 13:54 ` Mike Snitzer
2016-02-04 13:58 ` Hannes Reinecke
2016-02-04 13:58 ` Hannes Reinecke
2016-02-04 14:09 ` Mike Snitzer
2016-02-04 14:09 ` Mike Snitzer
2016-02-04 14:32 ` Hannes Reinecke
2016-02-04 14:32 ` Hannes Reinecke
2016-02-04 14:44 ` Mike Snitzer
2016-02-04 14:44 ` Mike Snitzer
2016-02-05 15:13 ` [RFC PATCH] dm: fix excessive dm-mq context switching Mike Snitzer
2016-02-05 15:13 ` Mike Snitzer
2016-02-05 18:05 ` Mike Snitzer
2016-02-05 18:05 ` Mike Snitzer
2016-02-05 19:19 ` Mike Snitzer
2016-02-05 19:19 ` Mike Snitzer
2016-02-07 15:41 ` Sagi Grimberg
2016-02-07 15:41 ` Sagi Grimberg
2016-02-07 16:07 ` Mike Snitzer
2016-02-07 16:07 ` Mike Snitzer
2016-02-07 16:42 ` Sagi Grimberg
2016-02-07 16:42 ` Sagi Grimberg
2016-02-07 16:37 ` Bart Van Assche
2016-02-07 16:37 ` Bart Van Assche
2016-02-07 16:43 ` Sagi Grimberg
2016-02-07 16:43 ` Sagi Grimberg
2016-02-07 16:53 ` Mike Snitzer
2016-02-07 16:53 ` Mike Snitzer
2016-02-07 16:54 ` Sagi Grimberg
2016-02-07 16:54 ` Sagi Grimberg
2016-02-07 17:20 ` Mike Snitzer
2016-02-07 17:20 ` Mike Snitzer
2016-02-08 12:21 ` Sagi Grimberg [this message]
2016-02-08 12:21 ` Sagi Grimberg
2016-02-08 14:34 ` Mike Snitzer
2016-02-08 14:34 ` Mike Snitzer
2016-02-09 7:50 ` Hannes Reinecke
2016-02-09 7:50 ` Hannes Reinecke
2016-02-09 14:55 ` Mike Snitzer
2016-02-09 14:55 ` Mike Snitzer
2016-02-09 15:32 ` Hannes Reinecke
2016-02-09 15:32 ` Hannes Reinecke
2016-02-10 0:45 ` Mike Snitzer
2016-02-10 0:45 ` Mike Snitzer
2016-02-11 1:50 ` RCU-ified dm-mpath for testing/review Mike Snitzer
2016-02-11 3:35 ` Mike Snitzer
2016-02-11 3:35 ` Mike Snitzer
2016-02-11 15:34 ` Mike Snitzer
2016-02-11 15:34 ` Mike Snitzer
2016-02-12 15:18 ` Hannes Reinecke
2016-02-12 15:18 ` Hannes Reinecke
2016-02-12 15:26 ` Mike Snitzer
2016-02-12 15:26 ` Mike Snitzer
2016-02-12 16:04 ` Hannes Reinecke
2016-02-12 16:04 ` Hannes Reinecke
2016-02-12 18:00 ` Mike Snitzer
2016-02-12 18:00 ` Mike Snitzer
2016-02-15 6:47 ` Hannes Reinecke
2016-02-15 6:47 ` Hannes Reinecke
2016-01-26 1:49 ` dm-multipath low performance with blk-mq Benjamin Marzinski
2016-01-26 1:49 ` [dm-devel] " Benjamin Marzinski
2016-01-26 16:03 ` Mike Snitzer
2016-01-26 16:03 ` Mike Snitzer
2016-01-26 16:44 ` Christoph Hellwig
2016-01-26 16:44 ` Christoph Hellwig
2016-01-27 2:09 ` Mike Snitzer
2016-01-27 2:09 ` Mike Snitzer
2016-01-27 11:10 ` Sagi Grimberg
2016-01-27 11:10 ` Sagi Grimberg
2016-01-26 21:40 ` Benjamin Marzinski
2016-01-26 21:40 ` [dm-devel] " Benjamin Marzinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56B88867.4060308@dev.mellanox.co.il \
--to=sagig@dev.mellanox.co.il \
--cc=axboe@kernel.dk \
--cc=bart.vanassche@sandisk.com \
--cc=dm-devel@redhat.com \
--cc=hch@infradead.org \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.