dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Mike Snitzer <snitzer@redhat.com>
Cc: axboe@kernel.dk, "keith.busch@intel.com" <keith.busch@intel.com>,
	Sagi Grimberg <sagig@dev.mellanox.co.il>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Christoph Hellwig <hch@infradead.org>,
	device-mapper development <dm-devel@redhat.com>,
	linux-block@vger.kernel.org,
	Bart Van Assche <bart.vanassche@sandisk.com>
Subject: Re: dm-multipath low performance with blk-mq
Date: Mon, 1 Feb 2016 07:46:59 +0100	[thread overview]
Message-ID: <56AEFF63.7050606@suse.de> (raw)
In-Reply-To: <20160130191238.GA18686@redhat.com>

On 01/30/2016 08:12 PM, Mike Snitzer wrote:
> On Sat, Jan 30 2016 at  3:52am -0500,
> Hannes Reinecke <hare@suse.de> wrote:
> 
>> On 01/30/2016 12:35 AM, Mike Snitzer wrote:
>>>
>>> Your test above is prone to exhaust the dm-mpath blk-mq tags (128)
>>> because 24 threads * 32 easily exceeds 128 (by a factor of 6).
>>>
>>> I found that we were context switching (via bt_get's io_schedule)
>>> waiting for tags to become available.
>>>
>>> This is embarassing but, until Jens told me today, I was oblivious to
>>> the fact that the number of blk-mq's tags per hw_queue was defined by
>>> tag_set.queue_depth.
>>>
>>> Previously request-based DM's blk-mq support had:
>>> md->tag_set.queue_depth = BLKDEV_MAX_RQ; (again: 128)
>>>
>>> Now I have a patch that allows tuning queue_depth via dm_mod module
>>> parameter.  And I'll likely bump the default to 4096 or something (doing
>>> so eliminated blocking in bt_get).
>>>
>>> But eliminating the tags bottleneck only raised my read IOPs from ~600K
>>> to ~800K (using 1 hw_queue for both null_blk and dm-mpath).
>>>
>>> When I raise nr_hw_queues to 4 for null_blk (keeping dm-mq at 1) I see a
>>> whole lot more context switching due to request-based DM's use of
>>> ksoftirqd (and kworkers) for request completion.
>>>
>>> So I'm moving on to optimizing the completion path.  But at least some
>>> progress was made, more to come...
>>>
>>
>> Would you mind sharing your patches?
> 
> I'm still working through this.  I'll hopefully have a handful of
> RFC-level changes by end of day Monday.  But could take longer.
> 
> One change that I already shared in a previous mail is:
> http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/commit/?h=devel2&id=99ebcaf36d9d1fa3acec98492c36664d57ba8fbd
> 
>> We're currently doing tests with a high-performance FC setup
>> (16G FC with all-flash storage), and are still 20% short of the
>> announced backend performance.
>>
>> Just as a side note: we're currently getting 550k IOPs.
>> With unpatched dm-mpath.
> 
> What is your test workload?  If you can share I'll be sure to factor it
> into my testing.
> 
That's a plain random read via fio, using 8 LUNs on the target.

>> So nearly on par with your null-blk setup. but with real hardware.
>> (Which in itself is pretty cool. You should get faster RAM :-)
> 
> You've misunderstood what I said my null_blk (RAM) performance is.
> 
> My null_blk test gets ~1900K read IOPs.  But dm-mpath ontop only gets
> between 600K and 1000K IOPs depending on $FIO_QUEUE_DEPTH and if I
> use multiple $NULL_BLK_HW_QUEUES.
> 
Right.
We're using two 16G FC links, each talking to 4 LUNs.
With dm-mpath on top. The FC HBAs have a hardware queue depth
of roughly 2000, so we might need to tweak the queue depth of the
multipath devices, too.


Will be having a look at your patches.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

  reply	other threads:[~2016-02-01  6:46 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <569CD4D6.2040908@dev.mellanox.co.il>
2016-01-19 10:37 ` dm-multipath low performance with blk-mq Sagi Grimberg
2016-01-19 22:45   ` Mike Snitzer
2016-01-25 21:40     ` Mike Snitzer
2016-01-25 23:37       ` Benjamin Marzinski
2016-01-26 13:29         ` Mike Snitzer
2016-01-26 14:01           ` Hannes Reinecke
2016-01-26 14:47             ` Mike Snitzer
2016-01-26 14:56               ` Christoph Hellwig
2016-01-26 15:27                 ` Mike Snitzer
2016-01-26 15:57             ` Benjamin Marzinski
2016-01-27 11:14           ` Sagi Grimberg
2016-01-27 17:48             ` Mike Snitzer
2016-01-27 17:51               ` Jens Axboe
2016-01-27 18:16                 ` Mike Snitzer
2016-01-27 18:26                   ` Jens Axboe
2016-01-27 19:14                     ` Mike Snitzer
2016-01-27 19:50                       ` Jens Axboe
2016-01-27 17:56               ` Sagi Grimberg
2016-01-27 18:42                 ` Mike Snitzer
2016-01-27 19:49                   ` Jens Axboe
2016-01-27 20:45                     ` Mike Snitzer
2016-01-29 23:35                 ` Mike Snitzer
2016-01-30  8:52                   ` Hannes Reinecke
2016-01-30 19:12                     ` Mike Snitzer
2016-02-01  6:46                       ` Hannes Reinecke [this message]
2016-02-03 18:04                         ` Mike Snitzer
2016-02-03 18:24                           ` Mike Snitzer
2016-02-03 19:22                             ` Mike Snitzer
2016-02-04  6:54                             ` Hannes Reinecke
2016-02-04 13:54                               ` Mike Snitzer
2016-02-04 13:58                                 ` Hannes Reinecke
2016-02-04 14:09                                   ` Mike Snitzer
2016-02-04 14:32                                     ` Hannes Reinecke
2016-02-04 14:44                                       ` Mike Snitzer
2016-02-05 15:13                                 ` [RFC PATCH] dm: fix excessive dm-mq context switching Mike Snitzer
2016-02-05 18:05                                   ` Mike Snitzer
2016-02-05 19:19                                     ` Mike Snitzer
2016-02-07 15:41                                       ` Sagi Grimberg
2016-02-07 16:07                                         ` Mike Snitzer
2016-02-07 16:42                                           ` Sagi Grimberg
2016-02-07 16:37                                         ` Bart Van Assche
2016-02-07 16:43                                           ` Sagi Grimberg
2016-02-07 16:53                                             ` Mike Snitzer
2016-02-07 16:54                                             ` Sagi Grimberg
2016-02-07 17:20                                               ` Mike Snitzer
2016-02-08 12:21                                                 ` Sagi Grimberg
2016-02-08 14:34                                                   ` Mike Snitzer
2016-02-09  7:50                                                 ` Hannes Reinecke
2016-02-09 14:55                                                   ` Mike Snitzer
2016-02-09 15:32                                                     ` Hannes Reinecke
2016-02-10  0:45                                                       ` Mike Snitzer
2016-02-11  1:50                                                         ` RCU-ified dm-mpath for testing/review Mike Snitzer
2016-02-11  3:35                                                           ` Mike Snitzer
2016-02-11 15:34                                                           ` Mike Snitzer
2016-02-12 15:18                                                             ` Hannes Reinecke
2016-02-12 15:26                                                               ` Mike Snitzer
2016-02-12 16:04                                                                 ` Hannes Reinecke
2016-02-12 18:00                                                                   ` Mike Snitzer
2016-02-15  6:47                                                                     ` Hannes Reinecke
2016-01-26  1:49       ` dm-multipath low performance with blk-mq Benjamin Marzinski
2016-01-26 16:03       ` Mike Snitzer
2016-01-26 16:44         ` Christoph Hellwig
2016-01-27  2:09           ` Mike Snitzer
2016-01-27 11:10             ` Sagi Grimberg
2016-01-26 21:40         ` Benjamin Marzinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56AEFF63.7050606@suse.de \
    --to=hare@suse.de \
    --cc=axboe@kernel.dk \
    --cc=bart.vanassche@sandisk.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagig@dev.mellanox.co.il \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).