From: snitzer@redhat.com (Mike Snitzer)
Subject: dm-multipath low performance with blk-mq
Date: Thu, 4 Feb 2016 08:54:20 -0500	[thread overview]
Message-ID: <20160204135420.GA18227@redhat.com> (raw)
In-Reply-To: <56B2F5BC.1010700@suse.de>
On Thu, Feb 04 2016 at  1:54am -0500,
Hannes Reinecke <hare@suse.de> wrote:
> On 02/03/2016 07:24 PM, Mike Snitzer wrote:
> > On Wed, Feb 03 2016 at  1:04pm -0500,
> > Mike Snitzer <snitzer@redhat.com> wrote:
> >  
> >> I'm still not clear on where the considerable performance loss is coming
> >> from (on null_blk device I see ~1900K read IOPs but I'm still only
> >> seeing ~1000K read IOPs when blk-mq DM-multipath is layered ontop).
> >> What is very much apparent is: layering dm-mq multipath ontop of null_blk
> >> results in a HUGE amount of additional context switches.  I can only
> >> infer that the request completion for this stacked device (blk-mq queue
> >> ontop of blk-mq queue, with 2 completions: 1 for clone completing on
> >> underlying device and 1 for original request completing) is the reason
> >> for all the extra context switches.
> > 
> > Starts to explain, certainly not the "reason"; that is still very much
> > TBD...
> > 
> >> Here are pictures of 'perf report' for perf datat collected using
> >> 'perf record -ag -e cs'.
> >>
> >> Against null_blk:
> >> http://people.redhat.com/msnitzer/perf-report-cs-null_blk.png
> > 
> > if dm-mq nr_hw_queues=1 and null_blk nr_hw_queues=1
> >   cpu          : usr=25.53%, sys=74.40%, ctx=1970, majf=0, minf=474
> > if dm-mq nr_hw_queues=1 and null_blk nr_hw_queues=4
> >   cpu          : usr=26.79%, sys=73.15%, ctx=2067, majf=0, minf=479
> > 
> >> Against dm-mpath ontop of the same null_blk:
> >> http://people.redhat.com/msnitzer/perf-report-cs-dm_mq.png
> > 
> > if dm-mq nr_hw_queues=1 and null_blk nr_hw_queues=1
> >   cpu          : usr=11.07%, sys=33.90%, ctx=667784, majf=0, minf=466
> > if dm-mq nr_hw_queues=1 and null_blk nr_hw_queues=4
> >   cpu          : usr=15.22%, sys=48.44%, ctx=2314901, majf=0, minf=466
> > 
> > So yeah, the percentages reflected in these respective images didn't do
> > the huge increase in context switches justice... we _must_ figure out
> > why we're seeing so many context switches with dm-mq.
> > 
> Well, the most obvious one being that you're using 1 dm-mq queue vs
> 4 null_blk queues.
> So you will have have to do an additional context switch for 75% of
> the total I/Os submitted.
Right, that case is certainly prone to more context switches.  But I'm
initially most concerned about the case where both only have 1 queue.
> Have you tested with 4 dm-mq hw queues?
Yes, it makes performance worse.  This is likely rooted in dm-mpath IO
path not being lockless.  But I also have concern about whether the
clone, sent to the underlying path, is completing on a different cpu
than dm-mq's original request.
I'll be using ftrace to try to dig into the various aspects of this
(perf, as I know how to use it, isn't giving me enough precision in its
reporting).
> To avoid context switches we would have to align the dm-mq queues to
> the underlying blk-mq layout for the paths.
Right, we need to take more care (how remains TBD).  But for now I'm
just going to focus on the case where both dm-mq and null_blk have 1 for
nr_hw_queues.  As you can see even in that config the number of context
switches goes from 1970 to 667784 (and there is a huge loss of system
cpu utilization) once dm-mq w/ 1 hw_queue is stacked ontop on the
null_blk device.
Once we understand the source of all the additional context switching
for this more simplistic stacked configuration we can look closer at
scaling as we add more underlying paths.
> And we need to look at making the main submission path lockless;
> I was wondering if we really need to take the lock if we don't
> switch priority groups; maybe we can establish a similar algorithm
> blk-mq does; if we were to have a queue per valid path in any given
> priority group we should be able to run lockless and only take the
> lock if we need to switch priority groups.
I'd like to explore this further with you once I come back up from this
frustrating deep dive on "what is causing all these context switches!?"
 
> But anyway, I'll be looking at your patches.
Thanks, sadly none of the patches are going to fix the performance
problems but I do think they are a step forward.
next prev parent reply	other threads:[~2016-02-04 13:54 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <569E11EA.8000305@dev.mellanox.co.il>
2016-01-19 22:45 ` dm-multipath low performance with blk-mq Mike Snitzer
2016-01-25 21:40   ` Mike Snitzer
2016-01-25 23:37     ` [dm-devel] " Benjamin Marzinski
2016-01-26 13:29       ` Mike Snitzer
     [not found]         ` <56A77C21.90605@suse.de>
2016-01-26 14:47           ` Mike Snitzer
2016-01-26 14:56             ` Christoph Hellwig
2016-01-26 15:27               ` Mike Snitzer
2016-01-27 11:14         ` Sagi Grimberg
2016-01-27 17:48           ` Mike Snitzer
2016-01-27 17:51             ` Jens Axboe
2016-01-27 18:16               ` Mike Snitzer
2016-01-27 18:26                 ` Jens Axboe
2016-01-27 19:14                   ` Mike Snitzer
2016-01-27 19:50                     ` Jens Axboe
2016-01-27 17:56             ` Sagi Grimberg
2016-01-27 18:42               ` Mike Snitzer
2016-01-27 19:49                 ` Jens Axboe
2016-01-27 20:45                   ` Mike Snitzer
2016-01-29 23:35               ` Mike Snitzer
2016-01-30  8:52                 ` Hannes Reinecke
2016-01-30 19:12                   ` Mike Snitzer
2016-02-01  6:46                     ` Hannes Reinecke
2016-02-03 18:04                       ` Mike Snitzer
2016-02-03 18:24                         ` Mike Snitzer
2016-02-03 19:22                           ` Mike Snitzer
2016-02-04  6:54                           ` Hannes Reinecke
2016-02-04 13:54                             ` Mike Snitzer [this message]
2016-02-04 13:58                               ` Hannes Reinecke
2016-02-04 14:09                                 ` Mike Snitzer
2016-02-04 14:32                                   ` Hannes Reinecke
2016-02-04 14:44                                     ` Mike Snitzer
2016-02-05 15:13                               ` [RFC PATCH] dm: fix excessive dm-mq context switching Mike Snitzer
2016-02-05 18:05                                 ` Mike Snitzer
2016-02-05 19:19                                   ` Mike Snitzer
2016-02-07 15:41                                     ` Sagi Grimberg
2016-02-07 16:07                                       ` Mike Snitzer
2016-02-07 16:42                                         ` Sagi Grimberg
2016-02-07 16:37                                       ` Bart Van Assche
2016-02-07 16:43                                         ` Sagi Grimberg
2016-02-07 16:53                                           ` Mike Snitzer
2016-02-07 16:54                                           ` Sagi Grimberg
2016-02-07 17:20                                             ` Mike Snitzer
2016-02-08 12:21                                               ` Sagi Grimberg
2016-02-08 14:34                                                 ` Mike Snitzer
2016-02-09  7:50                                               ` Hannes Reinecke
2016-02-09 14:55                                                 ` Mike Snitzer
2016-02-09 15:32                                                   ` Hannes Reinecke
2016-02-10  0:45                                                     ` Mike Snitzer
     [not found]                                                       ` <20160211015030.GA4481@redhat.com>
2016-02-11  3:35                                                         ` RCU-ified dm-mpath for testing/review Mike Snitzer
2016-02-11 15:34                                                         ` Mike Snitzer
2016-02-12 15:18                                                           ` Hannes Reinecke
2016-02-12 15:26                                                             ` Mike Snitzer
2016-02-12 16:04                                                               ` Hannes Reinecke
2016-02-12 18:00                                                                 ` Mike Snitzer
2016-02-15  6:47                                                                   ` Hannes Reinecke
2016-01-26  1:49     ` [dm-devel] dm-multipath low performance with blk-mq Benjamin Marzinski
2016-01-26 16:03     ` Mike Snitzer
2016-01-26 16:44       ` Christoph Hellwig
2016-01-27  2:09         ` Mike Snitzer
2016-01-27 11:10           ` Sagi Grimberg
2016-01-26 21:40       ` [dm-devel] " Benjamin Marzinski
2016-01-18 12:04 Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=20160204135420.GA18227@redhat.com \
    --to=snitzer@redhat.com \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).