dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>,
	Sagi Grimberg <sagig@dev.mellanox.co.il>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"keith.busch@intel.com" <keith.busch@intel.com>,
	dm-devel@redhat.com, Bart Van Assche <bart.vanassche@sandisk.com>
Subject: Re: dm-multipath low performance with blk-mq
Date: Tue, 26 Jan 2016 09:47:13 -0500	[thread overview]
Message-ID: <20160126144713.GA24239@redhat.com> (raw)
In-Reply-To: <56A77C21.90605@suse.de>

[you're killing me.. you nuked all CCs again]

On Tue, Jan 26 2016 at  9:01am -0500,
Hannes Reinecke <hare@suse.de> wrote:

> On 01/26/2016 02:29 PM, Mike Snitzer wrote:
> > On Mon, Jan 25 2016 at  6:37pm -0500,
> > Benjamin Marzinski <bmarzins@redhat.com> wrote:
> > 
> >> On Mon, Jan 25, 2016 at 04:40:16PM -0500, Mike Snitzer wrote:
> >>> On Tue, Jan 19 2016 at  5:45P -0500,
> >>> Mike Snitzer <snitzer@redhat.com> wrote:
> >>
> >> I don't think this is going to help __multipath_map() without some
> >> configuration changes.  Now that we're running on already merged
> >> requests instead of bios, the m->repeat_count is almost always set to 1,
> >> so we call the path_selector every time, which means that we'll always
> >> need the write lock. Bumping up the number of IOs we send before calling
> >> the path selector again will give this patch a change to do some good
> >> here.
> >>
> >> To do that you need to set:
> >>
> >> 	rr_min_io_rq <something_bigger_than_one>
> >>
> >> in the defaults section of /etc/multipath.conf and then reload the
> >> multipathd service.
> >>
> >> The patch should hopefully help in multipath_busy() regardless of the
> >> the rr_min_io_rq setting.
> > 
> > This patch, while generic, is meant to help the blk-mq case.  A blk-mq
> > request_queue doesn't have an elevator so the requests will not have
> > seen merging.
> > 
> > But yes, implied in the patch is the requirement to increase
> > m->repeat_count via multipathd's rr_min_io_rq (I'll backfill a proper
> > header once it is tested).
> > 
> But that would defeat load balancing, would it not?
> IE when you want to do load balancing you would constantly change
> paths, thereby always taking the write lock.
> Which would render the patch pointless.

Increasing m->repeat_count slightly for blk-mq could be beneficial
considering there isn't an elevator.  I do concede the need for finding
the sweet-spot (not too small, not too large so as to starve load
balancing) is less than ideal.  But it needs testing.

This initial m->lock conversion from spinlock_t to rwlock_t is just the
first step on addressing the locking bottlenecks we've not had a need to
look at until now.  Could be the rwlock_t also gets replaced with a more
complex locking model.

More work is possible to make path switching lockless.  Not yet clear
(to me) on how to approach it.  And yes, the work gets incrementally
more challenging (percpu, rcu, whatever... that code is "harder",
especially when refactoring existing code with legacy requirements).

> I was rather wondering if we could expose all active paths as
> hardware contexts and let blk-mq do the I/O mapping.
> That way we would only have to take the write lock if we have to
> choose a new pgpath/priority group ie in the case the active
> priority group goes down.

Training blk-mq to be multipath aware (priority groups, etc) is a
entirely new tangent that is one rabbit hole after another.

Yeah I know you want to throw away everything.  I'm not holding you back
from doing anything but I've told you I want incremental dm-multipath
improvements until it is clear there is no more room for improvement.

  reply	other threads:[~2016-01-26 14:47 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <569CD4D6.2040908@dev.mellanox.co.il>
2016-01-19 10:37 ` dm-multipath low performance with blk-mq Sagi Grimberg
2016-01-19 22:45   ` Mike Snitzer
2016-01-25 21:40     ` Mike Snitzer
2016-01-25 23:37       ` Benjamin Marzinski
2016-01-26 13:29         ` Mike Snitzer
2016-01-26 14:01           ` Hannes Reinecke
2016-01-26 14:47             ` Mike Snitzer [this message]
2016-01-26 14:56               ` Christoph Hellwig
2016-01-26 15:27                 ` Mike Snitzer
2016-01-26 15:57             ` Benjamin Marzinski
2016-01-27 11:14           ` Sagi Grimberg
2016-01-27 17:48             ` Mike Snitzer
2016-01-27 17:51               ` Jens Axboe
2016-01-27 18:16                 ` Mike Snitzer
2016-01-27 18:26                   ` Jens Axboe
2016-01-27 19:14                     ` Mike Snitzer
2016-01-27 19:50                       ` Jens Axboe
2016-01-27 17:56               ` Sagi Grimberg
2016-01-27 18:42                 ` Mike Snitzer
2016-01-27 19:49                   ` Jens Axboe
2016-01-27 20:45                     ` Mike Snitzer
2016-01-29 23:35                 ` Mike Snitzer
2016-01-30  8:52                   ` Hannes Reinecke
2016-01-30 19:12                     ` Mike Snitzer
2016-02-01  6:46                       ` Hannes Reinecke
2016-02-03 18:04                         ` Mike Snitzer
2016-02-03 18:24                           ` Mike Snitzer
2016-02-03 19:22                             ` Mike Snitzer
2016-02-04  6:54                             ` Hannes Reinecke
2016-02-04 13:54                               ` Mike Snitzer
2016-02-04 13:58                                 ` Hannes Reinecke
2016-02-04 14:09                                   ` Mike Snitzer
2016-02-04 14:32                                     ` Hannes Reinecke
2016-02-04 14:44                                       ` Mike Snitzer
2016-02-05 15:13                                 ` [RFC PATCH] dm: fix excessive dm-mq context switching Mike Snitzer
2016-02-05 18:05                                   ` Mike Snitzer
2016-02-05 19:19                                     ` Mike Snitzer
2016-02-07 15:41                                       ` Sagi Grimberg
2016-02-07 16:07                                         ` Mike Snitzer
2016-02-07 16:42                                           ` Sagi Grimberg
2016-02-07 16:37                                         ` Bart Van Assche
2016-02-07 16:43                                           ` Sagi Grimberg
2016-02-07 16:53                                             ` Mike Snitzer
2016-02-07 16:54                                             ` Sagi Grimberg
2016-02-07 17:20                                               ` Mike Snitzer
2016-02-08 12:21                                                 ` Sagi Grimberg
2016-02-08 14:34                                                   ` Mike Snitzer
2016-02-09  7:50                                                 ` Hannes Reinecke
2016-02-09 14:55                                                   ` Mike Snitzer
2016-02-09 15:32                                                     ` Hannes Reinecke
2016-02-10  0:45                                                       ` Mike Snitzer
2016-02-11  1:50                                                         ` RCU-ified dm-mpath for testing/review Mike Snitzer
2016-02-11  3:35                                                           ` Mike Snitzer
2016-02-11 15:34                                                           ` Mike Snitzer
2016-02-12 15:18                                                             ` Hannes Reinecke
2016-02-12 15:26                                                               ` Mike Snitzer
2016-02-12 16:04                                                                 ` Hannes Reinecke
2016-02-12 18:00                                                                   ` Mike Snitzer
2016-02-15  6:47                                                                     ` Hannes Reinecke
2016-01-26  1:49       ` dm-multipath low performance with blk-mq Benjamin Marzinski
2016-01-26 16:03       ` Mike Snitzer
2016-01-26 16:44         ` Christoph Hellwig
2016-01-27  2:09           ` Mike Snitzer
2016-01-27 11:10             ` Sagi Grimberg
2016-01-26 21:40         ` Benjamin Marzinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160126144713.GA24239@redhat.com \
    --to=snitzer@redhat.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=dm-devel@redhat.com \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=keith.busch@intel.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagig@dev.mellanox.co.il \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).