dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: axboe@kernel.dk, Hannes Reinecke <hare@suse.de>,
	Sagi Grimberg <sagig@dev.mellanox.co.il>,
	Christoph Hellwig <hch@infradead.org>
Cc: "keith.busch@intel.com" <keith.busch@intel.com>,
	linux-block@vger.kernel.org,
	device-mapper development <dm-devel@redhat.com>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Bart Van Assche <bart.vanassche@sandisk.com>
Subject: Re: [RFC PATCH] dm: fix excessive dm-mq context switching
Date: Fri, 5 Feb 2016 14:19:10 -0500	[thread overview]
Message-ID: <20160205191909.GA25982@redhat.com> (raw)
In-Reply-To: <20160205180515.GA25808@redhat.com>

On Fri, Feb 05 2016 at  1:05pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Fri, Feb 05 2016 at 10:13am -0500,
> Mike Snitzer <snitzer@redhat.com> wrote:
>  
> > Following is RFC because it really speaks to dm-mq _needing_ a variant
> > of blk_mq_complete_request() that supports partial completions.  Not
> > supporting partial completions really isn't an option for DM multipath.
> > 
> > From: Mike Snitzer <snitzer@redhat.com>
> > Date: Fri, 5 Feb 2016 08:49:01 -0500
> > Subject: [RFC PATCH] dm: fix excessive dm-mq context switching
> > 
> > Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower
> > than if an underlying null_blk device were used directly.  This biggest
> > reason for this drop in performance is that blk_insert_clone_request()
> > was calling blk_mq_insert_request() with @async=true.  This forced the
> > use of kblockd_schedule_delayed_work_on() to run the queues which
> > ushered in ping-ponging between process context (fio in this case) and
> > kblockd's kworker to submit the cloned request.  The ftrace
> > function_graph tracer showed:
> > 
> >   kworker-2013  =>   fio-12190
> >   fio-12190    =>  kworker-2013
> >   ...
> >   kworker-2013  =>   fio-12190
> >   fio-12190    =>  kworker-2013
> >   ...
> > 
> > Fixing blk_mq_insert_request() to _not_ use kblockd to submit the cloned
> > requests isn't enough to fix eliminated the oberved context switches.
> > 
> > In addition to this dm-mq specific blk-core fix, there were 2 DM core
> > fixes to dm-mq that (when paired with the blk-core fix) completely
> > eliminate the observed context switching:
> > 
> > 1)  don't blk_mq_run_hw_queues in blk-mq request completion
> > 
> >     Motivated by desire to reduce overhead of dm-mq, punting to kblockd
> >     just increases context switches.
> > 
> >     In my testing against a really fast null_blk device there was no benefit
> >     to running blk_mq_run_hw_queues() on completion (and no other blk-mq
> >     driver does this).  So hopefully this change doesn't induce the need for
> >     yet another revert like commit 621739b00e16ca2d !
> > 
> > 2)  use blk_mq_complete_request() in dm_complete_request()
> > 
> >     blk_complete_request() doesn't offer the traditional q->mq_ops vs
> >     .request_fn branching pattern that other historic block interfaces
> >     do (e.g. blk_get_request).  Using blk_mq_complete_request() for
> >     blk-mq requests is important for performance but it doesn't handle
> >     partial completions -- which is a pretty big problem given the
> >     potential for partial completions with DM multipath due to path
> >     failure(s).  As such this makes this entire patch only RFC-worthy.
> 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index c683f6d..a618477 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1344,7 +1340,10 @@ static void dm_complete_request(struct request *rq, int error)
> >  	struct dm_rq_target_io *tio = tio_from_request(rq);
> >  
> >  	tio->error = error;
> > -	blk_complete_request(rq);
> > +	if (!rq->q->mq_ops)
> > +		blk_complete_request(rq);
> > +	else
> > +		blk_mq_complete_request(rq, rq->errors);
> >  }
> >  
> >  /*
> 
> Looking closer, DM is very likely OK just using blk_mq_complete_request.
> 
> blk_complete_request() also doesn't provide native partial completion
> support (it relies on the driver to do it, which DM core does):
> 
> /**
>  * blk_complete_request - end I/O on a request
>  * @req:      the request being processed
>  *
>  * Description:
>  *     Ends all I/O on a request. It does not handle partial completions,
>  *     unless the driver actually implements this in its completion callback
>  *     through requeueing. The actual completion happens out-of-order,
>  *     through a softirq handler. The user must have registered a completion
>  *     callback through blk_queue_softirq_done().
>  **/
> 
> blk_mq_complete_request() is effectively implemented in a comparable
> fashion to blk_complete_request().  Given that DM core is providing
> partial completion support by dm.c:end_clone_bio() triggering requeueing
> of the request via dm-mpath.c:multipath_end_io()'s return of
> DM_ENDIO_REQUEUE.
> 
> So I'm thinking I can drop the "RFC" for this patch and run with
> it.. once I get Jens' feedback (hopefully) confirming my understanding.
> 
> Jens, please advise.  If you're comfortable providing your Acked-by I
> can get this fix in for 4.5-rc4 or so...

FYI, here is the latest revised patch:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.6&id=a5b835282422ec41991c1dbdb88daa4af7d166d2

(revised patch header and fixed a thinko in the dm.c:rq_completed()
change from the RFC patch I posted earlier)

  reply	other threads:[~2016-02-05 19:19 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <569CD4D6.2040908@dev.mellanox.co.il>
2016-01-19 10:37 ` dm-multipath low performance with blk-mq Sagi Grimberg
2016-01-19 22:45   ` Mike Snitzer
2016-01-25 21:40     ` Mike Snitzer
2016-01-25 23:37       ` Benjamin Marzinski
2016-01-26 13:29         ` Mike Snitzer
2016-01-26 14:01           ` Hannes Reinecke
2016-01-26 14:47             ` Mike Snitzer
2016-01-26 14:56               ` Christoph Hellwig
2016-01-26 15:27                 ` Mike Snitzer
2016-01-26 15:57             ` Benjamin Marzinski
2016-01-27 11:14           ` Sagi Grimberg
2016-01-27 17:48             ` Mike Snitzer
2016-01-27 17:51               ` Jens Axboe
2016-01-27 18:16                 ` Mike Snitzer
2016-01-27 18:26                   ` Jens Axboe
2016-01-27 19:14                     ` Mike Snitzer
2016-01-27 19:50                       ` Jens Axboe
2016-01-27 17:56               ` Sagi Grimberg
2016-01-27 18:42                 ` Mike Snitzer
2016-01-27 19:49                   ` Jens Axboe
2016-01-27 20:45                     ` Mike Snitzer
2016-01-29 23:35                 ` Mike Snitzer
2016-01-30  8:52                   ` Hannes Reinecke
2016-01-30 19:12                     ` Mike Snitzer
2016-02-01  6:46                       ` Hannes Reinecke
2016-02-03 18:04                         ` Mike Snitzer
2016-02-03 18:24                           ` Mike Snitzer
2016-02-03 19:22                             ` Mike Snitzer
2016-02-04  6:54                             ` Hannes Reinecke
2016-02-04 13:54                               ` Mike Snitzer
2016-02-04 13:58                                 ` Hannes Reinecke
2016-02-04 14:09                                   ` Mike Snitzer
2016-02-04 14:32                                     ` Hannes Reinecke
2016-02-04 14:44                                       ` Mike Snitzer
2016-02-05 15:13                                 ` [RFC PATCH] dm: fix excessive dm-mq context switching Mike Snitzer
2016-02-05 18:05                                   ` Mike Snitzer
2016-02-05 19:19                                     ` Mike Snitzer [this message]
2016-02-07 15:41                                       ` Sagi Grimberg
2016-02-07 16:07                                         ` Mike Snitzer
2016-02-07 16:42                                           ` Sagi Grimberg
2016-02-07 16:37                                         ` Bart Van Assche
2016-02-07 16:43                                           ` Sagi Grimberg
2016-02-07 16:53                                             ` Mike Snitzer
2016-02-07 16:54                                             ` Sagi Grimberg
2016-02-07 17:20                                               ` Mike Snitzer
2016-02-08 12:21                                                 ` Sagi Grimberg
2016-02-08 14:34                                                   ` Mike Snitzer
2016-02-09  7:50                                                 ` Hannes Reinecke
2016-02-09 14:55                                                   ` Mike Snitzer
2016-02-09 15:32                                                     ` Hannes Reinecke
2016-02-10  0:45                                                       ` Mike Snitzer
2016-02-11  1:50                                                         ` RCU-ified dm-mpath for testing/review Mike Snitzer
2016-02-11  3:35                                                           ` Mike Snitzer
2016-02-11 15:34                                                           ` Mike Snitzer
2016-02-12 15:18                                                             ` Hannes Reinecke
2016-02-12 15:26                                                               ` Mike Snitzer
2016-02-12 16:04                                                                 ` Hannes Reinecke
2016-02-12 18:00                                                                   ` Mike Snitzer
2016-02-15  6:47                                                                     ` Hannes Reinecke
2016-01-26  1:49       ` dm-multipath low performance with blk-mq Benjamin Marzinski
2016-01-26 16:03       ` Mike Snitzer
2016-01-26 16:44         ` Christoph Hellwig
2016-01-27  2:09           ` Mike Snitzer
2016-01-27 11:10             ` Sagi Grimberg
2016-01-26 21:40         ` Benjamin Marzinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160205191909.GA25982@redhat.com \
    --to=snitzer@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bart.vanassche@sandisk.com \
    --cc=dm-devel@redhat.com \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagig@dev.mellanox.co.il \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).