linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: Paolo Valente <paolo.valente@linaro.org>,
	Bart Van Assche <Bart.VanAssche@wdc.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	dm-devel@redhat.com
Subject: Re: dm mpath: switch IO scheduler of underlying paths to "none" [was: Re: BFQ + dm-mpath]
Date: Fri, 8 Sep 2017 13:07:28 -0400	[thread overview]
Message-ID: <20170908170727.GA29318@redhat.com> (raw)
In-Reply-To: <113c15ac-75f5-ff5b-695d-920dc6d5f708@kernel.dk>

On Fri, Sep 08 2017 at 12:48pm -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 09/08/2017 10:41 AM, Mike Snitzer wrote:
> > On Fri, Sep 08 2017 at  5:13P -0400,
> > Paolo Valente <paolo.valente@linaro.org> wrote:
> > 
> >>
> >>> Il giorno 07 set 2017, alle ore 17:52, Mike Snitzer <snitzer@redhat.com> ha scritto:
> >>>
> >>> On Tue, Sep 05 2017 at 10:15am -0400,
> >>> Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> >>>
> >>>> On Tue, 2017-09-05 at 09:56 +0200, Paolo Valente wrote:
> >>>>> Ok, my suspects seem confirmed: the path dm_mq_queue_rq -> map_request
> >>>>> -> setup_clone -> blk_rq_prep_clone creates a cloned request without
> >>>>> invoking e->type->ops.mq.prepare_request for the target elevator e.
> >>>>> The cloned request is therefore not initialized for the scheduler, but
> >>>>> it is however inserted into the scheduler by
> >>>>> blk_mq_sched_insert_request.  This seems an error for any scheduler
> >>>>> that needs to initialize fields in the incoming request, or in general
> >>>>> to take some preliminary action.
> >>>>>
> >>>>> Am I missing something here?
> >>>>
> >>>> (+Mike Snitzer)
> >>>>
> >>>> Mike, do you perhaps have the time to look into this memory leak?
> >>>
> >>> It isn't a memory leak, it is missing initialization in the case of
> >>> cloned requests (if I'm understanding Paolo correctly).
> >>>
> >>
> >> Exactly!
> >>
> >>> But cloned requests shouldn't be going through the scheduler.  Only the
> >>> original requests should.
> >>>
> >>> Commit bd166ef18 ("blk-mq-sched: add framework for MQ capable IO
> >>> schedulers") switched from blk_mq_insert_request() to
> >>> blk_mq_sched_insert_request() and in doing so it opened dm-mpath up to
> >>> this bug.
> >>>
> >>> Could be we need to take steps to ensure the block layer still
> >>> supports bypassing the elevator by using direct insertion?
> >>>
> >>> Or blk_mq_sched_insert_request() needs updating to check if
> >>> e->type->ops.mq.prepare_request were actually performed and to fallback
> >>> to the !elevator case if not..
> >>>
> >>> Not sure on the fix, but I can look closer if others (like Jens or
> >>> Paolo) don't have quicker suggestions.
> >>>
> >>
> >> No quick suggestion from me :(
> >>
> >> Thanks for analyzing this bug,
> > 
> > Please see the following untested patch.  All
> > testing/review/comments/acks appreciated.
> > 
> > I elected to use elevator_change() rather than fiddle with adding a new
> > blk-mq elevator hook (e.g. ->request_prepared) to verify that each
> > blk-mq elevator enabled request did in fact get prepared.
> > 
> > Bart, please test this patch and reply with your review/feedback.
> > 
> > Jens, if you're OK with this solution please reply with your Ack and
> > I'll send it to Linus along with the rest of the handful of DM changes I
> > have for 4.14.
> 
> I am not - we used to have this elevator change functionality from
> inside the kernel, and finally got rid of it when certain drivers killed
> it. I don't want to be bringing it back.

Fine.

> Sounds like we have two issues here. One is that we run into issues with
> stacking IO schedulers, and the other is that we'd rather not have
> multiple schedulers in play for a stacked setup.
> 
> Maybe it'd be cleaner to have the dm-mq side of things not insert
> through the scheduler, but rather just FIFO on the target end?

That was how blk_insert_cloned_request() was before.  From the patch
header (you may have missed it):

"Commit bd166ef18 ("blk-mq-sched: add framework for MQ capable IO
schedulers") switched blk_insert_cloned_request() from using
blk_mq_insert_request() to blk_mq_sched_insert_request().  Which
incorrectly added elevator machinery into a call chain that isn't
supposed to have any."

So shouldn't blk_insert_cloned_request() be made to _not_ use
blk_mq_sched_insert_request()?  We'd need a new block interface
established, or equivalent open-coded in blk_insert_cloned_request(),
to handle direct dispatch to an mq request_queue's queue(s).

Mike

  reply	other threads:[~2017-09-08 17:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-24 23:16 BFQ + dm-mpath Bart Van Assche
2017-08-30 16:31 ` Paolo Valente
2017-09-05  7:56   ` Paolo Valente
2017-09-05 14:15     ` Bart Van Assche
2017-09-07 15:52       ` Mike Snitzer
2017-09-08  9:13         ` Paolo Valente
2017-09-08 16:41           ` [PATCH] dm mpath: switch IO scheduler of underlying paths to "none" [was: Re: BFQ + dm-mpath] Mike Snitzer
2017-09-08 16:48             ` Jens Axboe
2017-09-08 17:07               ` Mike Snitzer [this message]
2017-09-08 19:58                 ` Mike Snitzer
2017-09-08 20:28                   ` Jens Axboe
2017-09-08 21:42                     ` [PATCH] block: directly insert blk-mq request from blk_insert_cloned_request() Mike Snitzer
2017-09-08 21:50                       ` Jens Axboe
2017-09-08 22:03                         ` Mike Snitzer
2017-09-11 16:16                           ` [PATCH v2] " Mike Snitzer
2017-09-11 20:51                             ` Jens Axboe
2017-09-11 21:13                               ` Mike Snitzer
2017-09-11 21:27                                 ` Jens Axboe
2017-09-11 21:51                                   ` Mike Snitzer
2017-09-11 22:30                                     ` Mike Snitzer
2017-09-11 22:43                                       ` Jens Axboe
2017-09-14 15:57                                   ` Ming Lei
2017-09-14 16:30                                     ` Jens Axboe
2017-09-14 16:33                                       ` Ming Lei
2017-09-14 16:34                                         ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170908170727.GA29318@redhat.com \
    --to=snitzer@redhat.com \
    --cc=Bart.VanAssche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=paolo.valente@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).