Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: Bart Van Assche <Bart.VanAssche@wdc.com>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"snitzer@redhat.com" <snitzer@redhat.com>
Subject: Re: [PATCH] dm rq: Avoid that request processing stalls sporadically
Date: Fri, 19 Jan 2018 14:36:48 +0800	[thread overview]
Message-ID: <20180119063647.GA25369@ming.t460p> (raw)
In-Reply-To: <91af852d-cb1b-f306-d899-d72de4023d8d@kernel.dk>

On Thu, Jan 18, 2018 at 05:49:10PM -0700, Jens Axboe wrote:
> On 1/18/18 5:35 PM, Ming Lei wrote:
> > On Thu, Jan 18, 2018 at 05:24:29PM -0700, Jens Axboe wrote:
> >> On 1/18/18 5:18 PM, Ming Lei wrote:
> >>> On Fri, Jan 19, 2018 at 12:14:24AM +0000, Bart Van Assche wrote:
> >>>> On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote:
> >>>>> On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote:
> >>>>>> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> >>>>>> index f16096af879a..c59c59cfd2a5 100644
> >>>>>> --- a/drivers/md/dm-rq.c
> >>>>>> +++ b/drivers/md/dm-rq.c
> >>>>>> @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
> >>>>>>  		/* Undo dm_start_request() before requeuing */
> >>>>>>  		rq_end_stats(md, rq);
> >>>>>>  		rq_completed(md, rq_data_dir(rq), false);
> >>>>>> +		blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
> >>>>>>  		return BLK_STS_RESOURCE;
> >>>>>>  	}
> >>>>>>  
> >>>>>
> >>>>> Nak.
> >>>>
> >>>> This patch fixes a regression that was introduced by you. You should know
> >>>> that regressions are not acceptable. If you don't agree with this patch,
> >>>> please fix the root cause.
> >>>
> >>> Yesterday I sent a patch, did you test that?
> >>
> >> That patch isn't going to be of much help. It might prevent you from
> >> completely stalling, but it might take you tens of seconds to get there.
> > 
> > Yeah, that is true, and why I marked it as RFC, but the case is so rare to
> > happen.
> 
> You don't know that. If the resource is very limited, or someone else
> is gobbling up all of it, then it may trigger quite often. Granted,
> in that case, you need some way of signaling the freeing of those
> resources to make it efficient.
> 
> >> On top of that, it's a rolling timer that gets added when IO is queued,
> >> and re-added if IO is pending when it fires. If the latter case is not
> >> true, then it won't get armed again. So if IO fails to queue without
> >> any other IO pending, you're pretty much in the same situation as if
> > 
> > No IO pending detection may take a bit time, we can do it after
> > BLK_STS_RESOURCE is returned and SCHED_RESTART is set. I have done
> > this way in the following patch, what do you think of it?
> > 
> > https://github.com/ming1/linux/commit/f866ce0b97b0ae22d033881da7eb07706fd458b4
> 
> I think it's overly complicated. As I wrote in a lengthier email earlier
> today, just ensure that we have a unique return code from the driver for
> when it aborts queuing an IO due to a resource shortage that isn't tied
> to it's own consumption of it. When we get that return, do a delayed
> queue run after X amount of time to ensure we always try again. If you
> want to get fancy (later on), you could track the frequency of such
> returns and complain if it's too high.

Suppose the new introduced return code is BLK_STS_NO_DEV_RESOURCE.

This way may degrade performance, for example, DM-MPATH returns
BLK_STS_NO_DEV_RESOURCE when blk_get_request() returns NULL, blk-mq
handles it by blk_mq_delay_run_hw_queue(10ms). Then blk_mq_sched_restart()
is just triggered when one DM-MPATH request is completed, that means one
request of underlying queue is completed too, but blk_mq_sched_restart()
still can't make progress during the 10ms.

That means we should only run blk_mq_delay_run_hw_queue(10ms/100ms) when
the queue is idle.

I will post out the patch in github for discussion.

Thanks,
Ming

     prev parent reply	other threads:[~2018-01-19  6:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-18 16:37 [PATCH] dm rq: Avoid that request processing stalls sporadically Bart Van Assche
2018-01-18 16:50 ` Mike Snitzer
2018-01-18 17:13   ` Bart Van Assche
2018-01-19  0:26     ` Ming Lei
2018-01-19  0:33       ` Bart Van Assche
2018-01-19  0:11 ` [PATCH] " Ming Lei
2018-01-19  0:14   ` Bart Van Assche
2018-01-19  0:18     ` Ming Lei
2018-01-19  0:21       ` Bart Van Assche
2018-01-19  0:24       ` Jens Axboe
2018-01-19  0:35         ` Ming Lei
2018-01-19  0:49           ` Jens Axboe
2018-01-19  6:36             ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180119063647.GA25369@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=Bart.VanAssche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox