Re: [PATCH 2/6] blk-mq: replace timeout synchronization with a RCU and generation based scheme

linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"kernel-team@fb.com" <kernel-team@fb.com>,
	"oleg@redhat.com" <oleg@redhat.com>, "hch@lst.de" <hch@lst.de>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"jianchao.w.wang@oracle.com" <jianchao.w.wang@oracle.com>,
	"osandov@fb.com" <osandov@fb.com>,
	"tj@kernel.org" <tj@kernel.org>
Subject: Re: [PATCH 2/6] blk-mq: replace timeout synchronization with a RCU and generation based scheme
Date: Thu, 14 Dec 2017 22:54:04 +0100	[thread overview]
Message-ID: <20171214215404.GK3326@worktop> (raw)
In-Reply-To: <1513287766.2475.73.camel@wdc.com>

On Thu, Dec 14, 2017 at 09:42:48PM +0000, Bart Van Assche wrote:
> On Thu, 2017-12-14 at 21:20 +0100, Peter Zijlstra wrote:
> > On Thu, Dec 14, 2017 at 06:51:11PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-12-12 at 11:01 -0800, Tejun Heo wrote:
> > > > +	write_seqcount_begin(&rq->gstate_seq);
> > > > +	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
> > > > +	blk_add_timer(rq);
> > > > +	write_seqcount_end(&rq->gstate_seq);
> > > 
> > > My understanding is that both write_seqcount_begin() and write_seqcount_end()
> > > trigger a write memory barrier. Is a seqcount really faster than a spinlock?
> > 
> > Yes lots, no atomic operations and no waiting.
> > 
> > The only constraint for write_seqlock is that there must not be any
> > concurrency.
> > 
> > But now that I look at this again, TJ, why can't the below happen?
> > 
> > 	write_seqlock_begin();
> > 	blk_mq_rq_update_state(rq, IN_FLIGHT);
> > 	blk_add_timer(rq);
> > 	<timer-irq>
> > 		read_seqcount_begin()
> > 			while (seq & 1)
> > 				cpurelax();
> > 		// life-lock
> > 	</timer-irq>
> > 	write_seqlock_end();
> 
> Hello Peter,
> 
> Some time ago the block layer was changed to handle timeouts in thread context
> instead of interrupt context. See also commit 287922eb0b18 ("block: defer
> timeouts to a workqueue").

That only makes it a little better:

	Task-A					Worker

	write_seqcount_begin()
	blk_mq_rw_update_state(rq, IN_FLIGHT)
	blk_add_timer(rq)
	<timer>
		schedule_work()
	</timer>
	<context-switch to worker>
						read_seqcount_begin()
							while(seq & 1)
								cpu_relax();


Now normally this isn't fatal because Worker will simply spin its entire
time slice away and we'll eventually schedule our Task-A back in, which
will complete the seqcount and things will work.

But if, for some reason, our Worker was to have RT priority higher than
our Task-A we'd be up some creek without no paddles.

We don't happen to have preemption of IRQs off here? That would fix
things nicely.

next prev parent reply	other threads:[~2017-12-14 21:54 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-12 19:01 [PATCHSET v2] blk-mq: reimplement timeout handling Tejun Heo
2017-12-12 19:01 ` [PATCH 1/6] blk-mq: protect completion path with RCU Tejun Heo
2017-12-13  3:30   ` jianchao.wang
2017-12-13 16:13     ` Tejun Heo
2017-12-14  2:09       ` jianchao.wang
2017-12-14 17:01   ` Bart Van Assche
2017-12-14 18:14     ` tj
2017-12-12 19:01 ` [PATCH 2/6] blk-mq: replace timeout synchronization with a RCU and generation based scheme Tejun Heo
2017-12-12 21:37   ` Bart Van Assche
2017-12-12 21:44     ` tj
2017-12-13  5:07   ` jianchao.wang
2017-12-13 16:13     ` Tejun Heo
2017-12-14 18:51   ` Bart Van Assche
2017-12-14 19:19     ` tj
2017-12-14 21:13       ` Bart Van Assche
2017-12-15 13:30         ` tj
2017-12-14 20:20     ` Peter Zijlstra
2017-12-14 21:42       ` Bart Van Assche
2017-12-14 21:54         ` Peter Zijlstra [this message]
2017-12-15  2:12           ` jianchao.wang
2017-12-15  7:31             ` Peter Zijlstra
2017-12-15 15:14               ` jianchao.wang
2017-12-15  2:39           ` Mike Galbraith
2017-12-15 13:50       ` tj
2017-12-12 19:01 ` [PATCH 3/6] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE Tejun Heo
2017-12-12 19:01 ` [PATCH 4/6] blk-mq: make blk_abort_request() trigger timeout path Tejun Heo
2017-12-14 18:56   ` Bart Van Assche
2017-12-14 19:26     ` tj
2017-12-12 19:01 ` [PATCH 5/6] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq Tejun Heo
2017-12-12 19:01 ` [PATCH 6/6] blk-mq: remove REQ_ATOM_STARTED Tejun Heo
2017-12-12 22:20   ` Bart Van Assche
2017-12-12 22:22     ` tj
2017-12-12 20:23 ` [PATCHSET v2] blk-mq: reimplement timeout handling Jens Axboe
2017-12-12 21:40   ` Tejun Heo
2017-12-20 23:41 ` Bart Van Assche
2017-12-21  0:08   ` tj
2017-12-21  1:00     ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171214215404.GK3326@worktop \
    --to=peterz@infradead.org \
    --cc=Bart.VanAssche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jianchao.w.wang@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=osandov@fb.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).