public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Tejun Heo <tj@kernel.org>, Jeff Moyer <jmoyer@redhat.com>,
	linux-kernel@vger.kernel.org, Jens Axboe <jaxboe@fusionio.com>,
	dm-devel@redhat.com
Subject: Re: block: properly handle flush/fua requests in blk_insert_cloned_request
Date: Tue, 9 Aug 2011 14:55:31 -0400	[thread overview]
Message-ID: <20110809185531.GC13293@redhat.com> (raw)
In-Reply-To: <20110809175237.GA978@redhat.com>

On Tue, Aug 09 2011 at  1:52pm -0400,
Vivek Goyal <vgoyal@redhat.com> wrote:

> On Tue, Aug 09, 2011 at 01:43:47PM -0400, Mike Snitzer wrote:
> > On Tue, Aug 09 2011 at 12:13pm -0400,
> > Tejun Heo <tj@kernel.org> wrote:
> > 
> > > Hello,
> > > 
> > > On Tue, Aug 09, 2011 at 11:53:51AM -0400, Jeff Moyer wrote:
> > > > Tejun Heo <tj@kernel.org> writes:
> > > > > I'm a bit confused.  We still need ELEVATOR_INSERT_FLUSH fix for
> > > > > insertion paths, right?  Or is blk_insert_cloned_request() supposed to
> > > > > used only by request based dm which lives under the elevator?  If so,
> > > > > it would be great to make that explicit in the comment.  Maybe just
> > > > > renaming it to blk_insert_dm_cloned_request() would be better as it
> > > > > wouldn't be safe for other cases anyway.
> > > > 
> > > > request-based dm is the only caller at present.  I'm not a fan of
> > > > renaming the function, but I'm more than willing to comment it.
> > > 
> > > I'm still confused and don't think the patch is correct (you can't
> > > turn off REQ_FUA without decomposing it to data + post flush).
> > > 
> > > Going through flush machinery twice is okay and I think is the right
> > > thing to do.  At the upper queue, the request is decomposed to member
> > > requests.  After decomposition, it's either REQ_FLUSH w/o data or data
> > > request w/ or w/o REQ_FUA.  When the decomposed request reaches lower
> > > queue, the lower queue will then either short-circuit it, execute
> > > as-is or decompose data w/ REQ_FUA into data + REQ_FLUSH sequence.
> > > 
> > > AFAICS, the breakages are...
> > > 
> > > * ELEVATOR_INSERT_FLUSH not used properly from insert paths.
> > > 
> > > * Short circuit not kicking in for the dm requests. (the above and the
> > >   policy patch should solve this, right?)
> > > 
> > > * BUG(!rq->bio || ...) in blk_insert_flush().  I think we can lift
> > >   this restriction for empty REQ_FLUSH but also dm can just send down
> > >   requests with empty bio.
> > 
> > [cc'ing dm-devel]
> > 
> > All of these issues have come to light because DM was not setting
> > flush_flags based on the underlying device(s).  Now fixed in v3.1-rc1:
> > ed8b752 dm table: set flush capability based on underlying devices
> > 
> > Given that commit, and that request-based DM is beneath the elevator, it
> > seems any additional effort to have DM flushes re-enter the flush
> > machinary is unnecessary.
> > 
> > We expect:
> > 1) flushes to have gone through the flush machinary
> > 2) no FLUSH/FUA should be entering underlying queues if not supported
> > 
> > I think it best to just document the expectation that any FLUSH/FUA
> > request that enters blk_insert_cloned_request() will already match the
> > queue that the request is being sent to.  One way to document it is to
> > change Jeff's flag striping in to pure BUG_ON()s, e.g.:
> > 
> > ---
> >  block/blk-core.c |    8 ++++++++
> >  1 files changed, 8 insertions(+), 0 deletions(-)
> > 
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index b627558..201bb27 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -1710,6 +1710,14 @@ int blk_insert_cloned_request(struct request_queue *q, struct request *rq)
> >  	    should_fail_request(&rq->rq_disk->part0, blk_rq_bytes(rq)))
> >  		return -EIO;
> >  
> > +	/*
> > +	 * All FLUSH/FUA requests are expected to have gone through the
> > +	 * flush machinary.  If a request's cmd_flags doesn't match the
> > +	 * flush_flags of the underlying request_queue it is a bug.
> > +	 */
> > +	BUG_ON((rq->cmd_flags & REQ_FLUSH) && !(q->flush_flags & REQ_FLUSH));
> > +	BUG_ON((rq->cmd_flags & REQ_FUA) && !(q->flush_flags & REQ_FUA));
> > +
> 
> Actually this makes sense and is simple. :-) Is BUG_ON() too harsh, how
> about WARN_ONCE() variants? To me system continues to work so warning 
> is probably good enough.

Sure, WARN_ONCE() is fine by me.

Seems Tejun wants a more involved fix though.

  parent reply	other threads:[~2011-08-09 18:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-09 15:05 [patch] block: properly handle flush/fua requests in blk_insert_cloned_request Jeff Moyer
2011-08-09 15:38 ` Tejun Heo
2011-08-09 15:53   ` Jeff Moyer
2011-08-09 16:13     ` Tejun Heo
2011-08-09 16:19       ` Tejun Heo
2011-08-09 17:43       ` Mike Snitzer
2011-08-09 17:51         ` Tejun Heo
2011-08-09 18:33           ` Mike Snitzer
2011-08-09 17:52         ` Vivek Goyal
2011-08-09 17:55           ` Tejun Heo
2011-08-09 18:55           ` Mike Snitzer [this message]
2011-08-09 19:05             ` Vivek Goyal
2011-08-10 15:49               ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110809185531.GC13293@redhat.com \
    --to=snitzer@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=jaxboe@fusionio.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox