All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
	dm-devel@redhat.com, NeilBrown <neilb@suse.com>
Subject: Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
Date: Mon, 21 Jan 2019 22:35:11 -0500	[thread overview]
Message-ID: <20190122033510.GA7621@redhat.com> (raw)
In-Reply-To: <20190122031758.GA7574@redhat.com>

On Mon, Jan 21 2019 at 10:17pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Mon, Jan 21 2019 at  9:46pm -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > > On Sun, Jan 20 2019 at 10:21P -0500,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > > 
> > > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > > recursing via generic_make_request().
> > > > > 
> > > > > Also add trace_block_split() because it provides useful context about
> > > > > bio splits in blktrace.
> > > > > 
> > > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > > Cc: stable@vger.kernel.org # 4.16+
> > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > > ---
> > > > >  drivers/md/dm.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > > >  				part_stat_unlock();
> > > > >  
> > > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > > >  				bio_chain(b, bio);
> > > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > > >  				ret = generic_make_request(bio);
> > > > >  				break;
> > > > >  			}
> > > > 
> > > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > > this way might cause trouble on operation to q->q_usage_counter.
> > > 
> > > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > > dm_make_request().
> > > 
> > > And to Neil's point: yes, these changes really do need to made
> > > common since it appears all bio_split() callers do go on to call
> > > generic_make_request().
> > > 
> > > Anyway, here is the updated patch that is now staged in linux-next:
> > > 
> > > From: Mike Snitzer <snitzer@redhat.com>
> > > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > > 
> > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > recursing via generic_make_request().
> > > 
> > > Also add trace_block_split() because it provides useful context about
> > > bio splits in blktrace.
> > > 
> > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  drivers/md/dm.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fbadda68e23b..25884f833a32 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > >  				part_stat_unlock();
> > >  
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  				bio_chain(b, bio);
> > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > >  				ret = generic_make_request(bio);
> > >  				break;
> > >  			}
> > > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> > >  
> > >  	map = dm_get_live_table(md, &srcu_idx);
> > >  
> > > +	/*
> > > +	 * Clear the bio-reentered-generic_make_request() flag,
> > > +	 * will be set again as needed if bio needs to be split.
> > > +	 */
> > > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > > +
> > >  	/* if we're suspended, we have to queue this io for later */
> > >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> > >  		dm_put_live_table(md, srcu_idx);
> > > -- 
> > > 2.15.0
> > > 
> > 
> > Hi Mike,
> > 
> > I'd suggest to fix this kind issue in the following way, then we
> > can avoid to touch this flag from drivers:
> > 
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index 3c5f61ceeb67..e70103560ac2 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  		else
> >  			bio_io_error(bio);
> >  		return ret;
> > +	} else {
> > +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  	}
> >  
> >  	if (!generic_make_request_checks(bio))
> > @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  			if (blk_queue_enter(q, flags) < 0) {
> >  				enter_succeeded = false;
> >  				q = NULL;
> > +			} else {
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  			}
> >  		}
> >  
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index b990853f6de7..8777e286bd3f 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
> >  		/* there isn't chance to merge the splitted bio */
> >  		split->bi_opf |= REQ_NOMERGE;
> >  
> > -		/*
> > -		 * Since we're recursing into make_request here, ensure
> > -		 * that we mark this bio as already having entered the queue.
> > -		 * If not, and the queue is going away, we can get stuck
> > -		 * forever on waiting for the queue reference to drop. But
> > -		 * that will never happen, as we're already holding a
> > -		 * reference to it.
> > -		 */
> > -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> > -
> >  		bio_chain(split, *bio);
> >  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
> >  		generic_make_request(*bio);
> > 
> 
> Not opposed to this.

But thinking further: when you have a stack of cascading
q->make_request_fn it could easily be that work done the next layer
down end up causing the bio to recurse to generic_make_request() but not
directly (e.g. dm_wq_work)... yet BIO_QUEUE_ENTERED will still be set
when it really isn't appropriate.

Getting too cute with setting bio flags but not clearing them on
different device boundaries could render the flags useless (or worse:
incorrect).

I'm not out for enaging in a focused audit/churn in this area that
becomes a slippery slope during the rest of 5.0-rcX.

That is why I was going for a local DM change for 5.0 and, in parallel,
work on the more generic fixes for 5.1.

So I'm back to preferring that...

But if you, Jens or others feel strongly about it I'm open to discuss it
further.

Think we need to set REQ_NOMERGE in the split too (like
blk_queue_split() is doing).  Again, a comprehensive cleanup and
consolidation of bio_split+generic_make_request pattern is needed.  MD
has a lot of it, DM has it, and then there is blk_queue_split().
Basically blk_queue_split()'s bio_split+bio_chain+generic_make_request
and all the flags that get set inbetween should be factored out for all
to use.

Mike

WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: dm-devel@redhat.com, NeilBrown <neilb@suse.com>,
	axboe@kernel.dk, linux-block@vger.kernel.org
Subject: Re: [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()
Date: Mon, 21 Jan 2019 22:35:11 -0500	[thread overview]
Message-ID: <20190122033510.GA7621@redhat.com> (raw)
In-Reply-To: <20190122031758.GA7574@redhat.com>

On Mon, Jan 21 2019 at 10:17pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Mon, Jan 21 2019 at  9:46pm -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > > On Sun, Jan 20 2019 at 10:21P -0500,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > > 
> > > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > > recursing via generic_make_request().
> > > > > 
> > > > > Also add trace_block_split() because it provides useful context about
> > > > > bio splits in blktrace.
> > > > > 
> > > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > > Cc: stable@vger.kernel.org # 4.16+
> > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > > ---
> > > > >  drivers/md/dm.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > > >  				part_stat_unlock();
> > > > >  
> > > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > > >  				bio_chain(b, bio);
> > > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > > >  				ret = generic_make_request(bio);
> > > > >  				break;
> > > > >  			}
> > > > 
> > > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > > this way might cause trouble on operation to q->q_usage_counter.
> > > 
> > > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > > dm_make_request().
> > > 
> > > And to Neil's point: yes, these changes really do need to made
> > > common since it appears all bio_split() callers do go on to call
> > > generic_make_request().
> > > 
> > > Anyway, here is the updated patch that is now staged in linux-next:
> > > 
> > > From: Mike Snitzer <snitzer@redhat.com>
> > > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > > 
> > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > recursing via generic_make_request().
> > > 
> > > Also add trace_block_split() because it provides useful context about
> > > bio splits in blktrace.
> > > 
> > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  drivers/md/dm.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fbadda68e23b..25884f833a32 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > >  				part_stat_unlock();
> > >  
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  				bio_chain(b, bio);
> > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > >  				ret = generic_make_request(bio);
> > >  				break;
> > >  			}
> > > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> > >  
> > >  	map = dm_get_live_table(md, &srcu_idx);
> > >  
> > > +	/*
> > > +	 * Clear the bio-reentered-generic_make_request() flag,
> > > +	 * will be set again as needed if bio needs to be split.
> > > +	 */
> > > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > > +
> > >  	/* if we're suspended, we have to queue this io for later */
> > >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> > >  		dm_put_live_table(md, srcu_idx);
> > > -- 
> > > 2.15.0
> > > 
> > 
> > Hi Mike,
> > 
> > I'd suggest to fix this kind issue in the following way, then we
> > can avoid to touch this flag from drivers:
> > 
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index 3c5f61ceeb67..e70103560ac2 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  		else
> >  			bio_io_error(bio);
> >  		return ret;
> > +	} else {
> > +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  	}
> >  
> >  	if (!generic_make_request_checks(bio))
> > @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  			if (blk_queue_enter(q, flags) < 0) {
> >  				enter_succeeded = false;
> >  				q = NULL;
> > +			} else {
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  			}
> >  		}
> >  
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index b990853f6de7..8777e286bd3f 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
> >  		/* there isn't chance to merge the splitted bio */
> >  		split->bi_opf |= REQ_NOMERGE;
> >  
> > -		/*
> > -		 * Since we're recursing into make_request here, ensure
> > -		 * that we mark this bio as already having entered the queue.
> > -		 * If not, and the queue is going away, we can get stuck
> > -		 * forever on waiting for the queue reference to drop. But
> > -		 * that will never happen, as we're already holding a
> > -		 * reference to it.
> > -		 */
> > -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> > -
> >  		bio_chain(split, *bio);
> >  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
> >  		generic_make_request(*bio);
> > 
> 
> Not opposed to this.

But thinking further: when you have a stack of cascading
q->make_request_fn it could easily be that work done the next layer
down end up causing the bio to recurse to generic_make_request() but not
directly (e.g. dm_wq_work)... yet BIO_QUEUE_ENTERED will still be set
when it really isn't appropriate.

Getting too cute with setting bio flags but not clearing them on
different device boundaries could render the flags useless (or worse:
incorrect).

I'm not out for enaging in a focused audit/churn in this area that
becomes a slippery slope during the rest of 5.0-rcX.

That is why I was going for a local DM change for 5.0 and, in parallel,
work on the more generic fixes for 5.1.

So I'm back to preferring that...

But if you, Jens or others feel strongly about it I'm open to discuss it
further.

Think we need to set REQ_NOMERGE in the split too (like
blk_queue_split() is doing).  Again, a comprehensive cleanup and
consolidation of bio_split+generic_make_request pattern is needed.  MD
has a lot of it, DM has it, and then there is blk_queue_split().
Basically blk_queue_split()'s bio_split+bio_chain+generic_make_request
and all the flags that get set inbetween should be factored out for all
to use.

Mike

  reply	other threads:[~2019-01-22  3:35 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-19 18:05 [PATCH 0/4] dm: fix various issues with bio splitting code Mike Snitzer
2019-01-19 18:05 ` Mike Snitzer
2019-01-19 18:05 ` [PATCH 1/4] dm: fix clone_bio() to trigger blk_recount_segments() Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer
2019-01-21  3:25   ` Ming Lei
2019-01-21  3:25     ` Ming Lei
2019-01-19 18:05 ` [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer
2019-01-21  3:52   ` Ming Lei
2019-01-21  3:52     ` Ming Lei
2019-01-22 15:55   ` Sasha Levin
2019-01-19 18:05 ` [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio() Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer
2019-01-21  3:21   ` Ming Lei
2019-01-21  3:21     ` Ming Lei
2019-01-21 16:02     ` Mike Snitzer
2019-01-21 16:02       ` Mike Snitzer
2019-01-22  2:46       ` Ming Lei
2019-01-22  2:46         ` Ming Lei
2019-01-22  3:17         ` Mike Snitzer
2019-01-22  3:17           ` Mike Snitzer
2019-01-22  3:35           ` Mike Snitzer [this message]
2019-01-22  3:35             ` Mike Snitzer
2019-01-22  3:49             ` Ming Lei
2019-01-22  3:49               ` Ming Lei
2019-01-21  4:39   ` NeilBrown
2019-01-21  4:39     ` [dm-devel] " NeilBrown
2019-01-22 15:56   ` Sasha Levin
2019-01-19 18:05 ` [PATCH 4/4] dm: fix dm_wq_work() to only use __split_and_process_bio() if appropriate Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190122033510.GA7621@redhat.com \
    --to=snitzer@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=neilb@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.