[PATCH] blk-mq: fix schedule-while-atomic with scheduler attached

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached
@ 2017-04-20 21:13 Jens Axboe
  2017-04-20 21:30 ` Omar Sandoval
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2017-04-20 21:13 UTC (permalink / raw)
  To: linux-block@vger.kernel.org; +Cc: Christoph Hellwig, Omar Sandoval

We must have dropped the ctx before we call
blk_mq_sched_insert_request() with can_block=true, otherwise we risk
that a flush request can block on insertion if we are currently out of
tags.

[   47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
[   47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
[   47.690572] Preemption disabled at:
[   47.690584] [<ffffffff81326c7c>] blk_mq_sched_get_request+0x6c/0x280
[   47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
[   47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
[   47.718081] Call Trace:
[   47.720903]  dump_stack+0x4f/0x73
[   47.724694]  ? blk_mq_sched_get_request+0x6c/0x280
[   47.730137]  __schedule_bug+0x6c/0xc0
[   47.734314]  __schedule+0x559/0x780
[   47.738302]  schedule+0x3b/0x90
[   47.741899]  io_schedule+0x11/0x40
[   47.745788]  blk_mq_get_tag+0x167/0x2a0
[   47.750162]  ? remove_wait_queue+0x70/0x70
[   47.754901]  blk_mq_get_driver_tag+0x92/0xf0
[   47.759758]  blk_mq_sched_insert_request+0x134/0x170
[   47.765398]  ? blk_account_io_start+0xd0/0x270
[   47.770679]  blk_mq_make_request+0x1b2/0x850
[   47.775766]  generic_make_request+0xf7/0x2d0
[   47.780860]  submit_bio+0x5f/0x120
[   47.784979]  ? submit_bio+0x5f/0x120
[   47.789631]  submit_bh_wbc.isra.46+0x10d/0x130
[   47.794902]  submit_bh+0xb/0x10
[   47.798719]  journal_submit_commit_record+0x190/0x210
[   47.804686]  ? _raw_spin_unlock+0x13/0x30
[   47.809480]  jbd2_journal_commit_transaction+0x180a/0x1d00
[   47.815925]  kjournald2+0xb6/0x250
[   47.820022]  ? kjournald2+0xb6/0x250
[   47.824328]  ? remove_wait_queue+0x70/0x70
[   47.829223]  kthread+0x10e/0x140
[   47.833147]  ? commit_timeout+0x10/0x10
[   47.837742]  ? kthread_create_on_node+0x40/0x40
[   47.843122]  ret_from_fork+0x29/0x40

Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
Signed-off-by: Jens Axboe <axboe@fb.com>

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9d7645f24b05..323eed50d615 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1634,8 +1634,10 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
 		return cookie;
 	} else if (q->elevator) {
+		blk_mq_put_ctx(data.ctx);
 		blk_mq_bio_to_request(rq, bio);
 		blk_mq_sched_insert_request(rq, false, true, true, true);
+		return cookie;
 	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
 		blk_mq_run_hw_queue(data.hctx, true);
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached
  2017-04-20 21:13 [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached Jens Axboe
@ 2017-04-20 21:30 ` Omar Sandoval
  2017-04-20 21:41   ` Jens Axboe
  2017-04-20 22:39   ` Jens Axboe
  0 siblings, 2 replies; 8+ messages in thread
From: Omar Sandoval @ 2017-04-20 21:30 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block@vger.kernel.org, Christoph Hellwig

On Thu, Apr 20, 2017 at 03:13:43PM -0600, Jens Axboe wrote:
> We must have dropped the ctx before we call
> blk_mq_sched_insert_request() with can_block=true, otherwise we risk
> that a flush request can block on insertion if we are currently out of
> tags.
> 
> [   47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
> [   47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
> [   47.690572] Preemption disabled at:
> [   47.690584] [<ffffffff81326c7c>] blk_mq_sched_get_request+0x6c/0x280
> [   47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
> [   47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
> [   47.718081] Call Trace:
> [   47.720903]  dump_stack+0x4f/0x73
> [   47.724694]  ? blk_mq_sched_get_request+0x6c/0x280
> [   47.730137]  __schedule_bug+0x6c/0xc0
> [   47.734314]  __schedule+0x559/0x780
> [   47.738302]  schedule+0x3b/0x90
> [   47.741899]  io_schedule+0x11/0x40
> [   47.745788]  blk_mq_get_tag+0x167/0x2a0
> [   47.750162]  ? remove_wait_queue+0x70/0x70
> [   47.754901]  blk_mq_get_driver_tag+0x92/0xf0
> [   47.759758]  blk_mq_sched_insert_request+0x134/0x170
> [   47.765398]  ? blk_account_io_start+0xd0/0x270
> [   47.770679]  blk_mq_make_request+0x1b2/0x850
> [   47.775766]  generic_make_request+0xf7/0x2d0
> [   47.780860]  submit_bio+0x5f/0x120
> [   47.784979]  ? submit_bio+0x5f/0x120
> [   47.789631]  submit_bh_wbc.isra.46+0x10d/0x130
> [   47.794902]  submit_bh+0xb/0x10
> [   47.798719]  journal_submit_commit_record+0x190/0x210
> [   47.804686]  ? _raw_spin_unlock+0x13/0x30
> [   47.809480]  jbd2_journal_commit_transaction+0x180a/0x1d00
> [   47.815925]  kjournald2+0xb6/0x250
> [   47.820022]  ? kjournald2+0xb6/0x250
> [   47.824328]  ? remove_wait_queue+0x70/0x70
> [   47.829223]  kthread+0x10e/0x140
> [   47.833147]  ? commit_timeout+0x10/0x10
> [   47.837742]  ? kthread_create_on_node+0x40/0x40
> [   47.843122]  ret_from_fork+0x29/0x40
> 
> Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
> Signed-off-by: Jens Axboe <axboe@fb.com>
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 9d7645f24b05..323eed50d615 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1634,8 +1634,10 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>  		return cookie;
>  	} else if (q->elevator) {
> +		blk_mq_put_ctx(data.ctx);
>  		blk_mq_bio_to_request(rq, bio);
>  		blk_mq_sched_insert_request(rq, false, true, true, true);
> +		return cookie;
>  	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
>  		blk_mq_run_hw_queue(data.hctx, true);
>  
> 

I'm confused, the first thing we check in make_request is:

	if (unlikely(is_flush_fua)) {
		blk_mq_bio_to_request(rq, bio);
		if (q->elevator) {
			blk_mq_sched_insert_request(rq, false, true, true,
					true);
		} else {
			blk_insert_flush(rq);
			blk_mq_run_hw_queue(data.hctx, true);
		}
	}

and can_block doesn't do anything in the !flush case, so shouldn't it be
changed in that one instead?

Alternatively, blk_mq_get_tag() knows how to drop the ctx before it
sleeps, but blk_mq_get_driver_tag() doesn't set data.ctx. I'm not sure
if that'll do what we want, and this is making my head hurt.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached
  2017-04-20 21:30 ` Omar Sandoval
@ 2017-04-20 21:41   ` Jens Axboe
  2017-04-20 21:47     ` Omar Sandoval
  2017-04-20 22:39   ` Jens Axboe
  1 sibling, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2017-04-20 21:41 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-block@vger.kernel.org, Christoph Hellwig

On 04/20/2017 03:30 PM, Omar Sandoval wrote:
> On Thu, Apr 20, 2017 at 03:13:43PM -0600, Jens Axboe wrote:
>> We must have dropped the ctx before we call
>> blk_mq_sched_insert_request() with can_block=true, otherwise we risk
>> that a flush request can block on insertion if we are currently out of
>> tags.
>>
>> [   47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
>> [   47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
>> [   47.690572] Preemption disabled at:
>> [   47.690584] [<ffffffff81326c7c>] blk_mq_sched_get_request+0x6c/0x280
>> [   47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
>> [   47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
>> [   47.718081] Call Trace:
>> [   47.720903]  dump_stack+0x4f/0x73
>> [   47.724694]  ? blk_mq_sched_get_request+0x6c/0x280
>> [   47.730137]  __schedule_bug+0x6c/0xc0
>> [   47.734314]  __schedule+0x559/0x780
>> [   47.738302]  schedule+0x3b/0x90
>> [   47.741899]  io_schedule+0x11/0x40
>> [   47.745788]  blk_mq_get_tag+0x167/0x2a0
>> [   47.750162]  ? remove_wait_queue+0x70/0x70
>> [   47.754901]  blk_mq_get_driver_tag+0x92/0xf0
>> [   47.759758]  blk_mq_sched_insert_request+0x134/0x170
>> [   47.765398]  ? blk_account_io_start+0xd0/0x270
>> [   47.770679]  blk_mq_make_request+0x1b2/0x850
>> [   47.775766]  generic_make_request+0xf7/0x2d0
>> [   47.780860]  submit_bio+0x5f/0x120
>> [   47.784979]  ? submit_bio+0x5f/0x120
>> [   47.789631]  submit_bh_wbc.isra.46+0x10d/0x130
>> [   47.794902]  submit_bh+0xb/0x10
>> [   47.798719]  journal_submit_commit_record+0x190/0x210
>> [   47.804686]  ? _raw_spin_unlock+0x13/0x30
>> [   47.809480]  jbd2_journal_commit_transaction+0x180a/0x1d00
>> [   47.815925]  kjournald2+0xb6/0x250
>> [   47.820022]  ? kjournald2+0xb6/0x250
>> [   47.824328]  ? remove_wait_queue+0x70/0x70
>> [   47.829223]  kthread+0x10e/0x140
>> [   47.833147]  ? commit_timeout+0x10/0x10
>> [   47.837742]  ? kthread_create_on_node+0x40/0x40
>> [   47.843122]  ret_from_fork+0x29/0x40
>>
>> Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
>> Signed-off-by: Jens Axboe <axboe@fb.com>
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 9d7645f24b05..323eed50d615 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -1634,8 +1634,10 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>>  		return cookie;
>>  	} else if (q->elevator) {
>> +		blk_mq_put_ctx(data.ctx);
>>  		blk_mq_bio_to_request(rq, bio);
>>  		blk_mq_sched_insert_request(rq, false, true, true, true);
>> +		return cookie;
>>  	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
>>  		blk_mq_run_hw_queue(data.hctx, true);
>>  
>>
> 
> I'm confused, the first thing we check in make_request is:
> 
> 	if (unlikely(is_flush_fua)) {
> 		blk_mq_bio_to_request(rq, bio);
> 		if (q->elevator) {
> 			blk_mq_sched_insert_request(rq, false, true, true,
> 					true);
> 		} else {
> 			blk_insert_flush(rq);
> 			blk_mq_run_hw_queue(data.hctx, true);
> 		}
> 	}
> 
> and can_block doesn't do anything in the !flush case, so shouldn't it be
> changed in that one instead?

It should be changed in both cases. But I do wonder why we're triggering
the bottom one, and not the top one, since this is clearly a flush.

> Alternatively, blk_mq_get_tag() knows how to drop the ctx before it
> sleeps, but blk_mq_get_driver_tag() doesn't set data.ctx. I'm not sure
> if that'll do what we want, and this is making my head hurt.

How about the below? Drop the ctx in all the cases. We only really want
to hold on to it for the case where we insert into the software queues.
But as long as we guarantee that all cases in that massive if/else
drops it, we should be fine, and we can skip the various places that
do manual returns to avoid having to hit the put_ctx/return cookie
case at the end.


diff --git a/block/blk-mq.c b/block/blk-mq.c
index dd9e8d6d471d..992f09772f8a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1571,6 +1571,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 
 	plug = current->plug;
 	if (unlikely(is_flush_fua)) {
+		blk_mq_put_ctx(data.ctx);
 		blk_mq_bio_to_request(rq, bio);
 		if (q->elevator) {
 			blk_mq_sched_insert_request(rq, false, true, true,
@@ -1582,6 +1583,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	} else if (plug && q->nr_hw_queues == 1) {
 		struct request *last = NULL;
 
+		blk_mq_put_ctx(data.ctx);
 		blk_mq_bio_to_request(rq, bio);
 
 		/*
@@ -1626,20 +1628,19 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 		if (same_queue_rq)
 			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
 					&cookie);
-
-		return cookie;
 	} else if (q->nr_hw_queues > 1 && is_sync) {
 		blk_mq_put_ctx(data.ctx);
 		blk_mq_bio_to_request(rq, bio);
 		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
-		return cookie;
 	} else if (q->elevator) {
+		blk_mq_put_ctx(data.ctx);
 		blk_mq_bio_to_request(rq, bio);
 		blk_mq_sched_insert_request(rq, false, true, true, true);
-	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
+	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio)) {
+		blk_mq_put_ctx(data.ctx);
 		blk_mq_run_hw_queue(data.hctx, true);
+	}
 
-	blk_mq_put_ctx(data.ctx);
 	return cookie;
 }
 

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached
  2017-04-20 21:41   ` Jens Axboe
@ 2017-04-20 21:47     ` Omar Sandoval
  2017-04-20 21:50       ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Omar Sandoval @ 2017-04-20 21:47 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block@vger.kernel.org, Christoph Hellwig

On Thu, Apr 20, 2017 at 03:41:21PM -0600, Jens Axboe wrote:
> On 04/20/2017 03:30 PM, Omar Sandoval wrote:
> > On Thu, Apr 20, 2017 at 03:13:43PM -0600, Jens Axboe wrote:
> >> We must have dropped the ctx before we call
> >> blk_mq_sched_insert_request() with can_block=true, otherwise we risk
> >> that a flush request can block on insertion if we are currently out of
> >> tags.
> >>
> >> [   47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
> >> [   47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
> >> [   47.690572] Preemption disabled at:
> >> [   47.690584] [<ffffffff81326c7c>] blk_mq_sched_get_request+0x6c/0x280
> >> [   47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
> >> [   47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
> >> [   47.718081] Call Trace:
> >> [   47.720903]  dump_stack+0x4f/0x73
> >> [   47.724694]  ? blk_mq_sched_get_request+0x6c/0x280
> >> [   47.730137]  __schedule_bug+0x6c/0xc0
> >> [   47.734314]  __schedule+0x559/0x780
> >> [   47.738302]  schedule+0x3b/0x90
> >> [   47.741899]  io_schedule+0x11/0x40
> >> [   47.745788]  blk_mq_get_tag+0x167/0x2a0
> >> [   47.750162]  ? remove_wait_queue+0x70/0x70
> >> [   47.754901]  blk_mq_get_driver_tag+0x92/0xf0
> >> [   47.759758]  blk_mq_sched_insert_request+0x134/0x170
> >> [   47.765398]  ? blk_account_io_start+0xd0/0x270
> >> [   47.770679]  blk_mq_make_request+0x1b2/0x850
> >> [   47.775766]  generic_make_request+0xf7/0x2d0
> >> [   47.780860]  submit_bio+0x5f/0x120
> >> [   47.784979]  ? submit_bio+0x5f/0x120
> >> [   47.789631]  submit_bh_wbc.isra.46+0x10d/0x130
> >> [   47.794902]  submit_bh+0xb/0x10
> >> [   47.798719]  journal_submit_commit_record+0x190/0x210
> >> [   47.804686]  ? _raw_spin_unlock+0x13/0x30
> >> [   47.809480]  jbd2_journal_commit_transaction+0x180a/0x1d00
> >> [   47.815925]  kjournald2+0xb6/0x250
> >> [   47.820022]  ? kjournald2+0xb6/0x250
> >> [   47.824328]  ? remove_wait_queue+0x70/0x70
> >> [   47.829223]  kthread+0x10e/0x140
> >> [   47.833147]  ? commit_timeout+0x10/0x10
> >> [   47.837742]  ? kthread_create_on_node+0x40/0x40
> >> [   47.843122]  ret_from_fork+0x29/0x40
> >>
> >> Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
> >> Signed-off-by: Jens Axboe <axboe@fb.com>
> >>
> >> diff --git a/block/blk-mq.c b/block/blk-mq.c
> >> index 9d7645f24b05..323eed50d615 100644
> >> --- a/block/blk-mq.c
> >> +++ b/block/blk-mq.c
> >> @@ -1634,8 +1634,10 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
> >>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
> >>  		return cookie;
> >>  	} else if (q->elevator) {
> >> +		blk_mq_put_ctx(data.ctx);
> >>  		blk_mq_bio_to_request(rq, bio);
> >>  		blk_mq_sched_insert_request(rq, false, true, true, true);
> >> +		return cookie;
> >>  	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
> >>  		blk_mq_run_hw_queue(data.hctx, true);
> >>  
> >>
> > 
> > I'm confused, the first thing we check in make_request is:
> > 
> > 	if (unlikely(is_flush_fua)) {
> > 		blk_mq_bio_to_request(rq, bio);
> > 		if (q->elevator) {
> > 			blk_mq_sched_insert_request(rq, false, true, true,
> > 					true);
> > 		} else {
> > 			blk_insert_flush(rq);
> > 			blk_mq_run_hw_queue(data.hctx, true);
> > 		}
> > 	}
> > 
> > and can_block doesn't do anything in the !flush case, so shouldn't it be
> > changed in that one instead?
> 
> It should be changed in both cases. But I do wonder why we're triggering
> the bottom one, and not the top one, since this is clearly a flush.
> 
> > Alternatively, blk_mq_get_tag() knows how to drop the ctx before it
> > sleeps, but blk_mq_get_driver_tag() doesn't set data.ctx. I'm not sure
> > if that'll do what we want, and this is making my head hurt.
> 
> How about the below? Drop the ctx in all the cases. We only really want
> to hold on to it for the case where we insert into the software queues.

So do we want to hold on to it or not when we insert into the software
queues? Because this change looks like it drops it before we do the
insert.

> But as long as we guarantee that all cases in that massive if/else
> drops it, we should be fine, and we can skip the various places that
> do manual returns to avoid having to hit the put_ctx/return cookie
> case at the end.
> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index dd9e8d6d471d..992f09772f8a 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1571,6 +1571,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  
>  	plug = current->plug;
>  	if (unlikely(is_flush_fua)) {
> +		blk_mq_put_ctx(data.ctx);
>  		blk_mq_bio_to_request(rq, bio);
>  		if (q->elevator) {
>  			blk_mq_sched_insert_request(rq, false, true, true,
> @@ -1582,6 +1583,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  	} else if (plug && q->nr_hw_queues == 1) {
>  		struct request *last = NULL;
>  
> +		blk_mq_put_ctx(data.ctx);
>  		blk_mq_bio_to_request(rq, bio);
>  
>  		/*
> @@ -1626,20 +1628,19 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  		if (same_queue_rq)
>  			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
>  					&cookie);
> -
> -		return cookie;
>  	} else if (q->nr_hw_queues > 1 && is_sync) {
>  		blk_mq_put_ctx(data.ctx);
>  		blk_mq_bio_to_request(rq, bio);
>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
> -		return cookie;
>  	} else if (q->elevator) {
> +		blk_mq_put_ctx(data.ctx);
>  		blk_mq_bio_to_request(rq, bio);
>  		blk_mq_sched_insert_request(rq, false, true, true, true);

If the scheduler doesn't have ->insert_request(), this'll insert into
the software queue.

> -	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))

This also inserts into the software queue.

> +	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio)) {
> +		blk_mq_put_ctx(data.ctx);
>  		blk_mq_run_hw_queue(data.hctx, true);
> +	}
>  
> -	blk_mq_put_ctx(data.ctx);
>  	return cookie;
>  }
>  
> 
> -- 
> Jens Axboe
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached
  2017-04-20 21:47     ` Omar Sandoval
@ 2017-04-20 21:50       ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2017-04-20 21:50 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-block@vger.kernel.org, Christoph Hellwig

On 04/20/2017 03:47 PM, Omar Sandoval wrote:
> On Thu, Apr 20, 2017 at 03:41:21PM -0600, Jens Axboe wrote:
>> On 04/20/2017 03:30 PM, Omar Sandoval wrote:
>>> On Thu, Apr 20, 2017 at 03:13:43PM -0600, Jens Axboe wrote:
>>>> We must have dropped the ctx before we call
>>>> blk_mq_sched_insert_request() with can_block=true, otherwise we risk
>>>> that a flush request can block on insertion if we are currently out of
>>>> tags.
>>>>
>>>> [   47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
>>>> [   47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
>>>> [   47.690572] Preemption disabled at:
>>>> [   47.690584] [<ffffffff81326c7c>] blk_mq_sched_get_request+0x6c/0x280
>>>> [   47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
>>>> [   47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
>>>> [   47.718081] Call Trace:
>>>> [   47.720903]  dump_stack+0x4f/0x73
>>>> [   47.724694]  ? blk_mq_sched_get_request+0x6c/0x280
>>>> [   47.730137]  __schedule_bug+0x6c/0xc0
>>>> [   47.734314]  __schedule+0x559/0x780
>>>> [   47.738302]  schedule+0x3b/0x90
>>>> [   47.741899]  io_schedule+0x11/0x40
>>>> [   47.745788]  blk_mq_get_tag+0x167/0x2a0
>>>> [   47.750162]  ? remove_wait_queue+0x70/0x70
>>>> [   47.754901]  blk_mq_get_driver_tag+0x92/0xf0
>>>> [   47.759758]  blk_mq_sched_insert_request+0x134/0x170
>>>> [   47.765398]  ? blk_account_io_start+0xd0/0x270
>>>> [   47.770679]  blk_mq_make_request+0x1b2/0x850
>>>> [   47.775766]  generic_make_request+0xf7/0x2d0
>>>> [   47.780860]  submit_bio+0x5f/0x120
>>>> [   47.784979]  ? submit_bio+0x5f/0x120
>>>> [   47.789631]  submit_bh_wbc.isra.46+0x10d/0x130
>>>> [   47.794902]  submit_bh+0xb/0x10
>>>> [   47.798719]  journal_submit_commit_record+0x190/0x210
>>>> [   47.804686]  ? _raw_spin_unlock+0x13/0x30
>>>> [   47.809480]  jbd2_journal_commit_transaction+0x180a/0x1d00
>>>> [   47.815925]  kjournald2+0xb6/0x250
>>>> [   47.820022]  ? kjournald2+0xb6/0x250
>>>> [   47.824328]  ? remove_wait_queue+0x70/0x70
>>>> [   47.829223]  kthread+0x10e/0x140
>>>> [   47.833147]  ? commit_timeout+0x10/0x10
>>>> [   47.837742]  ? kthread_create_on_node+0x40/0x40
>>>> [   47.843122]  ret_from_fork+0x29/0x40
>>>>
>>>> Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
>>>> Signed-off-by: Jens Axboe <axboe@fb.com>
>>>>
>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>> index 9d7645f24b05..323eed50d615 100644
>>>> --- a/block/blk-mq.c
>>>> +++ b/block/blk-mq.c
>>>> @@ -1634,8 +1634,10 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>>>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>>>>  		return cookie;
>>>>  	} else if (q->elevator) {
>>>> +		blk_mq_put_ctx(data.ctx);
>>>>  		blk_mq_bio_to_request(rq, bio);
>>>>  		blk_mq_sched_insert_request(rq, false, true, true, true);
>>>> +		return cookie;
>>>>  	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
>>>>  		blk_mq_run_hw_queue(data.hctx, true);
>>>>  
>>>>
>>>
>>> I'm confused, the first thing we check in make_request is:
>>>
>>> 	if (unlikely(is_flush_fua)) {
>>> 		blk_mq_bio_to_request(rq, bio);
>>> 		if (q->elevator) {
>>> 			blk_mq_sched_insert_request(rq, false, true, true,
>>> 					true);
>>> 		} else {
>>> 			blk_insert_flush(rq);
>>> 			blk_mq_run_hw_queue(data.hctx, true);
>>> 		}
>>> 	}
>>>
>>> and can_block doesn't do anything in the !flush case, so shouldn't it be
>>> changed in that one instead?
>>
>> It should be changed in both cases. But I do wonder why we're triggering
>> the bottom one, and not the top one, since this is clearly a flush.
>>
>>> Alternatively, blk_mq_get_tag() knows how to drop the ctx before it
>>> sleeps, but blk_mq_get_driver_tag() doesn't set data.ctx. I'm not sure
>>> if that'll do what we want, and this is making my head hurt.
>>
>> How about the below? Drop the ctx in all the cases. We only really want
>> to hold on to it for the case where we insert into the software queues.
> 
> So do we want to hold on to it or not when we insert into the software
> queues? Because this change looks like it drops it before we do the
> insert.

See below, just an optimization.

>> But as long as we guarantee that all cases in that massive if/else
>> drops it, we should be fine, and we can skip the various places that
>> do manual returns to avoid having to hit the put_ctx/return cookie
>> case at the end.
>>
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index dd9e8d6d471d..992f09772f8a 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -1571,6 +1571,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>  
>>  	plug = current->plug;
>>  	if (unlikely(is_flush_fua)) {
>> +		blk_mq_put_ctx(data.ctx);
>>  		blk_mq_bio_to_request(rq, bio);
>>  		if (q->elevator) {
>>  			blk_mq_sched_insert_request(rq, false, true, true,
>> @@ -1582,6 +1583,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>  	} else if (plug && q->nr_hw_queues == 1) {
>>  		struct request *last = NULL;
>>  
>> +		blk_mq_put_ctx(data.ctx);
>>  		blk_mq_bio_to_request(rq, bio);
>>  
>>  		/*
>> @@ -1626,20 +1628,19 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>  		if (same_queue_rq)
>>  			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
>>  					&cookie);
>> -
>> -		return cookie;
>>  	} else if (q->nr_hw_queues > 1 && is_sync) {
>>  		blk_mq_put_ctx(data.ctx);
>>  		blk_mq_bio_to_request(rq, bio);
>>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>> -		return cookie;
>>  	} else if (q->elevator) {
>> +		blk_mq_put_ctx(data.ctx);
>>  		blk_mq_bio_to_request(rq, bio);
>>  		blk_mq_sched_insert_request(rq, false, true, true, true);
> 
> If the scheduler doesn't have ->insert_request(), this'll insert into
> the software queue.

I forgot that Kyber uses the software queues. But we can't hold on to
it here, we just have to hope that we don't preempt for that case, and
insert on the "wrong" software queue. I'm guessing this should happen
very rarely, if ever, so it's not a huge deal.

>> -	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
> 
> This also inserts into the software queue.

Right, and we hold on to it here.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached
  2017-04-20 21:30 ` Omar Sandoval
  2017-04-20 21:41   ` Jens Axboe
@ 2017-04-20 22:39   ` Jens Axboe
  2017-04-20 22:41     ` Omar Sandoval
  1 sibling, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2017-04-20 22:39 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-block@vger.kernel.org, Christoph Hellwig

On 04/20/2017 03:30 PM, Omar Sandoval wrote:
> On Thu, Apr 20, 2017 at 03:13:43PM -0600, Jens Axboe wrote:
>> We must have dropped the ctx before we call
>> blk_mq_sched_insert_request() with can_block=true, otherwise we risk
>> that a flush request can block on insertion if we are currently out of
>> tags.
>>
>> [   47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
>> [   47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
>> [   47.690572] Preemption disabled at:
>> [   47.690584] [<ffffffff81326c7c>] blk_mq_sched_get_request+0x6c/0x280
>> [   47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
>> [   47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
>> [   47.718081] Call Trace:
>> [   47.720903]  dump_stack+0x4f/0x73
>> [   47.724694]  ? blk_mq_sched_get_request+0x6c/0x280
>> [   47.730137]  __schedule_bug+0x6c/0xc0
>> [   47.734314]  __schedule+0x559/0x780
>> [   47.738302]  schedule+0x3b/0x90
>> [   47.741899]  io_schedule+0x11/0x40
>> [   47.745788]  blk_mq_get_tag+0x167/0x2a0
>> [   47.750162]  ? remove_wait_queue+0x70/0x70
>> [   47.754901]  blk_mq_get_driver_tag+0x92/0xf0
>> [   47.759758]  blk_mq_sched_insert_request+0x134/0x170
>> [   47.765398]  ? blk_account_io_start+0xd0/0x270
>> [   47.770679]  blk_mq_make_request+0x1b2/0x850
>> [   47.775766]  generic_make_request+0xf7/0x2d0
>> [   47.780860]  submit_bio+0x5f/0x120
>> [   47.784979]  ? submit_bio+0x5f/0x120
>> [   47.789631]  submit_bh_wbc.isra.46+0x10d/0x130
>> [   47.794902]  submit_bh+0xb/0x10
>> [   47.798719]  journal_submit_commit_record+0x190/0x210
>> [   47.804686]  ? _raw_spin_unlock+0x13/0x30
>> [   47.809480]  jbd2_journal_commit_transaction+0x180a/0x1d00
>> [   47.815925]  kjournald2+0xb6/0x250
>> [   47.820022]  ? kjournald2+0xb6/0x250
>> [   47.824328]  ? remove_wait_queue+0x70/0x70
>> [   47.829223]  kthread+0x10e/0x140
>> [   47.833147]  ? commit_timeout+0x10/0x10
>> [   47.837742]  ? kthread_create_on_node+0x40/0x40
>> [   47.843122]  ret_from_fork+0x29/0x40
>>
>> Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
>> Signed-off-by: Jens Axboe <axboe@fb.com>
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 9d7645f24b05..323eed50d615 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -1634,8 +1634,10 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>>  		return cookie;
>>  	} else if (q->elevator) {
>> +		blk_mq_put_ctx(data.ctx);
>>  		blk_mq_bio_to_request(rq, bio);
>>  		blk_mq_sched_insert_request(rq, false, true, true, true);
>> +		return cookie;
>>  	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
>>  		blk_mq_run_hw_queue(data.hctx, true);
>>  
>>
> 
> I'm confused, the first thing we check in make_request is:
> 
> 	if (unlikely(is_flush_fua)) {
> 		blk_mq_bio_to_request(rq, bio);
> 		if (q->elevator) {
> 			blk_mq_sched_insert_request(rq, false, true, true,
> 					true);
> 		} else {
> 			blk_insert_flush(rq);
> 			blk_mq_run_hw_queue(data.hctx, true);
> 		}
> 	}
> 
> and can_block doesn't do anything in the !flush case, so shouldn't it be
> changed in that one instead?

Just to get closure on this issue, the two cases ends up being folded
into one. So we're really triggering the first case, but it's a jump
to the 2nd one.

Both cases should still be fixed up, the 2nd patch I sent out should be
fine.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached
  2017-04-20 22:39   ` Jens Axboe
@ 2017-04-20 22:41     ` Omar Sandoval
  2017-04-20 22:42       ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Omar Sandoval @ 2017-04-20 22:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block@vger.kernel.org, Christoph Hellwig

On Thu, Apr 20, 2017 at 04:39:10PM -0600, Jens Axboe wrote:
> On 04/20/2017 03:30 PM, Omar Sandoval wrote:
> > On Thu, Apr 20, 2017 at 03:13:43PM -0600, Jens Axboe wrote:
> >> We must have dropped the ctx before we call
> >> blk_mq_sched_insert_request() with can_block=true, otherwise we risk
> >> that a flush request can block on insertion if we are currently out of
> >> tags.
> >>
> >> [   47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
> >> [   47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
> >> [   47.690572] Preemption disabled at:
> >> [   47.690584] [<ffffffff81326c7c>] blk_mq_sched_get_request+0x6c/0x280
> >> [   47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
> >> [   47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
> >> [   47.718081] Call Trace:
> >> [   47.720903]  dump_stack+0x4f/0x73
> >> [   47.724694]  ? blk_mq_sched_get_request+0x6c/0x280
> >> [   47.730137]  __schedule_bug+0x6c/0xc0
> >> [   47.734314]  __schedule+0x559/0x780
> >> [   47.738302]  schedule+0x3b/0x90
> >> [   47.741899]  io_schedule+0x11/0x40
> >> [   47.745788]  blk_mq_get_tag+0x167/0x2a0
> >> [   47.750162]  ? remove_wait_queue+0x70/0x70
> >> [   47.754901]  blk_mq_get_driver_tag+0x92/0xf0
> >> [   47.759758]  blk_mq_sched_insert_request+0x134/0x170
> >> [   47.765398]  ? blk_account_io_start+0xd0/0x270
> >> [   47.770679]  blk_mq_make_request+0x1b2/0x850
> >> [   47.775766]  generic_make_request+0xf7/0x2d0
> >> [   47.780860]  submit_bio+0x5f/0x120
> >> [   47.784979]  ? submit_bio+0x5f/0x120
> >> [   47.789631]  submit_bh_wbc.isra.46+0x10d/0x130
> >> [   47.794902]  submit_bh+0xb/0x10
> >> [   47.798719]  journal_submit_commit_record+0x190/0x210
> >> [   47.804686]  ? _raw_spin_unlock+0x13/0x30
> >> [   47.809480]  jbd2_journal_commit_transaction+0x180a/0x1d00
> >> [   47.815925]  kjournald2+0xb6/0x250
> >> [   47.820022]  ? kjournald2+0xb6/0x250
> >> [   47.824328]  ? remove_wait_queue+0x70/0x70
> >> [   47.829223]  kthread+0x10e/0x140
> >> [   47.833147]  ? commit_timeout+0x10/0x10
> >> [   47.837742]  ? kthread_create_on_node+0x40/0x40
> >> [   47.843122]  ret_from_fork+0x29/0x40
> >>
> >> Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
> >> Signed-off-by: Jens Axboe <axboe@fb.com>
> >>
> >> diff --git a/block/blk-mq.c b/block/blk-mq.c
> >> index 9d7645f24b05..323eed50d615 100644
> >> --- a/block/blk-mq.c
> >> +++ b/block/blk-mq.c
> >> @@ -1634,8 +1634,10 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
> >>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
> >>  		return cookie;
> >>  	} else if (q->elevator) {
> >> +		blk_mq_put_ctx(data.ctx);
> >>  		blk_mq_bio_to_request(rq, bio);
> >>  		blk_mq_sched_insert_request(rq, false, true, true, true);
> >> +		return cookie;
> >>  	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
> >>  		blk_mq_run_hw_queue(data.hctx, true);
> >>  
> >>
> > 
> > I'm confused, the first thing we check in make_request is:
> > 
> > 	if (unlikely(is_flush_fua)) {
> > 		blk_mq_bio_to_request(rq, bio);
> > 		if (q->elevator) {
> > 			blk_mq_sched_insert_request(rq, false, true, true,
> > 					true);
> > 		} else {
> > 			blk_insert_flush(rq);
> > 			blk_mq_run_hw_queue(data.hctx, true);
> > 		}
> > 	}
> > 
> > and can_block doesn't do anything in the !flush case, so shouldn't it be
> > changed in that one instead?
> 
> Just to get closure on this issue, the two cases ends up being folded
> into one. So we're really triggering the first case, but it's a jump
> to the 2nd one.
> 
> Both cases should still be fixed up, the 2nd patch I sent out should be
> fine.

You can add

Reviewed-by: Omar Sandoval <osandov@fb.com>

for the 2nd patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached
  2017-04-20 22:41     ` Omar Sandoval
@ 2017-04-20 22:42       ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2017-04-20 22:42 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-block@vger.kernel.org, Christoph Hellwig

On 04/20/2017 04:41 PM, Omar Sandoval wrote:
> On Thu, Apr 20, 2017 at 04:39:10PM -0600, Jens Axboe wrote:
>> On 04/20/2017 03:30 PM, Omar Sandoval wrote:
>>> On Thu, Apr 20, 2017 at 03:13:43PM -0600, Jens Axboe wrote:
>>>> We must have dropped the ctx before we call
>>>> blk_mq_sched_insert_request() with can_block=true, otherwise we risk
>>>> that a flush request can block on insertion if we are currently out of
>>>> tags.
>>>>
>>>> [   47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
>>>> [   47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
>>>> [   47.690572] Preemption disabled at:
>>>> [   47.690584] [<ffffffff81326c7c>] blk_mq_sched_get_request+0x6c/0x280
>>>> [   47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
>>>> [   47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
>>>> [   47.718081] Call Trace:
>>>> [   47.720903]  dump_stack+0x4f/0x73
>>>> [   47.724694]  ? blk_mq_sched_get_request+0x6c/0x280
>>>> [   47.730137]  __schedule_bug+0x6c/0xc0
>>>> [   47.734314]  __schedule+0x559/0x780
>>>> [   47.738302]  schedule+0x3b/0x90
>>>> [   47.741899]  io_schedule+0x11/0x40
>>>> [   47.745788]  blk_mq_get_tag+0x167/0x2a0
>>>> [   47.750162]  ? remove_wait_queue+0x70/0x70
>>>> [   47.754901]  blk_mq_get_driver_tag+0x92/0xf0
>>>> [   47.759758]  blk_mq_sched_insert_request+0x134/0x170
>>>> [   47.765398]  ? blk_account_io_start+0xd0/0x270
>>>> [   47.770679]  blk_mq_make_request+0x1b2/0x850
>>>> [   47.775766]  generic_make_request+0xf7/0x2d0
>>>> [   47.780860]  submit_bio+0x5f/0x120
>>>> [   47.784979]  ? submit_bio+0x5f/0x120
>>>> [   47.789631]  submit_bh_wbc.isra.46+0x10d/0x130
>>>> [   47.794902]  submit_bh+0xb/0x10
>>>> [   47.798719]  journal_submit_commit_record+0x190/0x210
>>>> [   47.804686]  ? _raw_spin_unlock+0x13/0x30
>>>> [   47.809480]  jbd2_journal_commit_transaction+0x180a/0x1d00
>>>> [   47.815925]  kjournald2+0xb6/0x250
>>>> [   47.820022]  ? kjournald2+0xb6/0x250
>>>> [   47.824328]  ? remove_wait_queue+0x70/0x70
>>>> [   47.829223]  kthread+0x10e/0x140
>>>> [   47.833147]  ? commit_timeout+0x10/0x10
>>>> [   47.837742]  ? kthread_create_on_node+0x40/0x40
>>>> [   47.843122]  ret_from_fork+0x29/0x40
>>>>
>>>> Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
>>>> Signed-off-by: Jens Axboe <axboe@fb.com>
>>>>
>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>> index 9d7645f24b05..323eed50d615 100644
>>>> --- a/block/blk-mq.c
>>>> +++ b/block/blk-mq.c
>>>> @@ -1634,8 +1634,10 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>>>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>>>>  		return cookie;
>>>>  	} else if (q->elevator) {
>>>> +		blk_mq_put_ctx(data.ctx);
>>>>  		blk_mq_bio_to_request(rq, bio);
>>>>  		blk_mq_sched_insert_request(rq, false, true, true, true);
>>>> +		return cookie;
>>>>  	} else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio))
>>>>  		blk_mq_run_hw_queue(data.hctx, true);
>>>>  
>>>>
>>>
>>> I'm confused, the first thing we check in make_request is:
>>>
>>> 	if (unlikely(is_flush_fua)) {
>>> 		blk_mq_bio_to_request(rq, bio);
>>> 		if (q->elevator) {
>>> 			blk_mq_sched_insert_request(rq, false, true, true,
>>> 					true);
>>> 		} else {
>>> 			blk_insert_flush(rq);
>>> 			blk_mq_run_hw_queue(data.hctx, true);
>>> 		}
>>> 	}
>>>
>>> and can_block doesn't do anything in the !flush case, so shouldn't it be
>>> changed in that one instead?
>>
>> Just to get closure on this issue, the two cases ends up being folded
>> into one. So we're really triggering the first case, but it's a jump
>> to the 2nd one.
>>
>> Both cases should still be fixed up, the 2nd patch I sent out should be
>> fine.
> 
> You can add
> 
> Reviewed-by: Omar Sandoval <osandov@fb.com>
> 
> for the 2nd patch.

Thanks Omar, queued up for 4.12.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-04-20 22:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-20 21:13 [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached Jens Axboe
2017-04-20 21:30 ` Omar Sandoval
2017-04-20 21:41   ` Jens Axboe
2017-04-20 21:47     ` Omar Sandoval
2017-04-20 21:50       ` Jens Axboe
2017-04-20 22:39   ` Jens Axboe
2017-04-20 22:41     ` Omar Sandoval
2017-04-20 22:42       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox