* [PATCH 1/6] blk-mq: Make blk_mq_quiesce_queue() wait for all .queue_rq() calls
[not found] <20170427155437.23228-1-bart.vanassche@sandisk.com>
@ 2017-04-27 15:54 ` Bart Van Assche
2017-04-28 2:00 ` Ming Lei
0 siblings, 1 reply; 5+ messages in thread
From: Bart Van Assche @ 2017-04-27 15:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Bart Van Assche, Hannes Reinecke, Omar Sandoval,
Ming Lei, stable
blk_mq_quiesce_queue() callers, e.g. elevator_switch_mq(), assume
that no .queue_rq() calls occur while switching to another I/O
scheduler. This patch fixes the following kernel crash if another
I/O scheduler than "none" is the default scheduler:
general protection fault: 0000 [#1] SMP
RIP: 0010:__lock_acquire+0xfe/0x1280
Call Trace:
lock_acquire+0xd5/0x1c0
_raw_spin_lock+0x2a/0x40
dd_dispatch_request+0x29/0x1e0
blk_mq_sched_dispatch_requests+0x139/0x190
__blk_mq_run_hw_queue+0x12d/0x1c0
blk_mq_run_work_fn+0xd/0x10
process_one_work+0x206/0x6a0
worker_thread+0x49/0x4a0
kthread+0x107/0x140
ret_from_fork+0x2e/0x40
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: <stable@vger.kernel.org>
---
block/blk-mq.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index b75ef2392db7..3b3420f76b5a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1224,8 +1224,9 @@ EXPORT_SYMBOL(blk_mq_queue_stopped);
void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx)
{
- cancel_work(&hctx->run_work);
- cancel_delayed_work(&hctx->delay_work);
+ cancel_work_sync(&hctx->run_work);
+ cancel_delayed_work_sync(&hctx->delay_work);
+ cancel_delayed_work_sync(&hctx->delayed_run_work);
set_bit(BLK_MQ_S_STOPPED, &hctx->state);
}
EXPORT_SYMBOL(blk_mq_stop_hw_queue);
--
2.12.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/6] blk-mq: Make blk_mq_quiesce_queue() wait for all .queue_rq() calls
2017-04-27 15:54 ` [PATCH 1/6] blk-mq: Make blk_mq_quiesce_queue() wait for all .queue_rq() calls Bart Van Assche
@ 2017-04-28 2:00 ` Ming Lei
2017-04-28 3:48 ` Ming Lei
2017-04-28 15:35 ` Bart Van Assche
0 siblings, 2 replies; 5+ messages in thread
From: Ming Lei @ 2017-04-28 2:00 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Hannes Reinecke, Omar Sandoval, stable
Hi Bart,
On Thu, Apr 27, 2017 at 08:54:32AM -0700, Bart Van Assche wrote:
> blk_mq_quiesce_queue() callers, e.g. elevator_switch_mq(), assume
> that no .queue_rq() calls occur while switching to another I/O
I think we should call blk_mq_freeze_queue_wait() instead
of blk_quiesce_queue() in elevator_switch_mq(), otherwise
it is easy to cause use-after-free.
> scheduler. This patch fixes the following kernel crash if another
> I/O scheduler than "none" is the default scheduler:
>
> general protection fault: 0000 [#1] SMP
> RIP: 0010:__lock_acquire+0xfe/0x1280
> Call Trace:
> lock_acquire+0xd5/0x1c0
> _raw_spin_lock+0x2a/0x40
> dd_dispatch_request+0x29/0x1e0
> blk_mq_sched_dispatch_requests+0x139/0x190
> __blk_mq_run_hw_queue+0x12d/0x1c0
> blk_mq_run_work_fn+0xd/0x10
> process_one_work+0x206/0x6a0
> worker_thread+0x49/0x4a0
> kthread+0x107/0x140
> ret_from_fork+0x2e/0x40
>
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: <stable@vger.kernel.org>
> ---
> block/blk-mq.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index b75ef2392db7..3b3420f76b5a 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1224,8 +1224,9 @@ EXPORT_SYMBOL(blk_mq_queue_stopped);
>
> void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx)
> {
> - cancel_work(&hctx->run_work);
> - cancel_delayed_work(&hctx->delay_work);
> + cancel_work_sync(&hctx->run_work);
> + cancel_delayed_work_sync(&hctx->delay_work);
Could you explain it a bit why we need the sync version?
> + cancel_delayed_work_sync(&hctx->delayed_run_work);
More introduced, more bugs may come, :-)
So I suggest to unity both .run_work and .dealyed_run_work
into one work, just as what Jens did in the following link:
http://marc.info/?t=149183989800010&r=1&w=2
Thanks,
Ming
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/6] blk-mq: Make blk_mq_quiesce_queue() wait for all .queue_rq() calls
2017-04-28 2:00 ` Ming Lei
@ 2017-04-28 3:48 ` Ming Lei
2017-04-28 15:35 ` Bart Van Assche
1 sibling, 0 replies; 5+ messages in thread
From: Ming Lei @ 2017-04-28 3:48 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Hannes Reinecke, Omar Sandoval, stable
Hi,
On Fri, Apr 28, 2017 at 10:00:44AM +0800, Ming Lei wrote:
> Hi Bart,
>
> On Thu, Apr 27, 2017 at 08:54:32AM -0700, Bart Van Assche wrote:
> > blk_mq_quiesce_queue() callers, e.g. elevator_switch_mq(), assume
> > that no .queue_rq() calls occur while switching to another I/O
>
> I think we should call blk_mq_freeze_queue_wait() instead
> of blk_quiesce_queue() in elevator_switch_mq(), otherwise
> it is easy to cause use-after-free.
Sorry, my fault, blk_mq_freeze_queue_wait() has been done in
blk_mq_freeze_queue() already, so no new IO can enter queue
any more, and all old pending IOs will be fninished when
blk_mq_freeze_queue() returns.
>
> > scheduler. This patch fixes the following kernel crash if another
> > I/O scheduler than "none" is the default scheduler:
> >
> > general protection fault: 0000 [#1] SMP
> > RIP: 0010:__lock_acquire+0xfe/0x1280
> > Call Trace:
> > lock_acquire+0xd5/0x1c0
> > _raw_spin_lock+0x2a/0x40
> > dd_dispatch_request+0x29/0x1e0
> > blk_mq_sched_dispatch_requests+0x139/0x190
> > __blk_mq_run_hw_queue+0x12d/0x1c0
> > blk_mq_run_work_fn+0xd/0x10
> > process_one_work+0x206/0x6a0
> > worker_thread+0x49/0x4a0
> > kthread+0x107/0x140
> > ret_from_fork+0x2e/0x40
So I am just wondering where the I/O in the above issue is
from? and feels there might be other bug behind the issue.
Also blk_mq_quiesce_queue() in elevator_switch_mq()
shouldn't be necessary.
Thanks,
Ming
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/6] blk-mq: Make blk_mq_quiesce_queue() wait for all .queue_rq() calls
2017-04-28 2:00 ` Ming Lei
2017-04-28 3:48 ` Ming Lei
@ 2017-04-28 15:35 ` Bart Van Assche
2017-04-28 16:02 ` Ming Lei
1 sibling, 1 reply; 5+ messages in thread
From: Bart Van Assche @ 2017-04-28 15:35 UTC (permalink / raw)
To: ming.lei@redhat.com
Cc: osandov@fb.com, linux-block@vger.kernel.org, hare@suse.com,
stable@vger.kernel.org, axboe@kernel.dk
On Fri, 2017-04-28 at 10:00 +0800, Ming Lei wrote:
> On Thu, Apr 27, 2017 at 08:54:32AM -0700, Bart Van Assche wrote:
> > void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx)
> > {
> > - cancel_work(&hctx->run_work);
> > - cancel_delayed_work(&hctx->delay_work);
> > + cancel_work_sync(&hctx->run_work);
> > + cancel_delayed_work_sync(&hctx->delay_work);
>
> Could you explain it a bit why we need the sync version?
Because the purpose of this patch is to make blk_mq_quiesce_queue() wait for
all .queue_rq() calls.
> So I suggest to unity both .run_work and .dealyed_run_work
> into one work, just as what Jens did in the following link:
>
> http://marc.info/?t=149183989800010&r=1&w=2
That should be done after this patch is upstream otherwise this
patch won't apply to the stable trees.
Bart.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/6] blk-mq: Make blk_mq_quiesce_queue() wait for all .queue_rq() calls
2017-04-28 15:35 ` Bart Van Assche
@ 2017-04-28 16:02 ` Ming Lei
0 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2017-04-28 16:02 UTC (permalink / raw)
To: Bart Van Assche
Cc: osandov@fb.com, linux-block@vger.kernel.org, hare@suse.com,
stable@vger.kernel.org, axboe@kernel.dk
On Fri, Apr 28, 2017 at 03:35:59PM +0000, Bart Van Assche wrote:
> On Fri, 2017-04-28 at 10:00 +0800, Ming Lei wrote:
> > On Thu, Apr 27, 2017 at 08:54:32AM -0700, Bart Van Assche wrote:
> > > void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx)
> > > {
> > > - cancel_work(&hctx->run_work);
> > > - cancel_delayed_work(&hctx->delay_work);
> > > + cancel_work_sync(&hctx->run_work);
> > > + cancel_delayed_work_sync(&hctx->delay_work);
> >
> > Could you explain it a bit why we need the sync version?
>
> Because the purpose of this patch is to make blk_mq_quiesce_queue() wait for
> all .queue_rq() calls.
If only .queue_rq() from this queue is waited, blk_mq_freeze_queue() is
enough. If you want to wait .queue_rq() from all queues,
blk_mq_quiesce_queue() can't do that too.
Firstly there shouldn't be a 'struct request_queue' parameter passed to this
function, because you want to wait for .queue_rq() from all queues.
Secondaly your current implementation can't guarantee that point:
- only hw queues of this queue are stopped in this function, and
.queue_rq() from other queues still can be run once
synchronize_rcu() returns.
- for BLOCKING case, the SRCU is per hctx.
Thirdly I am still not sure why we want to wait for all .queue_rq(),
could you explain it a bit?
At least for mq scheduler switch, looks all schedule data is per request
queue, and we shouldn't have needed to wait all .queue_rq().
>
> > So I suggest to unity both .run_work and .dealyed_run_work
> > into one work, just as what Jens did in the following link:
> >
> > http://marc.info/?t=149183989800010&r=1&w=2
>
> That should be done after this patch is upstream otherwise this
> patch won't apply to the stable trees.
OK.
Thanks,
Ming
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-04-28 16:02 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20170427155437.23228-1-bart.vanassche@sandisk.com>
2017-04-27 15:54 ` [PATCH 1/6] blk-mq: Make blk_mq_quiesce_queue() wait for all .queue_rq() calls Bart Van Assche
2017-04-28 2:00 ` Ming Lei
2017-04-28 3:48 ` Ming Lei
2017-04-28 15:35 ` Bart Van Assche
2017-04-28 16:02 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox