* [PATCH] blk-mq: Fix blk_execute_rq_nowait() handling of dying queues
@ 2017-04-11 23:58 Bart Van Assche
2017-04-12 5:01 ` Ming Lei
0 siblings, 1 reply; 4+ messages in thread
From: Bart Van Assche @ 2017-04-11 23:58 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, Bart Van Assche, Mike Snitzer, Ming Lei, stable
Although blk_execute_rq_nowait() asks blk_mq_sched_insert_request()
to run the queue, the function that should run the queue
(__blk_mq_delay_run_hw_queue()) skips hardware queues for which
.tags == NULL. Since blk_mq_free_tag_set() clears .tags this means
if blk_execute_rq_nowait() is called after the tag set has been
freed that the request that has been queued will never be executed.
In my tests I noticed that every now and then an SG_IO request that
got queued by multipathd on a dm device did not get executed. This
resulted in either a memory leak complaint about the SG_IO code or
the dm device becoming unremovable with e.g. the following state:
$ grep busy= /sys/kernel/debug/block/dm*/mq/*
/sys/kernel/debug/block/dm-0/mq/state:SAME_COMP STACKABLE IO_STAT INIT_DONE POLL REGISTERED, pg_init_in_progress=0, nr_valid_paths=4, flags= RETAIN_ATTACHED_HW_HANDLER, paths: [0:0] active=1 busy=0 dying dead [1:0] active=1 busy=0 dying dead [2:0] active=1 busy=0 dying dead [3:0] active=1 busy=0 dying dead
$ multipath -ll
mpathu (3600140572616d6469736b32000000000) dm-0 ##,##
size=984M features='3 retain_attached_hw_handler queue_mode mq' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=active
|-+- policy='service-time 0' prio=0 status=undef
|-+- policy='service-time 0' prio=0 status=undef
`-+- policy='service-time 0' prio=0 status=undef
Avoid that blk_execute_rq_nowait() is called to queue a request
onto a dying queue by changing the blk_freeze_queue_start() call
in blk_set_queue_dying() into a blk_freeze_queue() call.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: <stable@vger.kernel.org>
---
block/blk-core.c | 9 +++++----
block/blk-exec.c | 7 +++++--
2 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 8654aa0cef6d..21314b995887 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -501,11 +501,12 @@ void blk_set_queue_dying(struct request_queue *q)
spin_unlock_irq(q->queue_lock);
/*
- * When queue DYING flag is set, we need to block new req
- * entering queue, so we call blk_freeze_queue_start() to
- * prevent I/O from crossing blk_queue_enter().
+ * When queue DYING flag is set, we need to block new requests
+ * from being queued. Hence call blk_freeze_queue() to make
+ * new blk_queue_enter() calls fail and to wait until all pending
+ * I/O has finished.
*/
- blk_freeze_queue_start(q);
+ blk_freeze_queue(q);
if (q->mq_ops)
blk_mq_wake_waiters(q);
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 8cd0e9bc8dc8..f7d9bed2cb15 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -57,10 +57,13 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
rq->end_io = done;
/*
- * don't check dying flag for MQ because the request won't
- * be reused after dying flag is set
+ * The blk_freeze_queue() call in blk_set_queue_dying() and the
+ * test of the "dying" flag in blk_queue_enter() guarantee that
+ * blk_execute_rq_nowait() won't be called anymore after the "dying"
+ * flag has been set.
*/
if (q->mq_ops) {
+ WARN_ON_ONCE(blk_queue_dying(q));
blk_mq_sched_insert_request(rq, at_head, true, false, false);
return;
}
--
2.12.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] blk-mq: Fix blk_execute_rq_nowait() handling of dying queues
2017-04-11 23:58 [PATCH] blk-mq: Fix blk_execute_rq_nowait() handling of dying queues Bart Van Assche
@ 2017-04-12 5:01 ` Ming Lei
2017-04-12 18:24 ` Bart Van Assche
0 siblings, 1 reply; 4+ messages in thread
From: Ming Lei @ 2017-04-12 5:01 UTC (permalink / raw)
To: Bart Van Assche; +Cc: Jens Axboe, linux-block, Mike Snitzer, stable
On Wed, Apr 12, 2017 at 7:58 AM, Bart Van Assche
<bart.vanassche@sandisk.com> wrote:
> Although blk_execute_rq_nowait() asks blk_mq_sched_insert_request()
> to run the queue, the function that should run the queue
> (__blk_mq_delay_run_hw_queue()) skips hardware queues for which
> .tags == NULL. Since blk_mq_free_tag_set() clears .tags this means
> if blk_execute_rq_nowait() is called after the tag set has been
Just wondering how that can happen, because we usually call
blk_mq_free_tag_set()
after blk_cleanup_queue() is completed.
> freed that the request that has been queued will never be executed.
> In my tests I noticed that every now and then an SG_IO request that
> got queued by multipathd on a dm device did not get executed. This
> resulted in either a memory leak complaint about the SG_IO code or
> the dm device becoming unremovable with e.g. the following state:
>
> $ grep busy= /sys/kernel/debug/block/dm*/mq/*
> /sys/kernel/debug/block/dm-0/mq/state:SAME_COMP STACKABLE IO_STAT INIT_DONE POLL REGISTERED, pg_init_in_progress=0, nr_valid_paths=4, flags= RETAIN_ATTACHED_HW_HANDLER, paths: [0:0] active=1 busy=0 dying dead [1:0] active=1 busy=0 dying dead [2:0] active=1 busy=0 dying dead [3:0] active=1 busy=0 dying dead
> $ multipath -ll
> mpathu (3600140572616d6469736b32000000000) dm-0 ##,##
> size=984M features='3 retain_attached_hw_handler queue_mode mq' hwhandler='1 alua' wp=rw
> |-+- policy='service-time 0' prio=0 status=active
> |-+- policy='service-time 0' prio=0 status=undef
> |-+- policy='service-time 0' prio=0 status=undef
> `-+- policy='service-time 0' prio=0 status=undef
>
> Avoid that blk_execute_rq_nowait() is called to queue a request
> onto a dying queue by changing the blk_freeze_queue_start() call
> in blk_set_queue_dying() into a blk_freeze_queue() call.
blk_mq_freeze_queue_wait() is only for waiting for completion of pending IO, so
could you explain it a bit why _wait() is required?
In this case, either blk_freeze_queue_start() or blk_freeze_queue() can't
prevent the rq coming into queue, because we only hold/check q_usage_counter
before allocating a request, but blk_execute_rq_nowait() has got the request
already.
>
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Cc: Ming Lei <tom.leiming@gmail.com>
> Cc: <stable@vger.kernel.org>
> ---
> block/blk-core.c | 9 +++++----
> block/blk-exec.c | 7 +++++--
> 2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 8654aa0cef6d..21314b995887 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -501,11 +501,12 @@ void blk_set_queue_dying(struct request_queue *q)
> spin_unlock_irq(q->queue_lock);
>
> /*
> - * When queue DYING flag is set, we need to block new req
> - * entering queue, so we call blk_freeze_queue_start() to
> - * prevent I/O from crossing blk_queue_enter().
> + * When queue DYING flag is set, we need to block new requests
> + * from being queued. Hence call blk_freeze_queue() to make
> + * new blk_queue_enter() calls fail and to wait until all pending
> + * I/O has finished.
> */
> - blk_freeze_queue_start(q);
> + blk_freeze_queue(q);
>
> if (q->mq_ops)
> blk_mq_wake_waiters(q);
> diff --git a/block/blk-exec.c b/block/blk-exec.c
> index 8cd0e9bc8dc8..f7d9bed2cb15 100644
> --- a/block/blk-exec.c
> +++ b/block/blk-exec.c
> @@ -57,10 +57,13 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
> rq->end_io = done;
>
> /*
> - * don't check dying flag for MQ because the request won't
> - * be reused after dying flag is set
> + * The blk_freeze_queue() call in blk_set_queue_dying() and the
> + * test of the "dying" flag in blk_queue_enter() guarantee that
> + * blk_execute_rq_nowait() won't be called anymore after the "dying"
> + * flag has been set.
That never be guaranteed, see the following case:
1) blk_get_request() is called just before queue is set as dying in another path
2) the request is allocated successfully and passed to
blk_execute_rq_nowait() even
though queue has been set as dying
Thanks,
Ming Lei
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] blk-mq: Fix blk_execute_rq_nowait() handling of dying queues
2017-04-12 5:01 ` Ming Lei
@ 2017-04-12 18:24 ` Bart Van Assche
2017-04-13 1:20 ` Ming Lei
0 siblings, 1 reply; 4+ messages in thread
From: Bart Van Assche @ 2017-04-12 18:24 UTC (permalink / raw)
To: tom.leiming@gmail.com
Cc: linux-block@vger.kernel.org, stable@vger.kernel.org,
axboe@kernel.dk, snitzer@redhat.com
On Wed, 2017-04-12 at 13:01 +0800, Ming Lei wrote:
> On Wed, Apr 12, 2017 at 7:58 AM, Bart Van Assche
> <bart.vanassche@sandisk.com> wrote:
> >
> > diff --git a/block/blk-exec.c b/block/blk-exec.c
> > index 8cd0e9bc8dc8..f7d9bed2cb15 100644
> > --- a/block/blk-exec.c
> > +++ b/block/blk-exec.c
> > @@ -57,10 +57,13 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
> > rq->end_io = done;
> >
> > /*
> > - * don't check dying flag for MQ because the request won't
> > - * be reused after dying flag is set
> > + * The blk_freeze_queue() call in blk_set_queue_dying() and the
> > + * test of the "dying" flag in blk_queue_enter() guarantee that
> > + * blk_execute_rq_nowait() won't be called anymore after the "dying"
> > + * flag has been set.
>
> That never be guaranteed, see the following case:
>
> 1) blk_get_request() is called just before queue is set as dying in another path
>
> 2) the request is allocated successfully and passed to
> blk_execute_rq_nowait() even
> though queue has been set as dying
Hello Ming,
Shouldn't the blk-mq code guarantee that blk_execute_rq_nowait() won't be
called anymore after the "dying" flag has been set? I think changing the
blk_freeze_queue_start() call into blk_freeze_queue() in blk_set_queue_dying()
is sufficient to achieve this.
Note: after I had posted this patch I have been able to reproduce the issue
described in the patch description. Although I still think we need the patch
at the start of this e-mail thread, it doesn't fix the issue I described.
Bart.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] blk-mq: Fix blk_execute_rq_nowait() handling of dying queues
2017-04-12 18:24 ` Bart Van Assche
@ 2017-04-13 1:20 ` Ming Lei
0 siblings, 0 replies; 4+ messages in thread
From: Ming Lei @ 2017-04-13 1:20 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-block@vger.kernel.org, stable@vger.kernel.org,
axboe@kernel.dk, snitzer@redhat.com
On Thu, Apr 13, 2017 at 2:24 AM, Bart Van Assche
<Bart.VanAssche@sandisk.com> wrote:
> On Wed, 2017-04-12 at 13:01 +0800, Ming Lei wrote:
>> On Wed, Apr 12, 2017 at 7:58 AM, Bart Van Assche
>> <bart.vanassche@sandisk.com> wrote:
>> >
>> > diff --git a/block/blk-exec.c b/block/blk-exec.c
>> > index 8cd0e9bc8dc8..f7d9bed2cb15 100644
>> > --- a/block/blk-exec.c
>> > +++ b/block/blk-exec.c
>> > @@ -57,10 +57,13 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
>> > rq->end_io = done;
>> >
>> > /*
>> > - * don't check dying flag for MQ because the request won't
>> > - * be reused after dying flag is set
>> > + * The blk_freeze_queue() call in blk_set_queue_dying() and the
>> > + * test of the "dying" flag in blk_queue_enter() guarantee that
>> > + * blk_execute_rq_nowait() won't be called anymore after the "dying"
>> > + * flag has been set.
>>
>> That never be guaranteed, see the following case:
>>
>> 1) blk_get_request() is called just before queue is set as dying in another path
>>
>> 2) the request is allocated successfully and passed to
>> blk_execute_rq_nowait() even
>> though queue has been set as dying
>
> Hello Ming,
>
> Shouldn't the blk-mq code guarantee that blk_execute_rq_nowait() won't be
> called anymore after the "dying" flag has been set? I think changing the
> blk_freeze_queue_start() call into blk_freeze_queue() in blk_set_queue_dying()
> is sufficient to achieve this.
I have explained that this change isn't enough.
>
> Note: after I had posted this patch I have been able to reproduce the issue
> described in the patch description. Although I still think we need the patch
> at the start of this e-mail thread, it doesn't fix the issue I described.
Since it fixes nothing, I don't suggest to do that.
Thanks,
Ming Lei
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-04-13 1:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-11 23:58 [PATCH] blk-mq: Fix blk_execute_rq_nowait() handling of dying queues Bart Van Assche
2017-04-12 5:01 ` Ming Lei
2017-04-12 18:24 ` Bart Van Assche
2017-04-13 1:20 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox