All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] dm: fix false alarm in free_rq_clone() for non blk-mq target
@ 2015-05-28  7:59 Junichi Nomura
  2015-05-28 19:14 ` Mike Snitzer
  0 siblings, 1 reply; 5+ messages in thread
From: Junichi Nomura @ 2015-05-28  7:59 UTC (permalink / raw)
  To: device-mapper development, Mike Snitzer

When stacking request-based dm device on non blk-mq device
and device-mapper target could not map the request
(error target is used, multipath target with all paths down, etc),
following warning will show up once:

  WARNING: CPU: 7 PID: 0 at drivers/md/dm.c:1090 free_rq_clone+0x7a/0xf0
  CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.1.0-rc5+ #1
  ..
  Call Trace:
   <IRQ>  [<ffffffff81558095>] dump_stack+0x45/0x57
   [<ffffffff8105ef3a>] warn_slowpath_common+0x8a/0xc0
   [<ffffffff8105f02a>] warn_slowpath_null+0x1a/0x20
   [<ffffffffa020bf3a>] free_rq_clone+0x7a/0xf0 [dm_mod]
   [<ffffffffa020c7cc>] dm_softirq_done+0x13c/0x250 [dm_mod]
   [<ffffffff812b87a0>] blk_done_softirq+0x80/0xa0
   [<ffffffff81062f44>] __do_softirq+0xf4/0x2d0
   [<ffffffff81063425>] irq_exit+0x125/0x130
   [<ffffffff81037f05>] smp_call_function_single_interrupt+0x35/0x40
   [<ffffffff8156027e>] call_function_single_interrupt+0x6e/0x80
   <EOI>  [<ffffffff8104b576>] ? native_safe_halt+0x6/0x10
   [<ffffffff810b6348>] ? rcu_eqs_enter+0x68/0x90
   [<ffffffff8100d6fe>] default_idle+0x1e/0xc0
   [<ffffffff8100e1df>] arch_cpu_idle+0xf/0x20
   [<ffffffff810a269c>] cpu_startup_entry+0x2fc/0x3a0
   [<ffffffff81038892>] start_secondary+0x182/0x1b0

The warning was added by commit aa6df8dd28c0 ("dm: fix free_rq_clone()
NULL pointer when requeueing unmapped request").

But free_rq_clone() with clone->q == NULL is valid usage for non blk-mq
underlying device. Such a call can happen via dm_kill_unmapped_request().

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1badfb2..8ecd22b 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1087,14 +1087,13 @@ static void free_rq_clone(struct request *clone, bool must_be_mapped)
 	struct dm_rq_target_io *tio = clone->end_io_data;
 	struct mapped_device *md = tio->md;
 
-	WARN_ON_ONCE(must_be_mapped && !clone->q);
-
 	blk_rq_unprep_clone(clone);
 
-	if (md->type == DM_TYPE_MQ_REQUEST_BASED)
+	if (md->type == DM_TYPE_MQ_REQUEST_BASED) {
 		/* stacked on blk-mq queue(s) */
+		WARN_ON_ONCE(must_be_mapped && !clone->q);
 		tio->ti->type->release_clone_rq(clone);
-	else if (!md->queue->mq_ops)
+	} else if (!md->queue->mq_ops)
 		/* request_fn queue stacked on request_fn queue(s) */
 		free_clone_request(md, clone);
 	/*

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: dm: fix false alarm in free_rq_clone() for non blk-mq target
  2015-05-28  7:59 [PATCH] dm: fix false alarm in free_rq_clone() for non blk-mq target Junichi Nomura
@ 2015-05-28 19:14 ` Mike Snitzer
  2015-05-29  0:29   ` Junichi Nomura
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Snitzer @ 2015-05-28 19:14 UTC (permalink / raw)
  To: Junichi Nomura; +Cc: device-mapper development

On Thu, May 28 2015 at  3:59am -0400,
Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:

> When stacking request-based dm device on non blk-mq device
> and device-mapper target could not map the request
> (error target is used, multipath target with all paths down, etc),
> following warning will show up once:
> 
>   WARNING: CPU: 7 PID: 0 at drivers/md/dm.c:1090 free_rq_clone+0x7a/0xf0
>   CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.1.0-rc5+ #1
>   ..
>   Call Trace:
>    <IRQ>  [<ffffffff81558095>] dump_stack+0x45/0x57
>    [<ffffffff8105ef3a>] warn_slowpath_common+0x8a/0xc0
>    [<ffffffff8105f02a>] warn_slowpath_null+0x1a/0x20
>    [<ffffffffa020bf3a>] free_rq_clone+0x7a/0xf0 [dm_mod]
>    [<ffffffffa020c7cc>] dm_softirq_done+0x13c/0x250 [dm_mod]
>    [<ffffffff812b87a0>] blk_done_softirq+0x80/0xa0
>    [<ffffffff81062f44>] __do_softirq+0xf4/0x2d0
>    [<ffffffff81063425>] irq_exit+0x125/0x130
>    [<ffffffff81037f05>] smp_call_function_single_interrupt+0x35/0x40
>    [<ffffffff8156027e>] call_function_single_interrupt+0x6e/0x80
>    <EOI>  [<ffffffff8104b576>] ? native_safe_halt+0x6/0x10
>    [<ffffffff810b6348>] ? rcu_eqs_enter+0x68/0x90
>    [<ffffffff8100d6fe>] default_idle+0x1e/0xc0
>    [<ffffffff8100e1df>] arch_cpu_idle+0xf/0x20
>    [<ffffffff810a269c>] cpu_startup_entry+0x2fc/0x3a0
>    [<ffffffff81038892>] start_secondary+0x182/0x1b0
> 
> The warning was added by commit aa6df8dd28c0 ("dm: fix free_rq_clone()
> NULL pointer when requeueing unmapped request").
> 
> But free_rq_clone() with clone->q == NULL is valid usage for non blk-mq
> underlying device. Such a call can happen via dm_kill_unmapped_request().

dm_kill_unmapped_request() is called from the blk-mq error path too
(e.g. if clone_and_map_rq fails with an error).  So this isn't non
blk-mq specific.

dm_kill_unmapped_request() sets REQ_FAILED, which dm_softirq_done()
checks for and will set 'mapped' to false.  I think a proper fix is to
pass 'mapped' into dm_end_request() and then pass it to free_rq_clone().
Like so, what do you think?

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1badfb2..17f1d0e 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1112,7 +1112,7 @@ static void free_rq_clone(struct request *clone, bool must_be_mapped)
  * Must be called without clone's queue lock held,
  * see end_clone_request() for more details.
  */
-static void dm_end_request(struct request *clone, int error)
+static void dm_end_request(struct request *clone, int error, bool mapped)
 {
 	int rw = rq_data_dir(clone);
 	struct dm_rq_target_io *tio = clone->end_io_data;
@@ -1132,7 +1132,7 @@ static void dm_end_request(struct request *clone, int error)
 			rq->sense_len = clone->sense_len;
 	}
 
-	free_rq_clone(clone, true);
+	free_rq_clone(clone, mapped);
 	if (!rq->q->mq_ops)
 		blk_end_request_all(rq, error);
 	else
@@ -1249,7 +1249,7 @@ static void dm_done(struct request *clone, int error, bool mapped)
 
 	if (r <= 0)
 		/* The target wants to complete the I/O */
-		dm_end_request(clone, r);
+		dm_end_request(clone, r, mapped);
 	else if (r == DM_ENDIO_INCOMPLETE)
 		/* The target will handle the I/O */
 		return;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: dm: fix false alarm in free_rq_clone() for non blk-mq target
  2015-05-28 19:14 ` Mike Snitzer
@ 2015-05-29  0:29   ` Junichi Nomura
  2015-05-29  3:18     ` Mike Snitzer
  0 siblings, 1 reply; 5+ messages in thread
From: Junichi Nomura @ 2015-05-29  0:29 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: device-mapper development

On 05/29/15 04:14, Mike Snitzer wrote:
> On Thu, May 28 2015 at  3:59am -0400,
> Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:
> 
>> When stacking request-based dm device on non blk-mq device
>> and device-mapper target could not map the request
>> (error target is used, multipath target with all paths down, etc),
>> following warning will show up once:
>>
>>   WARNING: CPU: 7 PID: 0 at drivers/md/dm.c:1090 free_rq_clone+0x7a/0xf0
>>   CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.1.0-rc5+ #1
>>   ..
>>   Call Trace:
>>    <IRQ>  [<ffffffff81558095>] dump_stack+0x45/0x57
>>    [<ffffffff8105ef3a>] warn_slowpath_common+0x8a/0xc0
>>    [<ffffffff8105f02a>] warn_slowpath_null+0x1a/0x20
>>    [<ffffffffa020bf3a>] free_rq_clone+0x7a/0xf0 [dm_mod]
>>    [<ffffffffa020c7cc>] dm_softirq_done+0x13c/0x250 [dm_mod]
>>    [<ffffffff812b87a0>] blk_done_softirq+0x80/0xa0
>>    [<ffffffff81062f44>] __do_softirq+0xf4/0x2d0
>>    [<ffffffff81063425>] irq_exit+0x125/0x130
>>    [<ffffffff81037f05>] smp_call_function_single_interrupt+0x35/0x40
>>    [<ffffffff8156027e>] call_function_single_interrupt+0x6e/0x80
>>    <EOI>  [<ffffffff8104b576>] ? native_safe_halt+0x6/0x10
>>    [<ffffffff810b6348>] ? rcu_eqs_enter+0x68/0x90
>>    [<ffffffff8100d6fe>] default_idle+0x1e/0xc0
>>    [<ffffffff8100e1df>] arch_cpu_idle+0xf/0x20
>>    [<ffffffff810a269c>] cpu_startup_entry+0x2fc/0x3a0
>>    [<ffffffff81038892>] start_secondary+0x182/0x1b0
>>
>> The warning was added by commit aa6df8dd28c0 ("dm: fix free_rq_clone()
>> NULL pointer when requeueing unmapped request").
>>
>> But free_rq_clone() with clone->q == NULL is valid usage for non blk-mq
>> underlying device. Such a call can happen via dm_kill_unmapped_request().
> 
> dm_kill_unmapped_request() is called from the blk-mq error path too

Yes, but:

> (e.g. if clone_and_map_rq fails with an error).  So this isn't non
> blk-mq specific.

unmapped request does not have clone in the case of blk-mq?
(as the clone_and_map_rq() API suggests)

Then dm_softirq_done(orig) for dm_kill_unmapped_request() will fall
into 'if (!clone)' path.
Thus neither of dm_done(clone), dm_end_request(clone) nor free_rq_clone()
will be called.

> dm_kill_unmapped_request() sets REQ_FAILED, which dm_softirq_done()
> checks for and will set 'mapped' to false.  I think a proper fix is to
> pass 'mapped' into dm_end_request() and then pass it to free_rq_clone().
> Like so, what do you think?

So I don't think it's necessary to extend dm_end_request() with 'mapped'
parameter.

For blk-mq underlying device, we can (and should be able to) assume
unmapped request as not having clone.
It might be better to clear tio->clone in free_rq_clone().
Currently it doesn't seem matter because tio is not reused for non-mq DM
and always re-initialized for mq DM, though.

-- 
Jun'ichi Nomura, NEC Corporation

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1c62ed8..f252adc 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1087,14 +1087,14 @@ static void free_rq_clone(struct request *clone, bool must_be_mapped)
 	struct dm_rq_target_io *tio = clone->end_io_data;
 	struct mapped_device *md = tio->md;
 
-	WARN_ON_ONCE(must_be_mapped && !clone->q);
-
 	blk_rq_unprep_clone(clone);
 
-	if (md->type == DM_TYPE_MQ_REQUEST_BASED)
+	if (md->type == DM_TYPE_MQ_REQUEST_BASED) {
 		/* stacked on blk-mq queue(s) */
+		WARN_ON_ONCE(must_be_mapped && !clone->q);
 		tio->ti->type->release_clone_rq(clone);
+		tio->clone = NULL;
-	else if (!md->queue->mq_ops)
+	} else if (!md->queue->mq_ops)
 		/* request_fn queue stacked on request_fn queue(s) */
 		free_clone_request(md, clone);
 	/*

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: dm: fix false alarm in free_rq_clone() for non blk-mq target
  2015-05-29  0:29   ` Junichi Nomura
@ 2015-05-29  3:18     ` Mike Snitzer
  2015-05-29  4:17       ` Junichi Nomura
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Snitzer @ 2015-05-29  3:18 UTC (permalink / raw)
  To: Junichi Nomura; +Cc: device-mapper development

On Thu, May 28 2015 at  8:29pm -0400,
Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:

> On 05/29/15 04:14, Mike Snitzer wrote:
> > On Thu, May 28 2015 at  3:59am -0400,
> > Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:
> > 
> >> When stacking request-based dm device on non blk-mq device
> >> and device-mapper target could not map the request
> >> (error target is used, multipath target with all paths down, etc),
> >> following warning will show up once:
> >>
> >>   WARNING: CPU: 7 PID: 0 at drivers/md/dm.c:1090 free_rq_clone+0x7a/0xf0
> >>   CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.1.0-rc5+ #1
> >>   ..
> >>   Call Trace:
> >>    <IRQ>  [<ffffffff81558095>] dump_stack+0x45/0x57
> >>    [<ffffffff8105ef3a>] warn_slowpath_common+0x8a/0xc0
> >>    [<ffffffff8105f02a>] warn_slowpath_null+0x1a/0x20
> >>    [<ffffffffa020bf3a>] free_rq_clone+0x7a/0xf0 [dm_mod]
> >>    [<ffffffffa020c7cc>] dm_softirq_done+0x13c/0x250 [dm_mod]
> >>    [<ffffffff812b87a0>] blk_done_softirq+0x80/0xa0
> >>    [<ffffffff81062f44>] __do_softirq+0xf4/0x2d0
> >>    [<ffffffff81063425>] irq_exit+0x125/0x130
> >>    [<ffffffff81037f05>] smp_call_function_single_interrupt+0x35/0x40
> >>    [<ffffffff8156027e>] call_function_single_interrupt+0x6e/0x80
> >>    <EOI>  [<ffffffff8104b576>] ? native_safe_halt+0x6/0x10
> >>    [<ffffffff810b6348>] ? rcu_eqs_enter+0x68/0x90
> >>    [<ffffffff8100d6fe>] default_idle+0x1e/0xc0
> >>    [<ffffffff8100e1df>] arch_cpu_idle+0xf/0x20
> >>    [<ffffffff810a269c>] cpu_startup_entry+0x2fc/0x3a0
> >>    [<ffffffff81038892>] start_secondary+0x182/0x1b0
> >>
> >> The warning was added by commit aa6df8dd28c0 ("dm: fix free_rq_clone()
> >> NULL pointer when requeueing unmapped request").
> >>
> >> But free_rq_clone() with clone->q == NULL is valid usage for non blk-mq
> >> underlying device. Such a call can happen via dm_kill_unmapped_request().
> > 
> > dm_kill_unmapped_request() is called from the blk-mq error path too
> 
> Yes, but:
> 
> > (e.g. if clone_and_map_rq fails with an error).  So this isn't non
> > blk-mq specific.
> 
> unmapped request does not have clone in the case of blk-mq?
> (as the clone_and_map_rq() API suggests)
> 
> Then dm_softirq_done(orig) for dm_kill_unmapped_request() will fall
> into 'if (!clone)' path.
> Thus neither of dm_done(clone), dm_end_request(clone) nor free_rq_clone()
> will be called.

Very true, not sure how I overlooked that.

> > dm_kill_unmapped_request() sets REQ_FAILED, which dm_softirq_done()
> > checks for and will set 'mapped' to false.  I think a proper fix is to
> > pass 'mapped' into dm_end_request() and then pass it to free_rq_clone().
> > Like so, what do you think?
> 
> So I don't think it's necessary to extend dm_end_request() with 'mapped'
> parameter.

I'm starting to question the need for 'must_be_mapped' param to
free_rq_clone().  It was motivated by strange reports from Bart's
testing but in reality I don't think it ever actually helped.

It was to act as a canary in the coal mine for the future but I'm now
more inclined to just remove the parameter entirely.  Or do you think
'must_be_mapped' is still useful?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: dm: fix false alarm in free_rq_clone() for non blk-mq target
  2015-05-29  3:18     ` Mike Snitzer
@ 2015-05-29  4:17       ` Junichi Nomura
  0 siblings, 0 replies; 5+ messages in thread
From: Junichi Nomura @ 2015-05-29  4:17 UTC (permalink / raw)
  To: device-mapper development, Mike Snitzer

On 05/29/15 12:18, Mike Snitzer wrote:
>> So I don't think it's necessary to extend dm_end_request() with 'mapped'
>> parameter.
> 
> I'm starting to question the need for 'must_be_mapped' param to
> free_rq_clone().  It was motivated by strange reports from Bart's
> testing but in reality I don't think it ever actually helped.
> 
> It was to act as a canary in the coal mine for the future but I'm now
> more inclined to just remove the parameter entirely.  Or do you think
> 'must_be_mapped' is still useful?

No. I think removing the parameter is good.

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-05-29  4:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-28  7:59 [PATCH] dm: fix false alarm in free_rq_clone() for non blk-mq target Junichi Nomura
2015-05-28 19:14 ` Mike Snitzer
2015-05-29  0:29   ` Junichi Nomura
2015-05-29  3:18     ` Mike Snitzer
2015-05-29  4:17       ` Junichi Nomura

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.