* Re: bug: using smp_processor_id() in preemptible code in rr_select_path()
[not found] <20160803153545.GE20370@bblock-ThinkPad-W530>
@ 2016-08-05 15:27 ` Mike Snitzer
2016-08-05 15:33 ` Jens Axboe
0 siblings, 1 reply; 10+ messages in thread
From: Mike Snitzer @ 2016-08-05 15:27 UTC (permalink / raw)
To: Benjamin Block; +Cc: dm-devel, linux-kernel, hch, axboe, linux-block
On Wed, Aug 03 2016 at 11:35am -0400,
Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
> Hej Mike,
>
> when running a debug-kernel today with several multipath-devices using
> the round-robin path selector I noticed that the kernel throws these
> warnings here:
>
> BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
> caller is rr_select_path+0x36/0x108 [dm_round_robin]
> CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
> Call Trace:
> ([<00000000001144a2>] show_trace+0x8a/0xe0)
> ([<0000000000114586>] show_stack+0x8e/0xf0)
> ([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
> ([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
> ([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
> ([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
> ([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
> ([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
> ([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
> ([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
> ([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
> ([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
> ([<00000000001681da>] kthread+0x112/0x120)
> ([<000000000098378a>] kernel_thread_starter+0x6/0xc)
> ([<0000000000983784>] kernel_thread_starter+0x0/0xc)
> no locks held by kdmwork-252:0/881.
>
>
> There are several more of these warnings, but all have the same
> stack-trace (this is on s390x, but this looks like its only common code)
> - sometimes the process-context is multipath.
>
> Looking at the changes in this function, it rather looks like this is
> caused by changes made in commit
> b0b477c7e0dd93f8916d106018ded1331b81bf61 (dm round robin: use percpu
> 'repeat_count' and 'current_path').
>
> The kernel is a stock v4.7 with some debug options enabled (prominently
> DEBUG_PREEMPT). Need any more info?
As you can see from commit b0b477c7e0dd9 the round-robin path selector
is now using percpu data (pointer) and a percpu_counter.
I'm really not sure how else to access this percpu data.
Cc'ing LKML, linux-block, Jens and hch to cast a wider net in the hopes
of getting an answer to how to fix this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: bug: using smp_processor_id() in preemptible code in rr_select_path()
2016-08-05 15:27 ` bug: using smp_processor_id() in preemptible code in rr_select_path() Mike Snitzer
@ 2016-08-05 15:33 ` Jens Axboe
2016-08-05 15:42 ` Mike Snitzer
0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2016-08-05 15:33 UTC (permalink / raw)
To: Mike Snitzer, Benjamin Block; +Cc: dm-devel, linux-kernel, hch, linux-block
On 08/05/2016 09:27 AM, Mike Snitzer wrote:
> On Wed, Aug 03 2016 at 11:35am -0400,
> Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
>
>> Hej Mike,
>>
>> when running a debug-kernel today with several multipath-devices using
>> the round-robin path selector I noticed that the kernel throws these
>> warnings here:
>>
>> BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
>> caller is rr_select_path+0x36/0x108 [dm_round_robin]
>> CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
>> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
>> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
>> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
>> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
>> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
>> Call Trace:
>> ([<00000000001144a2>] show_trace+0x8a/0xe0)
>> ([<0000000000114586>] show_stack+0x8e/0xf0)
>> ([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
>> ([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
>> ([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
>> ([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
>> ([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
>> ([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
>> ([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
>> ([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
>> ([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
>> ([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
>> ([<00000000001681da>] kthread+0x112/0x120)
>> ([<000000000098378a>] kernel_thread_starter+0x6/0xc)
>> ([<0000000000983784>] kernel_thread_starter+0x0/0xc)
>> no locks held by kdmwork-252:0/881.
>>
>>
>> There are several more of these warnings, but all have the same
>> stack-trace (this is on s390x, but this looks like its only common code)
>> - sometimes the process-context is multipath.
>>
>> Looking at the changes in this function, it rather looks like this is
>> caused by changes made in commit
>> b0b477c7e0dd93f8916d106018ded1331b81bf61 (dm round robin: use percpu
>> 'repeat_count' and 'current_path').
>>
>> The kernel is a stock v4.7 with some debug options enabled (prominently
>> DEBUG_PREEMPT). Need any more info?
>
> As you can see from commit b0b477c7e0dd9 the round-robin path selector
> is now using percpu data (pointer) and a percpu_counter.
>
> I'm really not sure how else to access this percpu data.
>
> Cc'ing LKML, linux-block, Jens and hch to cast a wider net in the hopes
> of getting an answer to how to fix this.
From a quick look, looks like you are using this_cpu_ptr() without
having preemption disabled.
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: bug: using smp_processor_id() in preemptible code in rr_select_path()
2016-08-05 15:33 ` Jens Axboe
@ 2016-08-05 15:42 ` Mike Snitzer
2016-08-05 15:54 ` Jens Axboe
0 siblings, 1 reply; 10+ messages in thread
From: Mike Snitzer @ 2016-08-05 15:42 UTC (permalink / raw)
To: Jens Axboe; +Cc: Benjamin Block, dm-devel, linux-kernel, hch, linux-block
On Fri, Aug 05 2016 at 11:33P -0400,
Jens Axboe <axboe@kernel.dk> wrote:
> On 08/05/2016 09:27 AM, Mike Snitzer wrote:
> >On Wed, Aug 03 2016 at 11:35am -0400,
> >Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
> >
> >>Hej Mike,
> >>
> >>when running a debug-kernel today with several multipath-devices using
> >>the round-robin path selector I noticed that the kernel throws these
> >>warnings here:
> >>
> >>BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
> >>caller is rr_select_path+0x36/0x108 [dm_round_robin]
> >>CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
> >> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
> >> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
> >> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
> >> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
> >> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
> >>Call Trace:
> >>([<00000000001144a2>] show_trace+0x8a/0xe0)
> >>([<0000000000114586>] show_stack+0x8e/0xf0)
> >>([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
> >>([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
> >>([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
> >>([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
> >>([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
> >>([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
> >>([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
> >>([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
> >>([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
> >>([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
> >>([<00000000001681da>] kthread+0x112/0x120)
> >>([<000000000098378a>] kernel_thread_starter+0x6/0xc)
> >>([<0000000000983784>] kernel_thread_starter+0x0/0xc)
> >>no locks held by kdmwork-252:0/881.
> >>
> >>
> >>There are several more of these warnings, but all have the same
> >>stack-trace (this is on s390x, but this looks like its only common code)
> >>- sometimes the process-context is multipath.
> >>
> >>Looking at the changes in this function, it rather looks like this is
> >>caused by changes made in commit
> >>b0b477c7e0dd93f8916d106018ded1331b81bf61 (dm round robin: use percpu
> >>'repeat_count' and 'current_path').
> >>
> >>The kernel is a stock v4.7 with some debug options enabled (prominently
> >>DEBUG_PREEMPT). Need any more info?
> >
> >As you can see from commit b0b477c7e0dd9 the round-robin path selector
> >is now using percpu data (pointer) and a percpu_counter.
> >
> >I'm really not sure how else to access this percpu data.
> >
> >Cc'ing LKML, linux-block, Jens and hch to cast a wider net in the hopes
> >of getting an answer to how to fix this.
>
> From a quick look, looks like you are using this_cpu_ptr() without
> having preemption disabled.
Right, that is what it looked like to me too.
I'm just not sure on what the proper pattern is to fix this.
I'll look closer though.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: bug: using smp_processor_id() in preemptible code in rr_select_path()
2016-08-05 15:42 ` Mike Snitzer
@ 2016-08-05 15:54 ` Jens Axboe
2016-08-05 16:06 ` Mike Snitzer
0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2016-08-05 15:54 UTC (permalink / raw)
To: Mike Snitzer; +Cc: Benjamin Block, dm-devel, linux-kernel, hch, linux-block
On 08/05/2016 09:42 AM, Mike Snitzer wrote:
> On Fri, Aug 05 2016 at 11:33P -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
>
>> On 08/05/2016 09:27 AM, Mike Snitzer wrote:
>>> On Wed, Aug 03 2016 at 11:35am -0400,
>>> Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
>>>
>>>> Hej Mike,
>>>>
>>>> when running a debug-kernel today with several multipath-devices using
>>>> the round-robin path selector I noticed that the kernel throws these
>>>> warnings here:
>>>>
>>>> BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
>>>> caller is rr_select_path+0x36/0x108 [dm_round_robin]
>>>> CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
>>>> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
>>>> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
>>>> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
>>>> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
>>>> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
>>>> Call Trace:
>>>> ([<00000000001144a2>] show_trace+0x8a/0xe0)
>>>> ([<0000000000114586>] show_stack+0x8e/0xf0)
>>>> ([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
>>>> ([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
>>>> ([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
>>>> ([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
>>>> ([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
>>>> ([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
>>>> ([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
>>>> ([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
>>>> ([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
>>>> ([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
>>>> ([<00000000001681da>] kthread+0x112/0x120)
>>>> ([<000000000098378a>] kernel_thread_starter+0x6/0xc)
>>>> ([<0000000000983784>] kernel_thread_starter+0x0/0xc)
>>>> no locks held by kdmwork-252:0/881.
>>>>
>>>>
>>>> There are several more of these warnings, but all have the same
>>>> stack-trace (this is on s390x, but this looks like its only common code)
>>>> - sometimes the process-context is multipath.
>>>>
>>>> Looking at the changes in this function, it rather looks like this is
>>>> caused by changes made in commit
>>>> b0b477c7e0dd93f8916d106018ded1331b81bf61 (dm round robin: use percpu
>>>> 'repeat_count' and 'current_path').
>>>>
>>>> The kernel is a stock v4.7 with some debug options enabled (prominently
>>>> DEBUG_PREEMPT). Need any more info?
>>>
>>> As you can see from commit b0b477c7e0dd9 the round-robin path selector
>>> is now using percpu data (pointer) and a percpu_counter.
>>>
>>> I'm really not sure how else to access this percpu data.
>>>
>>> Cc'ing LKML, linux-block, Jens and hch to cast a wider net in the hopes
>>> of getting an answer to how to fix this.
>>
>> From a quick look, looks like you are using this_cpu_ptr() without
>> having preemption disabled.
>
> Right, that is what it looked like to me too.
>
> I'm just not sure on what the proper pattern is to fix this.
>
> I'll look closer though.
I always forget the details (if this confuses lockdep or not), but you
could potentially turn it into:
local_irq_save(flags);
x = this_cpu_ptr();
[...]
spin_lock(&s->lock);
[...]
instead.
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: bug: using smp_processor_id() in preemptible code in rr_select_path()
2016-08-05 15:54 ` Jens Axboe
@ 2016-08-05 16:06 ` Mike Snitzer
2016-08-05 16:11 ` Jens Axboe
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Mike Snitzer @ 2016-08-05 16:06 UTC (permalink / raw)
To: Jens Axboe, Benjamin Block; +Cc: dm-devel, linux-kernel, hch, linux-block
On Fri, Aug 05 2016 at 11:54am -0400,
Jens Axboe <axboe@kernel.dk> wrote:
> On 08/05/2016 09:42 AM, Mike Snitzer wrote:
> >On Fri, Aug 05 2016 at 11:33P -0400,
> >Jens Axboe <axboe@kernel.dk> wrote:
> >
> >>On 08/05/2016 09:27 AM, Mike Snitzer wrote:
> >>>On Wed, Aug 03 2016 at 11:35am -0400,
> >>>Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
> >>>
> >>>>Hej Mike,
> >>>>
> >>>>when running a debug-kernel today with several multipath-devices using
> >>>>the round-robin path selector I noticed that the kernel throws these
> >>>>warnings here:
> >>>>
> >>>>BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
> >>>>caller is rr_select_path+0x36/0x108 [dm_round_robin]
> >>>>CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
> >>>> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
> >>>> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
> >>>> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
> >>>> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
> >>>> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
> >>>>Call Trace:
> >>>>([<00000000001144a2>] show_trace+0x8a/0xe0)
> >>>>([<0000000000114586>] show_stack+0x8e/0xf0)
> >>>>([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
> >>>>([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
> >>>>([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
> >>>>([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
> >>>>([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
> >>>>([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
> >>>>([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
> >>>>([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
> >>>>([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
> >>>>([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
> >>>>([<00000000001681da>] kthread+0x112/0x120)
> >>>>([<000000000098378a>] kernel_thread_starter+0x6/0xc)
> >>>>([<0000000000983784>] kernel_thread_starter+0x0/0xc)
> >>>>no locks held by kdmwork-252:0/881.
> >>>>
> >>>>
> >>>>There are several more of these warnings, but all have the same
> >>>>stack-trace (this is on s390x, but this looks like its only common code)
> >>>>- sometimes the process-context is multipath.
> >>>>
> >>>>Looking at the changes in this function, it rather looks like this is
> >>>>caused by changes made in commit
> >>>>b0b477c7e0dd93f8916d106018ded1331b81bf61 (dm round robin: use percpu
> >>>>'repeat_count' and 'current_path').
> >>>>
> >>>>The kernel is a stock v4.7 with some debug options enabled (prominently
> >>>>DEBUG_PREEMPT). Need any more info?
> >>>
> >>>As you can see from commit b0b477c7e0dd9 the round-robin path selector
> >>>is now using percpu data (pointer) and a percpu_counter.
> >>>
> >>>I'm really not sure how else to access this percpu data.
> >>>
> >>>Cc'ing LKML, linux-block, Jens and hch to cast a wider net in the hopes
> >>>of getting an answer to how to fix this.
> >>
> >>From a quick look, looks like you are using this_cpu_ptr() without
> >>having preemption disabled.
> >
> >Right, that is what it looked like to me too.
> >
> >I'm just not sure on what the proper pattern is to fix this.
> >
> >I'll look closer though.
>
> I always forget the details (if this confuses lockdep or not), but you
> could potentially turn it into:
>
> local_irq_save(flags);
> x = this_cpu_ptr();
> [...]
>
> spin_lock(&s->lock);
> [...]
>
> instead.
Cool, I've coded up the patch (compile tested only).
Benjamin, any chance you could test this against your v4.7 kernel and
report back?
Thanks,
Mike
diff --git a/drivers/md/dm-round-robin.c b/drivers/md/dm-round-robin.c
index 4ace1da..ed446f8 100644
--- a/drivers/md/dm-round-robin.c
+++ b/drivers/md/dm-round-robin.c
@@ -210,14 +210,17 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
struct path_info *pi = NULL;
struct dm_path *current_path = NULL;
+ local_irq_save(flags);
current_path = *this_cpu_ptr(s->current_path);
if (current_path) {
percpu_counter_dec(&s->repeat_count);
- if (percpu_counter_read_positive(&s->repeat_count) > 0)
+ if (percpu_counter_read_positive(&s->repeat_count) > 0) {
+ local_irq_restore(flags);
return current_path;
+ }
}
- spin_lock_irqsave(&s->lock, flags);
+ spin_lock(&s->lock);
if (!list_empty(&s->valid_paths)) {
pi = list_entry(s->valid_paths.next, struct path_info, list);
list_move_tail(&pi->list, &s->valid_paths);
@@ -225,7 +228,8 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
set_percpu_current_path(s, pi->path);
current_path = pi->path;
}
- spin_unlock_irqrestore(&s->lock, flags);
+ spin_unlock(&s->lock);
+ local_irq_restore(flags);
return current_path;
}
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: bug: using smp_processor_id() in preemptible code in rr_select_path()
2016-08-05 16:06 ` Mike Snitzer
@ 2016-08-05 16:11 ` Jens Axboe
2016-08-07 15:49 ` Benjamin Block
2016-08-08 16:32 ` [dm-devel] " Benjamin Block
2 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2016-08-05 16:11 UTC (permalink / raw)
To: Mike Snitzer, Benjamin Block; +Cc: dm-devel, linux-kernel, hch, linux-block
On 08/05/2016 10:06 AM, Mike Snitzer wrote:
> On Fri, Aug 05 2016 at 11:54am -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
>
>> On 08/05/2016 09:42 AM, Mike Snitzer wrote:
>>> On Fri, Aug 05 2016 at 11:33P -0400,
>>> Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>> On 08/05/2016 09:27 AM, Mike Snitzer wrote:
>>>>> On Wed, Aug 03 2016 at 11:35am -0400,
>>>>> Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
>>>>>
>>>>>> Hej Mike,
>>>>>>
>>>>>> when running a debug-kernel today with several multipath-devices using
>>>>>> the round-robin path selector I noticed that the kernel throws these
>>>>>> warnings here:
>>>>>>
>>>>>> BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
>>>>>> caller is rr_select_path+0x36/0x108 [dm_round_robin]
>>>>>> CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
>>>>>> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
>>>>>> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
>>>>>> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
>>>>>> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
>>>>>> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
>>>>>> Call Trace:
>>>>>> ([<00000000001144a2>] show_trace+0x8a/0xe0)
>>>>>> ([<0000000000114586>] show_stack+0x8e/0xf0)
>>>>>> ([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
>>>>>> ([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
>>>>>> ([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
>>>>>> ([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
>>>>>> ([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
>>>>>> ([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
>>>>>> ([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
>>>>>> ([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
>>>>>> ([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
>>>>>> ([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
>>>>>> ([<00000000001681da>] kthread+0x112/0x120)
>>>>>> ([<000000000098378a>] kernel_thread_starter+0x6/0xc)
>>>>>> ([<0000000000983784>] kernel_thread_starter+0x0/0xc)
>>>>>> no locks held by kdmwork-252:0/881.
>>>>>>
>>>>>>
>>>>>> There are several more of these warnings, but all have the same
>>>>>> stack-trace (this is on s390x, but this looks like its only common code)
>>>>>> - sometimes the process-context is multipath.
>>>>>>
>>>>>> Looking at the changes in this function, it rather looks like this is
>>>>>> caused by changes made in commit
>>>>>> b0b477c7e0dd93f8916d106018ded1331b81bf61 (dm round robin: use percpu
>>>>>> 'repeat_count' and 'current_path').
>>>>>>
>>>>>> The kernel is a stock v4.7 with some debug options enabled (prominently
>>>>>> DEBUG_PREEMPT). Need any more info?
>>>>>
>>>>> As you can see from commit b0b477c7e0dd9 the round-robin path selector
>>>>> is now using percpu data (pointer) and a percpu_counter.
>>>>>
>>>>> I'm really not sure how else to access this percpu data.
>>>>>
>>>>> Cc'ing LKML, linux-block, Jens and hch to cast a wider net in the hopes
>>>>> of getting an answer to how to fix this.
>>>>
>>> >From a quick look, looks like you are using this_cpu_ptr() without
>>>> having preemption disabled.
>>>
>>> Right, that is what it looked like to me too.
>>>
>>> I'm just not sure on what the proper pattern is to fix this.
>>>
>>> I'll look closer though.
>>
>> I always forget the details (if this confuses lockdep or not), but you
>> could potentially turn it into:
>>
>> local_irq_save(flags);
>> x = this_cpu_ptr();
>> [...]
>>
>> spin_lock(&s->lock);
>> [...]
>>
>> instead.
>
> Cool, I've coded up the patch (compile tested only).
>
> Benjamin, any chance you could test this against your v4.7 kernel and
> report back?
>
> Thanks,
> Mike
>
> diff --git a/drivers/md/dm-round-robin.c b/drivers/md/dm-round-robin.c
> index 4ace1da..ed446f8 100644
> --- a/drivers/md/dm-round-robin.c
> +++ b/drivers/md/dm-round-robin.c
> @@ -210,14 +210,17 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
> struct path_info *pi = NULL;
> struct dm_path *current_path = NULL;
>
> + local_irq_save(flags);
> current_path = *this_cpu_ptr(s->current_path);
> if (current_path) {
> percpu_counter_dec(&s->repeat_count);
> - if (percpu_counter_read_positive(&s->repeat_count) > 0)
> + if (percpu_counter_read_positive(&s->repeat_count) > 0) {
> + local_irq_restore(flags);
> return current_path;
> + }
> }
>
> - spin_lock_irqsave(&s->lock, flags);
> + spin_lock(&s->lock);
> if (!list_empty(&s->valid_paths)) {
> pi = list_entry(s->valid_paths.next, struct path_info, list);
> list_move_tail(&pi->list, &s->valid_paths);
> @@ -225,7 +228,8 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
> set_percpu_current_path(s, pi->path);
> current_path = pi->path;
Just leave that as-is, should be fine.
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: bug: using smp_processor_id() in preemptible code in rr_select_path()
2016-08-05 16:06 ` Mike Snitzer
2016-08-05 16:11 ` Jens Axboe
@ 2016-08-07 15:49 ` Benjamin Block
2016-08-08 16:32 ` [dm-devel] " Benjamin Block
2 siblings, 0 replies; 10+ messages in thread
From: Benjamin Block @ 2016-08-07 15:49 UTC (permalink / raw)
To: Mike Snitzer; +Cc: Jens Axboe, dm-devel, linux-kernel, hch, linux-block
On 12:06 Fri 05 Aug , Mike Snitzer wrote:
> On Fri, Aug 05 2016 at 11:54am -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
>
> > On 08/05/2016 09:42 AM, Mike Snitzer wrote:
> > >On Fri, Aug 05 2016 at 11:33P -0400,
> > >Jens Axboe <axboe@kernel.dk> wrote:
> > >
> > >>On 08/05/2016 09:27 AM, Mike Snitzer wrote:
> > >>>On Wed, Aug 03 2016 at 11:35am -0400,
> > >>>Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
> > >>>
> > >>>>Hej Mike,
> > >>>>
> > >>>>when running a debug-kernel today with several multipath-devices using
> > >>>>the round-robin path selector I noticed that the kernel throws these
> > >>>>warnings here:
> > >>>>
> > >>>>BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
> > >>>>caller is rr_select_path+0x36/0x108 [dm_round_robin]
> > >>>>CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
> > >>>> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
> > >>>> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
> > >>>> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
> > >>>> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
> > >>>> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
> > >>>>Call Trace:
> > >>>>([<00000000001144a2>] show_trace+0x8a/0xe0)
> > >>>>([<0000000000114586>] show_stack+0x8e/0xf0)
> > >>>>([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
> > >>>>([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
> > >>>>([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
> > >>>>([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
> > >>>>([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
> > >>>>([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
> > >>>>([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
> > >>>>([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
> > >>>>([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
> > >>>>([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
> > >>>>([<00000000001681da>] kthread+0x112/0x120)
> > >>>>([<000000000098378a>] kernel_thread_starter+0x6/0xc)
> > >>>>([<0000000000983784>] kernel_thread_starter+0x0/0xc)
> > >>>>no locks held by kdmwork-252:0/881.
> > >>>>
> > >>>>
> > >>>>There are several more of these warnings, but all have the same
> > >>>>stack-trace (this is on s390x, but this looks like its only common code)
> > >>>>- sometimes the process-context is multipath.
> > >>>>
> > >>>>Looking at the changes in this function, it rather looks like this is
> > >>>>caused by changes made in commit
> > >>>>b0b477c7e0dd93f8916d106018ded1331b81bf61 (dm round robin: use percpu
> > >>>>'repeat_count' and 'current_path').
> > >>>>
> > >>>>The kernel is a stock v4.7 with some debug options enabled (prominently
> > >>>>DEBUG_PREEMPT). Need any more info?
> > >>>
> > >>>As you can see from commit b0b477c7e0dd9 the round-robin path selector
> > >>>is now using percpu data (pointer) and a percpu_counter.
> > >>>
> > >>>I'm really not sure how else to access this percpu data.
> > >>>
> > >>>Cc'ing LKML, linux-block, Jens and hch to cast a wider net in the hopes
> > >>>of getting an answer to how to fix this.
> > >>
> > >>From a quick look, looks like you are using this_cpu_ptr() without
> > >>having preemption disabled.
> > >
> > >Right, that is what it looked like to me too.
> > >
> > >I'm just not sure on what the proper pattern is to fix this.
> > >
> > >I'll look closer though.
> >
> > I always forget the details (if this confuses lockdep or not), but you
> > could potentially turn it into:
> >
> > local_irq_save(flags);
> > x = this_cpu_ptr();
> > [...]
> >
> > spin_lock(&s->lock);
> > [...]
> >
> > instead.
>
> Cool, I've coded up the patch (compile tested only).
>
> Benjamin, any chance you could test this against your v4.7 kernel and
> report back?
>
I will give it a go on Monday. Thx for the quick fix.
- Benjamin
>
> diff --git a/drivers/md/dm-round-robin.c b/drivers/md/dm-round-robin.c
> index 4ace1da..ed446f8 100644
> --- a/drivers/md/dm-round-robin.c
> +++ b/drivers/md/dm-round-robin.c
> @@ -210,14 +210,17 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
> struct path_info *pi = NULL;
> struct dm_path *current_path = NULL;
>
> + local_irq_save(flags);
> current_path = *this_cpu_ptr(s->current_path);
> if (current_path) {
> percpu_counter_dec(&s->repeat_count);
> - if (percpu_counter_read_positive(&s->repeat_count) > 0)
> + if (percpu_counter_read_positive(&s->repeat_count) > 0) {
> + local_irq_restore(flags);
> return current_path;
> + }
> }
>
> - spin_lock_irqsave(&s->lock, flags);
> + spin_lock(&s->lock);
> if (!list_empty(&s->valid_paths)) {
> pi = list_entry(s->valid_paths.next, struct path_info, list);
> list_move_tail(&pi->list, &s->valid_paths);
> @@ -225,7 +228,8 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
> set_percpu_current_path(s, pi->path);
> current_path = pi->path;
> }
> - spin_unlock_irqrestore(&s->lock, flags);
> + spin_unlock(&s->lock);
> + local_irq_restore(flags);
>
> return current_path;
> }
>
--
Linux on z Systems Development / IBM Systems & Technology Group
IBM Deutschland Research & Development GmbH
Vorsitz. AufsR.: Martina Koederitz / Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dm-devel] bug: using smp_processor_id() in preemptible code in rr_select_path()
2016-08-05 16:06 ` Mike Snitzer
2016-08-05 16:11 ` Jens Axboe
2016-08-07 15:49 ` Benjamin Block
@ 2016-08-08 16:32 ` Benjamin Block
2016-08-08 16:39 ` Jens Axboe
2 siblings, 1 reply; 10+ messages in thread
From: Benjamin Block @ 2016-08-08 16:32 UTC (permalink / raw)
To: Mike Snitzer; +Cc: Jens Axboe, linux-block, dm-devel, linux-kernel, hch
On 12:06 Fri 05 Aug , Mike Snitzer wrote:
> On Fri, Aug 05 2016 at 11:54am -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
>
> > On 08/05/2016 09:42 AM, Mike Snitzer wrote:
> > >On Fri, Aug 05 2016 at 11:33P -0400,
> > >Jens Axboe <axboe@kernel.dk> wrote:
> > >
> > >>On 08/05/2016 09:27 AM, Mike Snitzer wrote:
> > >>>On Wed, Aug 03 2016 at 11:35am -0400,
> > >>>Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
> > >>>
> > >>>>Hej Mike,
> > >>>>
> > >>>>when running a debug-kernel today with several multipath-devices using
> > >>>>the round-robin path selector I noticed that the kernel throws these
> > >>>>warnings here:
> > >>>>
> > >>>>BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
> > >>>>caller is rr_select_path+0x36/0x108 [dm_round_robin]
> > >>>>CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
> > >>>> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
> > >>>> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
> > >>>> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
> > >>>> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
> > >>>> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
> > >>>>Call Trace:
> > >>>>([<00000000001144a2>] show_trace+0x8a/0xe0)
> > >>>>([<0000000000114586>] show_stack+0x8e/0xf0)
> > >>>>([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
> > >>>>([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
> > >>>>([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
> > >>>>([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
> > >>>>([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
> > >>>>([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
> > >>>>([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
> > >>>>([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
> > >>>>([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
> > >>>>([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
> > >>>>([<00000000001681da>] kthread+0x112/0x120)
> > >>>>([<000000000098378a>] kernel_thread_starter+0x6/0xc)
> > >>>>([<0000000000983784>] kernel_thread_starter+0x0/0xc)
> > >>>>no locks held by kdmwork-252:0/881.
> > >>>>
[:snip:]
> >
> > I always forget the details (if this confuses lockdep or not), but you
> > could potentially turn it into:
> >
> > local_irq_save(flags);
> > x = this_cpu_ptr();
> > [...]
> >
> > spin_lock(&s->lock);
> > [...]
> >
> > instead.
>
> Cool, I've coded up the patch (compile tested only).
>
> Benjamin, any chance you could test this against your v4.7 kernel and
> report back?
>
> Thanks,
> Mike
>
> diff --git a/drivers/md/dm-round-robin.c b/drivers/md/dm-round-robin.c
> index 4ace1da..ed446f8 100644
> --- a/drivers/md/dm-round-robin.c
> +++ b/drivers/md/dm-round-robin.c
> @@ -210,14 +210,17 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
> struct path_info *pi = NULL;
> struct dm_path *current_path = NULL;
>
> + local_irq_save(flags);
> current_path = *this_cpu_ptr(s->current_path);
> if (current_path) {
> percpu_counter_dec(&s->repeat_count);
> - if (percpu_counter_read_positive(&s->repeat_count) > 0)
> + if (percpu_counter_read_positive(&s->repeat_count) > 0) {
> + local_irq_restore(flags);
> return current_path;
> + }
> }
>
> - spin_lock_irqsave(&s->lock, flags);
> + spin_lock(&s->lock);
> if (!list_empty(&s->valid_paths)) {
> pi = list_entry(s->valid_paths.next, struct path_info, list);
> list_move_tail(&pi->list, &s->valid_paths);
> @@ -225,7 +228,8 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
> set_percpu_current_path(s, pi->path);
> current_path = pi->path;
> }
> - spin_unlock_irqrestore(&s->lock, flags);
> + spin_unlock(&s->lock);
> + local_irq_restore(flags);
>
> return current_path;
> }
>
Ok, this works as far as the warnings don't appear anymore. But while
applying the patch and thinking about it, why local_irq_save() and not
preempt_disable()? "Sounds" like this is the function you want, and I
also stumbled across this in Documentation/preempt-locking.txt:
But keep in mind that 'irqs disabled' is a fundamentally unsafe way of
disabling preemption - any spin_unlock() decreasing the preemption
count to 0 might trigger a reschedule.
The spinlock would do an other nested preempt_disable(), but those even
out.
Beste Grüße / Best regards,
- Benjamin Block
--
Linux on z Systems Development / IBM Systems & Technology Group
IBM Deutschland Research & Development GmbH
Vorsitz. AufsR.: Martina Koederitz / Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dm-devel] bug: using smp_processor_id() in preemptible code in rr_select_path()
2016-08-08 16:32 ` [dm-devel] " Benjamin Block
@ 2016-08-08 16:39 ` Jens Axboe
2016-08-08 17:04 ` Benjamin Block
0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2016-08-08 16:39 UTC (permalink / raw)
To: Benjamin Block, Mike Snitzer; +Cc: linux-block, dm-devel, linux-kernel, hch
On 08/08/2016 10:32 AM, Benjamin Block wrote:
> On 12:06 Fri 05 Aug , Mike Snitzer wrote:
>> On Fri, Aug 05 2016 at 11:54am -0400,
>> Jens Axboe <axboe@kernel.dk> wrote:
>>
>>> On 08/05/2016 09:42 AM, Mike Snitzer wrote:
>>>> On Fri, Aug 05 2016 at 11:33P -0400,
>>>> Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>>> On 08/05/2016 09:27 AM, Mike Snitzer wrote:
>>>>>> On Wed, Aug 03 2016 at 11:35am -0400,
>>>>>> Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
>>>>>>
>>>>>>> Hej Mike,
>>>>>>>
>>>>>>> when running a debug-kernel today with several multipath-devices using
>>>>>>> the round-robin path selector I noticed that the kernel throws these
>>>>>>> warnings here:
>>>>>>>
>>>>>>> BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
>>>>>>> caller is rr_select_path+0x36/0x108 [dm_round_robin]
>>>>>>> CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
>>>>>>> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
>>>>>>> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
>>>>>>> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
>>>>>>> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
>>>>>>> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
>>>>>>> Call Trace:
>>>>>>> ([<00000000001144a2>] show_trace+0x8a/0xe0)
>>>>>>> ([<0000000000114586>] show_stack+0x8e/0xf0)
>>>>>>> ([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
>>>>>>> ([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
>>>>>>> ([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
>>>>>>> ([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
>>>>>>> ([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
>>>>>>> ([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
>>>>>>> ([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
>>>>>>> ([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
>>>>>>> ([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
>>>>>>> ([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
>>>>>>> ([<00000000001681da>] kthread+0x112/0x120)
>>>>>>> ([<000000000098378a>] kernel_thread_starter+0x6/0xc)
>>>>>>> ([<0000000000983784>] kernel_thread_starter+0x0/0xc)
>>>>>>> no locks held by kdmwork-252:0/881.
>>>>>>>
> [:snip:]
>>>
>>> I always forget the details (if this confuses lockdep or not), but you
>>> could potentially turn it into:
>>>
>>> local_irq_save(flags);
>>> x = this_cpu_ptr();
>>> [...]
>>>
>>> spin_lock(&s->lock);
>>> [...]
>>>
>>> instead.
>>
>> Cool, I've coded up the patch (compile tested only).
>>
>> Benjamin, any chance you could test this against your v4.7 kernel and
>> report back?
>>
>> Thanks,
>> Mike
>>
>> diff --git a/drivers/md/dm-round-robin.c b/drivers/md/dm-round-robin.c
>> index 4ace1da..ed446f8 100644
>> --- a/drivers/md/dm-round-robin.c
>> +++ b/drivers/md/dm-round-robin.c
>> @@ -210,14 +210,17 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
>> struct path_info *pi = NULL;
>> struct dm_path *current_path = NULL;
>>
>> + local_irq_save(flags);
>> current_path = *this_cpu_ptr(s->current_path);
>> if (current_path) {
>> percpu_counter_dec(&s->repeat_count);
>> - if (percpu_counter_read_positive(&s->repeat_count) > 0)
>> + if (percpu_counter_read_positive(&s->repeat_count) > 0) {
>> + local_irq_restore(flags);
>> return current_path;
>> + }
>> }
>>
>> - spin_lock_irqsave(&s->lock, flags);
>> + spin_lock(&s->lock);
>> if (!list_empty(&s->valid_paths)) {
>> pi = list_entry(s->valid_paths.next, struct path_info, list);
>> list_move_tail(&pi->list, &s->valid_paths);
>> @@ -225,7 +228,8 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
>> set_percpu_current_path(s, pi->path);
>> current_path = pi->path;
>> }
>> - spin_unlock_irqrestore(&s->lock, flags);
>> + spin_unlock(&s->lock);
>> + local_irq_restore(flags);
>>
>> return current_path;
>> }
>>
>
> Ok, this works as far as the warnings don't appear anymore. But while
> applying the patch and thinking about it, why local_irq_save() and not
> preempt_disable()? "Sounds" like this is the function you want, and I
> also stumbled across this in Documentation/preempt-locking.txt:
>
> But keep in mind that 'irqs disabled' is a fundamentally unsafe way of
> disabling preemption - any spin_unlock() decreasing the preemption
> count to 0 might trigger a reschedule.
>
> The spinlock would do an other nested preempt_disable(), but those even
> out.
local_irq_save(), since we need to grab the lock irq safe very shortly
anyway. As long as they nest properly, the approach is fine, and it's
more efficient than first doing a preempt_disable(), then still needing
a irq safe spinlock.
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dm-devel] bug: using smp_processor_id() in preemptible code in rr_select_path()
2016-08-08 16:39 ` Jens Axboe
@ 2016-08-08 17:04 ` Benjamin Block
0 siblings, 0 replies; 10+ messages in thread
From: Benjamin Block @ 2016-08-08 17:04 UTC (permalink / raw)
To: Jens Axboe; +Cc: Mike Snitzer, linux-block, dm-devel, linux-kernel, hch
On 10:39 Mon 08 Aug , Jens Axboe wrote:
> On 08/08/2016 10:32 AM, Benjamin Block wrote:
> >On 12:06 Fri 05 Aug , Mike Snitzer wrote:
> >>On Fri, Aug 05 2016 at 11:54am -0400,
> >>Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >>>On 08/05/2016 09:42 AM, Mike Snitzer wrote:
> >>>>On Fri, Aug 05 2016 at 11:33P -0400,
> >>>>Jens Axboe <axboe@kernel.dk> wrote:
> >>>>
> >>>>>On 08/05/2016 09:27 AM, Mike Snitzer wrote:
> >>>>>>On Wed, Aug 03 2016 at 11:35am -0400,
> >>>>>>Benjamin Block <bblock@linux.vnet.ibm.com> wrote:
> >>>>>>
> >>>>>>>Hej Mike,
> >>>>>>>
> >>>>>>>when running a debug-kernel today with several multipath-devices using
> >>>>>>>the round-robin path selector I noticed that the kernel throws these
> >>>>>>>warnings here:
> >>>>>>>
> >>>>>>>BUG: using smp_processor_id() in preemptible [00000000] code: kdmwork-252:0/881
> >>>>>>>caller is rr_select_path+0x36/0x108 [dm_round_robin]
> >>>>>>>CPU: 1 PID: 881 Comm: kdmwork-252:0 Not tainted 4.7.0-debug #4
> >>>>>>> 00000000617679b8 0000000061767a48 0000000000000002 0000000000000000
> >>>>>>> 0000000061767ae8 0000000061767a60 0000000061767a60 00000000001145d0
> >>>>>>> 0000000000000000 0000000000b962ae 0000000000bb291e 000000000000000b
> >>>>>>> 0000000061767aa8 0000000061767a48 0000000000000000 0000000000000000
> >>>>>>> 0700000000b962ae 00000000001145d0 0000000061767a48 0000000061767aa8
> >>>>>>>Call Trace:
> >>>>>>>([<00000000001144a2>] show_trace+0x8a/0xe0)
> >>>>>>>([<0000000000114586>] show_stack+0x8e/0xf0)
> >>>>>>>([<00000000006c7fdc>] dump_stack+0x9c/0xe0)
> >>>>>>>([<00000000006fbbc0>] check_preemption_disabled+0x108/0x130)
> >>>>>>>([<000003ff80268646>] rr_select_path+0x36/0x108 [dm_round_robin])
> >>>>>>>([<000003ff80259a42>] choose_path_in_pg+0x42/0xc8 [dm_multipath])
> >>>>>>>([<000003ff80259b62>] choose_pgpath+0x9a/0x1a0 [dm_multipath])
> >>>>>>>([<000003ff8025b51a>] __multipath_map.isra.5+0x72/0x228 [dm_multipath])
> >>>>>>>([<000003ff8025b75e>] multipath_map+0x3e/0x50 [dm_multipath])
> >>>>>>>([<000003ff80225eb6>] map_request+0x66/0x458 [dm_mod])
> >>>>>>>([<000003ff802262ec>] map_tio_request+0x44/0x70 [dm_mod])
> >>>>>>>([<000000000016835a>] kthread_worker_fn+0xf2/0x1d8)
> >>>>>>>([<00000000001681da>] kthread+0x112/0x120)
> >>>>>>>([<000000000098378a>] kernel_thread_starter+0x6/0xc)
> >>>>>>>([<0000000000983784>] kernel_thread_starter+0x0/0xc)
> >>>>>>>no locks held by kdmwork-252:0/881.
> >>>>>>>
> >[:snip:]
> >>>
> >>>I always forget the details (if this confuses lockdep or not), but you
> >>>could potentially turn it into:
> >>>
> >>>local_irq_save(flags);
> >>>x = this_cpu_ptr();
> >>>[...]
> >>>
> >>>spin_lock(&s->lock);
> >>>[...]
> >>>
> >>>instead.
> >>
> >>Cool, I've coded up the patch (compile tested only).
> >>
> >>Benjamin, any chance you could test this against your v4.7 kernel and
> >>report back?
> >>
> >>Thanks,
> >>Mike
> >>
> >>diff --git a/drivers/md/dm-round-robin.c b/drivers/md/dm-round-robin.c
> >>index 4ace1da..ed446f8 100644
> >>--- a/drivers/md/dm-round-robin.c
> >>+++ b/drivers/md/dm-round-robin.c
> >>@@ -210,14 +210,17 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
> >> struct path_info *pi = NULL;
> >> struct dm_path *current_path = NULL;
> >>
> >>+ local_irq_save(flags);
> >> current_path = *this_cpu_ptr(s->current_path);
> >> if (current_path) {
> >> percpu_counter_dec(&s->repeat_count);
> >>- if (percpu_counter_read_positive(&s->repeat_count) > 0)
> >>+ if (percpu_counter_read_positive(&s->repeat_count) > 0) {
> >>+ local_irq_restore(flags);
> >> return current_path;
> >>+ }
> >> }
> >>
> >>- spin_lock_irqsave(&s->lock, flags);
> >>+ spin_lock(&s->lock);
> >> if (!list_empty(&s->valid_paths)) {
> >> pi = list_entry(s->valid_paths.next, struct path_info, list);
> >> list_move_tail(&pi->list, &s->valid_paths);
> >>@@ -225,7 +228,8 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
> >> set_percpu_current_path(s, pi->path);
> >> current_path = pi->path;
> >> }
> >>- spin_unlock_irqrestore(&s->lock, flags);
> >>+ spin_unlock(&s->lock);
> >>+ local_irq_restore(flags);
> >>
> >> return current_path;
> >> }
> >>
> >
> >Ok, this works as far as the warnings don't appear anymore. But while
> >applying the patch and thinking about it, why local_irq_save() and not
> >preempt_disable()? "Sounds" like this is the function you want, and I
> >also stumbled across this in Documentation/preempt-locking.txt:
> >
> > But keep in mind that 'irqs disabled' is a fundamentally unsafe way of
> > disabling preemption - any spin_unlock() decreasing the preemption
> > count to 0 might trigger a reschedule.
> >
> >The spinlock would do an other nested preempt_disable(), but those even
> >out.
>
> local_irq_save(), since we need to grab the lock irq safe very shortly
> anyway. As long as they nest properly, the approach is fine, and it's
> more efficient than first doing a preempt_disable(), then still needing
> a irq safe spinlock.
>
Fair enough, it also doesn't look like the code is doing anything that
could trigger a reschedule.
Beste Grüße / Best regards,
- Benjamin Block
--
Linux on z Systems Development / IBM Systems & Technology Group
IBM Deutschland Research & Development GmbH
Vorsitz. AufsR.: Martina Koederitz / Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-08-08 17:04 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20160803153545.GE20370@bblock-ThinkPad-W530>
2016-08-05 15:27 ` bug: using smp_processor_id() in preemptible code in rr_select_path() Mike Snitzer
2016-08-05 15:33 ` Jens Axboe
2016-08-05 15:42 ` Mike Snitzer
2016-08-05 15:54 ` Jens Axboe
2016-08-05 16:06 ` Mike Snitzer
2016-08-05 16:11 ` Jens Axboe
2016-08-07 15:49 ` Benjamin Block
2016-08-08 16:32 ` [dm-devel] " Benjamin Block
2016-08-08 16:39 ` Jens Axboe
2016-08-08 17:04 ` Benjamin Block
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox