* x86/thermal: AB-BA dependency between mvm->mutex and tz->lock
@ 2017-07-31 11:18 Jiri Kosina
2017-08-03 9:10 ` Jiri Kosina
0 siblings, 1 reply; 24+ messages in thread
From: Jiri Kosina @ 2017-07-31 11:18 UTC (permalink / raw)
To: Zhang Rui, Eduardo Valentin; +Cc: linux-pm, linux-kernel
Hi,
booting current Linus' tree, I'm seeing lockdep splat (see the end of this
mail).
Apparently, there is AB-BA between tz->lock and mvm->mutex through the CPU
hotplug lock.
The obivous depency is: thermal_zone_get_temp() acquires tz->lock, and
then calls iwl_mvm_tzone_get_temp() (through tz->ops->get_temp()
callback), which acquires mvm->mutex
The less obvious dependency is primarily caused by iwl_op_mode_mvm_start()
allocating workqueue (#2 stacktrace) while holding mvm->mutex (which is
broken, because that mutex is being taken also from CPU hotplug callback
path, hence the AB-BA).
======================================================
WARNING: possible circular locking dependency detected
4.13.0-rc2-00110-g0b5477d #347 Not tainted
------------------------------------------------------
modprobe/881 is trying to acquire lock:
(&mvm->mutex){+.+.+.}, at: [<ffffffffc0c504d2>] iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
but task is already holding lock:
(&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #5 (&tz->lock){+.+.+.}:
lock_acquire+0xbd/0x220
__mutex_lock+0x6e/0x900
mutex_lock_nested+0x1b/0x20
thermal_zone_get_temp+0x41/0x70
thermal_zone_device_update+0x3c/0x280
thermal_zone_device_register+0x3b8/0x610
pkg_thermal_cpu_online+0x20b/0x284 [x86_pkg_temp_thermal]
cpuhp_invoke_callback+0xac/0x900
cpuhp_thread_fun+0x79/0x160
smpboot_thread_fn+0x156/0x220
kthread+0x114/0x150
ret_from_fork+0x2a/0x40
-> #4 (cpuhp_state){+.+.+.}:
lock_acquire+0xbd/0x220
cpuhp_issue_call+0xea/0x170
__cpuhp_setup_state_cpuslocked+0x12a/0x190
__cpuhp_setup_state+0x46/0xc0
page_writeback_init+0x43/0x67
pagecache_init+0x39/0x3c
start_kernel+0x45a/0x4ae
x86_64_start_reservations+0x24/0x26
x86_64_start_kernel+0x13d/0x14c
verify_cpu+0x0/0xf1
-> #3 (cpuhp_state_mutex){+.+.+.}:
lock_acquire+0xbd/0x220
__mutex_lock+0x6e/0x900
mutex_lock_nested+0x1b/0x20
__cpuhp_setup_state_cpuslocked+0x4f/0x190
__cpuhp_setup_state+0x46/0xc0
page_alloc_init+0x28/0x30
start_kernel+0x186/0x4ae
x86_64_start_reservations+0x24/0x26
x86_64_start_kernel+0x13d/0x14c
verify_cpu+0x0/0xf1
-> #2 (cpu_hotplug_lock.rw_sem){++++++}:
lock_acquire+0xbd/0x220
cpus_read_lock+0x46/0x90
apply_workqueue_attrs+0x17/0x50
__alloc_workqueue_key+0x195/0x4d0
_iwl_pcie_rx_init+0x384/0x390 [iwlwifi]
iwl_pcie_rx_init+0x1e/0x380 [iwlwifi]
iwl_trans_pcie_start_fw+0x295/0x6f0 [iwlwifi]
iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm]
iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm]
iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm]
_iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi]
iwl_opmode_register+0xaa/0xd0 [iwlwifi]
iwl_mvm_init+0x37/0x1000 [iwlmvm]
do_one_initcall+0x51/0x1a9
do_init_module+0x60/0x20e
load_module+0x203f/0x2b50
SYSC_finit_module+0x96/0xd0
SyS_finit_module+0xe/0x10
entry_SYSCALL_64_fastpath+0x23/0xc2
-> #1 (&trans_pcie->mutex){+.+.+.}:
lock_acquire+0xbd/0x220
__mutex_lock+0x6e/0x900
mutex_lock_nested+0x1b/0x20
iwl_trans_pcie_start_fw+0x130/0x6f0 [iwlwifi]
iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm]
iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm]
iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm]
_iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi]
iwl_opmode_register+0xaa/0xd0 [iwlwifi]
iwl_mvm_init+0x37/0x1000 [iwlmvm]
do_one_initcall+0x51/0x1a9
do_init_module+0x60/0x20e
load_module+0x203f/0x2b50
SYSC_finit_module+0x96/0xd0
SyS_finit_module+0xe/0x10
entry_SYSCALL_64_fastpath+0x23/0xc2
-> #0 (&mvm->mutex){+.+.+.}:
__lock_acquire+0x13e1/0x1400
lock_acquire+0xbd/0x220
__mutex_lock+0x6e/0x900
mutex_lock_nested+0x1b/0x20
iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
thermal_zone_get_temp+0x51/0x70
thermal_zone_device_update+0x3c/0x280
thermal_zone_device_register+0x3b8/0x610
iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm]
iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm]
_iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi]
iwl_opmode_register+0xaa/0xd0 [iwlwifi]
iwl_mvm_init+0x37/0x1000 [iwlmvm]
do_one_initcall+0x51/0x1a9
do_init_module+0x60/0x20e
load_module+0x203f/0x2b50
SYSC_finit_module+0x96/0xd0
SyS_finit_module+0xe/0x10
entry_SYSCALL_64_fastpath+0x23/0xc2
other info that might help us debug this:
Chain exists of:
&mvm->mutex --> cpuhp_state --> &tz->lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&tz->lock);
lock(cpuhp_state);
lock(&tz->lock);
lock(&mvm->mutex);
*** DEADLOCK ***
2 locks held by modprobe/881:
#0: (&iwlwifi_opmode_table_mtx){+.+.+.}, at: [<ffffffffc0aa2df4>] iwl_opmode_register+0x24/0xd0 [iwlwifi]
#1: (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70
stack backtrace:
CPU: 3 PID: 881 Comm: modprobe Not tainted 4.13.0-rc2-00110-g0b5477d #347
Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017
Call Trace:
dump_stack+0x85/0xc9
print_circular_bug+0x1f9/0x207
__lock_acquire+0x13e1/0x1400
lock_acquire+0xbd/0x220
? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
__mutex_lock+0x6e/0x900
? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
? thermal_zone_get_temp+0x41/0x70
? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
? thermal_zone_get_temp+0x41/0x70
? find_held_lock+0x39/0xb0
mutex_lock_nested+0x1b/0x20
iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
thermal_zone_get_temp+0x51/0x70
thermal_zone_device_update+0x3c/0x280
thermal_zone_device_register+0x3b8/0x610
iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm]
iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm]
_iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi]
iwl_opmode_register+0xaa/0xd0 [iwlwifi]
iwl_mvm_init+0x37/0x1000 [iwlmvm]
? 0xffffffffc0c87000
do_one_initcall+0x51/0x1a9
? rcu_read_lock_sched_held+0x98/0xa0
? kmem_cache_alloc_trace+0x2a5/0x340
do_init_module+0x60/0x20e
load_module+0x203f/0x2b50
? __symbol_put+0x50/0x50
SYSC_finit_module+0x96/0xd0
SyS_finit_module+0xe/0x10
entry_SYSCALL_64_fastpath+0x23/0xc2
RIP: 0033:0x7f2ef067cc89
RSP: 002b:00007ffea2ea3d78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f2ef067cc89
RDX: 0000000000000000 RSI: 000000000041af06 RDI: 0000000000000001
RBP: 0000000000000005 R08: 0000000000000000 R09: 000000000096b230
R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffea2ea2d80
R13: 00007ffea2ea2d60 R14: 0000000000000005 R15: 000000000096f3c0
thermal thermal_zone3: failed to read out thermal zone (-5)
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: x86/thermal: AB-BA dependency between mvm->mutex and tz->lock 2017-07-31 11:18 x86/thermal: AB-BA dependency between mvm->mutex and tz->lock Jiri Kosina @ 2017-08-03 9:10 ` Jiri Kosina 2017-08-03 9:43 ` [linuxwifi] " Coelho, Luciano 0 siblings, 1 reply; 24+ messages in thread From: Jiri Kosina @ 2017-08-03 9:10 UTC (permalink / raw) To: Zhang Rui, Eduardo Valentin, Kalle Valo, Sara Sharon, Johannes Berg, Emmanuel Grumbach, Luca Coelho Cc: linux-pm, linux-kernel, Intel Linux Wireless On Mon, 31 Jul 2017, Jiri Kosina wrote: > Hi, > > booting current Linus' tree, I'm seeing lockdep splat (see the end of this > mail). > > Apparently, there is AB-BA between tz->lock and mvm->mutex through the CPU > hotplug lock. > > The obivous depency is: thermal_zone_get_temp() acquires tz->lock, and > then calls iwl_mvm_tzone_get_temp() (through tz->ops->get_temp() > callback), which acquires mvm->mutex > > The less obvious dependency is primarily caused by iwl_op_mode_mvm_start() > allocating workqueue (#2 stacktrace) while holding mvm->mutex (which is > broken, because that mutex is being taken also from CPU hotplug callback > path, hence the AB-BA). As the "central" part of the dependency is being added by iwlwifi driver (_iwl_pcie_rx_init() allocating workqueue while holding trans_pcie->mutex), I'm adding iwlwifi folks as well to CC. > > ====================================================== > WARNING: possible circular locking dependency detected > 4.13.0-rc2-00110-g0b5477d #347 Not tainted > ------------------------------------------------------ > modprobe/881 is trying to acquire lock: > (&mvm->mutex){+.+.+.}, at: [<ffffffffc0c504d2>] iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > > but task is already holding lock: > (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70 > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #5 (&tz->lock){+.+.+.}: > lock_acquire+0xbd/0x220 > __mutex_lock+0x6e/0x900 > mutex_lock_nested+0x1b/0x20 > thermal_zone_get_temp+0x41/0x70 > thermal_zone_device_update+0x3c/0x280 > thermal_zone_device_register+0x3b8/0x610 > pkg_thermal_cpu_online+0x20b/0x284 [x86_pkg_temp_thermal] > cpuhp_invoke_callback+0xac/0x900 > cpuhp_thread_fun+0x79/0x160 > smpboot_thread_fn+0x156/0x220 > kthread+0x114/0x150 > ret_from_fork+0x2a/0x40 > > -> #4 (cpuhp_state){+.+.+.}: > lock_acquire+0xbd/0x220 > cpuhp_issue_call+0xea/0x170 > __cpuhp_setup_state_cpuslocked+0x12a/0x190 > __cpuhp_setup_state+0x46/0xc0 > page_writeback_init+0x43/0x67 > pagecache_init+0x39/0x3c > start_kernel+0x45a/0x4ae > x86_64_start_reservations+0x24/0x26 > x86_64_start_kernel+0x13d/0x14c > verify_cpu+0x0/0xf1 > > -> #3 (cpuhp_state_mutex){+.+.+.}: > lock_acquire+0xbd/0x220 > __mutex_lock+0x6e/0x900 > mutex_lock_nested+0x1b/0x20 > __cpuhp_setup_state_cpuslocked+0x4f/0x190 > __cpuhp_setup_state+0x46/0xc0 > page_alloc_init+0x28/0x30 > start_kernel+0x186/0x4ae > x86_64_start_reservations+0x24/0x26 > x86_64_start_kernel+0x13d/0x14c > verify_cpu+0x0/0xf1 > > -> #2 (cpu_hotplug_lock.rw_sem){++++++}: > lock_acquire+0xbd/0x220 > cpus_read_lock+0x46/0x90 > apply_workqueue_attrs+0x17/0x50 > __alloc_workqueue_key+0x195/0x4d0 > _iwl_pcie_rx_init+0x384/0x390 [iwlwifi] > iwl_pcie_rx_init+0x1e/0x380 [iwlwifi] > iwl_trans_pcie_start_fw+0x295/0x6f0 [iwlwifi] > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > iwl_mvm_init+0x37/0x1000 [iwlmvm] > do_one_initcall+0x51/0x1a9 > do_init_module+0x60/0x20e > load_module+0x203f/0x2b50 > SYSC_finit_module+0x96/0xd0 > SyS_finit_module+0xe/0x10 > entry_SYSCALL_64_fastpath+0x23/0xc2 > > -> #1 (&trans_pcie->mutex){+.+.+.}: > lock_acquire+0xbd/0x220 > __mutex_lock+0x6e/0x900 > mutex_lock_nested+0x1b/0x20 > iwl_trans_pcie_start_fw+0x130/0x6f0 [iwlwifi] > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > iwl_mvm_init+0x37/0x1000 [iwlmvm] > do_one_initcall+0x51/0x1a9 > do_init_module+0x60/0x20e > load_module+0x203f/0x2b50 > SYSC_finit_module+0x96/0xd0 > SyS_finit_module+0xe/0x10 > entry_SYSCALL_64_fastpath+0x23/0xc2 > > -> #0 (&mvm->mutex){+.+.+.}: > __lock_acquire+0x13e1/0x1400 > lock_acquire+0xbd/0x220 > __mutex_lock+0x6e/0x900 > mutex_lock_nested+0x1b/0x20 > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > thermal_zone_get_temp+0x51/0x70 > thermal_zone_device_update+0x3c/0x280 > thermal_zone_device_register+0x3b8/0x610 > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > iwl_mvm_init+0x37/0x1000 [iwlmvm] > do_one_initcall+0x51/0x1a9 > do_init_module+0x60/0x20e > load_module+0x203f/0x2b50 > SYSC_finit_module+0x96/0xd0 > SyS_finit_module+0xe/0x10 > entry_SYSCALL_64_fastpath+0x23/0xc2 > > other info that might help us debug this: > > Chain exists of: > &mvm->mutex --> cpuhp_state --> &tz->lock > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&tz->lock); > lock(cpuhp_state); > lock(&tz->lock); > lock(&mvm->mutex); > > *** DEADLOCK *** > > 2 locks held by modprobe/881: > #0: (&iwlwifi_opmode_table_mtx){+.+.+.}, at: [<ffffffffc0aa2df4>] iwl_opmode_register+0x24/0xd0 [iwlwifi] > #1: (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70 > > stack backtrace: > CPU: 3 PID: 881 Comm: modprobe Not tainted 4.13.0-rc2-00110-g0b5477d #347 > Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017 > Call Trace: > dump_stack+0x85/0xc9 > print_circular_bug+0x1f9/0x207 > __lock_acquire+0x13e1/0x1400 > lock_acquire+0xbd/0x220 > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > __mutex_lock+0x6e/0x900 > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > ? thermal_zone_get_temp+0x41/0x70 > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > ? thermal_zone_get_temp+0x41/0x70 > ? find_held_lock+0x39/0xb0 > mutex_lock_nested+0x1b/0x20 > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > thermal_zone_get_temp+0x51/0x70 > thermal_zone_device_update+0x3c/0x280 > thermal_zone_device_register+0x3b8/0x610 > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > iwl_mvm_init+0x37/0x1000 [iwlmvm] > ? 0xffffffffc0c87000 > do_one_initcall+0x51/0x1a9 > ? rcu_read_lock_sched_held+0x98/0xa0 > ? kmem_cache_alloc_trace+0x2a5/0x340 > do_init_module+0x60/0x20e > load_module+0x203f/0x2b50 > ? __symbol_put+0x50/0x50 > SYSC_finit_module+0x96/0xd0 > SyS_finit_module+0xe/0x10 > entry_SYSCALL_64_fastpath+0x23/0xc2 > RIP: 0033:0x7f2ef067cc89 > RSP: 002b:00007ffea2ea3d78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 > RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f2ef067cc89 > RDX: 0000000000000000 RSI: 000000000041af06 RDI: 0000000000000001 > RBP: 0000000000000005 R08: 0000000000000000 R09: 000000000096b230 > R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffea2ea2d80 > R13: 00007ffea2ea2d60 R14: 0000000000000005 R15: 000000000096f3c0 > thermal thermal_zone3: failed to read out thermal zone (-5) > > > -- > Jiri Kosina > SUSE Labs > -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock 2017-08-03 9:10 ` Jiri Kosina @ 2017-08-03 9:43 ` Coelho, Luciano 2017-08-03 10:02 ` Kalle Valo 0 siblings, 1 reply; 24+ messages in thread From: Coelho, Luciano @ 2017-08-03 9:43 UTC (permalink / raw) To: jikos@kernel.org, Zhang, Rui, kvalo@codeaurora.org, edubezval@gmail.com, Sharon, Sara, Berg, Johannes, Grumbach, Emmanuel Cc: linuxwifi, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Weinehall, David On Thu, 2017-08-03 at 11:10 +0200, Jiri Kosina wrote: > On Mon, 31 Jul 2017, Jiri Kosina wrote: > > > Hi, > > > > booting current Linus' tree, I'm seeing lockdep splat (see the end of this > > mail). > > > > Apparently, there is AB-BA between tz->lock and mvm->mutex through the CPU > > hotplug lock. > > > > The obivous depency is: thermal_zone_get_temp() acquires tz->lock, and > > then calls iwl_mvm_tzone_get_temp() (through tz->ops->get_temp() > > callback), which acquires mvm->mutex > > > > The less obvious dependency is primarily caused by iwl_op_mode_mvm_start() > > allocating workqueue (#2 stacktrace) while holding mvm->mutex (which is > > broken, because that mutex is being taken also from CPU hotplug callback > > path, hence the AB-BA). > > As the "central" part of the dependency is being added by iwlwifi driver > (_iwl_pcie_rx_init() allocating workqueue while holding > trans_pcie->mutex), I'm adding iwlwifi folks as well to CC. > > > > > ====================================================== > > WARNING: possible circular locking dependency detected > > 4.13.0-rc2-00110-g0b5477d #347 Not tainted > > ------------------------------------------------------ > > modprobe/881 is trying to acquire lock: > > (&mvm->mutex){+.+.+.}, at: [<ffffffffc0c504d2>] iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > > > > but task is already holding lock: > > (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70 > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > -> #5 (&tz->lock){+.+.+.}: > > lock_acquire+0xbd/0x220 > > __mutex_lock+0x6e/0x900 > > mutex_lock_nested+0x1b/0x20 > > thermal_zone_get_temp+0x41/0x70 > > thermal_zone_device_update+0x3c/0x280 > > thermal_zone_device_register+0x3b8/0x610 > > pkg_thermal_cpu_online+0x20b/0x284 [x86_pkg_temp_thermal] > > cpuhp_invoke_callback+0xac/0x900 > > cpuhp_thread_fun+0x79/0x160 > > smpboot_thread_fn+0x156/0x220 > > kthread+0x114/0x150 > > ret_from_fork+0x2a/0x40 > > > > -> #4 (cpuhp_state){+.+.+.}: > > lock_acquire+0xbd/0x220 > > cpuhp_issue_call+0xea/0x170 > > __cpuhp_setup_state_cpuslocked+0x12a/0x190 > > __cpuhp_setup_state+0x46/0xc0 > > page_writeback_init+0x43/0x67 > > pagecache_init+0x39/0x3c > > start_kernel+0x45a/0x4ae > > x86_64_start_reservations+0x24/0x26 > > x86_64_start_kernel+0x13d/0x14c > > verify_cpu+0x0/0xf1 > > > > -> #3 (cpuhp_state_mutex){+.+.+.}: > > lock_acquire+0xbd/0x220 > > __mutex_lock+0x6e/0x900 > > mutex_lock_nested+0x1b/0x20 > > __cpuhp_setup_state_cpuslocked+0x4f/0x190 > > __cpuhp_setup_state+0x46/0xc0 > > page_alloc_init+0x28/0x30 > > start_kernel+0x186/0x4ae > > x86_64_start_reservations+0x24/0x26 > > x86_64_start_kernel+0x13d/0x14c > > verify_cpu+0x0/0xf1 > > > > -> #2 (cpu_hotplug_lock.rw_sem){++++++}: > > lock_acquire+0xbd/0x220 > > cpus_read_lock+0x46/0x90 > > apply_workqueue_attrs+0x17/0x50 > > __alloc_workqueue_key+0x195/0x4d0 > > _iwl_pcie_rx_init+0x384/0x390 [iwlwifi] > > iwl_pcie_rx_init+0x1e/0x380 [iwlwifi] > > iwl_trans_pcie_start_fw+0x295/0x6f0 [iwlwifi] > > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] > > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] > > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] > > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > > iwl_mvm_init+0x37/0x1000 [iwlmvm] > > do_one_initcall+0x51/0x1a9 > > do_init_module+0x60/0x20e > > load_module+0x203f/0x2b50 > > SYSC_finit_module+0x96/0xd0 > > SyS_finit_module+0xe/0x10 > > entry_SYSCALL_64_fastpath+0x23/0xc2 > > > > -> #1 (&trans_pcie->mutex){+.+.+.}: > > lock_acquire+0xbd/0x220 > > __mutex_lock+0x6e/0x900 > > mutex_lock_nested+0x1b/0x20 > > iwl_trans_pcie_start_fw+0x130/0x6f0 [iwlwifi] > > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] > > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] > > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] > > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > > iwl_mvm_init+0x37/0x1000 [iwlmvm] > > do_one_initcall+0x51/0x1a9 > > do_init_module+0x60/0x20e > > load_module+0x203f/0x2b50 > > SYSC_finit_module+0x96/0xd0 > > SyS_finit_module+0xe/0x10 > > entry_SYSCALL_64_fastpath+0x23/0xc2 > > > > -> #0 (&mvm->mutex){+.+.+.}: > > __lock_acquire+0x13e1/0x1400 > > lock_acquire+0xbd/0x220 > > __mutex_lock+0x6e/0x900 > > mutex_lock_nested+0x1b/0x20 > > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > > thermal_zone_get_temp+0x51/0x70 > > thermal_zone_device_update+0x3c/0x280 > > thermal_zone_device_register+0x3b8/0x610 > > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] > > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] > > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > > iwl_mvm_init+0x37/0x1000 [iwlmvm] > > do_one_initcall+0x51/0x1a9 > > do_init_module+0x60/0x20e > > load_module+0x203f/0x2b50 > > SYSC_finit_module+0x96/0xd0 > > SyS_finit_module+0xe/0x10 > > entry_SYSCALL_64_fastpath+0x23/0xc2 > > > > other info that might help us debug this: > > > > Chain exists of: > > &mvm->mutex --> cpuhp_state --> &tz->lock > > > > Possible unsafe locking scenario: > > > > CPU0 CPU1 > > ---- ---- > > lock(&tz->lock); > > lock(cpuhp_state); > > lock(&tz->lock); > > lock(&mvm->mutex); > > > > *** DEADLOCK *** > > > > 2 locks held by modprobe/881: > > #0: (&iwlwifi_opmode_table_mtx){+.+.+.}, at: [<ffffffffc0aa2df4>] iwl_opmode_register+0x24/0xd0 [iwlwifi] > > #1: (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70 > > > > stack backtrace: > > CPU: 3 PID: 881 Comm: modprobe Not tainted 4.13.0-rc2-00110-g0b5477d #347 > > Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017 > > Call Trace: > > dump_stack+0x85/0xc9 > > print_circular_bug+0x1f9/0x207 > > __lock_acquire+0x13e1/0x1400 > > lock_acquire+0xbd/0x220 > > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > > __mutex_lock+0x6e/0x900 > > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > > ? thermal_zone_get_temp+0x41/0x70 > > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > > ? thermal_zone_get_temp+0x41/0x70 > > ? find_held_lock+0x39/0xb0 > > mutex_lock_nested+0x1b/0x20 > > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] > > thermal_zone_get_temp+0x51/0x70 > > thermal_zone_device_update+0x3c/0x280 > > thermal_zone_device_register+0x3b8/0x610 > > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] > > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] > > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > > iwl_mvm_init+0x37/0x1000 [iwlmvm] > > ? 0xffffffffc0c87000 > > do_one_initcall+0x51/0x1a9 > > ? rcu_read_lock_sched_held+0x98/0xa0 > > ? kmem_cache_alloc_trace+0x2a5/0x340 > > do_init_module+0x60/0x20e > > load_module+0x203f/0x2b50 > > ? __symbol_put+0x50/0x50 > > SYSC_finit_module+0x96/0xd0 > > SyS_finit_module+0xe/0x10 > > entry_SYSCALL_64_fastpath+0x23/0xc2 > > RIP: 0033:0x7f2ef067cc89 > > RSP: 002b:00007ffea2ea3d78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 > > RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f2ef067cc89 > > RDX: 0000000000000000 RSI: 000000000041af06 RDI: 0000000000000001 > > RBP: 0000000000000005 R08: 0000000000000000 R09: 000000000096b230 > > R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffea2ea2d80 > > R13: 00007ffea2ea2d60 R14: 0000000000000005 R15: 000000000096f3c0 > > thermal thermal_zone3: failed to read out thermal zone (-5) CCing David Weinehall who also just reported this to me. We'll check this ASAP. Thanks for reporting! -- Luca. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock 2017-08-03 9:43 ` [linuxwifi] " Coelho, Luciano 2017-08-03 10:02 ` Kalle Valo @ 2017-08-03 10:02 ` Kalle Valo 0 siblings, 0 replies; 24+ messages in thread From: Kalle Valo @ 2017-08-03 10:02 UTC (permalink / raw) To: Coelho, Luciano Cc: jikos@kernel.org, Zhang, Rui, edubezval@gmail.com, Sharon, Sara, Berg, Johannes, Grumbach, Emmanuel, linuxwifi, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Weinehall, David, linux-wireless "Coelho, Luciano" <luciano.coelho@intel.com> writes: > On Thu, 2017-08-03 at 11:10 +0200, Jiri Kosina wrote: >> On Mon, 31 Jul 2017, Jiri Kosina wrote: >> >> > Hi, >> > >> > booting current Linus' tree, I'm seeing lockdep splat (see the end of this >> > mail). >> > >> > Apparently, there is AB-BA between tz->lock and mvm->mutex through the CPU >> > hotplug lock. >> > >> > The obivous depency is: thermal_zone_get_temp() acquires tz->lock, and >> > then calls iwl_mvm_tzone_get_temp() (through tz->ops->get_temp() >> > callback), which acquires mvm->mutex >> > >> > The less obvious dependency is primarily caused by iwl_op_mode_mvm_start() >> > allocating workqueue (#2 stacktrace) while holding mvm->mutex (which is >> > broken, because that mutex is being taken also from CPU hotplug callback >> > path, hence the AB-BA). >> >> As the "central" part of the dependency is being added by iwlwifi driver >> (_iwl_pcie_rx_init() allocating workqueue while holding >> trans_pcie->mutex), I'm adding iwlwifi folks as well to CC. >> >> > >> > ====================================================== >> > WARNING: possible circular locking dependency detected >> > 4.13.0-rc2-00110-g0b5477d #347 Not tainted >> > ------------------------------------------------------ >> > modprobe/881 is trying to acquire lock: >> > (&mvm->mutex){+.+.+.}, at: [<ffffffffc0c504d2>] iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > >> > but task is already holding lock: >> > (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70 >> > >> > which lock already depends on the new lock. >> > >> > >> > the existing dependency chain (in reverse order) is: >> > >> > -> #5 (&tz->lock){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > thermal_zone_get_temp+0x41/0x70 >> > thermal_zone_device_update+0x3c/0x280 >> > thermal_zone_device_register+0x3b8/0x610 >> > pkg_thermal_cpu_online+0x20b/0x284 [x86_pkg_temp_thermal] >> > cpuhp_invoke_callback+0xac/0x900 >> > cpuhp_thread_fun+0x79/0x160 >> > smpboot_thread_fn+0x156/0x220 >> > kthread+0x114/0x150 >> > ret_from_fork+0x2a/0x40 >> > >> > -> #4 (cpuhp_state){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > cpuhp_issue_call+0xea/0x170 >> > __cpuhp_setup_state_cpuslocked+0x12a/0x190 >> > __cpuhp_setup_state+0x46/0xc0 >> > page_writeback_init+0x43/0x67 >> > pagecache_init+0x39/0x3c >> > start_kernel+0x45a/0x4ae >> > x86_64_start_reservations+0x24/0x26 >> > x86_64_start_kernel+0x13d/0x14c >> > verify_cpu+0x0/0xf1 >> > >> > -> #3 (cpuhp_state_mutex){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > __cpuhp_setup_state_cpuslocked+0x4f/0x190 >> > __cpuhp_setup_state+0x46/0xc0 >> > page_alloc_init+0x28/0x30 >> > start_kernel+0x186/0x4ae >> > x86_64_start_reservations+0x24/0x26 >> > x86_64_start_kernel+0x13d/0x14c >> > verify_cpu+0x0/0xf1 >> > >> > -> #2 (cpu_hotplug_lock.rw_sem){++++++}: >> > lock_acquire+0xbd/0x220 >> > cpus_read_lock+0x46/0x90 >> > apply_workqueue_attrs+0x17/0x50 >> > __alloc_workqueue_key+0x195/0x4d0 >> > _iwl_pcie_rx_init+0x384/0x390 [iwlwifi] >> > iwl_pcie_rx_init+0x1e/0x380 [iwlwifi] >> > iwl_trans_pcie_start_fw+0x295/0x6f0 [iwlwifi] >> > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] >> > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] >> > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > do_one_initcall+0x51/0x1a9 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> > >> > -> #1 (&trans_pcie->mutex){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > iwl_trans_pcie_start_fw+0x130/0x6f0 [iwlwifi] >> > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] >> > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] >> > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > do_one_initcall+0x51/0x1a9 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> > >> > -> #0 (&mvm->mutex){+.+.+.}: >> > __lock_acquire+0x13e1/0x1400 >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > thermal_zone_get_temp+0x51/0x70 >> > thermal_zone_device_update+0x3c/0x280 >> > thermal_zone_device_register+0x3b8/0x610 >> > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] >> > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > do_one_initcall+0x51/0x1a9 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> > >> > other info that might help us debug this: >> > >> > Chain exists of: >> > &mvm->mutex --> cpuhp_state --> &tz->lock >> > >> > Possible unsafe locking scenario: >> > >> > CPU0 CPU1 >> > ---- ---- >> > lock(&tz->lock); >> > lock(cpuhp_state); >> > lock(&tz->lock); >> > lock(&mvm->mutex); >> > >> > *** DEADLOCK *** >> > >> > 2 locks held by modprobe/881: >> > #0: (&iwlwifi_opmode_table_mtx){+.+.+.}, at: [<ffffffffc0aa2df4>] iwl_opmode_register+0x24/0xd0 [iwlwifi] >> > #1: (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70 >> > >> > stack backtrace: >> > CPU: 3 PID: 881 Comm: modprobe Not tainted 4.13.0-rc2-00110-g0b5477d #347 >> > Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017 >> > Call Trace: >> > dump_stack+0x85/0xc9 >> > print_circular_bug+0x1f9/0x207 >> > __lock_acquire+0x13e1/0x1400 >> > lock_acquire+0xbd/0x220 >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > __mutex_lock+0x6e/0x900 >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > ? thermal_zone_get_temp+0x41/0x70 >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > ? thermal_zone_get_temp+0x41/0x70 >> > ? find_held_lock+0x39/0xb0 >> > mutex_lock_nested+0x1b/0x20 >> > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > thermal_zone_get_temp+0x51/0x70 >> > thermal_zone_device_update+0x3c/0x280 >> > thermal_zone_device_register+0x3b8/0x610 >> > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] >> > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > ? 0xffffffffc0c87000 >> > do_one_initcall+0x51/0x1a9 >> > ? rcu_read_lock_sched_held+0x98/0xa0 >> > ? kmem_cache_alloc_trace+0x2a5/0x340 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > ? __symbol_put+0x50/0x50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> > RIP: 0033:0x7f2ef067cc89 >> > RSP: 002b:00007ffea2ea3d78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 >> > RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f2ef067cc89 >> > RDX: 0000000000000000 RSI: 000000000041af06 RDI: 0000000000000001 >> > RBP: 0000000000000005 R08: 0000000000000000 R09: 000000000096b230 >> > R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffea2ea2d80 >> > R13: 00007ffea2ea2d60 R14: 0000000000000005 R15: 000000000096f3c0 >> > thermal thermal_zone3: failed to read out thermal zone (-5) > > CCing David Weinehall who also just reported this to me. > > We'll check this ASAP. Thanks for reporting! Adding linux-wireless also to the loop. -- Kalle Valo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock @ 2017-08-03 10:02 ` Kalle Valo 0 siblings, 0 replies; 24+ messages in thread From: Kalle Valo @ 2017-08-03 10:02 UTC (permalink / raw) To: Coelho, Luciano Cc: jikos@kernel.org, Zhang, Rui, edubezval@gmail.com, Sharon, Sara, Berg, Johannes, Grumbach, Emmanuel, linuxwifi, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Weinehall, David, linux-wireless "Coelho, Luciano" <luciano.coelho@intel.com> writes: > On Thu, 2017-08-03 at 11:10 +0200, Jiri Kosina wrote: >> On Mon, 31 Jul 2017, Jiri Kosina wrote: >> >> > Hi, >> > >> > booting current Linus' tree, I'm seeing lockdep splat (see the end of this >> > mail). >> > >> > Apparently, there is AB-BA between tz->lock and mvm->mutex through the CPU >> > hotplug lock. >> > >> > The obivous depency is: thermal_zone_get_temp() acquires tz->lock, and >> > then calls iwl_mvm_tzone_get_temp() (through tz->ops->get_temp() >> > callback), which acquires mvm->mutex >> > >> > The less obvious dependency is primarily caused by iwl_op_mode_mvm_start() >> > allocating workqueue (#2 stacktrace) while holding mvm->mutex (which is >> > broken, because that mutex is being taken also from CPU hotplug callback >> > path, hence the AB-BA). >> >> As the "central" part of the dependency is being added by iwlwifi driver >> (_iwl_pcie_rx_init() allocating workqueue while holding >> trans_pcie->mutex), I'm adding iwlwifi folks as well to CC. >> >> > >> > ====================================================== >> > WARNING: possible circular locking dependency detected >> > 4.13.0-rc2-00110-g0b5477d #347 Not tainted >> > ------------------------------------------------------ >> > modprobe/881 is trying to acquire lock: >> > (&mvm->mutex){+.+.+.}, at: [<ffffffffc0c504d2>] iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > >> > but task is already holding lock: >> > (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70 >> > >> > which lock already depends on the new lock. >> > >> > >> > the existing dependency chain (in reverse order) is: >> > >> > -> #5 (&tz->lock){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > thermal_zone_get_temp+0x41/0x70 >> > thermal_zone_device_update+0x3c/0x280 >> > thermal_zone_device_register+0x3b8/0x610 >> > pkg_thermal_cpu_online+0x20b/0x284 [x86_pkg_temp_thermal] >> > cpuhp_invoke_callback+0xac/0x900 >> > cpuhp_thread_fun+0x79/0x160 >> > smpboot_thread_fn+0x156/0x220 >> > kthread+0x114/0x150 >> > ret_from_fork+0x2a/0x40 >> > >> > -> #4 (cpuhp_state){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > cpuhp_issue_call+0xea/0x170 >> > __cpuhp_setup_state_cpuslocked+0x12a/0x190 >> > __cpuhp_setup_state+0x46/0xc0 >> > page_writeback_init+0x43/0x67 >> > pagecache_init+0x39/0x3c >> > start_kernel+0x45a/0x4ae >> > x86_64_start_reservations+0x24/0x26 >> > x86_64_start_kernel+0x13d/0x14c >> > verify_cpu+0x0/0xf1 >> > >> > -> #3 (cpuhp_state_mutex){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > __cpuhp_setup_state_cpuslocked+0x4f/0x190 >> > __cpuhp_setup_state+0x46/0xc0 >> > page_alloc_init+0x28/0x30 >> > start_kernel+0x186/0x4ae >> > x86_64_start_reservations+0x24/0x26 >> > x86_64_start_kernel+0x13d/0x14c >> > verify_cpu+0x0/0xf1 >> > >> > -> #2 (cpu_hotplug_lock.rw_sem){++++++}: >> > lock_acquire+0xbd/0x220 >> > cpus_read_lock+0x46/0x90 >> > apply_workqueue_attrs+0x17/0x50 >> > __alloc_workqueue_key+0x195/0x4d0 >> > _iwl_pcie_rx_init+0x384/0x390 [iwlwifi] >> > iwl_pcie_rx_init+0x1e/0x380 [iwlwifi] >> > iwl_trans_pcie_start_fw+0x295/0x6f0 [iwlwifi] >> > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] >> > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] >> > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > do_one_initcall+0x51/0x1a9 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> > >> > -> #1 (&trans_pcie->mutex){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > iwl_trans_pcie_start_fw+0x130/0x6f0 [iwlwifi] >> > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] >> > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] >> > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > do_one_initcall+0x51/0x1a9 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> > >> > -> #0 (&mvm->mutex){+.+.+.}: >> > __lock_acquire+0x13e1/0x1400 >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > thermal_zone_get_temp+0x51/0x70 >> > thermal_zone_device_update+0x3c/0x280 >> > thermal_zone_device_register+0x3b8/0x610 >> > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] >> > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > do_one_initcall+0x51/0x1a9 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> > >> > other info that might help us debug this: >> > >> > Chain exists of: >> > &mvm->mutex --> cpuhp_state --> &tz->lock >> > >> > Possible unsafe locking scenario: >> > >> > CPU0 CPU1 >> > ---- ---- >> > lock(&tz->lock); >> > lock(cpuhp_state); >> > lock(&tz->lock); >> > lock(&mvm->mutex); >> > >> > *** DEADLOCK *** >> > >> > 2 locks held by modprobe/881: >> > #0: (&iwlwifi_opmode_table_mtx){+.+.+.}, at: [<ffffffffc0aa2df4>] iwl_opmode_register+0x24/0xd0 [iwlwifi] >> > #1: (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+0x41/0x70 >> > >> > stack backtrace: >> > CPU: 3 PID: 881 Comm: modprobe Not tainted 4.13.0-rc2-00110-g0b5477d #347 >> > Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017 >> > Call Trace: >> > dump_stack+0x85/0xc9 >> > print_circular_bug+0x1f9/0x207 >> > __lock_acquire+0x13e1/0x1400 >> > lock_acquire+0xbd/0x220 >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > __mutex_lock+0x6e/0x900 >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > ? thermal_zone_get_temp+0x41/0x70 >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > ? thermal_zone_get_temp+0x41/0x70 >> > ? find_held_lock+0x39/0xb0 >> > mutex_lock_nested+0x1b/0x20 >> > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > thermal_zone_get_temp+0x51/0x70 >> > thermal_zone_device_update+0x3c/0x280 >> > thermal_zone_device_register+0x3b8/0x610 >> > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] >> > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > ? 0xffffffffc0c87000 >> > do_one_initcall+0x51/0x1a9 >> > ? rcu_read_lock_sched_held+0x98/0xa0 >> > ? kmem_cache_alloc_trace+0x2a5/0x340 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > ? __symbol_put+0x50/0x50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> > RIP: 0033:0x7f2ef067cc89 >> > RSP: 002b:00007ffea2ea3d78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 >> > RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f2ef067cc89 >> > RDX: 0000000000000000 RSI: 000000000041af06 RDI: 0000000000000001 >> > RBP: 0000000000000005 R08: 0000000000000000 R09: 000000000096b230 >> > R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffea2ea2d80 >> > R13: 00007ffea2ea2d60 R14: 0000000000000005 R15: 000000000096f3c0 >> > thermal thermal_zone3: failed to read out thermal zone (-5) > > CCing David Weinehall who also just reported this to me. > > We'll check this ASAP. Thanks for reporting! Adding linux-wireless also to the loop. -- Kalle Valo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock @ 2017-08-03 10:02 ` Kalle Valo 0 siblings, 0 replies; 24+ messages in thread From: Kalle Valo @ 2017-08-03 10:02 UTC (permalink / raw) To: Coelho, Luciano Cc: jikos@kernel.org, Zhang, Rui, edubezval@gmail.com, Sharon, Sara, Berg, Johannes, Grumbach, Emmanuel, linuxwifi, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Weinehall, David, linux-wireless "Coelho, Luciano" <luciano.coelho@intel.com> writes: > On Thu, 2017-08-03 at 11:10 +0200, Jiri Kosina wrote: >> On Mon, 31 Jul 2017, Jiri Kosina wrote: >>=20 >> > Hi, >> >=20 >> > booting current Linus' tree, I'm seeing lockdep splat (see the end of = this=20 >> > mail). >> >=20 >> > Apparently, there is AB-BA between tz->lock and mvm->mutex through the= CPU=20 >> > hotplug lock. >> >=20 >> > The obivous depency is: thermal_zone_get_temp() acquires tz->lock, and= =20 >> > then calls iwl_mvm_tzone_get_temp() (through tz->ops->get_temp()=20 >> > callback), which acquires mvm->mutex >> >=20 >> > The less obvious dependency is primarily caused by iwl_op_mode_mvm_sta= rt()=20 >> > allocating workqueue (#2 stacktrace) while holding mvm->mutex (which i= s=20 >> > broken, because that mutex is being taken also from CPU hotplug callba= ck=20 >> > path, hence the AB-BA). >>=20 >> As the "central" part of the dependency is being added by iwlwifi driver= =20 >> (_iwl_pcie_rx_init() allocating workqueue while holding=20 >> trans_pcie->mutex), I'm adding iwlwifi folks as well to CC. >>=20 >> >=20 >> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D >> > WARNING: possible circular locking dependency detected >> > 4.13.0-rc2-00110-g0b5477d #347 Not tainted >> > ------------------------------------------------------ >> > modprobe/881 is trying to acquire lock: >> > (&mvm->mutex){+.+.+.}, at: [<ffffffffc0c504d2>] iwl_mvm_tzone_get_te= mp+0x32/0x80 [iwlmvm] >> >=20=20 >> > but task is already holding lock: >> > (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+= 0x41/0x70 >> >=20=20 >> > which lock already depends on the new lock. >> >=20 >> >=20=20 >> > the existing dependency chain (in reverse order) is: >> >=20=20 >> > -> #5 (&tz->lock){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > thermal_zone_get_temp+0x41/0x70 >> > thermal_zone_device_update+0x3c/0x280 >> > thermal_zone_device_register+0x3b8/0x610 >> > pkg_thermal_cpu_online+0x20b/0x284 [x86_pkg_temp_thermal] >> > cpuhp_invoke_callback+0xac/0x900 >> > cpuhp_thread_fun+0x79/0x160 >> > smpboot_thread_fn+0x156/0x220 >> > kthread+0x114/0x150 >> > ret_from_fork+0x2a/0x40 >> >=20=20 >> > -> #4 (cpuhp_state){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > cpuhp_issue_call+0xea/0x170 >> > __cpuhp_setup_state_cpuslocked+0x12a/0x190 >> > __cpuhp_setup_state+0x46/0xc0 >> > page_writeback_init+0x43/0x67 >> > pagecache_init+0x39/0x3c >> > start_kernel+0x45a/0x4ae >> > x86_64_start_reservations+0x24/0x26 >> > x86_64_start_kernel+0x13d/0x14c >> > verify_cpu+0x0/0xf1 >> >=20=20 >> > -> #3 (cpuhp_state_mutex){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > __cpuhp_setup_state_cpuslocked+0x4f/0x190 >> > __cpuhp_setup_state+0x46/0xc0 >> > page_alloc_init+0x28/0x30 >> > start_kernel+0x186/0x4ae >> > x86_64_start_reservations+0x24/0x26 >> > x86_64_start_kernel+0x13d/0x14c >> > verify_cpu+0x0/0xf1 >> >=20=20 >> > -> #2 (cpu_hotplug_lock.rw_sem){++++++}: >> > lock_acquire+0xbd/0x220 >> > cpus_read_lock+0x46/0x90 >> > apply_workqueue_attrs+0x17/0x50 >> > __alloc_workqueue_key+0x195/0x4d0 >> > _iwl_pcie_rx_init+0x384/0x390 [iwlwifi] >> > iwl_pcie_rx_init+0x1e/0x380 [iwlwifi] >> > iwl_trans_pcie_start_fw+0x295/0x6f0 [iwlwifi] >> > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] >> > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] >> > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > do_one_initcall+0x51/0x1a9 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> >=20=20 >> > -> #1 (&trans_pcie->mutex){+.+.+.}: >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > iwl_trans_pcie_start_fw+0x130/0x6f0 [iwlwifi] >> > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] >> > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] >> > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > do_one_initcall+0x51/0x1a9 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> >=20=20 >> > -> #0 (&mvm->mutex){+.+.+.}: >> > __lock_acquire+0x13e1/0x1400 >> > lock_acquire+0xbd/0x220 >> > __mutex_lock+0x6e/0x900 >> > mutex_lock_nested+0x1b/0x20 >> > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > thermal_zone_get_temp+0x51/0x70 >> > thermal_zone_device_update+0x3c/0x280 >> > thermal_zone_device_register+0x3b8/0x610 >> > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] >> > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > do_one_initcall+0x51/0x1a9 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> >=20=20 >> > other info that might help us debug this: >> >=20 >> > Chain exists of: >> > &mvm->mutex --> cpuhp_state --> &tz->lock >> >=20 >> > Possible unsafe locking scenario: >> >=20 >> > =C2=A0 CPU0 =C2=A0 CPU1 >> > ---- ---- >> > lock(&tz->lock); >> > lock(cpuhp_state); >> > lock(&tz->lock); >> > lock(&mvm->mutex); >> >=20=20 >> > *** DEADLOCK *** >> >=20 >> > 2 locks held by modprobe/881: >> > #0: (&iwlwifi_opmode_table_mtx){+.+.+.}, at: [<ffffffffc0aa2df4>] i= wl_opmode_register+0x24/0xd0 [iwlwifi] >> > #1: (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_= temp+0x41/0x70 >> >=20=20 >> > stack backtrace: >> > CPU: 3 PID: 881 Comm: modprobe Not tainted 4.13.0-rc2-00110-g0b5477d = #347 >> > Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05= /31/2017 >> > Call Trace: >> > dump_stack+0x85/0xc9 >> > print_circular_bug+0x1f9/0x207 >> > __lock_acquire+0x13e1/0x1400 >> > lock_acquire+0xbd/0x220 >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > __mutex_lock+0x6e/0x900 >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > ? thermal_zone_get_temp+0x41/0x70 >> > ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > ? thermal_zone_get_temp+0x41/0x70 >> > ? find_held_lock+0x39/0xb0 >> > mutex_lock_nested+0x1b/0x20 >> > iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm] >> > thermal_zone_get_temp+0x51/0x70 >> > thermal_zone_device_update+0x3c/0x280 >> > thermal_zone_device_register+0x3b8/0x610 >> > iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm] >> > iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm] >> > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] >> > iwl_opmode_register+0xaa/0xd0 [iwlwifi] >> > iwl_mvm_init+0x37/0x1000 [iwlmvm] >> > ? 0xffffffffc0c87000 >> > do_one_initcall+0x51/0x1a9 >> > ? rcu_read_lock_sched_held+0x98/0xa0 >> > ? kmem_cache_alloc_trace+0x2a5/0x340 >> > do_init_module+0x60/0x20e >> > load_module+0x203f/0x2b50 >> > ? __symbol_put+0x50/0x50 >> > SYSC_finit_module+0x96/0xd0 >> > SyS_finit_module+0xe/0x10 >> > entry_SYSCALL_64_fastpath+0x23/0xc2 >> > RIP: 0033:0x7f2ef067cc89 >> > RSP: 002b:00007ffea2ea3d78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 >> > RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f2ef067cc89 >> > RDX: 0000000000000000 RSI: 000000000041af06 RDI: 0000000000000001 >> > RBP: 0000000000000005 R08: 0000000000000000 R09: 000000000096b230 >> > R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffea2ea2d80 >> > R13: 00007ffea2ea2d60 R14: 0000000000000005 R15: 000000000096f3c0 >> > thermal thermal_zone3: failed to read out thermal zone (-5) > > CCing David Weinehall who also just reported this to me. > > We'll check this ASAP. Thanks for reporting! Adding linux-wireless also to the loop. --=20 Kalle Valo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock 2017-08-03 10:02 ` Kalle Valo @ 2017-08-03 11:28 ` Coelho, Luciano -1 siblings, 0 replies; 24+ messages in thread From: Coelho, Luciano @ 2017-08-03 11:28 UTC (permalink / raw) To: jikos@kernel.org Cc: linux-kernel@vger.kernel.org, linuxwifi, Zhang, Rui, edubezval@gmail.com, linux-pm@vger.kernel.org, Weinehall, David, Berg, Johannes, kvalo@codeaurora.org, Sharon, Sara, linux-wireless@vger.kernel.org, Grumbach, Emmanuel On Thu, 2017-08-03 at 13:02 +0300, Kalle Valo wrote: > "Coelho, Luciano" <luciano.coelho@intel.com> writes: > > > On Thu, 2017-08-03 at 11:10 +0200, Jiri Kosina wrote: > > > On Mon, 31 Jul 2017, Jiri Kosina wrote: > > > > > > > Hi, > > > > > > > > booting current Linus' tree, I'm seeing lockdep splat (see the end of this > > > > mail). > > > > > > > > Apparently, there is AB-BA between tz->lock and mvm->mutex through the CPU > > > > hotplug lock. > > > > > > > > The obivous depency is: thermal_zone_get_temp() acquires tz->lock, and > > > > then calls iwl_mvm_tzone_get_temp() (through tz->ops->get_temp() > > > > callback), which acquires mvm->mutex > > > > > > > > The less obvious dependency is primarily caused by iwl_op_mode_mvm_start() > > > > allocating workqueue (#2 stacktrace) while holding mvm->mutex (which is > > > > broken, because that mutex is being taken also from CPU hotplug callback > > > > path, hence the AB-BA). > > > > > > As the "central" part of the dependency is being added by iwlwifi driver > > > (_iwl_pcie_rx_init() allocating workqueue while holding > > > trans_pcie->mutex), I'm adding iwlwifi folks as well to CC. [...] > > > > -> #2 (cpu_hotplug_lock.rw_sem){++++++}: > > > > lock_acquire+0xbd/0x220 > > > > cpus_read_lock+0x46/0x90 > > > > apply_workqueue_attrs+0x17/0x50 > > > > __alloc_workqueue_key+0x195/0x4d0 > > > > _iwl_pcie_rx_init+0x384/0x390 [iwlwifi] > > > > iwl_pcie_rx_init+0x1e/0x380 [iwlwifi] > > > > iwl_trans_pcie_start_fw+0x295/0x6f0 [iwlwifi] > > > > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] > > > > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] > > > > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] > > > > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > > > > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > > > > iwl_mvm_init+0x37/0x1000 [iwlmvm] > > > > do_one_initcall+0x51/0x1a9 > > > > do_init_module+0x60/0x20e > > > > load_module+0x203f/0x2b50 > > > > SYSC_finit_module+0x96/0xd0 > > > > SyS_finit_module+0xe/0x10 > > > > entry_SYSCALL_64_fastpath+0x23/0xc2 Okay, so as I understand it the problem has been there for a long time, but the splat is only coming up now because of Thomas' patch that adds the lockdep map[1], right? I see the workqueue allocation you mentioned. I'll try to move this allocation out of the mutex and see how it goes. [1] http://lkml.kernel.org/r/20170524081549.709375845@linutronix.de -- Cheers, Luca. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock @ 2017-08-03 11:28 ` Coelho, Luciano 0 siblings, 0 replies; 24+ messages in thread From: Coelho, Luciano @ 2017-08-03 11:28 UTC (permalink / raw) To: jikos@kernel.org Cc: linux-kernel@vger.kernel.org, linuxwifi, Zhang, Rui, edubezval@gmail.com, linux-pm@vger.kernel.org, Weinehall, David, Berg, Johannes, kvalo@codeaurora.org, Sharon, Sara, linux-wireless@vger.kernel.org, Grumbach, Emmanuel T24gVGh1LCAyMDE3LTA4LTAzIGF0IDEzOjAyICswMzAwLCBLYWxsZSBWYWxvIHdyb3RlOg0KPiAi Q29lbGhvLCBMdWNpYW5vIiA8bHVjaWFuby5jb2VsaG9AaW50ZWwuY29tPiB3cml0ZXM6DQo+IA0K PiA+IE9uIFRodSwgMjAxNy0wOC0wMyBhdCAxMToxMCArMDIwMCwgSmlyaSBLb3NpbmEgd3JvdGU6 DQo+ID4gPiBPbiBNb24sIDMxIEp1bCAyMDE3LCBKaXJpIEtvc2luYSB3cm90ZToNCj4gPiA+IA0K PiA+ID4gPiBIaSwNCj4gPiA+ID4gDQo+ID4gPiA+IGJvb3RpbmcgY3VycmVudCBMaW51cycgdHJl ZSwgSSdtIHNlZWluZyBsb2NrZGVwIHNwbGF0IChzZWUgdGhlIGVuZCBvZiB0aGlzIA0KPiA+ID4g PiBtYWlsKS4NCj4gPiA+ID4gDQo+ID4gPiA+IEFwcGFyZW50bHksIHRoZXJlIGlzIEFCLUJBIGJl dHdlZW4gdHotPmxvY2sgYW5kIG12bS0+bXV0ZXggdGhyb3VnaCB0aGUgQ1BVIA0KPiA+ID4gPiBo b3RwbHVnIGxvY2suDQo+ID4gPiA+IA0KPiA+ID4gPiBUaGUgb2Jpdm91cyBkZXBlbmN5IGlzOiB0 aGVybWFsX3pvbmVfZ2V0X3RlbXAoKSBhY3F1aXJlcyB0ei0+bG9jaywgYW5kIA0KPiA+ID4gPiB0 aGVuIGNhbGxzIGl3bF9tdm1fdHpvbmVfZ2V0X3RlbXAoKSAodGhyb3VnaCB0ei0+b3BzLT5nZXRf dGVtcCgpIA0KPiA+ID4gPiBjYWxsYmFjayksIHdoaWNoIGFjcXVpcmVzIG12bS0+bXV0ZXgNCj4g PiA+ID4gDQo+ID4gPiA+IFRoZSBsZXNzIG9idmlvdXMgZGVwZW5kZW5jeSBpcyBwcmltYXJpbHkg Y2F1c2VkIGJ5IGl3bF9vcF9tb2RlX212bV9zdGFydCgpIA0KPiA+ID4gPiBhbGxvY2F0aW5nIHdv cmtxdWV1ZSAoIzIgc3RhY2t0cmFjZSkgd2hpbGUgaG9sZGluZyBtdm0tPm11dGV4ICh3aGljaCBp cyANCj4gPiA+ID4gYnJva2VuLCBiZWNhdXNlIHRoYXQgbXV0ZXggaXMgYmVpbmcgdGFrZW4gYWxz byBmcm9tIENQVSBob3RwbHVnIGNhbGxiYWNrIA0KPiA+ID4gPiBwYXRoLCBoZW5jZSB0aGUgQUIt QkEpLg0KPiA+ID4gDQo+ID4gPiBBcyB0aGUgImNlbnRyYWwiIHBhcnQgb2YgdGhlIGRlcGVuZGVu Y3kgaXMgYmVpbmcgYWRkZWQgYnkgaXdsd2lmaSBkcml2ZXIgDQo+ID4gPiAoX2l3bF9wY2llX3J4 X2luaXQoKSBhbGxvY2F0aW5nIHdvcmtxdWV1ZSB3aGlsZSBob2xkaW5nIA0KPiA+ID4gdHJhbnNf cGNpZS0+bXV0ZXgpLCBJJ20gYWRkaW5nIGl3bHdpZmkgZm9sa3MgYXMgd2VsbCB0byBDQy4NCg0K Wy4uLl0NCg0KPiA+ID4gPiAgLT4gIzIgKGNwdV9ob3RwbHVnX2xvY2sucndfc2VtKXsrKysrKyt9 Og0KPiA+ID4gPiAgICAgICAgIGxvY2tfYWNxdWlyZSsweGJkLzB4MjIwDQo+ID4gPiA+ICAgICAg ICAgY3B1c19yZWFkX2xvY2srMHg0Ni8weDkwDQo+ID4gPiA+ICAgICAgICAgYXBwbHlfd29ya3F1 ZXVlX2F0dHJzKzB4MTcvMHg1MA0KPiA+ID4gPiAgICAgICAgIF9fYWxsb2Nfd29ya3F1ZXVlX2tl eSsweDE5NS8weDRkMA0KPiA+ID4gPiAgICAgICAgIF9pd2xfcGNpZV9yeF9pbml0KzB4Mzg0LzB4 MzkwIFtpd2x3aWZpXQ0KPiA+ID4gPiAgICAgICAgIGl3bF9wY2llX3J4X2luaXQrMHgxZS8weDM4 MCBbaXdsd2lmaV0NCj4gPiA+ID4gICAgICAgICBpd2xfdHJhbnNfcGNpZV9zdGFydF9mdysweDI5 NS8weDZmMCBbaXdsd2lmaV0NCj4gPiA+ID4gICAgICAgICBpd2xfbXZtX2xvYWRfdWNvZGVfd2Fp dF9hbGl2ZSsweGU3LzB4MzkwIFtpd2xtdm1dDQo+ID4gPiA+ICAgICAgICAgaXdsX3J1bl9pbml0 X212bV91Y29kZSsweDg0LzB4MzIwIFtpd2xtdm1dDQo+ID4gPiA+ICAgICAgICAgaXdsX29wX21v ZGVfbXZtX3N0YXJ0KzB4OTY0LzB4ZDMwIFtpd2xtdm1dDQo+ID4gPiA+ICAgICAgICAgX2l3bF9v cF9tb2RlX3N0YXJ0LmlzcmEuOSsweDQ3LzB4YTAgW2l3bHdpZmldDQo+ID4gPiA+ICAgICAgICAg aXdsX29wbW9kZV9yZWdpc3RlcisweGFhLzB4ZDAgW2l3bHdpZmldDQo+ID4gPiA+ICAgICAgICAg aXdsX212bV9pbml0KzB4MzcvMHgxMDAwIFtpd2xtdm1dDQo+ID4gPiA+ICAgICAgICAgZG9fb25l X2luaXRjYWxsKzB4NTEvMHgxYTkNCj4gPiA+ID4gICAgICAgICBkb19pbml0X21vZHVsZSsweDYw LzB4MjBlDQo+ID4gPiA+ICAgICAgICAgbG9hZF9tb2R1bGUrMHgyMDNmLzB4MmI1MA0KPiA+ID4g PiAgICAgICAgIFNZU0NfZmluaXRfbW9kdWxlKzB4OTYvMHhkMA0KPiA+ID4gPiAgICAgICAgIFN5 U19maW5pdF9tb2R1bGUrMHhlLzB4MTANCj4gPiA+ID4gICAgICAgICBlbnRyeV9TWVNDQUxMXzY0 X2Zhc3RwYXRoKzB4MjMvMHhjMg0KDQpPa2F5LCBzbyBhcyBJIHVuZGVyc3RhbmQgaXQgdGhlIHBy b2JsZW0gaGFzIGJlZW4gdGhlcmUgZm9yIGEgbG9uZyB0aW1lLA0KYnV0IHRoZSBzcGxhdCBpcyBv bmx5IGNvbWluZyB1cCBub3cgYmVjYXVzZSBvZiBUaG9tYXMnIHBhdGNoIHRoYXQgYWRkcw0KdGhl IGxvY2tkZXAgbWFwWzFdLCByaWdodD8NCg0KSSBzZWUgdGhlIHdvcmtxdWV1ZSBhbGxvY2F0aW9u IHlvdSBtZW50aW9uZWQuICBJJ2xsIHRyeSB0byBtb3ZlIHRoaXMNCmFsbG9jYXRpb24gb3V0IG9m IHRoZSBtdXRleCBhbmQgc2VlIGhvdyBpdCBnb2VzLg0KDQpbMV0gaHR0cDovL2xrbWwua2VybmVs Lm9yZy9yLzIwMTcwNTI0MDgxNTQ5LjcwOTM3NTg0NUBsaW51dHJvbml4LmRlDQoNCi0tDQpDaGVl cnMsDQpMdWNhLg== ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock 2017-08-03 11:28 ` Coelho, Luciano (?) @ 2017-08-03 11:34 ` Jiri Kosina 2017-08-03 13:09 ` Jiri Kosina -1 siblings, 1 reply; 24+ messages in thread From: Jiri Kosina @ 2017-08-03 11:34 UTC (permalink / raw) To: Coelho, Luciano Cc: linux-kernel@vger.kernel.org, linuxwifi, Zhang, Rui, edubezval@gmail.com, linux-pm@vger.kernel.org, Weinehall, David, Berg, Johannes, kvalo@codeaurora.org, Sharon, Sara, linux-wireless@vger.kernel.org, Grumbach, Emmanuel On Thu, 3 Aug 2017, Coelho, Luciano wrote: > Okay, so as I understand it the problem has been there for a long time, > but the splat is only coming up now because of Thomas' patch that adds > the lockdep map[1], right? Yeah, sorry, forgot to mention that pre-49dfe2a67797 kernels wouldn't produce this, as there would not be aware of the fact that cpus_read_lock() is actually semantically a lock. > I see the workqueue allocation you mentioned. I'll try to move this > allocation out of the mutex and see how it goes. I have been briefly looking into this as well -- it'll basically have to be moved out of the trans_pcie->mutex context, but (a) I'm not sure whether that's actually safe (b) iwl_pcie_rx_reuse_rbd() (which is where corresponding work is being queued) is not a proper context either (it's atomic context) Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock 2017-08-03 11:34 ` Jiri Kosina @ 2017-08-03 13:09 ` Jiri Kosina 2017-08-17 13:38 ` Jiri Kosina 0 siblings, 1 reply; 24+ messages in thread From: Jiri Kosina @ 2017-08-03 13:09 UTC (permalink / raw) To: Coelho, Luciano Cc: linux-kernel@vger.kernel.org, linuxwifi, Zhang, Rui, edubezval@gmail.com, linux-pm@vger.kernel.org, Weinehall, David, Berg, Johannes, kvalo@codeaurora.org, Sharon, Sara, linux-wireless@vger.kernel.org, Grumbach, Emmanuel On Thu, 3 Aug 2017, Jiri Kosina wrote: > > I see the workqueue allocation you mentioned. I'll try to move this > > allocation out of the mutex and see how it goes. > > I have been briefly looking into this as well -- it'll basically have to > be moved out of the trans_pcie->mutex context, but > > (a) I'm not sure whether that's actually safe > (b) iwl_pcie_rx_reuse_rbd() (which is where corresponding work is being > queued) is not a proper context either (it's atomic context) Actually moving it out of trans_pcie->mutex is likely not to be enough, the dependency would still be there, just the graph will have one vertex less, with the dependency going directly from mvm mutex to cpu_hotplug_lock. -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock 2017-08-03 13:09 ` Jiri Kosina @ 2017-08-17 13:38 ` Jiri Kosina 2017-08-17 14:02 ` Coelho, Luciano ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Jiri Kosina @ 2017-08-17 13:38 UTC (permalink / raw) To: Coelho, Luciano Cc: linux-kernel@vger.kernel.org, linuxwifi, Zhang, Rui, edubezval@gmail.com, linux-pm@vger.kernel.org, Weinehall, David, Berg, Johannes, kvalo@codeaurora.org, Sharon, Sara, linux-wireless@vger.kernel.org, Grumbach, Emmanuel Hi, anything new on this front please? The splat (and therefore deadlock potential) is still there with current Linus' tree. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock 2017-08-17 13:38 ` Jiri Kosina @ 2017-08-17 14:02 ` Coelho, Luciano 2017-08-22 7:32 ` [PATCH] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() Luca Coelho 2017-08-22 7:37 ` [PATCH v2] " Luca Coelho 2 siblings, 0 replies; 24+ messages in thread From: Coelho, Luciano @ 2017-08-17 14:02 UTC (permalink / raw) To: jikos@kernel.org Cc: linux-kernel@vger.kernel.org, linuxwifi, Zhang, Rui, Weinehall, David, linux-pm@vger.kernel.org, edubezval@gmail.com, Berg, Johannes, kvalo@codeaurora.org, Sharon, Sara, linux-wireless@vger.kernel.org, Grumbach, Emmanuel On Thu, 2017-08-17 at 15:38 +0200, Jiri Kosina wrote: > Hi, > > anything new on this front please? > > The splat (and therefore deadlock potential) is still there with current > Linus' tree. Sorry, haven't had more time to spend on it. I'll do it this evening. But, just to clarify, the deadlock potential has been there for a while, right? The only difference is that now we get the splat. Not saying we shouldn't fix it though. ;) -- Luca. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock @ 2017-08-17 14:02 ` Coelho, Luciano 0 siblings, 0 replies; 24+ messages in thread From: Coelho, Luciano @ 2017-08-17 14:02 UTC (permalink / raw) To: jikos@kernel.org Cc: linux-kernel@vger.kernel.org, linuxwifi, Zhang, Rui, Weinehall, David, linux-pm@vger.kernel.org, edubezval@gmail.com, Berg, Johannes, kvalo@codeaurora.org, Sharon, Sara, linux-wireless@vger.kernel.org, Grumbach, Emmanuel T24gVGh1LCAyMDE3LTA4LTE3IGF0IDE1OjM4ICswMjAwLCBKaXJpIEtvc2luYSB3cm90ZToNCj4g SGksDQo+IA0KPiBhbnl0aGluZyBuZXcgb24gdGhpcyBmcm9udCBwbGVhc2U/DQo+IA0KPiBUaGUg c3BsYXQgKGFuZCB0aGVyZWZvcmUgZGVhZGxvY2sgcG90ZW50aWFsKSBpcyBzdGlsbCB0aGVyZSB3 aXRoIGN1cnJlbnQgDQo+IExpbnVzJyB0cmVlLg0KDQpTb3JyeSwgaGF2ZW4ndCBoYWQgbW9yZSB0 aW1lIHRvIHNwZW5kIG9uIGl0LiAgSSdsbCBkbyBpdCB0aGlzIGV2ZW5pbmcuDQoNCkJ1dCwganVz dCB0byBjbGFyaWZ5LCB0aGUgZGVhZGxvY2sgcG90ZW50aWFsIGhhcyBiZWVuIHRoZXJlIGZvciBh IHdoaWxlLA0KcmlnaHQ/IFRoZSBvbmx5IGRpZmZlcmVuY2UgaXMgdGhhdCBub3cgd2UgZ2V0IHRo ZSBzcGxhdC4NCg0KTm90IHNheWluZyB3ZSBzaG91bGRuJ3QgZml4IGl0IHRob3VnaC4gOykNCg0K LS0NCkx1Y2Eu ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-17 13:38 ` Jiri Kosina 2017-08-17 14:02 ` Coelho, Luciano @ 2017-08-22 7:32 ` Luca Coelho 2017-08-22 7:37 ` [PATCH v2] " Luca Coelho 2 siblings, 0 replies; 24+ messages in thread From: Luca Coelho @ 2017-08-22 7:32 UTC (permalink / raw) To: jikos, linux-wireless Cc: luciano.coelho, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon, emmanuel.grumbach From: Luca Coelho <luciano.coelho@intel.com> Work queues cannot be allocated in when a mutex is held because the mutex may be in use and that would make it sleep. Doing so generates the following splat with 4.13+: [ 19.513298] ====================================================== [ 19.513429] WARNING: possible circular locking dependency detected [ 19.513557] 4.13.0-rc5+ #6 Not tainted [ 19.513638] ------------------------------------------------------ [ 19.513767] cpuhp/0/12 is trying to acquire lock: [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0 [ 19.514047] [ 19.514047] but task is already holding lock: [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210 [ 19.514338] [ 19.514338] which lock already depends on the new lock. This lock dependency already existed with previous kernel versions, but it was not detected until commit ... was introduced. Reported-by: David Weinehall <david.weinehall@intel.com> Reported-by: Jiri Kosina <jikos@kernel.org> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> --- drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 2 ++ drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 10 +--------- drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 9 +++++++++ 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h index fa315d84e98e..a1ea9ef97ed9 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h @@ -787,6 +787,8 @@ int iwl_pci_fw_enter_d0i3(struct iwl_trans *trans); void iwl_pcie_enable_rx_wake(struct iwl_trans *trans, bool enable); +void iwl_pcie_rx_allocator_work(struct work_struct *data); + /* common functions that are used by gen2 transport */ void iwl_pcie_apm_config(struct iwl_trans *trans); int iwl_pcie_prepare_card_hw(struct iwl_trans *trans); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c index 351c4423125a..942736d3fa75 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c @@ -597,7 +597,7 @@ static void iwl_pcie_rx_allocator_get(struct iwl_trans *trans, rxq->free_count += RX_CLAIM_REQ_ALLOC; } -static void iwl_pcie_rx_allocator_work(struct work_struct *data) +void iwl_pcie_rx_allocator_work(struct work_struct *data) { struct iwl_rb_allocator *rba_p = container_of(data, struct iwl_rb_allocator, rx_alloc); @@ -900,10 +900,6 @@ static int _iwl_pcie_rx_init(struct iwl_trans *trans) return err; } def_rxq = trans_pcie->rxq; - if (!rba->alloc_wq) - rba->alloc_wq = alloc_workqueue("rb_allocator", - WQ_HIGHPRI | WQ_UNBOUND, 1); - INIT_WORK(&rba->rx_alloc, iwl_pcie_rx_allocator_work); spin_lock(&rba->lock); atomic_set(&rba->req_pending, 0); @@ -1017,10 +1013,6 @@ void iwl_pcie_rx_free(struct iwl_trans *trans) } cancel_work_sync(&rba->rx_alloc); - if (rba->alloc_wq) { - destroy_workqueue(rba->alloc_wq); - rba->alloc_wq = NULL; - } iwl_pcie_free_rbs_pool(trans); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c index f95eec52508e..3927bbf04f72 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c @@ -1786,6 +1786,11 @@ void iwl_trans_pcie_free(struct iwl_trans *trans) iwl_pcie_tx_free(trans); iwl_pcie_rx_free(trans); + if (trans_pcie->rba.alloc_wq) { + destroy_workqueue(trans_pcie->rba.alloc_wq); + trans_pcie->rba.alloc_wq = NULL; + } + if (trans_pcie->msix_enabled) { for (i = 0; i < trans_pcie->alloc_vecs; i++) { irq_set_affinity_hint( @@ -3169,6 +3174,10 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev, trans_pcie->inta_mask = CSR_INI_SET_MASK; } + trans_pcie->rba.alloc_wq = alloc_workqueue("rb_allocator", + WQ_HIGHPRI | WQ_UNBOUND, 1); + INIT_WORK(&trans_pcie->rba.rx_alloc, iwl_pcie_rx_allocator_work); + #ifdef CONFIG_IWLWIFI_PCIE_RTPM trans->runtime_pm_mode = IWL_PLAT_PM_MODE_D0I3; #else -- 2.14.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-17 13:38 ` Jiri Kosina 2017-08-17 14:02 ` Coelho, Luciano 2017-08-22 7:32 ` [PATCH] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() Luca Coelho @ 2017-08-22 7:37 ` Luca Coelho 2017-08-24 6:00 ` Luca Coelho ` (2 more replies) 2 siblings, 3 replies; 24+ messages in thread From: Luca Coelho @ 2017-08-22 7:37 UTC (permalink / raw) To: jikos, linux-wireless Cc: luciano.coelho, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon, emmanuel.grumbach From: Luca Coelho <luciano.coelho@intel.com> Work queues cannot be allocated when a mutex is held because the mutex may be in use and that would make it sleep. Doing so generates the following splat with 4.13+: [ 19.513298] ====================================================== [ 19.513429] WARNING: possible circular locking dependency detected [ 19.513557] 4.13.0-rc5+ #6 Not tainted [ 19.513638] ------------------------------------------------------ [ 19.513767] cpuhp/0/12 is trying to acquire lock: [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0 [ 19.514047] [ 19.514047] but task is already holding lock: [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210 [ 19.514338] [ 19.514338] which lock already depends on the new lock. This lock dependency already existed with previous kernel versions, but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link lock stacks for hotplug callbacks") was introduced. Reported-by: David Weinehall <david.weinehall@intel.com> Reported-by: Jiri Kosina <jikos@kernel.org> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> --- In v2: - updated the commit message to a new version, with a grammar fix and the actual commit that exposed the problem; drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 2 ++ drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 10 +--------- drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 9 +++++++++ 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h index fa315d84e98e..a1ea9ef97ed9 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h @@ -787,6 +787,8 @@ int iwl_pci_fw_enter_d0i3(struct iwl_trans *trans); void iwl_pcie_enable_rx_wake(struct iwl_trans *trans, bool enable); +void iwl_pcie_rx_allocator_work(struct work_struct *data); + /* common functions that are used by gen2 transport */ void iwl_pcie_apm_config(struct iwl_trans *trans); int iwl_pcie_prepare_card_hw(struct iwl_trans *trans); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c index 351c4423125a..942736d3fa75 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c @@ -597,7 +597,7 @@ static void iwl_pcie_rx_allocator_get(struct iwl_trans *trans, rxq->free_count += RX_CLAIM_REQ_ALLOC; } -static void iwl_pcie_rx_allocator_work(struct work_struct *data) +void iwl_pcie_rx_allocator_work(struct work_struct *data) { struct iwl_rb_allocator *rba_p = container_of(data, struct iwl_rb_allocator, rx_alloc); @@ -900,10 +900,6 @@ static int _iwl_pcie_rx_init(struct iwl_trans *trans) return err; } def_rxq = trans_pcie->rxq; - if (!rba->alloc_wq) - rba->alloc_wq = alloc_workqueue("rb_allocator", - WQ_HIGHPRI | WQ_UNBOUND, 1); - INIT_WORK(&rba->rx_alloc, iwl_pcie_rx_allocator_work); spin_lock(&rba->lock); atomic_set(&rba->req_pending, 0); @@ -1017,10 +1013,6 @@ void iwl_pcie_rx_free(struct iwl_trans *trans) } cancel_work_sync(&rba->rx_alloc); - if (rba->alloc_wq) { - destroy_workqueue(rba->alloc_wq); - rba->alloc_wq = NULL; - } iwl_pcie_free_rbs_pool(trans); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c index f95eec52508e..3927bbf04f72 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c @@ -1786,6 +1786,11 @@ void iwl_trans_pcie_free(struct iwl_trans *trans) iwl_pcie_tx_free(trans); iwl_pcie_rx_free(trans); + if (trans_pcie->rba.alloc_wq) { + destroy_workqueue(trans_pcie->rba.alloc_wq); + trans_pcie->rba.alloc_wq = NULL; + } + if (trans_pcie->msix_enabled) { for (i = 0; i < trans_pcie->alloc_vecs; i++) { irq_set_affinity_hint( @@ -3169,6 +3174,10 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev, trans_pcie->inta_mask = CSR_INI_SET_MASK; } + trans_pcie->rba.alloc_wq = alloc_workqueue("rb_allocator", + WQ_HIGHPRI | WQ_UNBOUND, 1); + INIT_WORK(&trans_pcie->rba.rx_alloc, iwl_pcie_rx_allocator_work); + #ifdef CONFIG_IWLWIFI_PCIE_RTPM trans->runtime_pm_mode = IWL_PLAT_PM_MODE_D0I3; #else -- 2.14.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-22 7:37 ` [PATCH v2] " Luca Coelho @ 2017-08-24 6:00 ` Luca Coelho 2017-08-24 19:56 ` Jiri Kosina 2017-08-24 13:50 ` [v2] " Kalle Valo 2017-08-30 14:57 ` [PATCH v2] " David Weinehall 2 siblings, 1 reply; 24+ messages in thread From: Luca Coelho @ 2017-08-24 6:00 UTC (permalink / raw) To: jikos, linux-wireless Cc: linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon, emmanuel.grumbach On Tue, 2017-08-22 at 10:37 +0300, Luca Coelho wrote: > From: Luca Coelho <luciano.coelho@intel.com> > > Work queues cannot be allocated when a mutex is held because the mutex > may be in use and that would make it sleep. Doing so generates the > following splat with 4.13+: > > [ 19.513298] ====================================================== > [ 19.513429] WARNING: possible circular locking dependency detected > [ 19.513557] 4.13.0-rc5+ #6 Not tainted > [ 19.513638] ------------------------------------------------------ > [ 19.513767] cpuhp/0/12 is trying to acquire lock: > [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0 > [ 19.514047] > [ 19.514047] but task is already holding lock: > [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210 > [ 19.514338] > [ 19.514338] which lock already depends on the new lock. > > This lock dependency already existed with previous kernel versions, > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link > lock stacks for hotplug callbacks") was introduced. > > Reported-by: David Weinehall <david.weinehall@intel.com> > Reported-by: Jiri Kosina <jikos@kernel.org> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Jiri, did you have a chance to try this out? I'm about to ask Kalle to merge this so it gets in in time for 4.13-rc7. -- Cheers, Luca. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-24 6:00 ` Luca Coelho @ 2017-08-24 19:56 ` Jiri Kosina 2017-08-24 20:00 ` Luca Coelho 0 siblings, 1 reply; 24+ messages in thread From: Jiri Kosina @ 2017-08-24 19:56 UTC (permalink / raw) To: Luca Coelho Cc: linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon, emmanuel.grumbach On Thu, 24 Aug 2017, Luca Coelho wrote: > > Work queues cannot be allocated when a mutex is held because the mutex > > may be in use and that would make it sleep. Doing so generates the > > following splat with 4.13+: > > > > [ 19.513298] ====================================================== > > [ 19.513429] WARNING: possible circular locking dependency detected > > [ 19.513557] 4.13.0-rc5+ #6 Not tainted > > [ 19.513638] ------------------------------------------------------ > > [ 19.513767] cpuhp/0/12 is trying to acquire lock: > > [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0 > > [ 19.514047] > > [ 19.514047] but task is already holding lock: > > [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210 > > [ 19.514338] > > [ 19.514338] which lock already depends on the new lock. > > > > This lock dependency already existed with previous kernel versions, > > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link > > lock stacks for hotplug callbacks") was introduced. > > > > Reported-by: David Weinehall <david.weinehall@intel.com> > > Reported-by: Jiri Kosina <jikos@kernel.org> > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > > Jiri, did you have a chance to try this out? I'm about to ask Kalle to > merge this so it gets in in time for 4.13-rc7. Sorry, I am almost completely offline for one more week (vacation), and will not have access to the affected system before that. But this indeed looks like a correct fix to me, so feel free to add Acked-by: Jiri Kosina <jkosina@suse.cz> I'll be able to provide my Tested-by: eventually only in ~10 days. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-24 19:56 ` Jiri Kosina @ 2017-08-24 20:00 ` Luca Coelho 2017-08-29 8:19 ` Kalle Valo 2017-09-04 11:43 ` Jiri Kosina 0 siblings, 2 replies; 24+ messages in thread From: Luca Coelho @ 2017-08-24 20:00 UTC (permalink / raw) To: Jiri Kosina Cc: linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon, emmanuel.grumbach On Thu, 2017-08-24 at 21:56 +0200, Jiri Kosina wrote: > On Thu, 24 Aug 2017, Luca Coelho wrote: > > > > Work queues cannot be allocated when a mutex is held because the mutex > > > may be in use and that would make it sleep. Doing so generates the > > > following splat with 4.13+: > > > > > > [ 19.513298] ====================================================== > > > [ 19.513429] WARNING: possible circular locking dependency detected > > > [ 19.513557] 4.13.0-rc5+ #6 Not tainted > > > [ 19.513638] ------------------------------------------------------ > > > [ 19.513767] cpuhp/0/12 is trying to acquire lock: > > > [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0 > > > [ 19.514047] > > > [ 19.514047] but task is already holding lock: > > > [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210 > > > [ 19.514338] > > > [ 19.514338] which lock already depends on the new lock. > > > > > > This lock dependency already existed with previous kernel versions, > > > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link > > > lock stacks for hotplug callbacks") was introduced. > > > > > > Reported-by: David Weinehall <david.weinehall@intel.com> > > > Reported-by: Jiri Kosina <jikos@kernel.org> > > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > > > > Jiri, did you have a chance to try this out? I'm about to ask Kalle to > > merge this so it gets in in time for 4.13-rc7. > > Sorry, I am almost completely offline for one more week (vacation), and > will not have access to the affected system before that. Sounds good! Enjoy! ;) > But this indeed > looks like a correct fix to me, so feel free to add > > Acked-by: Jiri Kosina <jkosina@suse.cz> > > I'll be able to provide my Tested-by: eventually only in ~10 days. Kalle already picked it up in wireless-drivers and this should make it's way to 4.13-rc7 (we hope). In any case, thanks for reporting and the help debugging it. -- Cheers, Luca. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-24 20:00 ` Luca Coelho @ 2017-08-29 8:19 ` Kalle Valo 2017-09-04 11:43 ` Jiri Kosina 1 sibling, 0 replies; 24+ messages in thread From: Kalle Valo @ 2017-08-29 8:19 UTC (permalink / raw) To: Luca Coelho Cc: Jiri Kosina, linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, sara.sharon, emmanuel.grumbach Luca Coelho <luca@coelho.fi> writes: > On Thu, 2017-08-24 at 21:56 +0200, Jiri Kosina wrote: >> On Thu, 24 Aug 2017, Luca Coelho wrote: >> >> > > Work queues cannot be allocated when a mutex is held because the mutex >> > > may be in use and that would make it sleep. Doing so generates the >> > > following splat with 4.13+: >> > > >> > > [ 19.513298] ====================================================== >> > > [ 19.513429] WARNING: possible circular locking dependency detected >> > > [ 19.513557] 4.13.0-rc5+ #6 Not tainted >> > > [ 19.513638] ------------------------------------------------------ >> > > [ 19.513767] cpuhp/0/12 is trying to acquire lock: >> > > [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] >> > > thermal_zone_get_temp+0x5b/0xb0 >> > > [ 19.514047] >> > > [ 19.514047] but task is already holding lock: >> > > [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] >> > > cpuhp_thread_fun+0x3a/0x210 >> > > [ 19.514338] >> > > [ 19.514338] which lock already depends on the new lock. >> > > >> > > This lock dependency already existed with previous kernel versions, >> > > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link >> > > lock stacks for hotplug callbacks") was introduced. >> > > >> > > Reported-by: David Weinehall <david.weinehall@intel.com> >> > > Reported-by: Jiri Kosina <jikos@kernel.org> >> > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> >> > >> > Jiri, did you have a chance to try this out? I'm about to ask Kalle to >> > merge this so it gets in in time for 4.13-rc7. >> >> Sorry, I am almost completely offline for one more week (vacation), and >> will not have access to the affected system before that. > > Sounds good! Enjoy! ;) > > >> But this indeed >> looks like a correct fix to me, so feel free to add >> >> Acked-by: Jiri Kosina <jkosina@suse.cz> >> >> I'll be able to provide my Tested-by: eventually only in ~10 days. > > > Kalle already picked it up in wireless-drivers and this should make it's > way to 4.13-rc7 (we hope). It (10a54d8196d1) didn't make it to -rc7 but it's in net tree now and should make it to the next release on Sunday (either -rc8 or the final). -- Kalle Valo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-24 20:00 ` Luca Coelho 2017-08-29 8:19 ` Kalle Valo @ 2017-09-04 11:43 ` Jiri Kosina 2017-09-04 11:44 ` Luca Coelho 1 sibling, 1 reply; 24+ messages in thread From: Jiri Kosina @ 2017-09-04 11:43 UTC (permalink / raw) To: Luca Coelho Cc: linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon, emmanuel.grumbach On Thu, 24 Aug 2017, Luca Coelho wrote: > > looks like a correct fix to me, so feel free to add > > > > Acked-by: Jiri Kosina <jkosina@suse.cz> > > > > I'll be able to provide my Tested-by: eventually only in ~10 days. > > > Kalle already picked it up in wireless-drivers and this should make it's > way to 4.13-rc7 (we hope). > > In any case, thanks for reporting and the help debugging it. I know it's pretty late for this to be added to the commit, but I don't want this to be left in the void, so for the sake of completness: Tested-by: Jiri Kosina <jkosina@suse.cz> Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-09-04 11:43 ` Jiri Kosina @ 2017-09-04 11:44 ` Luca Coelho 0 siblings, 0 replies; 24+ messages in thread From: Luca Coelho @ 2017-09-04 11:44 UTC (permalink / raw) To: Jiri Kosina Cc: linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon, emmanuel.grumbach On Mon, 2017-09-04 at 13:43 +0200, Jiri Kosina wrote: > On Thu, 24 Aug 2017, Luca Coelho wrote: > > > > looks like a correct fix to me, so feel free to add > > > > > > Acked-by: Jiri Kosina <jkosina@suse.cz> > > > > > > I'll be able to provide my Tested-by: eventually only in ~10 days. > > > > > > Kalle already picked it up in wireless-drivers and this should make it's > > way to 4.13-rc7 (we hope). > > > > In any case, thanks for reporting and the help debugging it. > > I know it's pretty late for this to be added to the commit, but I don't > want this to be left in the void, so for the sake of completness: > > Tested-by: Jiri Kosina <jkosina@suse.cz> Thanks, Jiri, for reporting, debugging and testing! -- Cheers, Luca. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-22 7:37 ` [PATCH v2] " Luca Coelho 2017-08-24 6:00 ` Luca Coelho @ 2017-08-24 13:50 ` Kalle Valo 2017-08-30 14:57 ` [PATCH v2] " David Weinehall 2 siblings, 0 replies; 24+ messages in thread From: Kalle Valo @ 2017-08-24 13:50 UTC (permalink / raw) To: Luciano Coelho Cc: jikos, linux-wireless, luciano.coelho, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, sara.sharon, emmanuel.grumbach Luciano Coelho <luca@coelho.fi> wrote: > From: Luca Coelho <luciano.coelho@intel.com> > > Work queues cannot be allocated when a mutex is held because the mutex > may be in use and that would make it sleep. Doing so generates the > following splat with 4.13+: > > [ 19.513298] ====================================================== > [ 19.513429] WARNING: possible circular locking dependency detected > [ 19.513557] 4.13.0-rc5+ #6 Not tainted > [ 19.513638] ------------------------------------------------------ > [ 19.513767] cpuhp/0/12 is trying to acquire lock: > [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0 > [ 19.514047] > [ 19.514047] but task is already holding lock: > [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210 > [ 19.514338] > [ 19.514338] which lock already depends on the new lock. > > This lock dependency already existed with previous kernel versions, > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link > lock stacks for hotplug callbacks") was introduced. > > Reported-by: David Weinehall <david.weinehall@intel.com> > Reported-by: Jiri Kosina <jikos@kernel.org> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Patch applied to wireless-drivers.git, thanks. 10a54d8196d1 iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() -- https://patchwork.kernel.org/patch/9914349/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-22 7:37 ` [PATCH v2] " Luca Coelho 2017-08-24 6:00 ` Luca Coelho 2017-08-24 13:50 ` [v2] " Kalle Valo @ 2017-08-30 14:57 ` David Weinehall 2017-08-31 6:11 ` Luca Coelho 2 siblings, 1 reply; 24+ messages in thread From: David Weinehall @ 2017-08-30 14:57 UTC (permalink / raw) To: Luca Coelho Cc: jikos, linux-wireless, luciano.coelho, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon, emmanuel.grumbach On Tue, Aug 22, 2017 at 10:37:29AM +0300, Luca Coelho wrote: > From: Luca Coelho <luciano.coelho@intel.com> > > Work queues cannot be allocated when a mutex is held because the mutex > may be in use and that would make it sleep. Doing so generates the > following splat with 4.13+: > > [ 19.513298] ====================================================== > [ 19.513429] WARNING: possible circular locking dependency detected > [ 19.513557] 4.13.0-rc5+ #6 Not tainted > [ 19.513638] ------------------------------------------------------ > [ 19.513767] cpuhp/0/12 is trying to acquire lock: > [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0 > [ 19.514047] > [ 19.514047] but task is already holding lock: > [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210 > [ 19.514338] > [ 19.514338] which lock already depends on the new lock. > > This lock dependency already existed with previous kernel versions, > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link > lock stacks for hotplug callbacks") was introduced. > > Reported-by: David Weinehall <david.weinehall@intel.com> > Reported-by: Jiri Kosina <jikos@kernel.org> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> With this patch I no longer get the lockdep warning, and the driver seems to work just as well as before. Thanks! Tested-by: David Weinehall <david.weinehall@linux.intel.com> > --- > In v2: > - updated the commit message to a new version, with a grammar fix > and the actual commit that exposed the problem; > > drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 2 ++ > drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 10 +--------- > drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 9 +++++++++ > 3 files changed, 12 insertions(+), 9 deletions(-) > > diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h > index fa315d84e98e..a1ea9ef97ed9 100644 > --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h > +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h > @@ -787,6 +787,8 @@ int iwl_pci_fw_enter_d0i3(struct iwl_trans *trans); > > void iwl_pcie_enable_rx_wake(struct iwl_trans *trans, bool enable); > > +void iwl_pcie_rx_allocator_work(struct work_struct *data); > + > /* common functions that are used by gen2 transport */ > void iwl_pcie_apm_config(struct iwl_trans *trans); > int iwl_pcie_prepare_card_hw(struct iwl_trans *trans); > diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c > index 351c4423125a..942736d3fa75 100644 > --- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c > +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c > @@ -597,7 +597,7 @@ static void iwl_pcie_rx_allocator_get(struct iwl_trans *trans, > rxq->free_count += RX_CLAIM_REQ_ALLOC; > } > > -static void iwl_pcie_rx_allocator_work(struct work_struct *data) > +void iwl_pcie_rx_allocator_work(struct work_struct *data) > { > struct iwl_rb_allocator *rba_p = > container_of(data, struct iwl_rb_allocator, rx_alloc); > @@ -900,10 +900,6 @@ static int _iwl_pcie_rx_init(struct iwl_trans *trans) > return err; > } > def_rxq = trans_pcie->rxq; > - if (!rba->alloc_wq) > - rba->alloc_wq = alloc_workqueue("rb_allocator", > - WQ_HIGHPRI | WQ_UNBOUND, 1); > - INIT_WORK(&rba->rx_alloc, iwl_pcie_rx_allocator_work); > > spin_lock(&rba->lock); > atomic_set(&rba->req_pending, 0); > @@ -1017,10 +1013,6 @@ void iwl_pcie_rx_free(struct iwl_trans *trans) > } > > cancel_work_sync(&rba->rx_alloc); > - if (rba->alloc_wq) { > - destroy_workqueue(rba->alloc_wq); > - rba->alloc_wq = NULL; > - } > > iwl_pcie_free_rbs_pool(trans); > > diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c > index f95eec52508e..3927bbf04f72 100644 > --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c > +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c > @@ -1786,6 +1786,11 @@ void iwl_trans_pcie_free(struct iwl_trans *trans) > iwl_pcie_tx_free(trans); > iwl_pcie_rx_free(trans); > > + if (trans_pcie->rba.alloc_wq) { > + destroy_workqueue(trans_pcie->rba.alloc_wq); > + trans_pcie->rba.alloc_wq = NULL; > + } > + > if (trans_pcie->msix_enabled) { > for (i = 0; i < trans_pcie->alloc_vecs; i++) { > irq_set_affinity_hint( > @@ -3169,6 +3174,10 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev, > trans_pcie->inta_mask = CSR_INI_SET_MASK; > } > > + trans_pcie->rba.alloc_wq = alloc_workqueue("rb_allocator", > + WQ_HIGHPRI | WQ_UNBOUND, 1); > + INIT_WORK(&trans_pcie->rba.rx_alloc, iwl_pcie_rx_allocator_work); > + > #ifdef CONFIG_IWLWIFI_PCIE_RTPM > trans->runtime_pm_mode = IWL_PLAT_PM_MODE_D0I3; > #else > -- > 2.14.1 > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() 2017-08-30 14:57 ` [PATCH v2] " David Weinehall @ 2017-08-31 6:11 ` Luca Coelho 0 siblings, 0 replies; 24+ messages in thread From: Luca Coelho @ 2017-08-31 6:11 UTC (permalink / raw) To: David Weinehall Cc: jikos, linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon, emmanuel.grumbach On Wed, 2017-08-30 at 17:57 +0300, David Weinehall wrote: > On Tue, Aug 22, 2017 at 10:37:29AM +0300, Luca Coelho wrote: > > From: Luca Coelho <luciano.coelho@intel.com> > > > > Work queues cannot be allocated when a mutex is held because the mutex > > may be in use and that would make it sleep. Doing so generates the > > following splat with 4.13+: > > > > [ 19.513298] ====================================================== > > [ 19.513429] WARNING: possible circular locking dependency detected > > [ 19.513557] 4.13.0-rc5+ #6 Not tainted > > [ 19.513638] ------------------------------------------------------ > > [ 19.513767] cpuhp/0/12 is trying to acquire lock: > > [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0 > > [ 19.514047] > > [ 19.514047] but task is already holding lock: > > [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210 > > [ 19.514338] > > [ 19.514338] which lock already depends on the new lock. > > > > This lock dependency already existed with previous kernel versions, > > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link > > lock stacks for hotplug callbacks") was introduced. > > > > Reported-by: David Weinehall <david.weinehall@intel.com> > > Reported-by: Jiri Kosina <jikos@kernel.org> > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > > With this patch I no longer get the lockdep warning, > and the driver seems to work just as well as before. Great! Thanks for reporting and testing, David! > Thanks! > > Tested-by: David Weinehall <david.weinehall@linux.intel.com> Thanks for this too, but unfortunately it's too late to add it, since the patch is already in net tree. -- Cheers, Luca. ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2017-09-04 11:44 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-07-31 11:18 x86/thermal: AB-BA dependency between mvm->mutex and tz->lock Jiri Kosina 2017-08-03 9:10 ` Jiri Kosina 2017-08-03 9:43 ` [linuxwifi] " Coelho, Luciano 2017-08-03 10:02 ` Kalle Valo 2017-08-03 10:02 ` Kalle Valo 2017-08-03 10:02 ` Kalle Valo 2017-08-03 11:28 ` Coelho, Luciano 2017-08-03 11:28 ` Coelho, Luciano 2017-08-03 11:34 ` Jiri Kosina 2017-08-03 13:09 ` Jiri Kosina 2017-08-17 13:38 ` Jiri Kosina 2017-08-17 14:02 ` Coelho, Luciano 2017-08-17 14:02 ` Coelho, Luciano 2017-08-22 7:32 ` [PATCH] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() Luca Coelho 2017-08-22 7:37 ` [PATCH v2] " Luca Coelho 2017-08-24 6:00 ` Luca Coelho 2017-08-24 19:56 ` Jiri Kosina 2017-08-24 20:00 ` Luca Coelho 2017-08-29 8:19 ` Kalle Valo 2017-09-04 11:43 ` Jiri Kosina 2017-09-04 11:44 ` Luca Coelho 2017-08-24 13:50 ` [v2] " Kalle Valo 2017-08-30 14:57 ` [PATCH v2] " David Weinehall 2017-08-31 6:11 ` Luca Coelho
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.