* [PATCH ath-next 0/1] thermal read locking problem
@ 2026-04-13 8:38 Nicolas Escande
2026-04-13 8:38 ` [PATCH ath-next 1/1] wifi: ath12k: avoid deadlock in thermal read Nicolas Escande
0 siblings, 1 reply; 2+ messages in thread
From: Nicolas Escande @ 2026-04-13 8:38 UTC (permalink / raw)
To: ath12k; +Cc: linux-wireless, maharaja.kennadyrajan
So I hit a deadlock between ath12k_pci_remove() and reading the hwmon
temperature of the same device. See the stack trace bellow
[ 369.804971] INFO: task sh:7638 blocked for more than 122 seconds.
[ 369.804991] Not tainted 6.19.0git+ #15
[ 369.805000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 369.805005] task:sh state:D stack:0 pid:7638 tgid:7638 ppid:7637 task_flags:0x480100 flags:0x00080800
[ 369.805022] Call Trace:
[ 369.805028] <TASK>
[ 369.805038] __schedule+0x45b/0x1780
[ 369.805057] ? rwsem_mark_wake+0x1a9/0x2c0
[ 369.805076] ? wake_up_q+0x37/0x90
[ 369.805090] schedule+0x27/0xd0
[ 369.805102] kernfs_drain+0xec/0x170
[ 369.805116] ? __pfx_autoremove_wake_function+0x10/0x10
[ 369.805130] __kernfs_remove.part.0+0x85/0x220
[ 369.805144] kernfs_remove+0x61/0x70
[ 369.805157] __kobject_del+0x2e/0xa0
[ 369.805167] kobject_del+0x13/0x30
[ 369.805175] device_del+0x283/0x3d0
[ 369.805185] ? rtnl_is_locked+0x15/0x20
[ 369.805200] wiphy_unregister+0x10a/0x3f0 [cfg80211 6168fbe3683cd298138328882cdf9008f30e4673]
[ 369.805417] ieee80211_unregister_hw+0x10c/0x130 [mac80211 a4ced2a7a5c741afca2c73dab2ee07ceec04d385]
[ 369.805625] ath12k_mac_hw_unregister+0x71/0x100 [ath12k 6ccd74c4d10a7837e98e0ec39774107bc2e6daf1]
[ 369.805746] ath12k_mac_unregister+0x2e/0x50 [ath12k 6ccd74c4d10a7837e98e0ec39774107bc2e6daf1]
[ 369.805895] ath12k_core_hw_group_stop+0x18/0xa0 [ath12k 6ccd74c4d10a7837e98e0ec39774107bc2e6daf1]
[ 369.805972] ath12k_core_hw_group_cleanup+0x37/0x90 [ath12k 6ccd74c4d10a7837e98e0ec39774107bc2e6daf1]
[ 369.806051] ath12k_pci_remove+0x60/0x110 [ath12k 6ccd74c4d10a7837e98e0ec39774107bc2e6daf1]
[ 369.806137] pci_device_remove+0x4a/0xc0
[ 369.806151] device_release_driver_internal+0x19e/0x200
[ 369.806165] unbind_store+0xa4/0xb0
[ 369.806179] kernfs_fop_write_iter+0x14d/0x200
[ 369.806194] vfs_write+0x25d/0x480
[ 369.806214] ksys_write+0x73/0xf0
[ 369.806228] do_syscall_64+0x11c/0x1610
[ 369.806242] ? __folio_mod_stat+0x2d/0x90
[ 369.806252] ? set_ptes.isra.0+0x36/0x80
[ 369.806266] ? do_anonymous_page+0xfb/0x830
[ 369.806276] ? __pte_offset_map+0x1b/0x100
[ 369.806291] ? __handle_mm_fault+0xb46/0xf60
[ 369.806310] ? count_memcg_events+0xd7/0x190
[ 369.806323] ? handle_mm_fault+0x1d7/0x2d0
[ 369.806356] ? do_user_addr_fault+0x21a/0x680
[ 369.806372] ? exc_page_fault+0x82/0x1d0
[ 369.806385] ? __irq_exit_rcu+0x4c/0xf0
[ 369.806397] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 369.806408] RIP: 0033:0x7f2d2d5163be
[ 369.806443] RSP: 002b:00007fff2b63ea90 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 369.806455] RAX: ffffffffffffffda RBX: 00007f2d2d66b580 RCX: 00007f2d2d5163be
[ 369.806462] RDX: 000000000000000d RSI: 0000561010670560 RDI: 0000000000000001
[ 369.806469] RBP: 00007fff2b63eaa0 R08: 0000000000000000 R09: 0000000000000000
[ 369.806476] R10: 0000000000000000 R11: 0000000000000202 R12: 000000000000000d
[ 369.806481] R13: 000000000000000d R14: 0000561010670560 R15: 0000561010670080
[ 369.806496] </TASK>
[ 369.806502] INFO: task cat:7652 blocked for more than 122 seconds.
[ 369.806512] Not tainted 6.19.0git+ #15
[ 369.806519] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 369.806524] task:cat state:D stack:0 pid:7652 tgid:7652 ppid:611 task_flags:0x400000 flags:0x00080000
[ 369.806538] Call Trace:
[ 369.806542] <TASK>
[ 369.806549] __schedule+0x45b/0x1780
[ 369.806565] ? page_counter_try_charge+0x90/0x150
[ 369.806580] ? mod_memcg_lruvec_state+0xc5/0x1f0
[ 369.806594] schedule+0x27/0xd0
[ 369.806621] schedule_preempt_disabled+0x15/0x30
[ 369.806634] __mutex_lock.constprop.0+0x545/0xae0
[ 369.806644] ? obj_cgroup_charge_account+0x23e/0x420
[ 369.806670] ath12k_thermal_temp_show+0x2b/0x130 [ath12k 6ccd74c4d10a7837e98e0ec39774107bc2e6daf1]
[ 369.806854] ? __kvmalloc_node_noprof+0x696/0x720
[ 369.806869] dev_attr_show+0x1f/0x50
[ 369.806909] sysfs_kf_seq_show+0xcc/0x120
[ 369.806921] seq_read_iter+0x128/0x490
[ 369.806947] ? rw_verify_area+0x56/0x180
[ 369.806959] vfs_read+0x268/0x390
[ 369.806970] ? __folio_mod_stat+0x2d/0x90
[ 369.806985] ksys_read+0x73/0xf0
[ 369.807010] do_syscall_64+0x11c/0x1610
[ 369.807028] ? count_memcg_events+0xd7/0x190
[ 369.807041] ? handle_mm_fault+0x1d7/0x2d0
[ 369.807060] ? do_user_addr_fault+0x21a/0x680
[ 369.807075] ? exc_page_fault+0x82/0x1d0
[ 369.807090] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 369.807099] RIP: 0033:0x7f2e4bd6d3be
[ 369.807139] RSP: 002b:00007ffcd412d410 EFLAGS: 00000202 ORIG_RAX: 0000000000000000
[ 369.807149] RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 00007f2e4bd6d3be
[ 369.807156] RDX: 0000000000040000 RSI: 00007f2e4bc96000 RDI: 0000000000000003
[ 369.807162] RBP: 00007ffcd412d420 R08: 0000000000000000 R09: 0000000000000000
[ 369.807167] R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffcd412d7a8
[ 369.807172] R13: 00007f2e4bc96000 R14: 0000000000040000 R15: 0000000000000000
[ 369.807204] </TASK>
This can be replicated at will on my side by force unbinding the driver
while reading in a loop:
- while [ true ]; do cat /sys/class/ieee80211/phy0/hwmon2/temp1_input; done
- echo 0000:01:00.0 > /sys/bus/pci/drivers/ath12k_wifi7_pci/unbind
I'm not sure if the way of fixing things is the best, but it works around
the deadlock in my tests.
Also as this is just in ath-next for now not sure if I should add a proper
fixes tag ?
Nicolas Escande (1):
wifi: ath12k: avoid deadlock in thermal read
drivers/net/wireless/ath/ath12k/thermal.c | 4 ++++
1 file changed, 4 insertions(+)
--
2.53.0
^ permalink raw reply [flat|nested] 2+ messages in thread
* [PATCH ath-next 1/1] wifi: ath12k: avoid deadlock in thermal read
2026-04-13 8:38 [PATCH ath-next 0/1] thermal read locking problem Nicolas Escande
@ 2026-04-13 8:38 ` Nicolas Escande
0 siblings, 0 replies; 2+ messages in thread
From: Nicolas Escande @ 2026-04-13 8:38 UTC (permalink / raw)
To: ath12k; +Cc: linux-wireless, maharaja.kennadyrajan
When removing the pci device we can have a deadlock if we try to read the
thermal sensor at the same time.
This is due to the fact that when the wiphy gets unregistered (so while
the wiphy lock is held), we wait for all sysfs operation to complete. But
if a read of the thermal device has been started in the mean time, we need
to acquire the wiphy lock, which will lead to a deadlock.
As we already have a flag indicating that we are currently unregistering
the device in the hw group, lets check it first (before locking) so we can
bail out early and avoid the dealock.
Signed-off-by: Nicolas Escande <nico.escande@gmail.com>
---
drivers/net/wireless/ath/ath12k/thermal.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/wireless/ath/ath12k/thermal.c b/drivers/net/wireless/ath/ath12k/thermal.c
index a764d2112a3c..700e7458ddff 100644
--- a/drivers/net/wireless/ath/ath12k/thermal.c
+++ b/drivers/net/wireless/ath/ath12k/thermal.c
@@ -17,9 +17,13 @@ static ssize_t ath12k_thermal_temp_show(struct device *dev,
char *buf)
{
struct ath12k *ar = dev_get_drvdata(dev);
+ struct ath12k_hw_group *ag = ath12k_ab_to_ag(ar->ab);
unsigned long time_left;
int ret, temperature;
+ if (!test_bit(ATH12K_GROUP_FLAG_REGISTERED, &ag->flags))
+ return -ESHUTDOWN;
+
guard(wiphy)(ath12k_ar_to_hw(ar)->wiphy);
if (ar->ah->state != ATH12K_HW_STATE_ON)
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-04-13 8:38 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 8:38 [PATCH ath-next 0/1] thermal read locking problem Nicolas Escande
2026-04-13 8:38 ` [PATCH ath-next 1/1] wifi: ath12k: avoid deadlock in thermal read Nicolas Escande
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox