public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net/nfc: Fix A-B/B-A deadlock between nfc_unregister_device and rfkill_fop_write
@ 2025-08-14 17:31 Yunseong Kim
  2025-08-15  5:55 ` Krzysztof Kozlowski
  0 siblings, 1 reply; 4+ messages in thread
From: Yunseong Kim @ 2025-08-14 17:31 UTC (permalink / raw)
  To: Krzysztof Kozlowski, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Taehee Yoo, Byungchul Park, max.byungchul.park,
	yeoreum.yun, ppbuk5246, netdev, linux-kernel, Yunseong Kim

A potential deadlock due to A-B/B-A deadlock exists between the NFC core
and the RFKill subsystem, involving the NFC device lock and the
rfkill_global_mutex.

This issue is particularly visible on PREEMPT_RT kernels, which can
report the following warning:

| rtmutex deadlock detected
| WARNING: CPU: 0 PID: 22729 at kernel/locking/rtmutex.c:1674 rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1
| Modules linked in:
| CPU: 0 UID: 0 PID: 22729 Comm: syz.7.2187 Kdump: loaded Not tainted 6.17.0-rc1-00001-g1149a5db27c8-dirty #55 PREEMPT_RT
| Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8ubuntu1 06/11/2025
| pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
| pc : rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1
| lr : rt_mutex_handle_deadlock+0x40/0xec kernel/locking/rtmutex.c:1674
| sp : ffff8000967c7720
| x29: ffff8000967c7720 x28: 1fffe0001946d182 x27: dfff800000000000
| x26: 0000000000000001 x25: 0000000000000003 x24: 1fffe0001946d00b
| x23: 1fffe0001946d182 x22: ffff80008aec8940 x21: dfff800000000000
| x20: ffff0000ca368058 x19: ffff0000ca368c10 x18: ffff80008af6b6e0
| x17: 1fffe000590b8088 x16: ffff80008046cc08 x15: 0000000000000001
| x14: 1fffe000590ba990 x13: 0000000000000000 x12: 0000000000000000
| x11: ffff6000590ba991 x10: 0000000000000002 x9 : 0fe446e029bcfe00
| x8 : 0000000000000000 x7 : 0000000000000000 x6 : 000000000000003f
| x5 : 0000000000000001 x4 : 0000000000001000 x3 : ffff800080503efc
| x2 : 0000000000000001 x1 : 0000000000000001 x0 : 0000000000000001
| Call trace:
|  rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1 (P)
|  __rt_mutex_slowlock+0x1cc/0x480 kernel/locking/rtmutex.c:1734
|  __rt_mutex_slowlock_locked kernel/locking/rtmutex.c:1760 [inline]
|  rt_mutex_slowlock+0x140/0x21c kernel/locking/rtmutex.c:1800
|  __rt_mutex_lock kernel/locking/rtmutex.c:1815 [inline]
|  __mutex_lock_common kernel/locking/rtmutex_api.c:536 [inline]
|  mutex_lock+0xf0/0x10c kernel/locking/rtmutex_api.c:603
|  device_lock include/linux/device.h:911 [inline]
|  nfc_dev_down net/nfc/core.c:143 [inline]
|  nfc_rfkill_set_block+0x48/0x2a4 net/nfc/core.c:179
|  rfkill_set_block+0x184/0x364 net/rfkill/core.c:346
|  rfkill_fop_write+0x4dc/0x624 net/rfkill/core.c:1301
|  vfs_write+0x2b8/0xa30 fs/read_write.c:684
|  ksys_write+0x120/0x210 fs/read_write.c:738
|  __do_sys_write fs/read_write.c:749 [inline]
|  __se_sys_write fs/read_write.c:746 [inline]
|  __arm64_sys_write+0x7c/0x90 fs/read_write.c:746
|  __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
|  invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
|  el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
|  do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
|  el0_svc+0x40/0x140 arch/arm64/kernel/entry-common.c:879
|  el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:898
|  el0t_64_sync+0x1ac/0x1b0 arch/arm64/kernel/entry.S:596

The scenario is as follows:

Task A (rfkill_fop_write):
  1. Acquires rfkill_global_mutex.
  2. Iterates devices and calls rfkill_set_block()
     -> nfc_rfkill_set_block()
     -> nfc_dev_down().
  3. Tries to acquire NFC device_lock.

Task B (nfc_unregister_device):
  1. Acquires NFC device_lock.
  2. Calls rfkill_unregister().
  3. Tries to acquire rfkill_global_mutex.

Task A waits for the device_lock held by Task B, while Task B waits for
the rfkill_global_mutex held by Task A.

To fix this, move the calls to rfkill_unregister() and rfkill_destroy()
outside the device_lock critical section in nfc_unregister_device().

We ensure this is safe by first acquiring the device_lock, setting the
shutting_down flag (which prevents races with nfc_dev_down()),
stashing the rfkill pointer in a local variable, nullifying the pointer
in the nfc_dev structure, and then releasing the device_lock before
calling the rfkill unregister functions. This breaks the lock inversion.

Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
---
 net/nfc/core.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/net/nfc/core.c b/net/nfc/core.c
index ae1c842f9c64..c8dc6414514b 100644
--- a/net/nfc/core.c
+++ b/net/nfc/core.c
@@ -1154,6 +1154,7 @@ EXPORT_SYMBOL(nfc_register_device);
 void nfc_unregister_device(struct nfc_dev *dev)
 {
 	int rc;
+	struct rfkill *rfk = NULL;
 
 	pr_debug("dev_name=%s\n", dev_name(&dev->dev));
 
@@ -1163,14 +1164,18 @@ void nfc_unregister_device(struct nfc_dev *dev)
 			 "was removed\n", dev_name(&dev->dev));
 
 	device_lock(&dev->dev);
+	dev->shutting_down = true;
 	if (dev->rfkill) {
-		rfkill_unregister(dev->rfkill);
-		rfkill_destroy(dev->rfkill);
+		rfk = dev->rfkill;
 		dev->rfkill = NULL;
 	}
-	dev->shutting_down = true;
 	device_unlock(&dev->dev);
 
+	if (rfk) {
+		rfkill_unregister(rfk);
+		rfkill_destroy(rfk);
+	}
+
 	if (dev->ops->check_presence) {
 		timer_delete_sync(&dev->check_pres_timer);
 		cancel_work_sync(&dev->check_pres_work);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] net/nfc: Fix A-B/B-A deadlock between nfc_unregister_device and rfkill_fop_write
  2025-08-14 17:31 [PATCH] net/nfc: Fix A-B/B-A deadlock between nfc_unregister_device and rfkill_fop_write Yunseong Kim
@ 2025-08-15  5:55 ` Krzysztof Kozlowski
  2025-08-15  8:23   ` Yunseong Kim
  0 siblings, 1 reply; 4+ messages in thread
From: Krzysztof Kozlowski @ 2025-08-15  5:55 UTC (permalink / raw)
  To: Yunseong Kim, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Taehee Yoo, Byungchul Park, max.byungchul.park,
	yeoreum.yun, ppbuk5246, netdev, linux-kernel

On 14/08/2025 19:31, Yunseong Kim wrote:
> A potential deadlock due to A-B/B-A deadlock exists between the NFC core
> and the RFKill subsystem, involving the NFC device lock and the
> rfkill_global_mutex.
> 
> This issue is particularly visible on PREEMPT_RT kernels, which can
> report the following warning:

Why are not you crediting syzbot and its report?

there is clear INSTRUCTION in that email from Syzbot.

> 
> | rtmutex deadlock detected
> | WARNING: CPU: 0 PID: 22729 at kernel/locking/rtmutex.c:1674 rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1
> | Modules linked in:
> | CPU: 0 UID: 0 PID: 22729 Comm: syz.7.2187 Kdump: loaded Not tainted 6.17.0-rc1-00001-g1149a5db27c8-dirty #55 PREEMPT_RT
> | Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8ubuntu1 06/11/2025
> | pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> | pc : rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1
> | lr : rt_mutex_handle_deadlock+0x40/0xec kernel/locking/rtmutex.c:1674
> | sp : ffff8000967c7720
> | x29: ffff8000967c7720 x28: 1fffe0001946d182 x27: dfff800000000000
> | x26: 0000000000000001 x25: 0000000000000003 x24: 1fffe0001946d00b
> | x23: 1fffe0001946d182 x22: ffff80008aec8940 x21: dfff800000000000
> | x20: ffff0000ca368058 x19: ffff0000ca368c10 x18: ffff80008af6b6e0
> | x17: 1fffe000590b8088 x16: ffff80008046cc08 x15: 0000000000000001
> | x14: 1fffe000590ba990 x13: 0000000000000000 x12: 0000000000000000
> | x11: ffff6000590ba991 x10: 0000000000000002 x9 : 0fe446e029bcfe00
> | x8 : 0000000000000000 x7 : 0000000000000000 x6 : 000000000000003f
> | x5 : 0000000000000001 x4 : 0000000000001000 x3 : ffff800080503efc
> | x2 : 0000000000000001 x1 : 0000000000000001 x0 : 0000000000000001

This all is irrelevant, really. Trim the log.

> | Call trace:
> |  rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1 (P)
> |  __rt_mutex_slowlock+0x1cc/0x480 kernel/locking/rtmutex.c:1734
> |  __rt_mutex_slowlock_locked kernel/locking/rtmutex.c:1760 [inline]
> |  rt_mutex_slowlock+0x140/0x21c kernel/locking/rtmutex.c:1800
Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] net/nfc: Fix A-B/B-A deadlock between nfc_unregister_device and rfkill_fop_write
  2025-08-15  5:55 ` Krzysztof Kozlowski
@ 2025-08-15  8:23   ` Yunseong Kim
  2025-08-17  5:45     ` Krzysztof Kozlowski
  0 siblings, 1 reply; 4+ messages in thread
From: Yunseong Kim @ 2025-08-15  8:23 UTC (permalink / raw)
  To: Krzysztof Kozlowski, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Taehee Yoo, Byungchul Park, max.byungchul.park,
	yeoreum.yun, ppbuk5246, netdev, linux-kernel, syzkaller

Hi Krzysztof,

Thank you for your review.

On 8/15/25 2:55 PM, Krzysztof Kozlowski wrote:
> On 14/08/2025 19:31, Yunseong Kim wrote:
>> A potential deadlock due to A-B/B-A deadlock exists between the NFC core
>> and the RFKill subsystem, involving the NFC device lock and the
>> rfkill_global_mutex.
>>
>> This issue is particularly visible on PREEMPT_RT kernels, which can
>> report the following warning:
> 
> Why are not you crediting syzbot and its report?
> 
> there is clear INSTRUCTION in that email from Syzbot.

I wanted to clarify that this report did not originate from syzbot.

I found this issue by building and running syzkaller locally on my own
Arm64 RADXA Orion6 board.

This is reproduction series on my local syzkaller.

WARNING in __rt_mutex_slowlock

#	Log	Report	Time	Tag
7	log	report	2025/08/14 20:01	
6	log	report	2025/08/14 05:55	
5	log	report	2025/08/14 02:31	
4	log	report	2025/08/12 09:38	
3	log	report	2025/07/30 07:09	
2	log	report	2025/07/27 23:29	
1	log	report	2025/07/26 04:18	
0	log	report	2025/07/26 04:17

The reason this is coming from syzbot recently is that I worked with Sebastian,
the RT maintainer, to fix KCOV to be PREEMPT_RT-aware. This was merged recently:
Link: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-linus&id=9528d32873b38281ae105f2f5799e79ae9d086c2

So, syszbot now report it:
https://syzkaller.appspot.com/bug?extid=535bbe83dfc3ae8d4be3

>> | rtmutex deadlock detected
>> | WARNING: CPU: 0 PID: 22729 at kernel/locking/rtmutex.c:1674 rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1
>> | Modules linked in:
>> | CPU: 0 UID: 0 PID: 22729 Comm: syz.7.2187 Kdump: loaded Not tainted 6.17.0-rc1-00001-g1149a5db27c8-dirty #55 PREEMPT_RT
>> | Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8ubuntu1 06/11/2025

As you might notice from the logs (e.g., "BIOS 2025.02-8ubuntu1"),
the environment is Ubuntu image on my machine, which, syzbot does not use.

>> | pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
>> | pc : rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1
>> | lr : rt_mutex_handle_deadlock+0x40/0xec kernel/locking/rtmutex.c:1674
>> | sp : ffff8000967c7720
>> | x29: ffff8000967c7720 x28: 1fffe0001946d182 x27: dfff800000000000
>> | x26: 0000000000000001 x25: 0000000000000003 x24: 1fffe0001946d00b
>> | x23: 1fffe0001946d182 x22: ffff80008aec8940 x21: dfff800000000000
>> | x20: ffff0000ca368058 x19: ffff0000ca368c10 x18: ffff80008af6b6e0
>> | x17: 1fffe000590b8088 x16: ffff80008046cc08 x15: 0000000000000001
>> | x14: 1fffe000590ba990 x13: 0000000000000000 x12: 0000000000000000
>> | x11: ffff6000590ba991 x10: 0000000000000002 x9 : 0fe446e029bcfe00
>> | x8 : 0000000000000000 x7 : 0000000000000000 x6 : 000000000000003f
>> | x5 : 0000000000000001 x4 : 0000000000001000 x3 : ffff800080503efc
>> | x2 : 0000000000000001 x1 : 0000000000000001 x0 : 0000000000000001
> 
> This all is irrelevant, really. Trim the log.

My apologies if the formatting and the log length were not appropriate. I
will trim the log significantly and review the subsystem's git history to
ensure the commit message format aligns with the expected style for
the next version.

>> | Call trace:
>> |  rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1 (P)
>> |  __rt_mutex_slowlock+0x1cc/0x480 kernel/locking/rtmutex.c:1734
>> |  __rt_mutex_slowlock_locked kernel/locking/rtmutex.c:1760 [inline]
>> |  rt_mutex_slowlock+0x140/0x21c kernel/locking/rtmutex.c:1800
> Best regards,
> Krzysztof

I’ve added the syzkaller mailing list to Cc.

Thank you!

Yunseong Kim

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] net/nfc: Fix A-B/B-A deadlock between nfc_unregister_device and rfkill_fop_write
  2025-08-15  8:23   ` Yunseong Kim
@ 2025-08-17  5:45     ` Krzysztof Kozlowski
  0 siblings, 0 replies; 4+ messages in thread
From: Krzysztof Kozlowski @ 2025-08-17  5:45 UTC (permalink / raw)
  To: Yunseong Kim, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Taehee Yoo, Byungchul Park, max.byungchul.park,
	yeoreum.yun, ppbuk5246, netdev, linux-kernel, syzkaller

On 15/08/2025 10:23, Yunseong Kim wrote:
> Hi Krzysztof,
> 
> Thank you for your review.
> 
> On 8/15/25 2:55 PM, Krzysztof Kozlowski wrote:
>> On 14/08/2025 19:31, Yunseong Kim wrote:
>>> A potential deadlock due to A-B/B-A deadlock exists between the NFC core
>>> and the RFKill subsystem, involving the NFC device lock and the
>>> rfkill_global_mutex.
>>>
>>> This issue is particularly visible on PREEMPT_RT kernels, which can
>>> report the following warning:
>>
>> Why are not you crediting syzbot and its report?
>>
>> there is clear INSTRUCTION in that email from Syzbot.
> 
> I wanted to clarify that this report did not originate from syzbot.
> 
> I found this issue by building and running syzkaller locally on my own
> Arm64 RADXA Orion6 board.
> 
> This is reproduction series on my local syzkaller.
> 
> WARNING in __rt_mutex_slowlock
> 
> #	Log	Report	Time	Tag
> 7	log	report	2025/08/14 20:01	
> 6	log	report	2025/08/14 05:55	
> 5	log	report	2025/08/14 02:31	
> 4	log	report	2025/08/12 09:38	
> 3	log	report	2025/07/30 07:09	
> 2	log	report	2025/07/27 23:29	
> 1	log	report	2025/07/26 04:18	
> 0	log	report	2025/07/26 04:17
> 
> The reason this is coming from syzbot recently is that I worked with Sebastian,
> the RT maintainer, to fix KCOV to be PREEMPT_RT-aware. This was merged recently:
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-linus&id=9528d32873b38281ae105f2f5799e79ae9d086c2
> 
> So, syszbot now report it:
> https://syzkaller.appspot.com/bug?extid=535bbe83dfc3ae8d4be3


Syzbot reported it before you pasted patch, so it should also receive
the reported-by credit, even if you discovered it separately.

> 
>>> | rtmutex deadlock detected
>>> | WARNING: CPU: 0 PID: 22729 at kernel/locking/rtmutex.c:1674 rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1
>>> | Modules linked in:
>>> | CPU: 0 UID: 0 PID: 22729 Comm: syz.7.2187 Kdump: loaded Not tainted 6.17.0-rc1-00001-g1149a5db27c8-dirty #55 PREEMPT_RT
>>> | Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8ubuntu1 06/11/2025
> 



Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-08-17  5:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-14 17:31 [PATCH] net/nfc: Fix A-B/B-A deadlock between nfc_unregister_device and rfkill_fop_write Yunseong Kim
2025-08-15  5:55 ` Krzysztof Kozlowski
2025-08-15  8:23   ` Yunseong Kim
2025-08-17  5:45     ` Krzysztof Kozlowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox