linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT
@ 2025-08-03  7:20 Yunseong Kim
  2025-08-03  7:20 ` [PATCH 1/4] kcov: Use raw_spinlock_t for kcov->lock and kcov_remote_lock Yunseong Kim
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Yunseong Kim @ 2025-08-03  7:20 UTC (permalink / raw)
  To: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman
  Cc: Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable,
	Yunseong Kim

This patch series resolves a sleeping function called from invalid context
bug that occurs when fuzzing USB with syzkaller on a PREEMPT_RT kernel.

The regression was introduced by the interaction of two separate patches:
one that made kcov's internal locks sleep on PREEMPT_RT for better latency
(d5d2c51f1e5f), and another that wrapped a kcov call in the USB softirq
path with local_irq_save() to prevent re-entrancy (f85d39dd7ed8).
This combination resulted in an attempt to acquire a sleeping lock from
within an atomic context, causing a kernel BUG.

To resolve this, this series makes the kcov remote path fully compatible
with atomic contexts by converting all its internal locking primitives to
non-sleeping variants. This approach is more robust than conditional
compilation as it creates a single, unified codebase that works correctly
on both RT and non-RT kernels.

The series is structured as follows:

Patch 1 converts the global kcov locks (kcov->lock and kcov_remote_lock)
to use the non-sleeping raw_spinlock_t.

Patch 2 replace the PREEMPT_RT-specific per-CPU local_lock_t back to the
original local_irq_save/restore primitives, making the per-CPU protection
non-sleeping as well.

Patches 3 and 4 are preparatory refactoring. They move the memory
allocation for remote handles out of the locked sections in the
KCOV_REMOTE_ENABLE ioctl path, which is a prerequisite for safely
using raw_spinlock_t as it forbids sleeping functions like kmalloc
within its critical section.

With these changes, I have been able to run syzkaller fuzzing on a
PREEMPT_RT kernel for a full day with no issues reported.

Reproduction details in here.
Link: https://lore.kernel.org/all/20250725201400.1078395-2-ysk@kzalloc.com/t/#u

Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
---

Changes from v2:

	1. Updated kcov_remote_reset() to use raw_spin_lock_irqsave() /
	   raw_spin_unlock_irqrestore() instead of raw_spin_lock() /
	   raw_spin_unlock(), following the interrupt disabling pattern
	   used in the original function that guard kcov_remote_lock.

Changes from v1:

	1. Dropped the #ifdef-based PREEMPT_RT branching.

	2. Convert kcov->lock and kcov_remote_lock from spinlock_t to
	   raw_spinlock_t. This ensures they remain true, non-sleeping
	   spinlocks even on PREEMPT_RT kernels.

	3. Remove the local_lock_t protection for kcov_percpu_data in
	   kcov_remote_start/stop(). Since local_lock_t can also sleep under
	   RT, and the required protection is against local interrupts when
	   accessing per-CPU data, it is replaced with explicit
	   local_irq_save/restore().

	4. Refactor the KCOV_REMOTE_ENABLE path to move memory allocations
	   out of the critical section.

	5. Modify the ioctl handling logic to utilize these pre-allocated
	   structures within the critical section. kcov_remote_add() is
	   modified to accept a pre-allocated structure instead of allocating
	   one internally. All necessary struct kcov_remote structures are now
	   pre-allocated individually in kcov_ioctl() using GFP_KERNEL
	   (allowing sleep) before acquiring the raw spinlocks.

Changes from v0:

	1. On PREEMPT_RT, separated the handling of
	   kcov_remote_start_usb_softirq() and kcov_remote_stop_usb_softirq()
	   to allow sleeping when entering kcov_remote_start_usb() /
	   kcov_remote_stop().

Yunseong Kim (4):
  kcov: Use raw_spinlock_t for kcov->lock and kcov_remote_lock
  kcov: Replace per-CPU local_lock with local_irq_save/restore
  kcov: Separate KCOV_REMOTE_ENABLE ioctl helper function
  kcov: move remote handle allocation outside raw spinlock

 kernel/kcov.c | 248 +++++++++++++++++++++++++++-----------------------
 1 file changed, 134 insertions(+), 114 deletions(-)

base-commit: 186f3edfdd41f2ae87fc40a9ccba52a3bf930994

-- 
2.50.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] kcov: Use raw_spinlock_t for kcov->lock and kcov_remote_lock
  2025-08-03  7:20 [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Yunseong Kim
@ 2025-08-03  7:20 ` Yunseong Kim
  2025-08-04 16:27   ` Steven Rostedt
  2025-08-03  7:20 ` [PATCH 2/4] kcov: Replace per-CPU local_lock with local_irq_save/restore Yunseong Kim
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Yunseong Kim @ 2025-08-03  7:20 UTC (permalink / raw)
  To: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman
  Cc: Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable,
	Yunseong Kim

The locks kcov->lock and kcov_remote_lock can be acquired from
atomic contexts, such as instrumentation hooks invoked from interrupt
handlers.

On PREEMPT_RT-enabled kernels, spinlock_t is typically implemented
as a sleeping lock (e.g., mapped to an rt_mutex). Acquiring such a
lock in atomic context, where sleeping is not allowed, can lead to
system hangs or crashes.

To avoid this, convert both locks to raw_spinlock_t, which always
provides non-sleeping spinlock semantics regardless of preemption model.

Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
---
 kernel/kcov.c | 58 +++++++++++++++++++++++++--------------------------
 1 file changed, 29 insertions(+), 29 deletions(-)

diff --git a/kernel/kcov.c b/kernel/kcov.c
index 187ba1b80bda..7d9b53385d81 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -54,7 +54,7 @@ struct kcov {
 	 */
 	refcount_t		refcount;
 	/* The lock protects mode, size, area and t. */
-	spinlock_t		lock;
+	raw_spinlock_t		lock;
 	enum kcov_mode		mode;
 	/* Size of arena (in long's). */
 	unsigned int		size;
@@ -84,7 +84,7 @@ struct kcov_remote {
 	struct hlist_node	hnode;
 };
 
-static DEFINE_SPINLOCK(kcov_remote_lock);
+static DEFINE_RAW_SPINLOCK(kcov_remote_lock);
 static DEFINE_HASHTABLE(kcov_remote_map, 4);
 static struct list_head kcov_remote_areas = LIST_HEAD_INIT(kcov_remote_areas);
 
@@ -406,7 +406,7 @@ static void kcov_remote_reset(struct kcov *kcov)
 	struct hlist_node *tmp;
 	unsigned long flags;
 
-	spin_lock_irqsave(&kcov_remote_lock, flags);
+	raw_spin_lock_irqsave(&kcov_remote_lock, flags);
 	hash_for_each_safe(kcov_remote_map, bkt, tmp, remote, hnode) {
 		if (remote->kcov != kcov)
 			continue;
@@ -415,7 +415,7 @@ static void kcov_remote_reset(struct kcov *kcov)
 	}
 	/* Do reset before unlock to prevent races with kcov_remote_start(). */
 	kcov_reset(kcov);
-	spin_unlock_irqrestore(&kcov_remote_lock, flags);
+	raw_spin_unlock_irqrestore(&kcov_remote_lock, flags);
 }
 
 static void kcov_disable(struct task_struct *t, struct kcov *kcov)
@@ -450,7 +450,7 @@ void kcov_task_exit(struct task_struct *t)
 	if (kcov == NULL)
 		return;
 
-	spin_lock_irqsave(&kcov->lock, flags);
+	raw_spin_lock_irqsave(&kcov->lock, flags);
 	kcov_debug("t = %px, kcov->t = %px\n", t, kcov->t);
 	/*
 	 * For KCOV_ENABLE devices we want to make sure that t->kcov->t == t,
@@ -475,12 +475,12 @@ void kcov_task_exit(struct task_struct *t)
 	 * By combining all three checks into one we get:
 	 */
 	if (WARN_ON(kcov->t != t)) {
-		spin_unlock_irqrestore(&kcov->lock, flags);
+		raw_spin_unlock_irqrestore(&kcov->lock, flags);
 		return;
 	}
 	/* Just to not leave dangling references behind. */
 	kcov_disable(t, kcov);
-	spin_unlock_irqrestore(&kcov->lock, flags);
+	raw_spin_unlock_irqrestore(&kcov->lock, flags);
 	kcov_put(kcov);
 }
 
@@ -492,14 +492,14 @@ static int kcov_mmap(struct file *filep, struct vm_area_struct *vma)
 	struct page *page;
 	unsigned long flags;
 
-	spin_lock_irqsave(&kcov->lock, flags);
+	raw_spin_lock_irqsave(&kcov->lock, flags);
 	size = kcov->size * sizeof(unsigned long);
 	if (kcov->area == NULL || vma->vm_pgoff != 0 ||
 	    vma->vm_end - vma->vm_start != size) {
 		res = -EINVAL;
 		goto exit;
 	}
-	spin_unlock_irqrestore(&kcov->lock, flags);
+	raw_spin_unlock_irqrestore(&kcov->lock, flags);
 	vm_flags_set(vma, VM_DONTEXPAND);
 	for (off = 0; off < size; off += PAGE_SIZE) {
 		page = vmalloc_to_page(kcov->area + off);
@@ -511,7 +511,7 @@ static int kcov_mmap(struct file *filep, struct vm_area_struct *vma)
 	}
 	return 0;
 exit:
-	spin_unlock_irqrestore(&kcov->lock, flags);
+	raw_spin_unlock_irqrestore(&kcov->lock, flags);
 	return res;
 }
 
@@ -525,7 +525,7 @@ static int kcov_open(struct inode *inode, struct file *filep)
 	kcov->mode = KCOV_MODE_DISABLED;
 	kcov->sequence = 1;
 	refcount_set(&kcov->refcount, 1);
-	spin_lock_init(&kcov->lock);
+	raw_spin_lock_init(&kcov->lock);
 	filep->private_data = kcov;
 	return nonseekable_open(inode, filep);
 }
@@ -646,18 +646,18 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
 		kcov->t = t;
 		kcov->remote = true;
 		kcov->remote_size = remote_arg->area_size;
-		spin_lock_irqsave(&kcov_remote_lock, flags);
+		raw_spin_lock_irqsave(&kcov_remote_lock, flags);
 		for (i = 0; i < remote_arg->num_handles; i++) {
 			if (!kcov_check_handle(remote_arg->handles[i],
 						false, true, false)) {
-				spin_unlock_irqrestore(&kcov_remote_lock,
+				raw_spin_unlock_irqrestore(&kcov_remote_lock,
 							flags);
 				kcov_disable(t, kcov);
 				return -EINVAL;
 			}
 			remote = kcov_remote_add(kcov, remote_arg->handles[i]);
 			if (IS_ERR(remote)) {
-				spin_unlock_irqrestore(&kcov_remote_lock,
+				raw_spin_unlock_irqrestore(&kcov_remote_lock,
 							flags);
 				kcov_disable(t, kcov);
 				return PTR_ERR(remote);
@@ -666,7 +666,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
 		if (remote_arg->common_handle) {
 			if (!kcov_check_handle(remote_arg->common_handle,
 						true, false, false)) {
-				spin_unlock_irqrestore(&kcov_remote_lock,
+				raw_spin_unlock_irqrestore(&kcov_remote_lock,
 							flags);
 				kcov_disable(t, kcov);
 				return -EINVAL;
@@ -674,14 +674,14 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
 			remote = kcov_remote_add(kcov,
 					remote_arg->common_handle);
 			if (IS_ERR(remote)) {
-				spin_unlock_irqrestore(&kcov_remote_lock,
+				raw_spin_unlock_irqrestore(&kcov_remote_lock,
 							flags);
 				kcov_disable(t, kcov);
 				return PTR_ERR(remote);
 			}
 			t->kcov_handle = remote_arg->common_handle;
 		}
-		spin_unlock_irqrestore(&kcov_remote_lock, flags);
+		raw_spin_unlock_irqrestore(&kcov_remote_lock, flags);
 		/* Put either in kcov_task_exit() or in KCOV_DISABLE. */
 		kcov_get(kcov);
 		return 0;
@@ -716,16 +716,16 @@ static long kcov_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 		area = vmalloc_user(size * sizeof(unsigned long));
 		if (area == NULL)
 			return -ENOMEM;
-		spin_lock_irqsave(&kcov->lock, flags);
+		raw_spin_lock_irqsave(&kcov->lock, flags);
 		if (kcov->mode != KCOV_MODE_DISABLED) {
-			spin_unlock_irqrestore(&kcov->lock, flags);
+			raw_spin_unlock_irqrestore(&kcov->lock, flags);
 			vfree(area);
 			return -EBUSY;
 		}
 		kcov->area = area;
 		kcov->size = size;
 		kcov->mode = KCOV_MODE_INIT;
-		spin_unlock_irqrestore(&kcov->lock, flags);
+		raw_spin_unlock_irqrestore(&kcov->lock, flags);
 		return 0;
 	case KCOV_REMOTE_ENABLE:
 		if (get_user(remote_num_handles, (unsigned __user *)(arg +
@@ -749,9 +749,9 @@ static long kcov_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 		 * All other commands can be normally executed under a spin lock, so we
 		 * obtain and release it here in order to simplify kcov_ioctl_locked().
 		 */
-		spin_lock_irqsave(&kcov->lock, flags);
+		raw_spin_lock_irqsave(&kcov->lock, flags);
 		res = kcov_ioctl_locked(kcov, cmd, arg);
-		spin_unlock_irqrestore(&kcov->lock, flags);
+		raw_spin_unlock_irqrestore(&kcov->lock, flags);
 		kfree(remote_arg);
 		return res;
 	}
@@ -883,10 +883,10 @@ void kcov_remote_start(u64 handle)
 		return;
 	}
 
-	spin_lock(&kcov_remote_lock);
+	raw_spin_lock(&kcov_remote_lock);
 	remote = kcov_remote_find(handle);
 	if (!remote) {
-		spin_unlock(&kcov_remote_lock);
+		raw_spin_unlock(&kcov_remote_lock);
 		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 		return;
 	}
@@ -908,7 +908,7 @@ void kcov_remote_start(u64 handle)
 		size = CONFIG_KCOV_IRQ_AREA_SIZE;
 		area = this_cpu_ptr(&kcov_percpu_data)->irq_area;
 	}
-	spin_unlock(&kcov_remote_lock);
+	raw_spin_unlock(&kcov_remote_lock);
 
 	/* Can only happen when in_task(). */
 	if (!area) {
@@ -1037,19 +1037,19 @@ void kcov_remote_stop(void)
 		kcov_remote_softirq_stop(t);
 	}
 
-	spin_lock(&kcov->lock);
+	raw_spin_lock(&kcov->lock);
 	/*
 	 * KCOV_DISABLE could have been called between kcov_remote_start()
 	 * and kcov_remote_stop(), hence the sequence check.
 	 */
 	if (sequence == kcov->sequence && kcov->remote)
 		kcov_move_area(kcov->mode, kcov->area, kcov->size, area);
-	spin_unlock(&kcov->lock);
+	raw_spin_unlock(&kcov->lock);
 
 	if (in_task()) {
-		spin_lock(&kcov_remote_lock);
+		raw_spin_lock(&kcov_remote_lock);
 		kcov_remote_area_put(area, size);
-		spin_unlock(&kcov_remote_lock);
+		raw_spin_unlock(&kcov_remote_lock);
 	}
 
 	local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] kcov: Replace per-CPU local_lock with local_irq_save/restore
  2025-08-03  7:20 [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Yunseong Kim
  2025-08-03  7:20 ` [PATCH 1/4] kcov: Use raw_spinlock_t for kcov->lock and kcov_remote_lock Yunseong Kim
@ 2025-08-03  7:20 ` Yunseong Kim
  2025-08-04 16:37   ` Steven Rostedt
  2025-08-03  7:20 ` [PATCH 3/4] kcov: Separate KCOV_REMOTE_ENABLE ioctl helper function Yunseong Kim
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Yunseong Kim @ 2025-08-03  7:20 UTC (permalink / raw)
  To: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman
  Cc: Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable,
	Yunseong Kim

Commit f85d39dd7ed8 ("kcov, usb: disable interrupts in
kcov_remote_start_usb_softirq") introduced a local_irq_save() in the
kcov_remote_start_usb_softirq() wrapper, placing kcov_remote_start() in
atomic context.

The previous patch addressed this by converting the global
kcov_remote_lock to a non-sleeping raw_spinlock_t. However, per-CPU
data in kcov_remote_start() and kcov_remote_stop() remains protected
by kcov_percpu_data.lock, which is a local_lock_t.

On PREEMPT_RT kernels, local_lock_t is implemented as a sleeping lock.
Acquiring it from atomic context triggers warnings or crashes due to
invalid sleeping behavior.

The original use of local_lock_t assumed that kcov_remote_start() would
never be called in atomic context. Now that this assumption no longer
holds, replace it with local_irq_save() and local_irq_restore(), which are
safe in all contexts and compatible with the use of raw_spinlock_t.

With this change, both global and per-CPU synchronization primitives are
guaranteed to be non-sleeping, making kcov_remote_start() safe for
use in atomic contexts.

Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
---
 kernel/kcov.c | 29 +++++++++++++----------------
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/kernel/kcov.c b/kernel/kcov.c
index 7d9b53385d81..faad3b288ca7 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -90,7 +90,6 @@ static struct list_head kcov_remote_areas = LIST_HEAD_INIT(kcov_remote_areas);
 
 struct kcov_percpu_data {
 	void			*irq_area;
-	local_lock_t		lock;
 
 	unsigned int		saved_mode;
 	unsigned int		saved_size;
@@ -99,9 +98,7 @@ struct kcov_percpu_data {
 	int			saved_sequence;
 };
 
-static DEFINE_PER_CPU(struct kcov_percpu_data, kcov_percpu_data) = {
-	.lock = INIT_LOCAL_LOCK(lock),
-};
+static DEFINE_PER_CPU(struct kcov_percpu_data, kcov_percpu_data);
 
 /* Must be called with kcov_remote_lock locked. */
 static struct kcov_remote *kcov_remote_find(u64 handle)
@@ -862,7 +859,7 @@ void kcov_remote_start(u64 handle)
 	if (!in_task() && !in_softirq_really())
 		return;
 
-	local_lock_irqsave(&kcov_percpu_data.lock, flags);
+	local_irq_save(flags);
 
 	/*
 	 * Check that kcov_remote_start() is not called twice in background
@@ -870,7 +867,7 @@ void kcov_remote_start(u64 handle)
 	 */
 	mode = READ_ONCE(t->kcov_mode);
 	if (WARN_ON(in_task() && kcov_mode_enabled(mode))) {
-		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+		local_irq_restore(flags);
 		return;
 	}
 	/*
@@ -879,7 +876,7 @@ void kcov_remote_start(u64 handle)
 	 * happened while collecting coverage from a background thread.
 	 */
 	if (WARN_ON(in_serving_softirq() && t->kcov_softirq)) {
-		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+		local_irq_restore(flags);
 		return;
 	}
 
@@ -887,7 +884,7 @@ void kcov_remote_start(u64 handle)
 	remote = kcov_remote_find(handle);
 	if (!remote) {
 		raw_spin_unlock(&kcov_remote_lock);
-		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+		local_irq_restore(flags);
 		return;
 	}
 	kcov_debug("handle = %llx, context: %s\n", handle,
@@ -912,13 +909,13 @@ void kcov_remote_start(u64 handle)
 
 	/* Can only happen when in_task(). */
 	if (!area) {
-		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+		local_irq_restore(flags);
 		area = vmalloc(size * sizeof(unsigned long));
 		if (!area) {
 			kcov_put(kcov);
 			return;
 		}
-		local_lock_irqsave(&kcov_percpu_data.lock, flags);
+		local_irq_save(flags);
 	}
 
 	/* Reset coverage size. */
@@ -930,7 +927,7 @@ void kcov_remote_start(u64 handle)
 	}
 	kcov_start(t, kcov, size, area, mode, sequence);
 
-	local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+	local_irq_restore(flags);
 
 }
 EXPORT_SYMBOL(kcov_remote_start);
@@ -1004,12 +1001,12 @@ void kcov_remote_stop(void)
 	if (!in_task() && !in_softirq_really())
 		return;
 
-	local_lock_irqsave(&kcov_percpu_data.lock, flags);
+	local_irq_save(flags);
 
 	mode = READ_ONCE(t->kcov_mode);
 	barrier();
 	if (!kcov_mode_enabled(mode)) {
-		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+		local_irq_restore(flags);
 		return;
 	}
 	/*
@@ -1017,12 +1014,12 @@ void kcov_remote_stop(void)
 	 * actually found the remote handle and started collecting coverage.
 	 */
 	if (in_serving_softirq() && !t->kcov_softirq) {
-		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+		local_irq_restore(flags);
 		return;
 	}
 	/* Make sure that kcov_softirq is only set when in softirq. */
 	if (WARN_ON(!in_serving_softirq() && t->kcov_softirq)) {
-		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+		local_irq_restore(flags);
 		return;
 	}
 
@@ -1052,7 +1049,7 @@ void kcov_remote_stop(void)
 		raw_spin_unlock(&kcov_remote_lock);
 	}
 
-	local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+	local_irq_restore(flags);
 
 	/* Get in kcov_remote_start(). */
 	kcov_put(kcov);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] kcov: Separate KCOV_REMOTE_ENABLE ioctl helper function
  2025-08-03  7:20 [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Yunseong Kim
  2025-08-03  7:20 ` [PATCH 1/4] kcov: Use raw_spinlock_t for kcov->lock and kcov_remote_lock Yunseong Kim
  2025-08-03  7:20 ` [PATCH 2/4] kcov: Replace per-CPU local_lock with local_irq_save/restore Yunseong Kim
@ 2025-08-03  7:20 ` Yunseong Kim
  2025-08-03  7:20 ` [PATCH 4/4] kcov: move remote handle allocation outside raw spinlock Yunseong Kim
  2025-08-04 16:24 ` [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Steven Rostedt
  4 siblings, 0 replies; 11+ messages in thread
From: Yunseong Kim @ 2025-08-03  7:20 UTC (permalink / raw)
  To: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman
  Cc: Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable,
	Yunseong Kim

kcov_ioctl() entry point is updated to dispatch commands to the
appropriate helper function, calling kcov_ioctl_locked_remote_enabled()
for the remote enable case and the now-simplified kcov_ioctl_locked() for
KCOV_ENABLE and KCOV_DISABLE commands.

Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
---
 kernel/kcov.c | 142 +++++++++++++++++++++++++++-----------------------
 1 file changed, 77 insertions(+), 65 deletions(-)

diff --git a/kernel/kcov.c b/kernel/kcov.c
index faad3b288ca7..1e7f08ddf0e8 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -579,15 +579,81 @@ static inline bool kcov_check_handle(u64 handle, bool common_valid,
 	return false;
 }
 
-static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
-			     unsigned long arg)
+static int kcov_ioctl_locked_remote_enabled(struct kcov *kcov,
+			     unsigned int cmd, unsigned long arg)
 {
 	struct task_struct *t;
-	unsigned long flags, unused;
+	unsigned long flags;
 	int mode, i;
 	struct kcov_remote_arg *remote_arg;
 	struct kcov_remote *remote;
 
+	if (kcov->mode != KCOV_MODE_INIT || !kcov->area)
+		return -EINVAL;
+	t = current;
+	if (kcov->t != NULL || t->kcov != NULL)
+		return -EBUSY;
+	remote_arg = (struct kcov_remote_arg *)arg;
+	mode = kcov_get_mode(remote_arg->trace_mode);
+	if (mode < 0)
+		return mode;
+	if ((unsigned long)remote_arg->area_size >
+		LONG_MAX / sizeof(unsigned long))
+		return -EINVAL;
+	kcov->mode = mode;
+	t->kcov = kcov;
+	t->kcov_mode = KCOV_MODE_REMOTE;
+	kcov->t = t;
+	kcov->remote = true;
+	kcov->remote_size = remote_arg->area_size;
+	raw_spin_lock_irqsave(&kcov_remote_lock, flags);
+	for (i = 0; i < remote_arg->num_handles; i++) {
+		if (!kcov_check_handle(remote_arg->handles[i],
+					false, true, false)) {
+			raw_spin_unlock_irqrestore(&kcov_remote_lock,
+						flags);
+			kcov_disable(t, kcov);
+			return -EINVAL;
+		}
+		remote = kcov_remote_add(kcov, remote_arg->handles[i]);
+		if (IS_ERR(remote)) {
+			raw_spin_unlock_irqrestore(&kcov_remote_lock,
+						flags);
+			kcov_disable(t, kcov);
+			return PTR_ERR(remote);
+		}
+	}
+	if (remote_arg->common_handle) {
+		if (!kcov_check_handle(remote_arg->common_handle,
+					true, false, false)) {
+			raw_spin_unlock_irqrestore(&kcov_remote_lock,
+						flags);
+			kcov_disable(t, kcov);
+			return -EINVAL;
+		}
+		remote = kcov_remote_add(kcov,
+				remote_arg->common_handle);
+		if (IS_ERR(remote)) {
+			raw_spin_unlock_irqrestore(&kcov_remote_lock,
+						flags);
+			kcov_disable(t, kcov);
+			return PTR_ERR(remote);
+		}
+		t->kcov_handle = remote_arg->common_handle;
+	}
+	raw_spin_unlock_irqrestore(&kcov_remote_lock, flags);
+	/* Put either in kcov_task_exit() or in KCOV_DISABLE. */
+	kcov_get(kcov);
+	return 0;
+}
+
+static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
+			     unsigned long arg)
+{
+	struct task_struct *t;
+	unsigned long unused;
+	int mode;
+
 	switch (cmd) {
 	case KCOV_ENABLE:
 		/*
@@ -624,64 +690,6 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
 		kcov_disable(t, kcov);
 		kcov_put(kcov);
 		return 0;
-	case KCOV_REMOTE_ENABLE:
-		if (kcov->mode != KCOV_MODE_INIT || !kcov->area)
-			return -EINVAL;
-		t = current;
-		if (kcov->t != NULL || t->kcov != NULL)
-			return -EBUSY;
-		remote_arg = (struct kcov_remote_arg *)arg;
-		mode = kcov_get_mode(remote_arg->trace_mode);
-		if (mode < 0)
-			return mode;
-		if ((unsigned long)remote_arg->area_size >
-		    LONG_MAX / sizeof(unsigned long))
-			return -EINVAL;
-		kcov->mode = mode;
-		t->kcov = kcov;
-	        t->kcov_mode = KCOV_MODE_REMOTE;
-		kcov->t = t;
-		kcov->remote = true;
-		kcov->remote_size = remote_arg->area_size;
-		raw_spin_lock_irqsave(&kcov_remote_lock, flags);
-		for (i = 0; i < remote_arg->num_handles; i++) {
-			if (!kcov_check_handle(remote_arg->handles[i],
-						false, true, false)) {
-				raw_spin_unlock_irqrestore(&kcov_remote_lock,
-							flags);
-				kcov_disable(t, kcov);
-				return -EINVAL;
-			}
-			remote = kcov_remote_add(kcov, remote_arg->handles[i]);
-			if (IS_ERR(remote)) {
-				raw_spin_unlock_irqrestore(&kcov_remote_lock,
-							flags);
-				kcov_disable(t, kcov);
-				return PTR_ERR(remote);
-			}
-		}
-		if (remote_arg->common_handle) {
-			if (!kcov_check_handle(remote_arg->common_handle,
-						true, false, false)) {
-				raw_spin_unlock_irqrestore(&kcov_remote_lock,
-							flags);
-				kcov_disable(t, kcov);
-				return -EINVAL;
-			}
-			remote = kcov_remote_add(kcov,
-					remote_arg->common_handle);
-			if (IS_ERR(remote)) {
-				raw_spin_unlock_irqrestore(&kcov_remote_lock,
-							flags);
-				kcov_disable(t, kcov);
-				return PTR_ERR(remote);
-			}
-			t->kcov_handle = remote_arg->common_handle;
-		}
-		raw_spin_unlock_irqrestore(&kcov_remote_lock, flags);
-		/* Put either in kcov_task_exit() or in KCOV_DISABLE. */
-		kcov_get(kcov);
-		return 0;
 	default:
 		return -ENOTTY;
 	}
@@ -740,16 +748,20 @@ static long kcov_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 			return -EINVAL;
 		}
 		arg = (unsigned long)remote_arg;
-		fallthrough;
+		raw_spin_lock_irqsave(&kcov->lock, flags);
+		res = kcov_ioctl_locked_remote_enabled(kcov, cmd, arg);
+		raw_spin_unlock_irqrestore(&kcov->lock, flags);
+		kfree(remote_arg);
+		return res;
 	default:
 		/*
-		 * All other commands can be normally executed under a spin lock, so we
-		 * obtain and release it here in order to simplify kcov_ioctl_locked().
+		 * KCOV_ENABLE and KCOV_DISABLE commands can be normally executed under
+		 * a raw spin lock, so we obtain and release it here in order to
+		 * simplify kcov_ioctl_locked().
 		 */
 		raw_spin_lock_irqsave(&kcov->lock, flags);
 		res = kcov_ioctl_locked(kcov, cmd, arg);
 		raw_spin_unlock_irqrestore(&kcov->lock, flags);
-		kfree(remote_arg);
 		return res;
 	}
 }
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] kcov: move remote handle allocation outside raw spinlock
  2025-08-03  7:20 [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Yunseong Kim
                   ` (2 preceding siblings ...)
  2025-08-03  7:20 ` [PATCH 3/4] kcov: Separate KCOV_REMOTE_ENABLE ioctl helper function Yunseong Kim
@ 2025-08-03  7:20 ` Yunseong Kim
  2025-08-04 16:24 ` [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Steven Rostedt
  4 siblings, 0 replies; 11+ messages in thread
From: Yunseong Kim @ 2025-08-03  7:20 UTC (permalink / raw)
  To: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman
  Cc: Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable,
	Yunseong Kim

To comply with raw spinlock constraints, move allocation of kcov_remote
structs out of the critical section in the KCOV_REMOTE_ENABLE path.

Memory is now pre-allocated in kcov_ioctl() before taking any locks,
and passed down to the locked section for insertion into the hash table.
error handling is updated to release the memory on failure.

This aligns with the non-sleeping requirement of  raw spinlocks
introduced earlier in the series.

Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
---
 kernel/kcov.c | 81 +++++++++++++++++++++++++++++----------------------
 1 file changed, 46 insertions(+), 35 deletions(-)

diff --git a/kernel/kcov.c b/kernel/kcov.c
index 1e7f08ddf0e8..46d36e0146cc 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -113,15 +113,9 @@ static struct kcov_remote *kcov_remote_find(u64 handle)
 }
 
 /* Must be called with kcov_remote_lock locked. */
-static struct kcov_remote *kcov_remote_add(struct kcov *kcov, u64 handle)
+static struct kcov_remote *kcov_remote_add(struct kcov *kcov, u64 handle,
+					   struct kcov_remote *remote)
 {
-	struct kcov_remote *remote;
-
-	if (kcov_remote_find(handle))
-		return ERR_PTR(-EEXIST);
-	remote = kmalloc(sizeof(*remote), GFP_ATOMIC);
-	if (!remote)
-		return ERR_PTR(-ENOMEM);
 	remote->handle = handle;
 	remote->kcov = kcov;
 	hash_add(kcov_remote_map, &remote->hnode, handle);
@@ -580,13 +574,14 @@ static inline bool kcov_check_handle(u64 handle, bool common_valid,
 }
 
 static int kcov_ioctl_locked_remote_enabled(struct kcov *kcov,
-			     unsigned int cmd, unsigned long arg)
+			     unsigned int cmd, unsigned long arg,
+			     struct kcov_remote *remote_handles,
+			     struct kcov_remote *remote_common_handle)
 {
 	struct task_struct *t;
 	unsigned long flags;
-	int mode, i;
+	int mode, i, ret;
 	struct kcov_remote_arg *remote_arg;
-	struct kcov_remote *remote;
 
 	if (kcov->mode != KCOV_MODE_INIT || !kcov->area)
 		return -EINVAL;
@@ -610,41 +605,43 @@ static int kcov_ioctl_locked_remote_enabled(struct kcov *kcov,
 	for (i = 0; i < remote_arg->num_handles; i++) {
 		if (!kcov_check_handle(remote_arg->handles[i],
 					false, true, false)) {
-			raw_spin_unlock_irqrestore(&kcov_remote_lock,
-						flags);
-			kcov_disable(t, kcov);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto err;
 		}
-		remote = kcov_remote_add(kcov, remote_arg->handles[i]);
-		if (IS_ERR(remote)) {
-			raw_spin_unlock_irqrestore(&kcov_remote_lock,
-						flags);
-			kcov_disable(t, kcov);
-			return PTR_ERR(remote);
+		if (kcov_remote_find(remote_arg->handles[i])) {
+			ret = -EEXIST;
+			goto err;
 		}
+		kcov_remote_add(kcov, remote_arg->handles[i],
+			&remote_handles[i]);
 	}
 	if (remote_arg->common_handle) {
 		if (!kcov_check_handle(remote_arg->common_handle,
 					true, false, false)) {
-			raw_spin_unlock_irqrestore(&kcov_remote_lock,
-						flags);
-			kcov_disable(t, kcov);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto err;
 		}
-		remote = kcov_remote_add(kcov,
-				remote_arg->common_handle);
-		if (IS_ERR(remote)) {
-			raw_spin_unlock_irqrestore(&kcov_remote_lock,
-						flags);
-			kcov_disable(t, kcov);
-			return PTR_ERR(remote);
+		if (kcov_remote_find(remote_arg->common_handle)) {
+			ret = -EEXIST;
+			goto err;
 		}
+		kcov_remote_add(kcov,
+			remote_arg->common_handle, remote_common_handle);
 		t->kcov_handle = remote_arg->common_handle;
 	}
 	raw_spin_unlock_irqrestore(&kcov_remote_lock, flags);
+
 	/* Put either in kcov_task_exit() or in KCOV_DISABLE. */
 	kcov_get(kcov);
 	return 0;
+
+err:
+	raw_spin_unlock_irqrestore(&kcov_remote_lock, flags);
+	kcov_disable(t, kcov);
+	kfree(remote_common_handle);
+	kfree(remote_handles);
+
+	return ret;
 }
 
 static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
@@ -702,6 +699,7 @@ static long kcov_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 	struct kcov_remote_arg *remote_arg = NULL;
 	unsigned int remote_num_handles;
 	unsigned long remote_arg_size;
+	struct kcov_remote *remote_handles, *remote_common_handle;
 	unsigned long size, flags;
 	void *area;
 
@@ -748,11 +746,22 @@ static long kcov_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 			return -EINVAL;
 		}
 		arg = (unsigned long)remote_arg;
+		remote_handles = kmalloc_array(remote_arg->num_handles,
+					sizeof(struct kcov_remote), GFP_KERNEL);
+		if (!remote_handles)
+			return -ENOMEM;
+		remote_common_handle = kmalloc(sizeof(struct kcov_remote), GFP_KERNEL);
+		if (!remote_common_handle) {
+			kfree(remote_handles);
+			return -ENOMEM;
+		}
+
 		raw_spin_lock_irqsave(&kcov->lock, flags);
-		res = kcov_ioctl_locked_remote_enabled(kcov, cmd, arg);
+		res = kcov_ioctl_locked_remote_enabled(kcov, cmd, arg,
+				remote_handles, remote_common_handle);
 		raw_spin_unlock_irqrestore(&kcov->lock, flags);
 		kfree(remote_arg);
-		return res;
+		break;
 	default:
 		/*
 		 * KCOV_ENABLE and KCOV_DISABLE commands can be normally executed under
@@ -762,8 +771,10 @@ static long kcov_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 		raw_spin_lock_irqsave(&kcov->lock, flags);
 		res = kcov_ioctl_locked(kcov, cmd, arg);
 		raw_spin_unlock_irqrestore(&kcov->lock, flags);
-		return res;
+		break;
 	}
+
+	return res;
 }
 
 static const struct file_operations kcov_fops = {
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT
  2025-08-03  7:20 [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Yunseong Kim
                   ` (3 preceding siblings ...)
  2025-08-03  7:20 ` [PATCH 4/4] kcov: move remote handle allocation outside raw spinlock Yunseong Kim
@ 2025-08-04 16:24 ` Steven Rostedt
  2025-08-05 15:27   ` Yunseong Kim
  4 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2025-08-04 16:24 UTC (permalink / raw)
  To: Yunseong Kim
  Cc: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman,
	Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable

On Sun,  3 Aug 2025 07:20:41 +0000
Yunseong Kim <ysk@kzalloc.com> wrote:

> This patch series resolves a sleeping function called from invalid context
> bug that occurs when fuzzing USB with syzkaller on a PREEMPT_RT kernel.
> 
> The regression was introduced by the interaction of two separate patches:
> one that made kcov's internal locks sleep on PREEMPT_RT for better latency

Just so I fully understand this change. It is basically reverting the
"better latency" changes? That is, with KCOV anyone running with PREEMPT_RT
can expect non deterministic latency behavior?

This should be fully documented. I assume this will not be a problem as
kcov is more for debugging and should not be enabled in production.

-- Steve



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] kcov: Use raw_spinlock_t for kcov->lock and kcov_remote_lock
  2025-08-03  7:20 ` [PATCH 1/4] kcov: Use raw_spinlock_t for kcov->lock and kcov_remote_lock Yunseong Kim
@ 2025-08-04 16:27   ` Steven Rostedt
  2025-08-05 15:33     ` Yunseong Kim
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2025-08-04 16:27 UTC (permalink / raw)
  To: Yunseong Kim
  Cc: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman,
	Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable

On Sun,  3 Aug 2025 07:20:43 +0000
Yunseong Kim <ysk@kzalloc.com> wrote:

> The locks kcov->lock and kcov_remote_lock can be acquired from
> atomic contexts, such as instrumentation hooks invoked from interrupt
> handlers.
> 
> On PREEMPT_RT-enabled kernels, spinlock_t is typically implemented

On PREEMPT_RT is implemented as a sleeping lock. You don't need to say
"typically".

> as a sleeping lock (e.g., mapped to an rt_mutex). Acquiring such a
> lock in atomic context, where sleeping is not allowed, can lead to
> system hangs or crashes.
> 
> To avoid this, convert both locks to raw_spinlock_t, which always
> provides non-sleeping spinlock semantics regardless of preemption model.
> 
> Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
> ---
>  kernel/kcov.c | 58 +++++++++++++++++++++++++--------------------------
>  1 file changed, 29 insertions(+), 29 deletions(-)
> 
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index 187ba1b80bda..7d9b53385d81 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -54,7 +54,7 @@ struct kcov {
>  	 */
>  	refcount_t		refcount;
>  	/* The lock protects mode, size, area and t. */
> -	spinlock_t		lock;
> +	raw_spinlock_t		lock;
>  	enum kcov_mode		mode;
>  	/* Size of arena (in long's). */
>  	unsigned int		size;
> @@ -84,7 +84,7 @@ struct kcov_remote {
>  	struct hlist_node	hnode;
>  };
>  
> -static DEFINE_SPINLOCK(kcov_remote_lock);
> +static DEFINE_RAW_SPINLOCK(kcov_remote_lock);
>  static DEFINE_HASHTABLE(kcov_remote_map, 4);
>  static struct list_head kcov_remote_areas = LIST_HEAD_INIT(kcov_remote_areas);
>  
> @@ -406,7 +406,7 @@ static void kcov_remote_reset(struct kcov *kcov)
>  	struct hlist_node *tmp;
>  	unsigned long flags;
>  
> -	spin_lock_irqsave(&kcov_remote_lock, flags);
> +	raw_spin_lock_irqsave(&kcov_remote_lock, flags);

Not related to these patches, but have you thought about converting some of
these locks over to the "guard()" infrastructure provided by cleanup.h?

>  	hash_for_each_safe(kcov_remote_map, bkt, tmp, remote, hnode) {
>  		if (remote->kcov != kcov)
>  			continue;

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] kcov: Replace per-CPU local_lock with local_irq_save/restore
  2025-08-03  7:20 ` [PATCH 2/4] kcov: Replace per-CPU local_lock with local_irq_save/restore Yunseong Kim
@ 2025-08-04 16:37   ` Steven Rostedt
  2025-08-05 15:41     ` Yunseong Kim
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2025-08-04 16:37 UTC (permalink / raw)
  To: Yunseong Kim
  Cc: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman,
	Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable

On Sun,  3 Aug 2025 07:20:45 +0000
Yunseong Kim <ysk@kzalloc.com> wrote:

> Commit f85d39dd7ed8 ("kcov, usb: disable interrupts in
> kcov_remote_start_usb_softirq") introduced a local_irq_save() in the
> kcov_remote_start_usb_softirq() wrapper, placing kcov_remote_start() in
> atomic context.
> 
> The previous patch addressed this by converting the global

Don't ever use the phrase "The previous patch" in a change log. These get
added to git and it's very hard to find any order of one patch to another.
When doing a git blame 5 years from now, "The previous patch" will be
meaningless.

> kcov_remote_lock to a non-sleeping raw_spinlock_t. However, per-CPU
> data in kcov_remote_start() and kcov_remote_stop() remains protected
> by kcov_percpu_data.lock, which is a local_lock_t.

Instead, you should say something like:

  As kcov_remote_start() is now in atomic context, the kcov_remote lock was
  converted to a non-sleeping raw_spinlock. However, per-cpu ...


> 
> On PREEMPT_RT kernels, local_lock_t is implemented as a sleeping lock.
> Acquiring it from atomic context triggers warnings or crashes due to
> invalid sleeping behavior.
> 
> The original use of local_lock_t assumed that kcov_remote_start() would
> never be called in atomic context. Now that this assumption no longer
> holds, replace it with local_irq_save() and local_irq_restore(), which are
> safe in all contexts and compatible with the use of raw_spinlock_t.

Hmm, if the local_lock_t() is called inside of the taking of the
raw_spinlock_t, then this patch should probably be first. Why introduce a
different bug when fixing another one?

Then the change log of this and the previous patch can both just mention
being called from atomic context.

This change log would probably then say, "in order to convert the kcov locks
to raw_spinlocks, the local_lock_irqsave()s need to be converted over to
local_irq_save()".

-- Steve

> 
> With this change, both global and per-CPU synchronization primitives are
> guaranteed to be non-sleeping, making kcov_remote_start() safe for
> use in atomic contexts.
> 
> Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
> ---

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT
  2025-08-04 16:24 ` [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Steven Rostedt
@ 2025-08-05 15:27   ` Yunseong Kim
  0 siblings, 0 replies; 11+ messages in thread
From: Yunseong Kim @ 2025-08-05 15:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman,
	Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable

Hi Steve,

You're absolutely right to ask for clarification, and I now realize that
I didn’t explain the background clearly enough in my cover letter.

On 8/5/25 1:24 오전, Steven Rostedt wrote:
> On Sun,  3 Aug 2025 07:20:41 +0000
> Yunseong Kim <ysk@kzalloc.com> wrote:
> 
>> This patch series resolves a sleeping function called from invalid context
>> bug that occurs when fuzzing USB with syzkaller on a PREEMPT_RT kernel.
>>
>> The regression was introduced by the interaction of two separate patches:
>> one that made kcov's internal locks sleep on PREEMPT_RT for better latency
> 
> Just so I fully understand this change. It is basically reverting the
> "better latency" changes? That is, with KCOV anyone running with PREEMPT_RT
> can expect non deterministic latency behavior?

The regression results from the interaction of two changes — and in my original
description, I inaccurately characterized one of them as being 
"for better latency." That was misleading.

The first change d5d2c51 replaced spin_lock_irqsave() with local_lock_irqsave()
in KCOV to ensure compatibility with PREEMPT_RT. This avoided using a
potentially sleeping lock with interrupts disabled.
At the time, as Sebastian noted:

 "There is no compelling reason to change the lock type to raw_spin_lock_t...
  Changing it would require to move memory allocation and deallocation outside
  of the locked section."

However, the situation changed after another patch 8fea0c8 converted the USB
HCD tasklet to a BH workqueue. As a result, usb_giveback_urb_bh() began running
with interrupts enabled, and the KCOV remote coverage collection section in
this path became re-entrant. To prevent nested coverage sections — which KCOV
doesn’t support — kcov_remote_start_usb_softirq() was updated to explicitly
disable interrupts during coverage collection f85d39d.

This combination — using a local_lock (which can sleep on RT) alongside
local_irq_save() — inadvertently created a scenario where a sleeping lock was
acquired in atomic context, triggering a kernel BUG on PREEMPT_RT.

So while the original KCOV locking change didn't require raw spinlocks at
the time, it became effectively incompatible with the USB softirq use case once
that path began relying on interrupt disabling for correctness. In this sense,
the "no compelling reason" eventually turned into a "necessary compromise."

To clarify: this patch series doesn't revert the previous change entirely.
It keeps the local_lock behavior for task context (where it's safe and
appropriate), but ensures atomic safety in interrupt/softirq contexts by
using raw spinlocks selectively where needed.

> This should be fully documented. I assume this will not be a problem as
> kcov is more for debugging and should not be enabled in production.
> 
> -- Steve
> 

Thanks again for raising this — I’ll make sure the changelog documents this
interaction more clearly.

Best regards,
Yunseong Kim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] kcov: Use raw_spinlock_t for kcov->lock and kcov_remote_lock
  2025-08-04 16:27   ` Steven Rostedt
@ 2025-08-05 15:33     ` Yunseong Kim
  0 siblings, 0 replies; 11+ messages in thread
From: Yunseong Kim @ 2025-08-05 15:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman,
	Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable

Hi Steve,

Thanks for the review and the suggestion.

On 8/5/25 1:27 오전, Steven Rostedt wrote:
> On Sun,  3 Aug 2025 07:20:43 +0000
> Yunseong Kim <ysk@kzalloc.com> wrote:
> 
>> The locks kcov->lock and kcov_remote_lock can be acquired from
>> atomic contexts, such as instrumentation hooks invoked from interrupt
>> handlers.
>>
>> On PREEMPT_RT-enabled kernels, spinlock_t is typically implemented
> 
> On PREEMPT_RT is implemented as a sleeping lock. You don't need to say
> "typically".

You're right — the phrase "typically implemented as a sleeping lock" was
inaccurate. On PREEMPT_RT, spinlock_t is implemented as a sleeping lock, and
I'll make sure to correct that wording in the next version.

>> as a sleeping lock (e.g., mapped to an rt_mutex). Acquiring such a
>> lock in atomic context, where sleeping is not allowed, can lead to
>> system hangs or crashes.
>>
>> To avoid this, convert both locks to raw_spinlock_t, which always
>> provides non-sleeping spinlock semantics regardless of preemption model.
>>
>> Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
>> ---
>>  kernel/kcov.c | 58 +++++++++++++++++++++++++--------------------------
>>  1 file changed, 29 insertions(+), 29 deletions(-)
>>
>> diff --git a/kernel/kcov.c b/kernel/kcov.c
>> index 187ba1b80bda..7d9b53385d81 100644
>> --- a/kernel/kcov.c
>> +++ b/kernel/kcov.c
>> @@ -54,7 +54,7 @@ struct kcov {
>>  	 */
>>  	refcount_t		refcount;
>>  	/* The lock protects mode, size, area and t. */
>> -	spinlock_t		lock;
>> +	raw_spinlock_t		lock;
>>  	enum kcov_mode		mode;
>>  	/* Size of arena (in long's). */
>>  	unsigned int		size;
>> @@ -84,7 +84,7 @@ struct kcov_remote {
>>  	struct hlist_node	hnode;
>>  };
>>  
>> -static DEFINE_SPINLOCK(kcov_remote_lock);
>> +static DEFINE_RAW_SPINLOCK(kcov_remote_lock);
>>  static DEFINE_HASHTABLE(kcov_remote_map, 4);
>>  static struct list_head kcov_remote_areas = LIST_HEAD_INIT(kcov_remote_areas);
>>  
>> @@ -406,7 +406,7 @@ static void kcov_remote_reset(struct kcov *kcov)
>>  	struct hlist_node *tmp;
>>  	unsigned long flags;
>>  
>> -	spin_lock_irqsave(&kcov_remote_lock, flags);
>> +	raw_spin_lock_irqsave(&kcov_remote_lock, flags);
> 
> Not related to these patches, but have you thought about converting some of
> these locks over to the "guard()" infrastructure provided by cleanup.h?

Also, I appreciate your note about the guard() infrastructure from cleanup.h.
I'll look into whether it's applicable in this context, and plan to adopt it
where appropriate in the next iteration of the series.

>>  	hash_for_each_safe(kcov_remote_map, bkt, tmp, remote, hnode) {
>>  		if (remote->kcov != kcov)
>>  			continue;
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> 
> -- Steve

Thanks again for the feedback and for the Reviewed-by tag!

Best regards,
Yunseong Kim


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] kcov: Replace per-CPU local_lock with local_irq_save/restore
  2025-08-04 16:37   ` Steven Rostedt
@ 2025-08-05 15:41     ` Yunseong Kim
  0 siblings, 0 replies; 11+ messages in thread
From: Yunseong Kim @ 2025-08-05 15:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dmitry Vyukov, Andrey Konovalov, Greg Kroah-Hartman,
	Thomas Gleixner, Sebastian Andrzej Siewior, Tetsuo Handa,
	Byungchul Park, max.byungchul.park, Yeoreum Yun, ppbuk5246,
	linux-usb, linux-rt-devel, syzkaller, linux-kernel, stable

Hi Steve,

Thanks for the detailed feedback and suggestions.

On 8/5/25 1:37 오전, Steven Rostedt wrote:
> On Sun,  3 Aug 2025 07:20:45 +0000
> Yunseong Kim <ysk@kzalloc.com> wrote:
> 
>> Commit f85d39dd7ed8 ("kcov, usb: disable interrupts in
>> kcov_remote_start_usb_softirq") introduced a local_irq_save() in the
>> kcov_remote_start_usb_softirq() wrapper, placing kcov_remote_start() in
>> atomic context.
>>
>> The previous patch addressed this by converting the global
> 
> Don't ever use the phrase "The previous patch" in a change log. These get
> added to git and it's very hard to find any order of one patch to another.
> When doing a git blame 5 years from now, "The previous patch" will be
> meaningless.

I agree that using phrases like "The previous patch" in changelogs is not a
good practice, especially considering future maintenance and git blame
scenarios.

>> kcov_remote_lock to a non-sleeping raw_spinlock_t. However, per-CPU
>> data in kcov_remote_start() and kcov_remote_stop() remains protected
>> by kcov_percpu_data.lock, which is a local_lock_t.
> 
> Instead, you should say something like:
> 
>   As kcov_remote_start() is now in atomic context, the kcov_remote lock was
>   converted to a non-sleeping raw_spinlock. However, per-cpu ...

I’ll revise the commit messages in the next iteration to explicitly
describe the context.

>> On PREEMPT_RT kernels, local_lock_t is implemented as a sleeping lock.
>> Acquiring it from atomic context triggers warnings or crashes due to
>> invalid sleeping behavior.
>>
>> The original use of local_lock_t assumed that kcov_remote_start() would
>> never be called in atomic context. Now that this assumption no longer
>> holds, replace it with local_irq_save() and local_irq_restore(), which are
>> safe in all contexts and compatible with the use of raw_spinlock_t.
> 
> Hmm, if the local_lock_t() is called inside of the taking of the
> raw_spinlock_t, then this patch should probably be first. Why introduce a
> different bug when fixing another one?

Regarding the patch ordering and the potential for introducing new bugs if the
local_lock_t conversions come after the raw_spinlock conversion, that’s a very
good point. I’ll review the patch sequence carefully to ensure the fixes apply
cleanly without regressions.

> Then the change log of this and the previous patch can both just mention
> being called from atomic context.
> 
> This change log would probably then say, "in order to convert the kcov locks
> to raw_spinlocks, the local_lock_irqsave()s need to be converted over to
> local_irq_save()".
> 
> -- Steve

Also, I will update the changelog to clearly state.

Thanks again for your thorough review and guidance!

Best regards,
Yunseong Kim

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-08-05 15:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-03  7:20 [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Yunseong Kim
2025-08-03  7:20 ` [PATCH 1/4] kcov: Use raw_spinlock_t for kcov->lock and kcov_remote_lock Yunseong Kim
2025-08-04 16:27   ` Steven Rostedt
2025-08-05 15:33     ` Yunseong Kim
2025-08-03  7:20 ` [PATCH 2/4] kcov: Replace per-CPU local_lock with local_irq_save/restore Yunseong Kim
2025-08-04 16:37   ` Steven Rostedt
2025-08-05 15:41     ` Yunseong Kim
2025-08-03  7:20 ` [PATCH 3/4] kcov: Separate KCOV_REMOTE_ENABLE ioctl helper function Yunseong Kim
2025-08-03  7:20 ` [PATCH 4/4] kcov: move remote handle allocation outside raw spinlock Yunseong Kim
2025-08-04 16:24 ` [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Steven Rostedt
2025-08-05 15:27   ` Yunseong Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).