All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yunseong Kim <ysk@kzalloc.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Dmitry Vyukov <dvyukov@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Byungchul Park <byungchul@sk.com>,
	max.byungchul.park@gmail.com, Yeoreum Yun <yeoreum.yun@arm.com>,
	Michelle Jin <shjy180909@gmail.com>,
	linux-kernel@vger.kernel.org,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Alan Stern <stern@rowland.harvard.edu>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	stable@vger.kernel.org, kasan-dev@googlegroups.com,
	syzkaller@googlegroups.com, linux-usb@vger.kernel.org,
	linux-rt-devel@lists.linux.dev,
	Austin Kim <austindh.kim@gmail.com>
Subject: Re: [PATCH] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT
Date: Sat, 9 Aug 2025 02:35:48 +0900	[thread overview]
Message-ID: <ee26e7b2-80dd-49b1-bca2-61e460f73c2d@kzalloc.com> (raw)
In-Reply-To: <20250808163345.PPfA_T3F@linutronix.de>

Hi Sebastian,

I was waiting for your review — thanks!

On 8/9/25 1:33 오전, Sebastian Andrzej Siewior wrote:
> On 2025-07-25 20:14:01 [+0000], Yunseong Kim wrote:
>> When fuzzing USB with syzkaller on a PREEMPT_RT enabled kernel, following
>> bug is triggered in the ksoftirqd context.
>>
> …
>> This issue was introduced by commit
>> f85d39dd7ed8 ("kcov, usb: disable interrupts in kcov_remote_start_usb_softirq").
>>
>> However, this creates a conflict on PREEMPT_RT kernels. The local_irq_save()
>> call establishes an atomic context where sleeping is forbidden. Inside this
>> context, kcov_remote_start() is called, which on PREEMPT_RT uses sleeping
>> locks (spinlock_t and local_lock_t are mapped to rt_mutex). This results in
>> a sleeping function called from invalid context.
>>
>> On PREEMPT_RT, interrupt handlers are threaded, so the re-entrancy scenario
>> is already safely handled by the existing local_lock_t and the global
>> kcov_remote_lock within kcov_remote_start(). Therefore, the outer
>> local_irq_save() is not necessary.
>>
>> This preserves the intended re-entrancy protection for non-RT kernels while
>> resolving the locking violation on PREEMPT_RT kernels.
>>
>> After making this modification and testing it, syzkaller fuzzing the
>> PREEMPT_RT kernel is now running without stopping on latest announced
>> Real-time Linux.
> 
> This looks oddly familiar because I removed the irq-disable bits while
> adding local-locks.
> 
> Commit f85d39dd7ed8 looks wrong not that it shouldn't disable
> interrupts. The statement in the added comment
> 
> | + * 2. Disables interrupts for the duration of the coverage collection section.
> | + *    This allows avoiding nested remote coverage collection sections in the
> | + *    softirq context (a softirq might occur during the execution of a work in
> | + *    the BH workqueue, which runs with in_serving_softirq() > 0).
> 
> is wrong. Softirqs are never nesting. While the BH workqueue is
> running another softirq does not occur. The softirq is raised (again)
> and will be handled _after_ BH workqueue is done. So this is already
> serialised.
> 
> The issue is __usb_hcd_giveback_urb() always invokes
> kcov_remote_start_usb_softirq(). __usb_hcd_giveback_urb() itself is
> invoked from BH context (for the majority of HCDs) and from hardirq
> context for the root-HUB. This gets us to the scenario that that we are
> in the give-back path in softirq context and then invoke the function
> once again in hardirq context.
> 
> I have no idea how kcov works but reverting the original commit and
> avoiding the false nesting due to hardirq context should do the trick,
> an untested patch follows.
> 
> This isn't any different than the tasklet handling that was used before
> so I am not sure why it is now a problem.

Thank you for the detailed analysis and the patch. Your explanation about
the real re-entrancy issue being "softirq vs. hardirq" and the faulty
premise in the original commit makes perfect sense.

> Could someone maybe test this?

As you requested, I have tested your patch on my setup.

I can check that your patch resolves the issue. I have been running
the syzkaller for several hours, and the "sleeping function called
from invalid context" bug is no longer triggered.

> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -1636,7 +1636,6 @@ static void __usb_hcd_giveback_urb(struct urb *urb)
>  	struct usb_hcd *hcd = bus_to_hcd(urb->dev->bus);
>  	struct usb_anchor *anchor = urb->anchor;
>  	int status = urb->unlinked;
> -	unsigned long flags;
>  
>  	urb->hcpriv = NULL;
>  	if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
> @@ -1654,14 +1653,13 @@ static void __usb_hcd_giveback_urb(struct urb *urb)
>  	/* pass ownership to the completion handler */
>  	urb->status = status;
>  	/*
> -	 * Only collect coverage in the softirq context and disable interrupts
> -	 * to avoid scenarios with nested remote coverage collection sections
> -	 * that KCOV does not support.
> -	 * See the comment next to kcov_remote_start_usb_softirq() for details.
> +	 * This function can be called in task context inside another remote
> +	 * coverage collection section, but kcov doesn't support that kind of
> +	 * recursion yet. Only collect coverage in softirq context for now.
>  	 */
> -	flags = kcov_remote_start_usb_softirq((u64)urb->dev->bus->busnum);
> +	kcov_remote_start_usb_softirq((u64)urb->dev->bus->busnum);
>  	urb->complete(urb);
> -	kcov_remote_stop_softirq(flags);
> +	kcov_remote_stop_softirq();
>  
>  	usb_anchor_resume_wakeups(anchor);
>  	atomic_dec(&urb->use_count);
> diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> index 75a2fb8b16c32..0143358874b07 100644
> --- a/include/linux/kcov.h
> +++ b/include/linux/kcov.h
> @@ -57,47 +57,21 @@ static inline void kcov_remote_start_usb(u64 id)
>  
>  /*
>   * The softirq flavor of kcov_remote_*() functions is introduced as a temporary
> - * workaround for KCOV's lack of nested remote coverage sections support.
> - *
> - * Adding support is tracked in https://bugzilla.kernel.org/show_bug.cgi?id=210337.
> - *
> - * kcov_remote_start_usb_softirq():
> - *
> - * 1. Only collects coverage when called in the softirq context. This allows
> - *    avoiding nested remote coverage collection sections in the task context.
> - *    For example, USB/IP calls usb_hcd_giveback_urb() in the task context
> - *    within an existing remote coverage collection section. Thus, KCOV should
> - *    not attempt to start collecting coverage within the coverage collection
> - *    section in __usb_hcd_giveback_urb() in this case.
> - *
> - * 2. Disables interrupts for the duration of the coverage collection section.
> - *    This allows avoiding nested remote coverage collection sections in the
> - *    softirq context (a softirq might occur during the execution of a work in
> - *    the BH workqueue, which runs with in_serving_softirq() > 0).
> - *    For example, usb_giveback_urb_bh() runs in the BH workqueue with
> - *    interrupts enabled, so __usb_hcd_giveback_urb() might be interrupted in
> - *    the middle of its remote coverage collection section, and the interrupt
> - *    handler might invoke __usb_hcd_giveback_urb() again.
> + * work around for kcov's lack of nested remote coverage sections support in
> + * task context. Adding support for nested sections is tracked in:
> + * https://bugzilla.kernel.org/show_bug.cgi?id=210337
>   */
>  
> -static inline unsigned long kcov_remote_start_usb_softirq(u64 id)
> +static inline void kcov_remote_start_usb_softirq(u64 id)
>  {
> -	unsigned long flags = 0;
> -
> -	if (in_serving_softirq()) {
> -		local_irq_save(flags);
> +	if (in_serving_softirq() && !in_hardirq())
>  		kcov_remote_start_usb(id);
> -	}
> -
> -	return flags;
>  }
>  
> -static inline void kcov_remote_stop_softirq(unsigned long flags)
> +static inline void kcov_remote_stop_softirq(void)
>  {
> -	if (in_serving_softirq()) {
> +	if (in_serving_softirq() && !in_hardirq())
>  		kcov_remote_stop();
> -		local_irq_restore(flags);
> -	}
>  }
>  
>  #ifdef CONFIG_64BIT
> @@ -131,11 +105,8 @@ static inline u64 kcov_common_handle(void)
>  }
>  static inline void kcov_remote_start_common(u64 id) {}
>  static inline void kcov_remote_start_usb(u64 id) {}
> -static inline unsigned long kcov_remote_start_usb_softirq(u64 id)
> -{
> -	return 0;
> -}
> -static inline void kcov_remote_stop_softirq(unsigned long flags) {}
> +static inline void kcov_remote_start_usb_softirq(u64 id) {}
> +static inline void kcov_remote_stop_softirq(void) {}
>  
>  #endif /* CONFIG_KCOV */
>  #endif /* _LINUX_KCOV_H */


I really impressed your "How to Not Break PREEMPT_RT" talk at LPC 22.


Tested-by: Yunseong Kim <ysk@kzalloc.com>


Thanks,

Yunseong Kim

  reply	other threads:[~2025-08-08 17:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-25 20:14 [PATCH] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Yunseong Kim
2025-07-26  6:36 ` Greg Kroah-Hartman
2025-07-26  7:44   ` Tetsuo Handa
2025-07-26  7:59     ` Greg Kroah-Hartman
2025-07-26 11:59       ` Thomas Gleixner
2025-08-01 22:06         ` Yunseong Kim
2025-08-08 16:33 ` Sebastian Andrzej Siewior
2025-08-08 17:35   ` Yunseong Kim [this message]
2025-08-11  8:31     ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee26e7b2-80dd-49b1-bca2-61e460f73c2d@kzalloc.com \
    --to=ysk@kzalloc.com \
    --cc=andreyknvl@gmail.com \
    --cc=austindh.kim@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=byungchul@sk.com \
    --cc=dvyukov@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=linux-usb@vger.kernel.org \
    --cc=max.byungchul.park@gmail.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=shjy180909@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    --cc=syzkaller@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.