public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
From: Yunseong Kim <ysk@kzalloc.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Dmitry Vyukov <dvyukov@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Byungchul Park <byungchul@sk.com>,
	max.byungchul.park@gmail.com, Yeoreum Yun <yeoreum.yun@arm.com>,
	Michelle Jin <shjy180909@gmail.com>,
	linux-kernel@vger.kernel.org,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Alan Stern <stern@rowland.harvard.edu>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	stable@vger.kernel.org, kasan-dev@googlegroups.com,
	syzkaller@googlegroups.com, linux-usb@vger.kernel.org,
	linux-rt-devel@lists.linux.dev,
	Austin Kim <austindh.kim@gmail.com>
Subject: Re: [PATCH] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT
Date: Sat, 9 Aug 2025 02:35:48 +0900	[thread overview]
Message-ID: <ee26e7b2-80dd-49b1-bca2-61e460f73c2d@kzalloc.com> (raw)
In-Reply-To: <20250808163345.PPfA_T3F@linutronix.de>

Hi Sebastian,

I was waiting for your review — thanks!

On 8/9/25 1:33 오전, Sebastian Andrzej Siewior wrote:
> On 2025-07-25 20:14:01 [+0000], Yunseong Kim wrote:
>> When fuzzing USB with syzkaller on a PREEMPT_RT enabled kernel, following
>> bug is triggered in the ksoftirqd context.
>>
> …
>> This issue was introduced by commit
>> f85d39dd7ed8 ("kcov, usb: disable interrupts in kcov_remote_start_usb_softirq").
>>
>> However, this creates a conflict on PREEMPT_RT kernels. The local_irq_save()
>> call establishes an atomic context where sleeping is forbidden. Inside this
>> context, kcov_remote_start() is called, which on PREEMPT_RT uses sleeping
>> locks (spinlock_t and local_lock_t are mapped to rt_mutex). This results in
>> a sleeping function called from invalid context.
>>
>> On PREEMPT_RT, interrupt handlers are threaded, so the re-entrancy scenario
>> is already safely handled by the existing local_lock_t and the global
>> kcov_remote_lock within kcov_remote_start(). Therefore, the outer
>> local_irq_save() is not necessary.
>>
>> This preserves the intended re-entrancy protection for non-RT kernels while
>> resolving the locking violation on PREEMPT_RT kernels.
>>
>> After making this modification and testing it, syzkaller fuzzing the
>> PREEMPT_RT kernel is now running without stopping on latest announced
>> Real-time Linux.
> 
> This looks oddly familiar because I removed the irq-disable bits while
> adding local-locks.
> 
> Commit f85d39dd7ed8 looks wrong not that it shouldn't disable
> interrupts. The statement in the added comment
> 
> | + * 2. Disables interrupts for the duration of the coverage collection section.
> | + *    This allows avoiding nested remote coverage collection sections in the
> | + *    softirq context (a softirq might occur during the execution of a work in
> | + *    the BH workqueue, which runs with in_serving_softirq() > 0).
> 
> is wrong. Softirqs are never nesting. While the BH workqueue is
> running another softirq does not occur. The softirq is raised (again)
> and will be handled _after_ BH workqueue is done. So this is already
> serialised.
> 
> The issue is __usb_hcd_giveback_urb() always invokes
> kcov_remote_start_usb_softirq(). __usb_hcd_giveback_urb() itself is
> invoked from BH context (for the majority of HCDs) and from hardirq
> context for the root-HUB. This gets us to the scenario that that we are
> in the give-back path in softirq context and then invoke the function
> once again in hardirq context.
> 
> I have no idea how kcov works but reverting the original commit and
> avoiding the false nesting due to hardirq context should do the trick,
> an untested patch follows.
> 
> This isn't any different than the tasklet handling that was used before
> so I am not sure why it is now a problem.

Thank you for the detailed analysis and the patch. Your explanation about
the real re-entrancy issue being "softirq vs. hardirq" and the faulty
premise in the original commit makes perfect sense.

> Could someone maybe test this?

As you requested, I have tested your patch on my setup.

I can check that your patch resolves the issue. I have been running
the syzkaller for several hours, and the "sleeping function called
from invalid context" bug is no longer triggered.

> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -1636,7 +1636,6 @@ static void __usb_hcd_giveback_urb(struct urb *urb)
>  	struct usb_hcd *hcd = bus_to_hcd(urb->dev->bus);
>  	struct usb_anchor *anchor = urb->anchor;
>  	int status = urb->unlinked;
> -	unsigned long flags;
>  
>  	urb->hcpriv = NULL;
>  	if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
> @@ -1654,14 +1653,13 @@ static void __usb_hcd_giveback_urb(struct urb *urb)
>  	/* pass ownership to the completion handler */
>  	urb->status = status;
>  	/*
> -	 * Only collect coverage in the softirq context and disable interrupts
> -	 * to avoid scenarios with nested remote coverage collection sections
> -	 * that KCOV does not support.
> -	 * See the comment next to kcov_remote_start_usb_softirq() for details.
> +	 * This function can be called in task context inside another remote
> +	 * coverage collection section, but kcov doesn't support that kind of
> +	 * recursion yet. Only collect coverage in softirq context for now.
>  	 */
> -	flags = kcov_remote_start_usb_softirq((u64)urb->dev->bus->busnum);
> +	kcov_remote_start_usb_softirq((u64)urb->dev->bus->busnum);
>  	urb->complete(urb);
> -	kcov_remote_stop_softirq(flags);
> +	kcov_remote_stop_softirq();
>  
>  	usb_anchor_resume_wakeups(anchor);
>  	atomic_dec(&urb->use_count);
> diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> index 75a2fb8b16c32..0143358874b07 100644
> --- a/include/linux/kcov.h
> +++ b/include/linux/kcov.h
> @@ -57,47 +57,21 @@ static inline void kcov_remote_start_usb(u64 id)
>  
>  /*
>   * The softirq flavor of kcov_remote_*() functions is introduced as a temporary
> - * workaround for KCOV's lack of nested remote coverage sections support.
> - *
> - * Adding support is tracked in https://bugzilla.kernel.org/show_bug.cgi?id=210337.
> - *
> - * kcov_remote_start_usb_softirq():
> - *
> - * 1. Only collects coverage when called in the softirq context. This allows
> - *    avoiding nested remote coverage collection sections in the task context.
> - *    For example, USB/IP calls usb_hcd_giveback_urb() in the task context
> - *    within an existing remote coverage collection section. Thus, KCOV should
> - *    not attempt to start collecting coverage within the coverage collection
> - *    section in __usb_hcd_giveback_urb() in this case.
> - *
> - * 2. Disables interrupts for the duration of the coverage collection section.
> - *    This allows avoiding nested remote coverage collection sections in the
> - *    softirq context (a softirq might occur during the execution of a work in
> - *    the BH workqueue, which runs with in_serving_softirq() > 0).
> - *    For example, usb_giveback_urb_bh() runs in the BH workqueue with
> - *    interrupts enabled, so __usb_hcd_giveback_urb() might be interrupted in
> - *    the middle of its remote coverage collection section, and the interrupt
> - *    handler might invoke __usb_hcd_giveback_urb() again.
> + * work around for kcov's lack of nested remote coverage sections support in
> + * task context. Adding support for nested sections is tracked in:
> + * https://bugzilla.kernel.org/show_bug.cgi?id=210337
>   */
>  
> -static inline unsigned long kcov_remote_start_usb_softirq(u64 id)
> +static inline void kcov_remote_start_usb_softirq(u64 id)
>  {
> -	unsigned long flags = 0;
> -
> -	if (in_serving_softirq()) {
> -		local_irq_save(flags);
> +	if (in_serving_softirq() && !in_hardirq())
>  		kcov_remote_start_usb(id);
> -	}
> -
> -	return flags;
>  }
>  
> -static inline void kcov_remote_stop_softirq(unsigned long flags)
> +static inline void kcov_remote_stop_softirq(void)
>  {
> -	if (in_serving_softirq()) {
> +	if (in_serving_softirq() && !in_hardirq())
>  		kcov_remote_stop();
> -		local_irq_restore(flags);
> -	}
>  }
>  
>  #ifdef CONFIG_64BIT
> @@ -131,11 +105,8 @@ static inline u64 kcov_common_handle(void)
>  }
>  static inline void kcov_remote_start_common(u64 id) {}
>  static inline void kcov_remote_start_usb(u64 id) {}
> -static inline unsigned long kcov_remote_start_usb_softirq(u64 id)
> -{
> -	return 0;
> -}
> -static inline void kcov_remote_stop_softirq(unsigned long flags) {}
> +static inline void kcov_remote_start_usb_softirq(u64 id) {}
> +static inline void kcov_remote_stop_softirq(void) {}
>  
>  #endif /* CONFIG_KCOV */
>  #endif /* _LINUX_KCOV_H */


I really impressed your "How to Not Break PREEMPT_RT" talk at LPC 22.


Tested-by: Yunseong Kim <ysk@kzalloc.com>


Thanks,

Yunseong Kim

  reply	other threads:[~2025-08-08 17:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-25 20:14 [PATCH] kcov, usb: Fix invalid context sleep in softirq path on PREEMPT_RT Yunseong Kim
2025-07-26  6:36 ` Greg Kroah-Hartman
2025-07-26  7:44   ` Tetsuo Handa
2025-07-26  7:59     ` Greg Kroah-Hartman
2025-07-26 11:59       ` Thomas Gleixner
2025-08-01 22:06         ` Yunseong Kim
2025-08-08 16:33 ` Sebastian Andrzej Siewior
2025-08-08 17:35   ` Yunseong Kim [this message]
2025-08-11  8:31     ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee26e7b2-80dd-49b1-bca2-61e460f73c2d@kzalloc.com \
    --to=ysk@kzalloc.com \
    --cc=andreyknvl@gmail.com \
    --cc=austindh.kim@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=byungchul@sk.com \
    --cc=dvyukov@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=linux-usb@vger.kernel.org \
    --cc=max.byungchul.park@gmail.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=shjy180909@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    --cc=syzkaller@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox