public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yunseong Kim <ysk@kzalloc.com>
To: Gabriele Monaco <gmonaco@redhat.com>, Nam Cao <nam.cao@linaro.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Tomas Glozar <tglozar@redhat.com>,
	Shung-Hsi Yu <shung-hsi.yu@suse.com>,
	Byungchul Park <byungchul@sk.com>,
	syzkaller@googlegroups.com, linux-rt-devel@lists.linux.dev,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [Question] Detecting Sleep-in-Atomic Context in PREEMPT_RT via RV (Runtime Verification) monitor rtapp:sleep
Date: Wed, 29 Oct 2025 07:53:20 +0900	[thread overview]
Message-ID: <eed6ff19-a944-4e4c-96e4-0f44e888c71d@kzalloc.com> (raw)
In-Reply-To: <d23d133b52ef574d669f1656789b78d07c91c9f5.camel@redhat.com>

Hi Gabriele,

On 10/27/25 9:20 PM, Gabriele Monaco wrote:
> On Mon, 2025-10-27 at 15:54 +0900, Yunseong Kim wrote:
>> Hi Nam,
>>
>> I've been very interested in RV (Runtime Verification) to proactively detect
>> "sleep in atomic" scenarios on PREEMPT_RT kernels. Specifically, I'm looking
>> for ways to find cases where sleeping spinlocks or memory allocations are used
>> within preemption-disabled or irq-disabled contexts. While searching for
>> solutions, I discovered the RV subsystem.
>>
> 
> Hi Yunseong,
> 
> I'm sure Nam can be more specific on this, but let me add my 2 cents here.

Thank you so much for your detailed response! It cleared up many of the
questions I had.

> The sleep monitor doesn't really do what you want, its violations are real time
> tasks (typically userspace tasks with RR/FIFO policies) sleeping in a way that
> might incur latencies. For instance using non PI locks or imprecise sleep.

So that’s the role of rtapp:sleep you mentioned. Thank you again for
clarifying it.

> What you need here is to validate kernel code, RV was actually designed for
> that, but there's currently no monitor that does what you want.

It’s a valuable chance to make a contribution to RV!

> The closest thing I can think of is monitors like scpd and snep in the sched
> collection [1]. Those however won't catch what you need because they focus on
> the preemption tracepoints and schedule, which works fine also in your scenario.
> 
> We could add similar monitors to catch what you want though:
> 
>                      |
>                      |
>                      v
>                    +-----------------+
>                    |   cant_sleep    | <+
>                    +-----------------+  |
>                      |                  |
>                      | preempt_enable   | preempt_disable
>                      v                  |
>     kmalloc                             |
>     lock_acquire                        |
>   +---------------      can_sleep       |
>   |                                     |
>   +-------------->                     -+
> 
> which would become slightly more complicated if considering irq enable/disable
> too. This is a deterministic automaton representation (see [1] for examples),
> you could use an LTL like sleep as well, I assume (needs a per-CPU monitor which
> is not merged yet for LTL).
> 
> This is simplified but you can of course put conditions on what kind of
> allocations and locks you're interested in.

If the goal is to detect this state before the output from __might_resched()
under CONFIG_DEBUG_ATOMIC_SLEEP (i.e., before an actual context switch occurs),
I am considering whether Deterministic Automata (.dot/DA) or Linear Temporal
Logic (.ltl/LTL) would be more appropriate for modeling this check. I'm also
thinking about whether I need to create a comprehensive table of all sleepable
functions for this purpose on the PREEMPT_RT kernel.

If this check is necessary, I’m planning to try the following verification:

RULE = always ((IN_ATOMIC or IRQS_DISABLED) imply not CALLS_RT_SLEEPER)

I’m also planning to add sleepable functions, including sleepable spinlocks
and memory allocations callable under PREEMPT_RT preempt/IRQ-disabled states,
to the RV monitor kernel module.

I’m considering adding the following functions as a result:

 // Mutex & Semaphore (or Lockdep's 'lock_acquire' for lock cases)
 "mutex_lock",
 "mutex_lock_interruptible",
 "mutex_lock_killable",
 "down_interruptible",
 "down_killable",
 "rwsem_down_read_failed",
 "rwsem_down_write_failed",
 "ww_mutex_lock",
 "rt_spin_lock",
 "rt_read_lock",
 "rt_write_lock",
 // or just "lock_acquire" for LOCKDEP enabled kernel.

 // sleep & schedule
 "msleep",
 "ssleep",
 "usleep_range",
 "wait_for_completion",
 "schedule",
 "cond_resched",

 // User-space memory access
 "copy_from_user",
 "copy_to_user",
 "__get_user_asm",
 "__put_user_asm",

 // memory allocation
 "__vmalloc",
 "__kmalloc"

> Now this specific case would require lockdep for the definition of lock_acquire
> tracepoints. So I'm not sure how useful this monitor would be since lockdep is
> going to complain too. You could use contention tracepoints to catch exactly
> when sleep is going to occur and not /potential/ failures.

I’ll look into this lockdep realated part further as well.

> I only gave a quick thought on this, there may be better models/event fitting
> your usecase, but I hope you get the idea.
> 
> [1] - https://docs.kernel.org/trace/rv/monitor_sched.html#monitor-scpd

Thank you for providing a diagram and references that make it easier to
understand!

>> Here are my questions:
>>
>> 1. Does the rtapp:sleep monitor proactively detect scenarios that
>>    could lead to sleeping in atomic context, perhaps before
>>    CONFIG_DEBUG_ATOMIC_SLEEP (enabled) would trigger at the actual point of
>>    sleeping?
> 
> I guess I answered this already, but TL;DR no, you'd need a dedicated monitor.
> 
>> 2. Is there a way to enable this monitor (e.g., rtapp:sleep)
>>    immediately as soon as the RV subsystem is loaded during boot time?
>>    (How to make this "default turn on"?)
> 
> Currently not, but you could probably use any sort of startup script to turn it
> on soon enough.
> 
>> 3. When a "violation detected" message occurs at runtime, is it
>>    possible to get a call stack of the location that triggered the
>>    violation? The panic reactor provides a full stack, but I'm
>>    wondering if this is also possible with the printk reactor.
> 
> You can use ftrace and rely on error tracepoints instead of reactors. Each RV
> violation triggers a tracepoint (e.g. error_sleep) and you can print a call
> stack there. E.g.:
> 
>   echo stacktrace > /sys/kernel/tracing/events/rv/error_sleep/trigger
> 
> Here I use sleep as an example, but all monitors have their own error events
> (e.g. error_wwnr, error_snep, etc.).
> 
> Does this all look useful in your scenario?

Thank you once again for your thorough explanation. Many of the questions
I initially had have now been resolved!

> Gabriele

Best regards,
Yunseong Kim

  reply	other threads:[~2025-10-28 22:53 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-27  6:54 [Question] Detecting Sleep-in-Atomic Context in PREEMPT_RT via RV (Runtime Verification) monitor rtapp:sleep Yunseong Kim
2025-10-27 12:20 ` Gabriele Monaco
2025-10-28 22:53   ` Yunseong Kim [this message]
2025-10-29  9:24     ` Gabriele Monaco
2025-11-05  9:10 ` Nam Cao
2025-12-02 11:14   ` Nam Cao
2025-12-02 11:26     ` Sebastian Andrzej Siewior
2025-12-11  4:30       ` Yunseong Kim
2025-12-11  5:42         ` Nam Cao
2025-12-11  7:58           ` Yunseong Kim
2025-12-22  7:40             ` Gabriele Monaco
2025-12-23 14:31               ` Yunseong Kim
2025-12-23 15:21                 ` Gabriele Monaco
2025-12-12  7:02       ` Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eed6ff19-a944-4e4c-96e4-0f44e888c71d@kzalloc.com \
    --to=ysk@kzalloc.com \
    --cc=bigeasy@linutronix.de \
    --cc=byungchul@sk.com \
    --cc=gmonaco@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=nam.cao@linaro.org \
    --cc=shung-hsi.yu@suse.com \
    --cc=syzkaller@googlegroups.com \
    --cc=tglozar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox