Linux kernel -stable discussions
 help / color / mirror / Atom feed
From: Yunseong Kim <ysk@kzalloc.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>,
	Naresh Kamboju <naresh.kamboju@linaro.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Austin Kim <austindh.kim@gmail.com>,
	Yeoreum Yun <yeoreum.yun@arm.com>,
	linux-rt-devel@lists.linux.dev, syzkaller@googlegroups.com,
	stable@vger.kernel.org
Subject: [BUG] arm64: Sleeping function called from invalid context in do_debug_exception on PREEMPT_RT
Date: Wed, 13 Aug 2025 14:01:47 +0900	[thread overview]
Message-ID: <c36e8dca-d466-40ad-ad51-2b75e769ff47@kzalloc.com> (raw)

Hi,

On a PREEMPT_RT kernel based on v6.16-rc1, I hit the following splat:

| BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
| in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 20466, name: syz.0.1689
| preempt_count: 1, expected: 0
| RCU nest depth: 0, expected: 0
| Preemption disabled at:
| [<ffff800080241600>] debug_exception_enter arch/arm64/mm/fault.c:978 [inline]
| [<ffff800080241600>] do_debug_exception+0x68/0x2fc arch/arm64/mm/fault.c:997
| CPU: 0 UID: 0 PID: 20466 Comm: syz.0.1689 Not tainted 6.16.0-rc1-rt1-dirty #12 PREEMPT_RT
| Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8 05/13/2025
| Call trace:
|  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:466 (C)
|  __dump_stack+0x30/0x40 lib/dump_stack.c:94
|  dump_stack_lvl+0x148/0x1d8 lib/dump_stack.c:120
|  dump_stack+0x1c/0x3c lib/dump_stack.c:129
|  __might_resched+0x2e4/0x52c kernel/sched/core.c:8800
|  __rt_spin_lock kernel/locking/spinlock_rt.c:48 [inline]
|  rt_spin_lock+0xa8/0x1bc kernel/locking/spinlock_rt.c:57
|  spin_lock include/linux/spinlock_rt.h:44 [inline]
|  force_sig_info_to_task+0x6c/0x4a8 kernel/signal.c:1302
|  force_sig_fault_to_task kernel/signal.c:1699 [inline]
|  force_sig_fault+0xc4/0x110 kernel/signal.c:1704
|  arm64_force_sig_fault+0x6c/0x80 arch/arm64/kernel/traps.c:265
|  send_user_sigtrap arch/arm64/kernel/debug-monitors.c:237 [inline]
|  single_step_handler+0x1f4/0x36c arch/arm64/kernel/debug-monitors.c:257
|  do_debug_exception+0x154/0x2fc arch/arm64/mm/fault.c:1002
|  el0_dbg+0x44/0x120 arch/arm64/kernel/entry-common.c:756
|  el0t_64_sync_handler+0x3c/0x108 arch/arm64/kernel/entry-common.c:832
|  el0t_64_sync+0x1ac/0x1b0 arch/arm64/kernel/entry.S:600


It seems that commit eaff68b32861 ("arm64: entry: Add entry and exit functions
for debug exception") in 6.17-rc1, also present as 6fb44438a5e1 in mainline,
removed code that previously avoided sleeping context issues when handling
debug exceptions:
Link: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/commit/arch/arm64/mm/fault.c?id=eaff68b3286116d499a3d4e513a36d772faba587

This appears to be triggered when force_sig_fault() is called from
debug exception context, which is not sleepable under PREEMPT_RT.

I understand that this path is primarily for debugging, but I would like
to discuss whether the patch needs some adjustment for PREEMPT_RT.

I also found that the issue can be reproduced depending on the changes
introduced by the following commit:
Link: https://github.com/torvalds/linux/commit/d8bb6718c4d

  arm64: Make debug exception handlers visible from RCU
  Make debug exceptions visible from RCU so that synchronize_rcu()
  correctly tracks the debug exception handler.

  This also introduces sanity checks for user-mode exceptions as same
  as x86's ist_enter()/ist_exit().

  The debug exception can interrupt in idle task. For example, it warns
  if we put a kprobe on a function called from idle task as below.
  The warning message showed that the rcu_read_lock() caused this
  problem. But actually, this means the RCU lost the context which
  was already in NMI/IRQ.

    /sys/kernel/debug/tracing # echo p default_idle_call >> kprobe_events
    /sys/kernel/debug/tracing # echo 1 > events/kprobes/enable
    ...

For reference:
- v5.2.10: https://elixir.bootlin.com/linux/v5.2.10/source/arch/arm64/mm/fault.c#L810
- v5.3-rc3: https://elixir.bootlin.com/linux/v5.3-rc3/source/arch/arm64/mm/fault.c#L787


Do we need to restore some form of non-sleeping signal delivery in debug
exception context for PREEMPT_RT, or is there another preferred fix?

Thanks,
Yunseong

             reply	other threads:[~2025-08-13  5:01 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-13  5:01 Yunseong Kim [this message]
2025-08-13  6:56 ` [BUG] arm64: Sleeping function called from invalid context in do_debug_exception on PREEMPT_RT Yeoreum Yun
2025-08-13  7:42   ` Yunseong Kim
2025-08-13  8:59     ` Yeoreum Yun
2025-08-13 10:06       ` Luis Claudio R. Goncalves
2025-08-13 11:43         ` Ada Couprie Diaz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c36e8dca-d466-40ad-ad51-2b75e769ff47@kzalloc.com \
    --to=ysk@kzalloc.com \
    --cc=austindh.kim@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=mark.rutland@arm.com \
    --cc=mhiramat@kernel.org \
    --cc=naresh.kamboju@linaro.org \
    --cc=paulmck@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=syzkaller@googlegroups.com \
    --cc=will@kernel.org \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox