From: Joel Fernandes <joelagnelf@nvidia.com>
To: Yao Kai <yaokai34@huawei.com>
Cc: rcu@vger.kernel.org, liuyongqiang13@huawei.com,
paulmck@kernel.org, frederic@kernel.org,
neeraj.upadhyay@kernel.org, josh@joshtriplett.org,
boqun.feng@gmail.com, urezki@gmail.com, rostedt@goodmis.org,
mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com,
qiang.zhang@linux.dev, linux-kernel@vger.kernel.org,
yujiacheng3@huawei.com
Subject: Re: [PATCH] rcu: Fix rcu_read_unlock() deadloop due to softirq
Date: Mon, 22 Dec 2025 17:26:29 -0500 [thread overview]
Message-ID: <fe07319e-01ec-4700-bc89-e548f4dc7271@nvidia.com> (raw)
In-Reply-To: <20251222080600.201867-1-yaokai34@huawei.com>
On 12/22/2025 3:06 AM, Yao Kai wrote:
> Commit 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in
> __rcu_read_unlock()") removes the recursion-protection code from
> __rcu_read_unlock(). Therefore, we could invoke the deadloop in
> raise_softirq_irqoff() with ftrace enabled as follows:
>
> WARNING: CPU: 0 PID: 0 at kernel/trace/trace.c:3021 __ftrace_trace_stack.constprop.0+0x172/0x180
> Modules linked in: my_irq_work(O)
> CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.18.0-rc7-dirty #23 PREEMPT(full)
> Tainted: [O]=OOT_MODULE
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> RIP: 0010:__ftrace_trace_stack.constprop.0+0x172/0x180
> RSP: 0018:ffffc900000034a8 EFLAGS: 00010002
> RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
> RDX: 0000000000000003 RSI: ffffffff826d7b87 RDI: ffffffff826e9329
> RBP: 0000000000090009 R08: 0000000000000005 R09: ffffffff82afbc4c
> R10: 0000000000000008 R11: 0000000000011d7a R12: 0000000000000000
> R13: ffff888003874100 R14: 0000000000000003 R15: ffff8880038c1054
> FS: 0000000000000000(0000) GS:ffff8880fa8ea000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055b31fa7f540 CR3: 00000000078f4005 CR4: 0000000000770ef0
> PKRU: 55555554
> Call Trace:
> <IRQ>
> trace_buffer_unlock_commit_regs+0x6d/0x220
> trace_event_buffer_commit+0x5c/0x260
> trace_event_raw_event_softirq+0x47/0x80
> raise_softirq_irqoff+0x6e/0xa0
> rcu_read_unlock_special+0xb1/0x160
> unwind_next_frame+0x203/0x9b0
> __unwind_start+0x15d/0x1c0
> arch_stack_walk+0x62/0xf0
> stack_trace_save+0x48/0x70
> __ftrace_trace_stack.constprop.0+0x144/0x180
> trace_buffer_unlock_commit_regs+0x6d/0x220
> trace_event_buffer_commit+0x5c/0x260
> trace_event_raw_event_softirq+0x47/0x80
> raise_softirq_irqoff+0x6e/0xa0
> rcu_read_unlock_special+0xb1/0x160
> unwind_next_frame+0x203/0x9b0
> __unwind_start+0x15d/0x1c0
> arch_stack_walk+0x62/0xf0
> stack_trace_save+0x48/0x70
> __ftrace_trace_stack.constprop.0+0x144/0x180
> trace_buffer_unlock_commit_regs+0x6d/0x220
> trace_event_buffer_commit+0x5c/0x260
> trace_event_raw_event_softirq+0x47/0x80
> raise_softirq_irqoff+0x6e/0xa0
> rcu_read_unlock_special+0xb1/0x160
> unwind_next_frame+0x203/0x9b0
> __unwind_start+0x15d/0x1c0
> arch_stack_walk+0x62/0xf0
> stack_trace_save+0x48/0x70
> __ftrace_trace_stack.constprop.0+0x144/0x180
> trace_buffer_unlock_commit_regs+0x6d/0x220
> trace_event_buffer_commit+0x5c/0x260
> trace_event_raw_event_softirq+0x47/0x80
> raise_softirq_irqoff+0x6e/0xa0
> rcu_read_unlock_special+0xb1/0x160
> __is_insn_slot_addr+0x54/0x70
> kernel_text_address+0x48/0xc0
> __kernel_text_address+0xd/0x40
> unwind_get_return_address+0x1e/0x40
> arch_stack_walk+0x9c/0xf0
> stack_trace_save+0x48/0x70
> __ftrace_trace_stack.constprop.0+0x144/0x180
> trace_buffer_unlock_commit_regs+0x6d/0x220
> trace_event_buffer_commit+0x5c/0x260
> trace_event_raw_event_softirq+0x47/0x80
> __raise_softirq_irqoff+0x61/0x80
> __flush_smp_call_function_queue+0x115/0x420
> __sysvec_call_function_single+0x17/0xb0
> sysvec_call_function_single+0x8c/0xc0
> </IRQ>
>
> Commit b41642c87716 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
> fixed the infinite loop in rcu_read_unlock_special() for IRQ work by
> setting a flag before calling irq_work_queue_on(). We fix this issue by
> setting the same flag before calling raise_softirq_irqoff() and rename the
> flag to defer_qs_pending for more common.
>
> Fixes: 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in __rcu_read_unlock()")
> Reported-by: Tengda Wu <wutengda2@huawei.com>
> Signed-off-by: Yao Kai <yaokai34@huawei.com>
> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Thank you. I will apply this patch, I am preparing a few other RCU patches from
me and others. Will send it in a series for review/testing for 7.0.
- Joel
prev parent reply other threads:[~2025-12-22 22:26 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-18 7:49 [PATCH] rcu: Fix rcu_read_unlock() deadloop due to softirq Yao Kai
2025-12-19 15:14 ` Joel Fernandes
2025-12-19 15:38 ` Joel Fernandes
2025-12-19 15:44 ` Joel Fernandes
2025-12-22 7:11 ` Yao Kai
2025-12-22 7:55 ` Yao Kai
2025-12-22 8:06 ` Yao Kai
2025-12-22 22:26 ` Joel Fernandes [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fe07319e-01ec-4700-bc89-e548f4dc7271@nvidia.com \
--to=joelagnelf@nvidia.com \
--cc=boqun.feng@gmail.com \
--cc=frederic@kernel.org \
--cc=jiangshanlai@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=liuyongqiang13@huawei.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=neeraj.upadhyay@kernel.org \
--cc=paulmck@kernel.org \
--cc=qiang.zhang@linux.dev \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=urezki@gmail.com \
--cc=yaokai34@huawei.com \
--cc=yujiacheng3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.