From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Wupeng Ma <mawupeng1@huawei.com>,
mathieu.desnoyers@efficios.com,
"Masami Hiramatsu (Google)" <mhiramat@kernel.org>,
"Steven Rostedt (Google)" <rostedt@goodmis.org>,
Sasha Levin <sashal@kernel.org>,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.18-5.10] ring-buffer: Avoid softlockup in ring_buffer_resize() during memory free
Date: Mon, 12 Jan 2026 09:58:17 -0500 [thread overview]
Message-ID: <20260112145840.724774-16-sashal@kernel.org> (raw)
In-Reply-To: <20260112145840.724774-1-sashal@kernel.org>
From: Wupeng Ma <mawupeng1@huawei.com>
[ Upstream commit 6435ffd6c7fcba330dfa91c58dc30aed2df3d0bf ]
When user resize all trace ring buffer through file 'buffer_size_kb',
then in ring_buffer_resize(), kernel allocates buffer pages for each
cpu in a loop.
If the kernel preemption model is PREEMPT_NONE and there are many cpus
and there are many buffer pages to be freed, it may not give up cpu
for a long time and finally cause a softlockup.
To avoid it, call cond_resched() after each cpu buffer free as Commit
f6bd2c92488c ("ring-buffer: Avoid softlockup in ring_buffer_resize()")
does.
Detailed call trace as follow:
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 24-....: (14837 ticks this GP) idle=521c/1/0x4000000000000000 softirq=230597/230597 fqs=5329
rcu: (t=15004 jiffies g=26003221 q=211022 ncpus=96)
CPU: 24 UID: 0 PID: 11253 Comm: bash Kdump: loaded Tainted: G EL 6.18.2+ #278 NONE
pc : arch_local_irq_restore+0x8/0x20
arch_local_irq_restore+0x8/0x20 (P)
free_frozen_page_commit+0x28c/0x3b0
__free_frozen_pages+0x1c0/0x678
___free_pages+0xc0/0xe0
free_pages+0x3c/0x50
ring_buffer_resize.part.0+0x6a8/0x880
ring_buffer_resize+0x3c/0x58
__tracing_resize_ring_buffer.part.0+0x34/0xd8
tracing_resize_ring_buffer+0x8c/0xd0
tracing_entries_write+0x74/0xd8
vfs_write+0xcc/0x288
ksys_write+0x74/0x118
__arm64_sys_write+0x24/0x38
Cc: <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20251228065008.2396573-1-mawupeng1@huawei.com
Signed-off-by: Wupeng Ma <mawupeng1@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
## Analysis of Commit: ring-buffer: Avoid softlockup in
ring_buffer_resize() during memory free
### 1. COMMIT MESSAGE ANALYSIS
The commit message clearly describes:
- **Problem**: When resizing trace ring buffers on systems with many
CPUs and PREEMPT_NONE, the kernel can hold the CPU for too long while
freeing buffer pages, causing a softlockup (RCU stall)
- **Evidence**: Includes a complete stack trace showing the issue on a
96-CPU system with kernel 6.18.2+
- **Solution**: Add `cond_resched()` after each buffer page free,
matching what was done in commit f6bd2c92488c for a different code
path
- **Validation**: Acked by Masami Hiramatsu and signed off by Steven
Rostedt (the ring buffer subsystem maintainer)
Keywords present: "softlockup", "self-detected stall", "rcu_sched" - all
indicate a real, user-visible bug.
### 2. CODE CHANGE ANALYSIS
The change is minimal and surgical:
```c
list_for_each_entry_safe(bpage, tmp, &cpu_buffer->new_pages, list) {
list_del_init(&bpage->list);
free_buffer_page(bpage);
+
+ cond_resched();
}
```
This is in the `out_err:` error handling path of `ring_buffer_resize()`.
The loop iterates over all buffer pages to free them on error cleanup.
On systems with many CPUs and many pages, this loop can run for a very
long time without yielding.
**Technical mechanism**:
- `cond_resched()` checks if the scheduler needs to preempt the current
task
- On PREEMPT_NONE kernels, voluntary preemption points like this are the
only way to yield
- This is a standard, well-established kernel pattern for long-running
loops
### 3. CLASSIFICATION
- **Bug type**: Softlockup fix - prevents RCU stalls and potential
watchdog timeouts
- **Not a feature**: Does not add new functionality, just prevents a
hang
- **Not an exception category**: Standard bug fix, not device
ID/quirk/DT
### 4. SCOPE AND RISK ASSESSMENT
| Metric | Assessment |
|--------|------------|
| Lines changed | 2 (1 blank + 1 cond_resched) |
| Files touched | 1 |
| Complexity | Trivial |
| Risk | Very Low |
**Risk is minimal because**:
- `cond_resched()` is designed to be safe to call anywhere
- It's a no-op when no rescheduling is needed
- The same pattern already exists in this function (per the referenced
commit)
- Only affects the error path during cleanup
### 5. USER IMPACT
- **Affected users**: Systems with many CPUs, PREEMPT_NONE
configuration, using tracing
- **Severity**: High - softlockups can trigger watchdog resets and
system hangs
- **Trigger**: User-initiated action (writing to buffer_size_kb file)
- **Real-world**: Stack trace demonstrates this was hit on production
hardware (96-CPU ARM64 system)
This is particularly relevant for enterprise servers and embedded
systems which commonly use PREEMPT_NONE for performance.
### 6. STABILITY INDICATORS
- **Reviewed by maintainer**: Steven Rostedt (ring buffer maintainer)
signed off
- **Acked by**: Masami Hiramatsu (Google) - experienced kernel tracing
developer
- **Follows established pattern**: References commit f6bd2c92488c which
applied the same fix elsewhere
### 7. DEPENDENCY CHECK
- The `ring_buffer_resize()` function and its error path have existed
for years
- `cond_resched()` is a fundamental kernel API available in all stable
trees
- No dependencies on new code or recent changes
- The fix is self-contained and independent
### CONCLUSION
This commit is an **ideal stable backport candidate**:
1. **Obviously correct**: Adding `cond_resched()` to prevent softlockups
is a textbook kernel pattern
2. **Fixes a real bug**: Softlockups are serious user-visible issues
that can hang systems
3. **Small and contained**: 2-line change, single file, trivial
complexity
4. **No new features**: Pure bug fix with no behavioral changes
5. **No API changes**: No userspace-visible modifications
6. **Very low risk**: `cond_resched()` is designed to be safe; worst
case is slightly increased latency
7. **Maintainer approved**: Signed off by the subsystem maintainer
8. **Affects all stable trees**: The vulnerable code path exists in all
supported stable kernels
The fix follows the exact pattern already applied in the same function
by a previous commit, demonstrating it's a proven solution. The risk-to-
benefit ratio is excellent: near-zero risk for fixing a potentially
system-hanging bug.
**YES**
kernel/trace/ring_buffer.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index afcd3747264d2..3ba08fc1b7d05 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3121,6 +3121,8 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
list) {
list_del_init(&bpage->list);
free_buffer_page(bpage);
+
+ cond_resched();
}
}
out_err_unlock:
--
2.51.0
next prev parent reply other threads:[~2026-01-12 14:59 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-12 14:58 [PATCH AUTOSEL 6.18] HID: Elecom: Add support for ELECOM M-XT3DRBK (018C) Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18] x86/sev: Disable GCOV on noinstr object Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] wifi: mac80211: collect station statistics earlier when disconnect Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18] btrfs: do not free data reservation in fallback from inline due to -ENOSPC Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] btrfs: fix deadlock in wait_current_trans() due to ignored transaction type Sasha Levin
2026-01-19 11:46 ` Motiejus Jakštys
2026-01-20 11:03 ` Greg KH
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: quirks: Add another Chicony HP 5MP Cameras to hid_ignore_list Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.1] HID: intel-ish-hid: Update ishtp bus match to support device ID table Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: multitouch: add MT_QUIRK_STICKY_FINGERS to MT_CLS_VTL Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.1] HID: i2c-hid: fix potential buffer overflow in i2c_hid_get_report() Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18] riscv: trace: fix snapshot deadlock with sbi ecall Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] drm/amd/pm: Disable MMIO access during SMU Mode 1 reset Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] riscv: Sanitize syscall table indexing under speculation Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.15] netfilter: replace -EEXIST with -EBUSY Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] PCI: qcom: Remove ASPM L0s support for MSM8996 SoC Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] ALSA: hda/realtek: add HP Laptop 15s-eq1xxx mute LED quirk Sasha Levin
2026-01-12 14:58 ` Sasha Levin [this message]
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.15] HID: playstation: Center initial joystick axes to prevent spurious events Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: intel-ish-hid: Reset enum_devices_done before enumeration Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18] drm/amd/display: Reduce number of arguments of dcn30's CalculatePrefetchSchedule() Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: Apply quirk HID_QUIRK_ALWAYS_POLL to Edifier QR30 (2d99:a101) Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.1] btrfs: fix reservation leak in some error paths when inserting inline extent Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] ALSA: hda/realtek: Add quirk for Acer Nitro AN517-55 Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] HID: logitech: add HID++ support for Logitech MX Anywhere 3S Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18] HID: Intel-thc-hid: Intel-thc: Add safety check for reading DMA buffer Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: usbhid: paper over wrong bNumDescriptor field Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260112145840.724774-16-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mawupeng1@huawei.com \
--cc=mhiramat@kernel.org \
--cc=patches@lists.linux.dev \
--cc=rostedt@goodmis.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox