From: Steven Rostedt <rostedt@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Andrew Morton <akpm@linux-foundation.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Clark Williams <clrkwllms@kernel.org>,
Frederic Weisbecker <frederic@kernel.org>,
Petr Tesarik <ptesarik@suse.com>
Subject: [for-next][PATCH 08/15] ring-buffer: Use a housekeeping CPU to wake up waiters
Date: Tue, 27 Jan 2026 10:06:03 -0500 [thread overview]
Message-ID: <20260127150612.623913531@kernel.org> (raw)
In-Reply-To: 20260127150555.840066272@kernel.org
From: Petr Tesarik <ptesarik@suse.com>
Avoid running the wakeup irq_work on an isolated CPU. Since the wakeup can
run on any CPU, let's pick a housekeeping CPU to do the job.
This change reduces additional noise when tracing isolated CPUs. For
example, the following ipi_send_cpu stack trace was captured with
nohz_full=2 on the isolated CPU:
<idle>-0 [002] d.h4. 1255.379293: ipi_send_cpu: cpu=2 callsite=irq_work_queue+0x2d/0x50 callback=rb_wake_up_waiters+0x0/0x80
<idle>-0 [002] d.h4. 1255.379329: <stack trace>
=> trace_event_raw_event_ipi_send_cpu
=> __irq_work_queue_local
=> irq_work_queue
=> ring_buffer_unlock_commit
=> trace_buffer_unlock_commit_regs
=> trace_event_buffer_commit
=> trace_event_raw_event_x86_irq_vector
=> __sysvec_apic_timer_interrupt
=> sysvec_apic_timer_interrupt
=> asm_sysvec_apic_timer_interrupt
=> pv_native_safe_halt
=> default_idle
=> default_idle_call
=> do_idle
=> cpu_startup_entry
=> start_secondary
=> common_startup_64
The IRQ work interrupt alone adds considerable noise, but the impact can
get even worse with PREEMPT_RT, because the IRQ work interrupt is then
handled by a separate kernel thread. This requires a task switch and makes
tracing useless for analyzing latency on an isolated CPU.
After applying the patch, the trace is similar, but ipi_send_cpu always
targets a non-isolated CPU.
Unfortunately, irq_work_queue_on() is not NMI-safe. When running in NMI
context, fall back to queuing the irq work on the local CPU.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Clark Williams <clrkwllms@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Link: https://patch.msgid.link/20260108132132.2473515-1-ptesarik@suse.com
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/ring_buffer.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 630221b00838..d33103408955 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -4,6 +4,7 @@
*
* Copyright (C) 2008 Steven Rostedt <srostedt@redhat.com>
*/
+#include <linux/sched/isolation.h>
#include <linux/trace_recursion.h>
#include <linux/trace_events.h>
#include <linux/ring_buffer.h>
@@ -4013,19 +4014,36 @@ static void rb_commit(struct ring_buffer_per_cpu *cpu_buffer)
rb_end_commit(cpu_buffer);
}
+static bool
+rb_irq_work_queue(struct rb_irq_work *irq_work)
+{
+ int cpu;
+
+ /* irq_work_queue_on() is not NMI-safe */
+ if (unlikely(in_nmi()))
+ return irq_work_queue(&irq_work->work);
+
+ /*
+ * If CPU isolation is not active, cpu is always the current
+ * CPU, and the following is equivallent to irq_work_queue().
+ */
+ cpu = housekeeping_any_cpu(HK_TYPE_KERNEL_NOISE);
+ return irq_work_queue_on(&irq_work->work, cpu);
+}
+
static __always_inline void
rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
{
if (buffer->irq_work.waiters_pending) {
buffer->irq_work.waiters_pending = false;
/* irq_work_queue() supplies it's own memory barriers */
- irq_work_queue(&buffer->irq_work.work);
+ rb_irq_work_queue(&buffer->irq_work);
}
if (cpu_buffer->irq_work.waiters_pending) {
cpu_buffer->irq_work.waiters_pending = false;
/* irq_work_queue() supplies it's own memory barriers */
- irq_work_queue(&cpu_buffer->irq_work.work);
+ rb_irq_work_queue(&cpu_buffer->irq_work);
}
if (cpu_buffer->last_pages_touch == local_read(&cpu_buffer->pages_touched))
@@ -4045,7 +4063,7 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
cpu_buffer->irq_work.wakeup_full = true;
cpu_buffer->irq_work.full_waiters_pending = false;
/* irq_work_queue() supplies it's own memory barriers */
- irq_work_queue(&cpu_buffer->irq_work.work);
+ rb_irq_work_queue(&cpu_buffer->irq_work);
}
#ifdef CONFIG_RING_BUFFER_RECORD_RECURSION
--
2.51.0
next prev parent reply other threads:[~2026-01-27 15:06 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-27 15:05 [for-next][PATCH 00/15] tracing: Updates for 6.20 Steven Rostedt
2026-01-27 15:05 ` [for-next][PATCH 01/15] tracing: Properly process error handling in event_hist_trigger_parse() Steven Rostedt
2026-01-27 15:05 ` [for-next][PATCH 02/15] tracing: Remove redundant call to event_trigger_reset_filter() " Steven Rostedt
2026-01-27 15:05 ` [for-next][PATCH 03/15] tracing: Add bitmask-list option for human-readable bitmask display Steven Rostedt
2026-01-27 15:05 ` [for-next][PATCH 04/15] tracing: Replace use of system_wq with system_dfl_wq Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 05/15] tracing: Add show_event_filters to expose active event filters Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 06/15] tracing: Add show_event_triggers to expose active event triggers Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 07/15] tracing: Check the return value of tracing_update_buffers() Steven Rostedt
2026-01-27 15:06 ` Steven Rostedt [this message]
2026-01-27 15:06 ` [for-next][PATCH 09/15] tracing: Have show_event_trigger/filter format a bit more in columns Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 10/15] ftrace: Introduce and use ENTRIES_PER_PAGE_GROUP macro Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 11/15] tracing: Disable trace_printk buffer on warning too Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 12/15] tracing: Have hist_debug show what function a field uses Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 13/15] tracing: Remove notrace from trace_event_raw_event_synth() Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 14/15] tracing: Up the hist stacktrace size from 16 to 31 Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 15/15] tracing: Remove duplicate ENABLE_EVENT_STR and DISABLE_EVENT_STR macros Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260127150612.623913531@kernel.org \
--to=rostedt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=clrkwllms@kernel.org \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=ptesarik@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.