public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Clark Williams <clrkwllms@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Petr Tesarik <ptesarik@suse.com>
Subject: [for-next][PATCH 08/15] ring-buffer: Use a housekeeping CPU to wake up waiters
Date: Tue, 27 Jan 2026 10:06:03 -0500	[thread overview]
Message-ID: <20260127150612.623913531@kernel.org> (raw)
In-Reply-To: 20260127150555.840066272@kernel.org

From: Petr Tesarik <ptesarik@suse.com>

Avoid running the wakeup irq_work on an isolated CPU. Since the wakeup can
run on any CPU, let's pick a housekeeping CPU to do the job.

This change reduces additional noise when tracing isolated CPUs. For
example, the following ipi_send_cpu stack trace was captured with
nohz_full=2 on the isolated CPU:

          <idle>-0       [002] d.h4.  1255.379293: ipi_send_cpu: cpu=2 callsite=irq_work_queue+0x2d/0x50 callback=rb_wake_up_waiters+0x0/0x80
          <idle>-0       [002] d.h4.  1255.379329: <stack trace>
 => trace_event_raw_event_ipi_send_cpu
 => __irq_work_queue_local
 => irq_work_queue
 => ring_buffer_unlock_commit
 => trace_buffer_unlock_commit_regs
 => trace_event_buffer_commit
 => trace_event_raw_event_x86_irq_vector
 => __sysvec_apic_timer_interrupt
 => sysvec_apic_timer_interrupt
 => asm_sysvec_apic_timer_interrupt
 => pv_native_safe_halt
 => default_idle
 => default_idle_call
 => do_idle
 => cpu_startup_entry
 => start_secondary
 => common_startup_64

The IRQ work interrupt alone adds considerable noise, but the impact can
get even worse with PREEMPT_RT, because the IRQ work interrupt is then
handled by a separate kernel thread. This requires a task switch and makes
tracing useless for analyzing latency on an isolated CPU.

After applying the patch, the trace is similar, but ipi_send_cpu always
targets a non-isolated CPU.

Unfortunately, irq_work_queue_on() is not NMI-safe. When running in NMI
context, fall back to queuing the irq work on the local CPU.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Clark Williams <clrkwllms@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Link: https://patch.msgid.link/20260108132132.2473515-1-ptesarik@suse.com
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 630221b00838..d33103408955 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -4,6 +4,7 @@
  *
  * Copyright (C) 2008 Steven Rostedt <srostedt@redhat.com>
  */
+#include <linux/sched/isolation.h>
 #include <linux/trace_recursion.h>
 #include <linux/trace_events.h>
 #include <linux/ring_buffer.h>
@@ -4013,19 +4014,36 @@ static void rb_commit(struct ring_buffer_per_cpu *cpu_buffer)
 	rb_end_commit(cpu_buffer);
 }
 
+static bool
+rb_irq_work_queue(struct rb_irq_work *irq_work)
+{
+	int cpu;
+
+	/* irq_work_queue_on() is not NMI-safe */
+	if (unlikely(in_nmi()))
+		return irq_work_queue(&irq_work->work);
+
+	/*
+	 * If CPU isolation is not active, cpu is always the current
+	 * CPU, and the following is equivallent to irq_work_queue().
+	 */
+	cpu = housekeeping_any_cpu(HK_TYPE_KERNEL_NOISE);
+	return irq_work_queue_on(&irq_work->work, cpu);
+}
+
 static __always_inline void
 rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
 {
 	if (buffer->irq_work.waiters_pending) {
 		buffer->irq_work.waiters_pending = false;
 		/* irq_work_queue() supplies it's own memory barriers */
-		irq_work_queue(&buffer->irq_work.work);
+		rb_irq_work_queue(&buffer->irq_work);
 	}
 
 	if (cpu_buffer->irq_work.waiters_pending) {
 		cpu_buffer->irq_work.waiters_pending = false;
 		/* irq_work_queue() supplies it's own memory barriers */
-		irq_work_queue(&cpu_buffer->irq_work.work);
+		rb_irq_work_queue(&cpu_buffer->irq_work);
 	}
 
 	if (cpu_buffer->last_pages_touch == local_read(&cpu_buffer->pages_touched))
@@ -4045,7 +4063,7 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
 	cpu_buffer->irq_work.wakeup_full = true;
 	cpu_buffer->irq_work.full_waiters_pending = false;
 	/* irq_work_queue() supplies it's own memory barriers */
-	irq_work_queue(&cpu_buffer->irq_work.work);
+	rb_irq_work_queue(&cpu_buffer->irq_work);
 }
 
 #ifdef CONFIG_RING_BUFFER_RECORD_RECURSION
-- 
2.51.0



  parent reply	other threads:[~2026-01-27 15:06 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-27 15:05 [for-next][PATCH 00/15] tracing: Updates for 6.20 Steven Rostedt
2026-01-27 15:05 ` [for-next][PATCH 01/15] tracing: Properly process error handling in event_hist_trigger_parse() Steven Rostedt
2026-01-27 15:05 ` [for-next][PATCH 02/15] tracing: Remove redundant call to event_trigger_reset_filter() " Steven Rostedt
2026-01-27 15:05 ` [for-next][PATCH 03/15] tracing: Add bitmask-list option for human-readable bitmask display Steven Rostedt
2026-01-27 15:05 ` [for-next][PATCH 04/15] tracing: Replace use of system_wq with system_dfl_wq Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 05/15] tracing: Add show_event_filters to expose active event filters Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 06/15] tracing: Add show_event_triggers to expose active event triggers Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 07/15] tracing: Check the return value of tracing_update_buffers() Steven Rostedt
2026-01-27 15:06 ` Steven Rostedt [this message]
2026-01-27 15:06 ` [for-next][PATCH 09/15] tracing: Have show_event_trigger/filter format a bit more in columns Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 10/15] ftrace: Introduce and use ENTRIES_PER_PAGE_GROUP macro Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 11/15] tracing: Disable trace_printk buffer on warning too Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 12/15] tracing: Have hist_debug show what function a field uses Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 13/15] tracing: Remove notrace from trace_event_raw_event_synth() Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 14/15] tracing: Up the hist stacktrace size from 16 to 31 Steven Rostedt
2026-01-27 15:06 ` [for-next][PATCH 15/15] tracing: Remove duplicate ENABLE_EVENT_STR and DISABLE_EVENT_STR macros Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260127150612.623913531@kernel.org \
    --to=rostedt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=clrkwllms@kernel.org \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=ptesarik@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox