linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] tracing/preemptirq: Optimize disabled tracepoint overhead
@ 2025-07-04 17:07 Wander Lairson Costa
  2025-07-04 17:07 ` [PATCH v3 1/2] trace/preemptirq: reduce overhead of irq_enable/disable tracepoints Wander Lairson Costa
  2025-07-04 17:07 ` [PATCH v3 2/2] tracing/preemptirq: Optimize preempt_disable/enable() tracepoint overhead Wander Lairson Costa
  0 siblings, 2 replies; 14+ messages in thread
From: Wander Lairson Costa @ 2025-07-04 17:07 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Masami Hiramatsu, Mathieu Desnoyers,
	Wander Lairson Costa, David Woodhouse, Boqun Feng,
	Thomas Gleixner, open list, open list:TRACING
  Cc: Arnaldo Carvalho de Melo, Clark Williams, Gabriele Monaco

This series addresses unnecessary overhead introduced by the
preempt/irq tracepoints when they are compiled into the kernel
but are not actively enabled (i.e., when tracing is disabled).

These optimizations ensure that when tracing is inactive, the kernel
can largely bypass operations that would otherwise incur a passive
performance penalty. This makes the impact of disabled preemptirq
IRQ and preempt tracing negligible in performance-sensitive environments.

---
Performance Measurements

Measurements were taken using a specialized kernel module [1] to benchmark
`local_irq_disable/enable()` and `preempt_disable/enable()` call pairs.
The kernel used for benchmarking was version 6.16.0-rc2. "Max Average"
represents the average of the 1000 highest samples, used to reduce noise
from single highest samples.

Each benchmark run collected 10^7 samples in parallel from each CPU
for each call pair (used for average, max_avg, and median calculations).
The 99th percentile was measured in a separate benchmark run, focused
on a single CPU.

The results show that compiling with tracers (Kernel Build:
`-master-trace`) introduced significant overhead compared to a base
kernel without tracers (Kernel Build: `-master`). After applying these
patches (Kernel Build: `-patched-trace`), the overhead is
substantially reduced, approaching the baseline.

x86-64 Metrics

Tests were run on a system equipped with an Intel(R) Xeon(R) Silver 4310 CPU.

IRQ Metrics

Combined Metric            average  max_avg  median  percentile
Kernel Build
6.16.0-rc2-master               28     5587      29          23
6.16.0-rc2-master-trace         46     7895      48          32
6.16.0-rc2-patched-trace        30     6030      31          27

Preempt Metrics

Combined Metric            average  max_avg  median  percentile
Kernel Build
6.16.0-rc2-master               26     5748      27          20
6.16.0-rc2-master-trace         45     7526      48          26
6.16.0-rc2-patched-trace        27     5479      27          21

AArch64 Metrics

Tests were also conducted on an AArch64 platform.

IRQ Metrics

Combined Metric             average  max_avg  median  percentile
Kernel Build
aarch64-6.16.0-rc2-master        28     3298      32          64
aarch64-6.16.0-rc2-master-trace 105     5769      96         128
aarch64-6.16.0-rc2-patched-trace 29     3192      32          64

Preempt Metrics

Combined Metric             average  max_avg  median  percentile
Kernel Build
aarch64-6.16.0-rc2-master        27     3371      32          32
aarch64-6.16.0-rc2-master-trace  32     3000      32          64
aarch64-6.16.0-rc2-patched-trace 28     3132      32          64

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>

---
References:
[1] https://github.com/walac/tracer-benchmark

--
Changes:
v1: Initial version of the patch.
v2: Enabled IRQ tracing automatically when CONFIG_PROVE_LOCKING is active.
v3: Resolved a build failure on the 32-bit ARM architecture.

Wander Lairson Costa (2):
  trace/preemptirq: reduce overhead of irq_enable/disable tracepoints
  tracing/preemptirq: Optimize preempt_disable/enable() tracepoint
    overhead

 include/linux/irqflags.h        | 30 +++++++++++++++++++---------
 include/linux/preempt.h         | 35 ++++++++++++++++++++++++++++++---
 include/linux/tracepoint-defs.h |  1 -
 include/linux/tracepoint.h      |  1 +
 kernel/sched/core.c             | 12 +----------
 kernel/trace/trace_preemptirq.c | 22 +++++++++++++++++++++
 6 files changed, 77 insertions(+), 24 deletions(-)

-- 
2.50.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-08-25 11:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-04 17:07 [PATCH v3 0/2] tracing/preemptirq: Optimize disabled tracepoint overhead Wander Lairson Costa
2025-07-04 17:07 ` [PATCH v3 1/2] trace/preemptirq: reduce overhead of irq_enable/disable tracepoints Wander Lairson Costa
2025-07-06  4:29   ` kernel test robot
2025-07-04 17:07 ` [PATCH v3 2/2] tracing/preemptirq: Optimize preempt_disable/enable() tracepoint overhead Wander Lairson Costa
2025-07-07 10:29   ` kernel test robot
2025-07-07 11:20   ` Peter Zijlstra
2025-07-08 12:54     ` Wander Lairson Costa
2025-07-08 18:54       ` Peter Zijlstra
2025-08-01 13:30         ` Wander Lairson Costa
2025-08-25 11:56           ` Wander Lairson Costa
2025-07-07 11:26   ` Peter Zijlstra
2025-07-08 13:09     ` Wander Lairson Costa
2025-07-08 18:46       ` Peter Zijlstra
2025-08-01 13:05         ` Wander Lairson Costa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).