All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wander Lairson Costa <wander@redhat.com>
To: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org (open list:SCHEDULER),
	linux-trace-kernel@vger.kernel.org (open list:TRACING)
Cc: acme@kernel.org, williams@redhat.com, gmonaco@redhat.com,
	Wander Lairson Costa <wander@redhat.com>
Subject: [PATCH v3 1/4] tracing/preemptirq: Optimize preempt_disable/enable() tracepoint overhead
Date: Wed, 11 Mar 2026 09:50:15 -0300	[thread overview]
Message-ID: <20260311125021.197638-2-wander@redhat.com> (raw)
In-Reply-To: <20260311125021.197638-1-wander@redhat.com>

When CONFIG_TRACE_PREEMPT_TOGGLE is enabled, preempt_count_add() and
preempt_count_sub() become external function calls (defined in
kernel/sched/core.c) rather than inlined operations. These functions
also perform preempt_count() checks and call trace_preempt_on/off()
unconditionally, even when no tracing consumer is active.

Reduce this overhead by splitting the #if logic in preempt.h into
three cases. When CONFIG_DEBUG_PREEMPT or CONFIG_PREEMPT_TRACER is
set, keep external function calls because DEBUG_PREEMPT needs runtime
validation checks, and PREEMPT_TRACER needs the preemptoff latency
tracer hooks (tracer_preempt_on/off, called via trace_preempt_on/off).
When CONFIG_TRACE_PREEMPT_TOGGLE alone is set, provide new inline
versions of preempt_count_add/sub() that check the tracepoint static
key via the __preempt_trace_enabled() macro before calling into the
tracing path. The macro evaluates to true when the preempt_enable or
preempt_disable tracepoint has subscribers AND the preempt count
equals val (indicating the first preempt disable or last preempt
enable), preserving the original preempt_latency_start/stop semantics.
When none of the above are set, use pure inline macros with no tracing
overhead.

The preempt_count_dec_and_test() macro is refactored out of the
three-way #if into a separate block shared by the first two cases,
since both need it to call the (potentially inline)
preempt_count_sub() before checking should_resched().

The inline path calls thin __trace_preempt_on/off() wrappers (added
in trace_preemptirq.c) that invoke trace_preempt_on/off(), keeping
the full tracepoint machinery out of the inline code.

The #include <linux/tracepoint-defs.h> is placed inside the
CONFIG_TRACE_PREEMPT_TOGGLE block rather than at the top of the file
to avoid a circular include dependency on architectures where
asm/irqflags.h includes linux/preempt.h (e.g. m68k):

  preempt.h -> tracepoint-defs.h -> static_key.h -> jump_label.h ->
  atomic.h -> irqflags.h -> asm/irqflags.h -> preempt.h (guarded)

If the include were at the top, this chain would be traversed before
hardirq_count() is defined (at line 110), causing a build failure on
m68k. Placing it inside the #elif block ensures it is only pulled in
when CONFIG_TRACE_PREEMPT_TOGGLE is enabled and avoids the cycle for
configurations that do not select it.

In core.c, narrow the compilation guard for the external
preempt_count_add/sub() from CONFIG_DEBUG_PREEMPT ||
CONFIG_TRACE_PREEMPT_TOGGLE to CONFIG_DEBUG_PREEMPT ||
CONFIG_PREEMPT_TRACER, since CONFIG_TRACE_PREEMPT_TOGGLE is now
handled inline.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/preempt.h         | 49 +++++++++++++++++++++++++++++++--
 kernel/sched/core.c             |  2 +-
 kernel/trace/trace_preemptirq.c | 19 +++++++++++++
 3 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index d964f965c8ffc..f59a92f930d81 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -189,17 +189,60 @@ static __always_inline unsigned char interrupt_context_level(void)
  */
 #define in_atomic_preempt_off() (preempt_count() != PREEMPT_DISABLE_OFFSET)
 
-#if defined(CONFIG_DEBUG_PREEMPT) || defined(CONFIG_TRACE_PREEMPT_TOGGLE)
+#if defined(CONFIG_DEBUG_PREEMPT) || defined(CONFIG_PREEMPT_TRACER)
 extern void preempt_count_add(int val);
 extern void preempt_count_sub(int val);
-#define preempt_count_dec_and_test() \
-	({ preempt_count_sub(1); should_resched(0); })
+#elif defined(CONFIG_TRACE_PREEMPT_TOGGLE)
+/*
+ * Avoid the circular dependency on architectures where asm/irqflags.h
+ * includes linux/preempt.h (e.g. m68k):
+ *
+ * preempt.h <--------------------+
+ *  tracepoint-defs.h             |
+ *   static_key.h                 |
+ *    jump_label.h                |
+ *     atomic.h                   |
+ *      irqflags.h                |
+ *       asm/irqflags.h           |
+ *        preempt.h --------------+
+ */
+#include <linux/tracepoint-defs.h>
+
+extern void __trace_preempt_on(void);
+extern void __trace_preempt_off(void);
+
+DECLARE_TRACEPOINT(preempt_enable);
+DECLARE_TRACEPOINT(preempt_disable);
+
+#define __preempt_trace_enabled(type, val) \
+	(tracepoint_enabled(preempt_##type) && preempt_count() == (val))
+
+static __always_inline void preempt_count_add(int val)
+{
+	__preempt_count_add(val);
+
+	if (__preempt_trace_enabled(disable, val))
+		__trace_preempt_off();
+}
+
+static __always_inline void preempt_count_sub(int val)
+{
+	if (__preempt_trace_enabled(enable, val))
+		__trace_preempt_on();
+
+	__preempt_count_sub(val);
+}
 #else
 #define preempt_count_add(val)	__preempt_count_add(val)
 #define preempt_count_sub(val)	__preempt_count_sub(val)
 #define preempt_count_dec_and_test() __preempt_count_dec_and_test()
 #endif
 
+#if defined(CONFIG_DEBUG_PREEMPT) || defined(CONFIG_TRACE_PREEMPT_TOGGLE)
+#define preempt_count_dec_and_test() \
+	({ preempt_count_sub(1); should_resched(0); })
+#endif
+
 #define __preempt_count_inc() __preempt_count_add(1)
 #define __preempt_count_dec() __preempt_count_sub(1)
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b7f77c165a6e0..125e5d71d1bd3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5733,7 +5733,7 @@ static inline void sched_tick_stop(int cpu) { }
 #endif /* !CONFIG_NO_HZ_FULL */
 
 #if defined(CONFIG_PREEMPTION) && (defined(CONFIG_DEBUG_PREEMPT) || \
-				defined(CONFIG_TRACE_PREEMPT_TOGGLE))
+				   defined(CONFIG_PREEMPT_TRACER))
 /*
  * If the value passed in is equal to the current preempt count
  * then we just disabled preemption. Start timing the latency.
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 0c42b15c38004..9f098fcb28012 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -115,6 +115,25 @@ NOKPROBE_SYMBOL(trace_hardirqs_off);
 
 #ifdef CONFIG_TRACE_PREEMPT_TOGGLE
 
+#if !defined(CONFIG_DEBUG_PREEMPT) && !defined(CONFIG_PREEMPT_TRACER)
+EXPORT_TRACEPOINT_SYMBOL(preempt_disable);
+EXPORT_TRACEPOINT_SYMBOL(preempt_enable);
+
+void __trace_preempt_on(void)
+{
+	trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
+}
+EXPORT_SYMBOL(__trace_preempt_on);
+NOKPROBE_SYMBOL(__trace_preempt_on);
+
+void __trace_preempt_off(void)
+{
+	trace_preempt_off(CALLER_ADDR0, get_lock_parent_ip());
+}
+EXPORT_SYMBOL(__trace_preempt_off);
+NOKPROBE_SYMBOL(__trace_preempt_off);
+#endif /* !CONFIG_DEBUG_PREEMPT */
+
 void trace_preempt_on(unsigned long a0, unsigned long a1)
 {
 	trace(preempt_enable, TP_ARGS(a0, a1));
-- 
2.53.0


  reply	other threads:[~2026-03-11 13:02 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11 12:50 [PATCH v3 0/4] tracing/preemptirq: Optimize disabled tracepoint overhead Wander Lairson Costa
2026-03-11 12:50 ` Wander Lairson Costa [this message]
2026-03-11 19:35   ` [PATCH v3 1/4] tracing/preemptirq: Optimize preempt_disable/enable() " Peter Zijlstra
2026-03-12 17:19     ` Wander Lairson Costa
2026-03-13  9:04       ` Peter Zijlstra
2026-03-13 15:36         ` Wander Lairson Costa
2026-04-28 18:51           ` Steven Rostedt
2026-04-28 19:05             ` Wander Lairson Costa
2026-03-11 12:50 ` [PATCH v3 2/4] trace/preemptirq: make TRACE_PREEMPT_TOGGLE user-selectable Wander Lairson Costa
2026-03-11 12:50 ` [PATCH v3 3/4] trace/preemptirq: add TRACE_IRQFLAGS_TOGGLE Wander Lairson Costa
2026-03-11 12:50 ` [PATCH v3 4/4] trace/preemptirq: Implement trace_irqflags hooks Wander Lairson Costa
2026-03-11 19:43   ` Peter Zijlstra
2026-03-11 19:48     ` Steven Rostedt
2026-03-11 19:53       ` Peter Zijlstra
2026-03-11 20:07         ` Steven Rostedt
2026-03-11 20:46           ` Peter Zijlstra
2026-03-11 23:16             ` Steven Rostedt
2026-03-12 17:09     ` Wander Lairson Costa
2026-04-28 18:54     ` Steven Rostedt
2026-04-28 19:04       ` Wander Lairson Costa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260311125021.197638-2-wander@redhat.com \
    --to=wander@redhat.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=gmonaco@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@suse.de \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.