Linux Trace Kernel

Linux Trace Kernel
 help / color / mirror / Atom feed

* Re: [PATCH 1/7] uprobes/x86: Move optimized uprobe from nop5 to nop10
From: Jiri Olsa @ 2026-05-15 12:31 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Oleg Nesterov, Peter Zijlstra, Ingo Molnar, Masami Hiramatsu,
	Andrii Nakryiko, bpf, linux-trace-kernel
In-Reply-To: <2qkbqj7c2bi7li4crheoarasvokrtxbb7ikofdv5zvsvgww5lx@bjd73tm2prfj>

On Thu, May 14, 2026 at 06:54:37PM +0200, Jakub Sitnicki wrote:
> On Thu, May 14, 2026 at 03:53:36PM +0200, Jiri Olsa wrote:
> > Andrii reported an issue with optimized uprobes [1] that can clobber
> > redzone area with call instruction storing return address on stack
> > where user code may keep temporary data without adjusting rsp.
> > 
> > Fixing this by moving the optimized uprobes on top of 10-bytes nop
> > instruction, so we can squeeze another instruction to escape the
> > redzone area before doing the call, like:
> > 
> >   lea -0x80(%rsp), %rsp
> >   call tramp
> > 
> > Note the lea instruction is used to adjust the rsp register without
> > changing the flags.
> > 
> > The optimized uprobe performance stays the same:
> > 
> >         uprobe-nop     :    3.129 ± 0.013M/s
> >         uprobe-push    :    3.045 ± 0.006M/s
> >         uprobe-ret     :    1.095 ± 0.004M/s
> >   -->   uprobe-nop10   :    7.170 ± 0.020M/s
> >         uretprobe-nop  :    2.143 ± 0.021M/s
> >         uretprobe-push :    2.090 ± 0.000M/s
> >         uretprobe-ret  :    0.942 ± 0.000M/s
> >   -->   uretprobe-nop10:    3.381 ± 0.003M/s
> >         usdt-nop       :    3.245 ± 0.004M/s
> >   -->   usdt-nop10     :    7.256 ± 0.023M/s
> > 
> > [1] https://lore.kernel.org/bpf/20260509003146.976844-1-andrii@kernel.org/
> > Reported-by: Andrii Nakryiko <andrii@kernel.org>
> > Closes: https://lore.kernel.org/bpf/20260509003146.976844-1-andrii@kernel.org/
> > Fixes: ba2bfc97b462 ("uprobes/x86: Add support to optimize uprobes")
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  arch/x86/kernel/uprobes.c | 121 +++++++++++++++++++++++++++-----------
> >  1 file changed, 86 insertions(+), 35 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> > index ebb1baf1eb1d..f7c4101a4039 100644
> > --- a/arch/x86/kernel/uprobes.c
> > +++ b/arch/x86/kernel/uprobes.c
> > @@ -636,9 +636,21 @@ struct uprobe_trampoline {
> >  	unsigned long		vaddr;
> >  };
> >  
> > +#define LEA_INSN_SIZE		5
> > +#define OPT_INSN_SIZE		(LEA_INSN_SIZE + CALL_INSN_SIZE)
> > +#define OPT_JMP8_OFFSET		(OPT_INSN_SIZE - JMP8_INSN_SIZE)
> > +#define REDZONE_SIZE		0x80
> > +
> > +static const u8 lea_rsp[] = { 0x48, 0x8d, 0x64, 0x24, 0x80 };
> > +
> > +static bool is_lea_insn(const uprobe_opcode_t *insn)
> > +{
> > +	return !memcmp(insn, lea_rsp, LEA_INSN_SIZE);
> > +}
> > +
> 
> Just a thought. See if below maybe reads better when plugged in.
> is_call_insn can then be removed, I think.
> 
> static bool is_call_past_redzone_insns(const uprobe_opcode_t *insn)
> {
> 	static const u8 lea_rsp_call[] = {
> 		0x48, 0x8d, 0x64, 0x24, REDZONE_SIZE, /* lea -0x80(%rsp), %rsp */
> 		CALL_INSN_OPCODE
> 	};
> 
> 	return !memcmp(insn, lea_rsp_call, ARRAY_SIZE(lea_rsp_call));
> }

yep, might be easier to unify that, thanks

jirka

^ permalink raw reply

* Re: [PATCH 2/7] libbpf: Change has_nop_combo to work on top of nop10
From: Jiri Olsa @ 2026-05-15 12:32 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: oleg, peterz, mingo, mhiramat, andrii, bpf, linux-trace-kernel,
	ast, daniel, martin.lau, eddyz87, yonghong.song, clm,
	ihor.solodrai
In-Reply-To: <430564f1eb4e1fb284b40510811502671e4aa8ef59afadddd546e4286402f760@mail.kernel.org>

On Thu, May 14, 2026 at 02:55:25PM +0000, bot+bpf-ci@kernel.org wrote:
> >     libbpf: Change has_nop_combo to work on top of nop10
> >
> >     We now expect nop combo with 10 bytes nop instead of 5 bytes nop,
> >     fixing has_nop_combo to reflect that.
> >
> >     Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> 
> This looks like a bug fix for kernel/libbpf integration. Should this
> include:
> 
>   Fixes: 9734c3ac2f23 ("uprobes/x86: Move optimized uprobe from nop5 to nop10")

I think it should be this one:

  41a5c7df4466 libbpf: Add support to detect nop,nop5 instructions combo for usdt probe

jirka

^ permalink raw reply

* Re: [RFC PATCH v2 08/10] rv/tlob: add tlob hybrid automaton monitor
From: Gabriele Monaco @ 2026-05-15 13:08 UTC (permalink / raw)
  To: wen.yang, Steven Rostedt; +Cc: linux-trace-kernel, linux-kernel
In-Reply-To: <fe5ed6a9a0a911e6ec74dc06c453786a2c4fb6d1.1778522945.git.wen.yang@linux.dev>

On Tue, 2026-05-12 at 02:24 +0800, wen.yang@linux.dev wrote:
> From: Wen Yang <wen.yang@linux.dev>
> 
> diff --git a/Documentation/trace/rv/monitor_tlob.rst
> b/Documentation/trace/rv/monitor_tlob.rst
> new file mode 100644
> index 000000000000..91b592630b3f
> --- /dev/null
> +++ b/Documentation/trace/rv/monitor_tlob.rst
> +Usage
> +-----
> +
> +tracefs interface (uprobe-based external monitoring)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The ``monitor`` tracefs file instruments an unmodified binary via uprobes.
> +The format follows the ftrace ``uprobe_events`` convention (``PATH:OFFSET``
> +for the probe location, ``key=value`` for configuration parameters)::
> +
> +  p PATH:OFFSET_START OFFSET_STOP threshold=US
> +
> +The uprobe at ``OFFSET_START`` fires ``tlob_start_task()``; the uprobe at
> +``OFFSET_STOP`` fires ``tlob_stop_task()``.  Both offsets are ELF file
> +offsets of entry points in ``PATH``.  ``PATH`` may contain ``:``; the last
> +``:`` in the ``PATH:OFFSET_START`` token is the separator.
> +
> +To remove a binding, use ``-PATH:OFFSET_START``::
> +
> +  echo 1 > /sys/kernel/tracing/rv/monitors/tlob/enable
> +
> +  echo "p /usr/bin/myapp:0x12a0 0x12f0 threshold=5000" \
> +      > /sys/kernel/tracing/rv/monitors/tlob/monitor
> +
> +  # Remove a binding
> +  echo "-/usr/bin/myapp:0x12a0" >
> /sys/kernel/tracing/rv/monitors/tlob/monitor
> +
> +  # List registered bindings
> +  cat /sys/kernel/tracing/rv/monitors/tlob/monitor
> +
> +  # Read violations from the trace buffer
> +  cat /sys/kernel/tracing/trace
> +
> +ioctl self-instrumentation (/dev/rv)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I'm not particularly fond of ioctls, they aren't that flexible and in
this way I don't really see an added value.

In short, you're adding this so a program could instrument itself using
ioctls instead of using uprobes, cannot the same thing be achieved using
uprobes alone, e.g. by registering a function address or the current
instruction pointer?

If you really cannot do it with uprobes alone, wouldn't a sysfs/tracefs file
achieve a similar purpose without much of the boilerplate code?

> +
> +``/dev/rv`` is a shared RV character device.  Before using any monitor-
> specific
> +ioctl, the fd must be bound to a monitor via ``RV_IOCTL_BIND_MONITOR``.  Each
> +open fd has independent per-fd monitoring state::
> +
> +  int fd = open("/dev/rv", O_RDWR);
> +
> +  /* Bind this fd to the tlob monitor. */
> +  struct rv_bind_args bind = { .monitor_name = "tlob" };
> +  ioctl(fd, RV_IOCTL_BIND_MONITOR, &bind);
> +
> +  struct tlob_start_args args = {
> +      .threshold_us = 50000,   /* 50 ms in microseconds */
> +  };
> +  ioctl(fd, TLOB_IOCTL_TRACE_START, &args);
> +
> +  /* ... code path under observation ... */
> +
> +  int ret = ioctl(fd, TLOB_IOCTL_TRACE_STOP, NULL);
> +  /* ret == 0:          within budget  */
> +  /* ret == -EOVERFLOW: budget exceeded */
> +
> +  close(fd);
> +
> +``TRACE_STOP`` returns ``-EOVERFLOW`` whenever the budget was exceeded.
> +The HA timer calls ``da_monitor_reset()`` (storage remains); the
> +synchronous ``ha_cancel_timer_sync()`` in ``tlob_stop_task()`` ensures the
> +callback has completed before checking ``da_monitoring()``.
> +
> +Violation events
> +~~~~~~~~~~~~~~~~

Since you are not documenting the detail_env_tlob tracepoint, is it
something really required?
It's deviating from the original RV purpose (run a model and spot
violations) by adding further accounting, I'm fine with that if there is a
documented need.
In such case I would at the very least document its usage (thought I'd
really like to be rid of it and let the curious user implement the
accounting themselves).
> +
> +Budget violations are always reported via the ``error_env_tlob`` RV
> +tracepoint (HA clock-invariant violation), regardless of which interface
> +triggered them::
> +
> +  cat /sys/kernel/tracing/trace
> +
> +To capture violations in a file::
> +
> +  trace-cmd record -e error_env_tlob &
> +  # ... run workload ...
> +  trace-cmd report
> +

This is standard tracepoints usage, there's nothing about tlob we should
document here. If you feel the existing RV documentation should expand
this subject, feel free to contribute there.

> +tracefs files
> +-------------
> +
> +The following files are created under
> +``/sys/kernel/tracing/rv/monitors/tlob/``:
> +
> +``enable`` (rw)
> +  Write ``1`` to enable the monitor; write ``0`` to disable it.
> +
> +``desc`` (ro)
> +  Human-readable description of the monitor.

Same here, standard RV.

> +
> +``monitor`` (rw)
> +  Write ``p PATH:OFFSET_START OFFSET_STOP threshold=US``
> +  to bind two entry uprobes.  Write ``-PATH:OFFSET_START`` to remove a
> +  binding.  Read to list registered bindings in the same format.

And this is duplicating what mentioned above about uprobes, isn't it?

> +
> +Kernel API
> +----------
> +
> +.. kernel-doc:: kernel/trace/rv/monitors/tlob/tlob.c
> +   :functions: tlob_start_task tlob_stop_task
> +
> +``tlob_start_task(task, threshold_us)``
> +  Begin monitoring *task* with a total latency budget of *threshold_us*
> +  microseconds.  Allocates per-task state, sets initial DA state to
> +  ``running``, resets ``clk_elapsed``, and arms the HA budget timer.
> +  Returns 0, -ENODEV (monitor disabled), -ERANGE (zero threshold),
> +  -EALREADY (already monitoring), -ENOSPC (at capacity), or -ENOMEM.
> +
> +``tlob_stop_task(task)``
> +  Stop monitoring *task*.  Synchronously cancels the HA timer via
> +  ``ha_cancel_timer_sync()``, checks ``da_monitoring()`` to determine
> outcome.
> +  Returns 0 (clean stop, within budget), -EOVERFLOW (budget was exceeded),
> +  -ESRCH (not monitored), or -EAGAIN (concurrent stop racing).
> +

Is kernel code going to use this API? RV monitors are meant to be
enabled by userspace. What's the use-case here?

> +Design notes
> +------------
> +
> +State transitions are driven by two tracepoints:
> +
> +- ``sched_switch``: ``prev_state == 0`` (``TASK_RUNNING``, preempted,
> +  stays on runqueue) → running→waiting; ``prev_state != 0`` (voluntarily
> +  blocked, leaves runqueue) → running→sleeping; ``next`` pointer →
> +  waiting→running.
> +- ``sched_wakeup``: task moves back onto the runqueue → sleeping→waiting.
> +
> +No ``waiting → sleeping`` edge exists because a task can only block
> +itself while executing on CPU.  ``try_to_wake_up()`` is also a no-op
> +when ``__state == TASK_RUNNING``, so ``sched_wakeup`` never fires while
> +the task is in ``waiting`` state.

That's probably a bit too detailed for this page.
If you really want this information somewhere couldn't it stay in the
code?

Thanks,
Gabriele


^ permalink raw reply

* Re: [RFC PATCH v2 09/10] rv/tlob: add KUnit tests for the tlob monitor
From: Gabriele Monaco @ 2026-05-15 13:13 UTC (permalink / raw)
  To: wen.yang; +Cc: linux-trace-kernel, linux-kernel, Steven Rostedt
In-Reply-To: <a12d14297b33b9b8d425bc1b813a8aecbd54bcc6.1778522945.git.wen.yang@linux.dev>

On Tue, 2026-05-12 at 02:24 +0800, wen.yang@linux.dev wrote:
> From: Wen Yang <wen.yang@linux.dev>
> 
> Add five KUnit test suites gated behind CONFIG_TLOB_KUNIT_TEST
> (depends on RV_MON_TLOB && KUNIT; default KUNIT_ALL_TESTS) with a
> .kunitconfig fragment for the kunit.py runner.
> 
> tlob_task_api tests the start/stop API, error returns (-EEXIST,
> -ESRCH, -EOVERFLOW, -ENOSPC, -ERANGE).
> tlob_sched_integration covers context-switch accounting and monitoring
> a kthread.  tlob_parse_uprobe exercises the uprobe line parser.
> tlob_trace_output checks sched_switch and error_env_tlob field layout.
> tlob_violation_react verifies error_env_tlob fires once on budget
> expiry and zero times when the budget is not exceeded.
> 
> Suggested-by: Gabriele Monaco <gmonaco@redhat.com> 
> Signed-off-by: Wen Yang <wen.yang@linux.dev>

That's quite extensive, but what caught my eyes are tests enrolling tracepoints
handlers. If you go there you're no longer doing unit testing, what's the
advantage of testing the entire monitor here over doing that in selftests?

Thanks,
Gabriele

> ---
>  kernel/trace/rv/monitors/tlob/.kunitconfig |   5 +
>  kernel/trace/rv/monitors/tlob/tlob.c       |  26 +
>  kernel/trace/rv/monitors/tlob/tlob_kunit.c | 881 +++++++++++++++++++++
>  3 files changed, 912 insertions(+)
>  create mode 100644 kernel/trace/rv/monitors/tlob/.kunitconfig
>  create mode 100644 kernel/trace/rv/monitors/tlob/tlob_kunit.c
> 
> diff --git a/kernel/trace/rv/monitors/tlob/.kunitconfig
> b/kernel/trace/rv/monitors/tlob/.kunitconfig
> new file mode 100644
> index 000000000000..977c58601ab7
> --- /dev/null
> +++ b/kernel/trace/rv/monitors/tlob/.kunitconfig
> @@ -0,0 +1,5 @@
> +CONFIG_FTRACE=y
> +CONFIG_KUNIT=y
> +CONFIG_RV=y
> +CONFIG_RV_MON_TLOB=y
> +CONFIG_TLOB_KUNIT_TEST=y
> diff --git a/kernel/trace/rv/monitors/tlob/tlob.c
> b/kernel/trace/rv/monitors/tlob/tlob.c
> index 475e972ae9aa..90e7035a0b55 100644
> --- a/kernel/trace/rv/monitors/tlob/tlob.c
> +++ b/kernel/trace/rv/monitors/tlob/tlob.c
> @@ -1024,6 +1024,7 @@ EXPORT_SYMBOL_IF_KUNIT(tlob_num_monitored_read);
>  /* Tracepoint probes for KUnit; rv_trace.h is only included here. */
>  static struct tlob_captured_event     tlob_kunit_last_event;
>  static struct tlob_captured_error_env tlob_kunit_last_error_env;
> +static struct tlob_captured_detail    tlob_kunit_last_detail;
>  static atomic_t tlob_kunit_event_cnt    = ATOMIC_INIT(0);
>  static atomic_t tlob_kunit_error_env_cnt = ATOMIC_INIT(0);
>  
> @@ -1054,6 +1055,17 @@ static void tlob_kunit_error_env_probe(void *data, int
> id, char *state,
>  	atomic_inc(&tlob_kunit_error_env_cnt);
>  }
>  
> +static void tlob_kunit_detail_probe(void *data, int pid, u64 threshold_us,
> +				    u64 running_ns, u64 waiting_ns,
> +				    u64 sleeping_ns)
> +{
> +	tlob_kunit_last_detail.pid		= pid;
> +	tlob_kunit_last_detail.threshold_us	= threshold_us;
> +	tlob_kunit_last_detail.running_ns	= running_ns;
> +	tlob_kunit_last_detail.waiting_ns	= waiting_ns;
> +	tlob_kunit_last_detail.sleeping_ns	= sleeping_ns;
> +}
> +
>  int tlob_register_kunit_probes(void)
>  {
>  	int ret;
> @@ -1069,6 +1081,12 @@ int tlob_register_kunit_probes(void)
>  		unregister_trace_event_tlob(tlob_kunit_event_probe, NULL);
>  		return ret;
>  	}
> +	ret = register_trace_detail_env_tlob(tlob_kunit_detail_probe, NULL);
> +	if (ret) {
> +		unregister_trace_error_env_tlob(tlob_kunit_error_env_probe,
> NULL);
> +		unregister_trace_event_tlob(tlob_kunit_event_probe, NULL);
> +		return ret;
> +	}
>  	return 0;
>  }
>  EXPORT_SYMBOL_IF_KUNIT(tlob_register_kunit_probes);
> @@ -1077,6 +1095,7 @@ void tlob_unregister_kunit_probes(void)
>  {
>  	unregister_trace_event_tlob(tlob_kunit_event_probe, NULL);
>  	unregister_trace_error_env_tlob(tlob_kunit_error_env_probe, NULL);
> +	unregister_trace_detail_env_tlob(tlob_kunit_detail_probe, NULL);
>  	tracepoint_synchronize_unregister();
>  }
>  EXPORT_SYMBOL_IF_KUNIT(tlob_unregister_kunit_probes);
> @@ -1105,6 +1124,7 @@ void tlob_error_env_count_reset(void)
>  }
>  EXPORT_SYMBOL_IF_KUNIT(tlob_error_env_count_reset);
>  
> +
>  const struct tlob_captured_event *tlob_last_event_read(void)
>  {
>  	return &tlob_kunit_last_event;
> @@ -1117,6 +1137,12 @@ const struct tlob_captured_error_env
> *tlob_last_error_env_read(void)
>  }
>  EXPORT_SYMBOL_IF_KUNIT(tlob_last_error_env_read);
>  
> +const struct tlob_captured_detail *tlob_last_detail_read(void)
> +{
> +	return &tlob_kunit_last_detail;
> +}
> +EXPORT_SYMBOL_IF_KUNIT(tlob_last_detail_read);
> +
>  #endif /* CONFIG_KUNIT */
>  
>  VISIBLE_IF_KUNIT int tlob_enable_hooks(void)
> diff --git a/kernel/trace/rv/monitors/tlob/tlob_kunit.c
> b/kernel/trace/rv/monitors/tlob/tlob_kunit.c
> new file mode 100644
> index 000000000000..ed2e7c7abaf8
> --- /dev/null
> +++ b/kernel/trace/rv/monitors/tlob/tlob_kunit.c
> @@ -0,0 +1,881 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * KUnit tests for the tlob RV monitor.
> + *
> + * tlob_task_api:          start/stop lifecycle, error paths, violations.
> + * tlob_sched_integration: per-state accounting across real context switches.
> + * tlob_uprobe_format:     uprobe binding format; add/remove acceptance and
> rejection.
> + * tlob_trace_output:      trace event format for event_tlob, error_env_tlob.
> + * tlob_violation_react:   error count per budget expiry; per-state
> breakdown.
> + *
> + * tlob_add_uprobe() duplicate-(binary, offset_start) constraint is not
> covered
> + * here: kern_path() requires a real filesystem; see selftests instead.
> + */
> +#include <kunit/test.h>
> +#include <linux/atomic.h>
> +#include <linux/completion.h>
> +#include <linux/delay.h>
> +#include <linux/kthread.h>
> +#include <linux/ktime.h>
> +#include <linux/mutex.h>
> +#include <linux/sched.h>
> +#include <linux/sched/rt.h>
> +#include <linux/sched/task.h>
> +
> +#include "tlob.h"
> +
> +MODULE_IMPORT_NS("EXPORTED_FOR_KUNIT_TESTING");
> +
> +/*
> + * Kthread cleanup guard: registers a kunit action that stops a kthread on
> + * test exit, even when a KUNIT_ASSERT fires before normal teardown code
> runs.
> + *
> + * Caller must call get_task_struct() before registering the guard.
> + * Set guard->task = NULL before normal-path teardown to prevent double-stop.
> + * Pass the completion to unblock on early exit, or NULL if not needed.
> + */
> +struct tlob_kthread_guard {
> +	struct task_struct	*task;
> +	struct completion	*unblock;
> +};
> +
> +static void kthread_guard_fn(void *arg)
> +{
> +	struct tlob_kthread_guard *g = arg;
> +
> +	if (!g->task)
> +		return;
> +	if (g->unblock)
> +		complete(g->unblock);
> +	kthread_stop(g->task);
> +	put_task_struct(g->task);
> +}
> +
> +static struct tlob_kthread_guard *
> +tlob_guard_kthread(struct kunit *test, struct task_struct *task,
> +		   struct completion *unblock)
> +{
> +	struct tlob_kthread_guard *g;
> +
> +	g = kunit_kzalloc(test, sizeof(*g), GFP_KERNEL);
> +	if (!g)
> +		return NULL;
> +	g->task = task;
> +	g->unblock = unblock;
> +	if (kunit_add_action_or_reset(test, kthread_guard_fn, g))
> +		return NULL;
> +	return g;
> +}
> +
> +/* Suite 1: task API - lifecycle, error paths, violations. */
> +
> +/* Basic start/stop cycle */
> +static void tlob_start_stop_ok(struct kunit *test)
> +{
> +	int ret;
> +
> +	ret = tlob_start_task(current, 10000000ULL);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), 0);
> +	KUNIT_EXPECT_EQ(test, tlob_num_monitored_read(), 0);
> +}
> +
> +/* Double start must return -EALREADY; double stop must return -ESRCH. */
> +static void tlob_double_start(struct kunit *test)
> +{
> +	KUNIT_ASSERT_EQ(test, tlob_start_task(current, 10000000ULL), 0);
> +	KUNIT_EXPECT_EQ(test, tlob_start_task(current, 10000000ULL), -
> EALREADY);
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), 0);
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), -ESRCH);
> +	KUNIT_EXPECT_EQ(test, tlob_num_monitored_read(), 0);
> +}
> +
> +/* Stop without start must return -ESRCH. */
> +static void tlob_stop_without_start(struct kunit *test)
> +{
> +	tlob_stop_task(current);
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), -ESRCH);
> +	KUNIT_EXPECT_EQ(test, tlob_num_monitored_read(), 0);
> +}
> +
> +/* threshold_us == 0 is invalid and must return -ERANGE. */
> +static void tlob_zero_threshold(struct kunit *test)
> +{
> +	KUNIT_EXPECT_EQ(test, tlob_start_task(current, 0), -ERANGE);
> +}
> +
> +/* 1 ns budget: timer almost certainly fires before tlob_stop_task(). */
> +static void tlob_immediate_deadline(struct kunit *test)
> +{
> +	int ret = tlob_start_task(current, 1);
> +
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +	udelay(100);
> +	/* timer fired -> -EOVERFLOW; if we won the race, 0 is also valid */
> +	ret = tlob_stop_task(current);
> +	KUNIT_EXPECT_TRUE(test, ret == 0 || ret == -EOVERFLOW);
> +	KUNIT_EXPECT_EQ(test, tlob_num_monitored_read(), 0);
> +}
> +
> +/*
> + * kthreads provide distinct task_structs; fill to TLOB_MAX_MONITORED,
> + * then verify -ENOSPC.
> + */
> +struct tlob_waiter_ctx {
> +	struct completion start;
> +	struct completion done;
> +};
> +
> +static int tlob_waiter_fn(void *arg)
> +{
> +	struct tlob_waiter_ctx *ctx = arg;
> +
> +	wait_for_completion(&ctx->start);
> +	complete(&ctx->done);
> +	return 0;
> +}
> +
> +static void tlob_enospc(struct kunit *test)
> +{
> +	struct tlob_waiter_ctx *ctxs;
> +	struct task_struct **threads;
> +	int i, ret;
> +
> +	ctxs = kunit_kcalloc(test, TLOB_MAX_MONITORED,
> +			     sizeof(*ctxs), GFP_KERNEL);
> +	KUNIT_ASSERT_NOT_NULL(test, ctxs);
> +
> +	threads = kunit_kcalloc(test, TLOB_MAX_MONITORED,
> +				sizeof(*threads), GFP_KERNEL);
> +	KUNIT_ASSERT_NOT_NULL(test, threads);
> +
> +	KUNIT_ASSERT_EQ(test, tlob_num_monitored_read(), 0);
> +
> +	for (i = 0; i < TLOB_MAX_MONITORED; i++) {
> +		init_completion(&ctxs[i].start);
> +		init_completion(&ctxs[i].done);
> +
> +		threads[i] = kthread_run(tlob_waiter_fn, &ctxs[i],
> +					 "tlob_waiter_%d", i);
> +		if (IS_ERR(threads[i])) {
> +			KUNIT_FAIL(test, "kthread_run failed at i=%d", i);
> +			threads[i] = NULL;
> +			goto cleanup;
> +		}
> +		get_task_struct(threads[i]);
> +
> +		ret = tlob_start_task(threads[i], 10000000ULL);
> +		if (ret != 0) {
> +			KUNIT_FAIL(test, "tlob_start_task failed at i=%d:
> %d",
> +				   i, ret);
> +			put_task_struct(threads[i]);
> +			complete(&ctxs[i].start);
> +			threads[i] = NULL;
> +			goto cleanup;
> +		}
> +	}
> +
> +	ret = tlob_start_task(current, 10000000ULL);
> +	KUNIT_EXPECT_EQ(test, ret, -ENOSPC);
> +
> +cleanup:
> +	/* cancel monitoring and unblock first, then wait for full exit */
> +	for (i = 0; i < TLOB_MAX_MONITORED; i++) {
> +		if (!threads[i])
> +			break;
> +		tlob_stop_task(threads[i]);
> +		complete(&ctxs[i].start);
> +	}
> +	for (i = 0; i < TLOB_MAX_MONITORED; i++) {
> +		if (!threads[i])
> +			break;
> +		kthread_stop(threads[i]);
> +		put_task_struct(threads[i]);
> +	}
> +}
> +
> +/*
> + * Holder kthread holds a mutex for 80 ms; arm a 10 ms budget, burn ~1 ms
> + * on-CPU, then block on the mutex; timer fires while sleeping -> -EOVERFLOW.
> + */
> +struct tlob_holder_ctx {
> +	struct mutex		lock;
> +	struct completion	ready;
> +	unsigned int		hold_ms;
> +};
> +
> +static int tlob_holder_fn(void *arg)
> +{
> +	struct tlob_holder_ctx *ctx = arg;
> +
> +	mutex_lock(&ctx->lock);
> +	complete(&ctx->ready);
> +	msleep(ctx->hold_ms);
> +	mutex_unlock(&ctx->lock);
> +	return 0;
> +}
> +
> +static void tlob_deadline_fires_sleeping(struct kunit *test)
> +{
> +	struct tlob_holder_ctx *ctx;
> +	struct tlob_kthread_guard *guard;
> +	struct task_struct *holder;
> +	ktime_t t0;
> +	int ret;
> +
> +	ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL);
> +	KUNIT_ASSERT_NOT_NULL(test, ctx);
> +	ctx->hold_ms = 80;
> +	mutex_init(&ctx->lock);
> +	init_completion(&ctx->ready);
> +
> +	holder = kthread_run(tlob_holder_fn, ctx, "tlob_holder_kunit");
> +	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, holder);
> +	get_task_struct(holder);
> +
> +	guard = tlob_guard_kthread(test, holder, NULL);
> +	KUNIT_ASSERT_NOT_NULL(test, guard);
> +
> +	wait_for_completion(&ctx->ready);
> +
> +	ret = tlob_start_task(current, 10000);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	t0 = ktime_get();
> +	while (ktime_us_delta(ktime_get(), t0) < 1000)
> +		cpu_relax();
> +
> +	/* block on mutex: running->sleeping; timer fires while sleeping */
> +	mutex_lock(&ctx->lock);
> +	mutex_unlock(&ctx->lock);
> +
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), -EOVERFLOW);
> +
> +	guard->task = NULL;
> +	kthread_stop(holder);
> +	put_task_struct(holder);
> +}
> +
> +/*
> + * yield() triggers a preempt sched_switch (prev_state==0): running->waiting.
> + * Busy-spin 50 ms so the 2 ms budget fires regardless of scheduler timing.
> + */
> +static void tlob_deadline_fires_waiting(struct kunit *test)
> +{
> +	ktime_t t0;
> +	int ret;
> +
> +	ret = tlob_start_task(current, 2000);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	yield();
> +
> +	t0 = ktime_get();
> +	while (ktime_us_delta(ktime_get(), t0) < 50000)
> +		cpu_relax();
> +
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), -EOVERFLOW);
> +}
> +
> +/* Arm a 1 ms budget and busy-spin for 50 ms; timer fires in running state.
> */
> +static void tlob_deadline_fires_running(struct kunit *test)
> +{
> +	ktime_t t0;
> +	int ret;
> +
> +	ret = tlob_start_task(current, 1000);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	t0 = ktime_get();
> +	while (ktime_us_delta(ktime_get(), t0) < 50000)
> +		cpu_relax();
> +
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), -EOVERFLOW);
> +}
> +
> +/* Start three tasks, reinit monitor, verify all entries are gone. */
> +static int tlob_dummy_fn(void *arg)
> +{
> +	wait_for_completion((struct completion *)arg);
> +	return 0;
> +}
> +
> +static void tlob_reinit_clears_all(struct kunit *test)
> +{
> +	struct completion *done1, *done2;
> +	struct tlob_kthread_guard *guard1, *guard2;
> +	struct task_struct *t1, *t2;
> +	int ret;
> +
> +	done1 = kunit_kzalloc(test, sizeof(*done1), GFP_KERNEL);
> +	KUNIT_ASSERT_NOT_NULL(test, done1);
> +	done2 = kunit_kzalloc(test, sizeof(*done2), GFP_KERNEL);
> +	KUNIT_ASSERT_NOT_NULL(test, done2);
> +
> +	init_completion(done1);
> +	init_completion(done2);
> +
> +	t1 = kthread_run(tlob_dummy_fn, done1, "tlob_dummy1");
> +	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, t1);
> +	get_task_struct(t1);
> +	guard1 = tlob_guard_kthread(test, t1, done1);
> +	KUNIT_ASSERT_NOT_NULL(test, guard1);
> +
> +	t2 = kthread_run(tlob_dummy_fn, done2, "tlob_dummy2");
> +	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, t2);
> +	get_task_struct(t2);
> +	guard2 = tlob_guard_kthread(test, t2, done2);
> +	KUNIT_ASSERT_NOT_NULL(test, guard2);
> +
> +	KUNIT_ASSERT_EQ(test, tlob_start_task(current, 10000000ULL), 0);
> +	KUNIT_ASSERT_EQ(test, tlob_start_task(t1, 10000000ULL), 0);
> +	KUNIT_ASSERT_EQ(test, tlob_start_task(t2, 10000000ULL), 0);
> +
> +	tlob_destroy_monitor();
> +	ret = tlob_init_monitor();
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), -ESRCH);
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(t1), -ESRCH);
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(t2), -ESRCH);
> +
> +	/* null guards before teardown to prevent double-stop */
> +	guard1->task = NULL;
> +	guard2->task = NULL;
> +	complete(done1);
> +	complete(done2);
> +	kthread_stop(t1);
> +	kthread_stop(t2);
> +	put_task_struct(t1);
> +	put_task_struct(t2);
> +}
> +
> +static int tlob_task_api_suite_init(struct kunit_suite *suite)
> +{
> +	rv_kunit_monitoring_on();
> +	return tlob_init_monitor();
> +}
> +
> +static void tlob_task_api_suite_exit(struct kunit_suite *suite)
> +{
> +	tlob_destroy_monitor();
> +	rv_kunit_monitoring_off();
> +}
> +
> +static void tlob_task_api_exit(struct kunit *test)
> +{
> +	/*
> +	 * tlob_stop_task() returns pool slots via call_rcu
> (da_pool_return_cb).
> +	 * Wait for all pending callbacks so each test starts with a full
> pool.
> +	 */
> +	rcu_barrier();
> +}
> +
> +static struct kunit_case tlob_task_api_cases[] = {
> +	KUNIT_CASE(tlob_start_stop_ok),
> +	KUNIT_CASE(tlob_double_start),
> +	KUNIT_CASE(tlob_stop_without_start),
> +	KUNIT_CASE(tlob_zero_threshold),
> +	KUNIT_CASE(tlob_immediate_deadline),
> +	KUNIT_CASE(tlob_enospc),
> +	KUNIT_CASE(tlob_deadline_fires_sleeping),
> +	KUNIT_CASE(tlob_deadline_fires_waiting),
> +	KUNIT_CASE(tlob_deadline_fires_running),
> +	KUNIT_CASE(tlob_reinit_clears_all),
> +	{}
> +};
> +
> +static struct kunit_suite tlob_task_api_suite = {
> +	.name       = "tlob_task_api",
> +	.suite_init = tlob_task_api_suite_init,
> +	.suite_exit = tlob_task_api_suite_exit,
> +	.exit       = tlob_task_api_exit,
> +	.test_cases = tlob_task_api_cases,
> +};
> +
> +/* Suite 2: sched integration - per-state ns accounting. */
> +
> +struct tlob_ping_ctx {
> +	struct completion ping;
> +	struct completion pong;
> +};
> +
> +static int tlob_ping_fn(void *arg)
> +{
> +	struct tlob_ping_ctx *ctx = arg;
> +
> +	wait_for_completion(&ctx->ping);
> +	complete(&ctx->pong);
> +	return 0;
> +}
> +
> +/* Force two context switches and verify stop returns 0 (within budget). */
> +static void tlob_sched_switch_accounting(struct kunit *test)
> +{
> +	struct tlob_ping_ctx *ctx;
> +	struct tlob_kthread_guard *guard;
> +	struct task_struct *peer;
> +	int ret;
> +
> +	ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL);
> +	KUNIT_ASSERT_NOT_NULL(test, ctx);
> +	init_completion(&ctx->ping);
> +	init_completion(&ctx->pong);
> +
> +	peer = kthread_run(tlob_ping_fn, ctx, "tlob_ping_kunit");
> +	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, peer);
> +	get_task_struct(peer);
> +
> +	guard = tlob_guard_kthread(test, peer, &ctx->ping);
> +	KUNIT_ASSERT_NOT_NULL(test, guard);
> +
> +	ret = tlob_start_task(current, 5000000ULL);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	/* complete(ping) -> peer runs, forcing a context switch out and back
> */
> +	complete(&ctx->ping);
> +	wait_for_completion(&ctx->pong);
> +
> +	ret = tlob_stop_task(current);
> +	KUNIT_EXPECT_EQ(test, ret, 0);
> +
> +	guard->task = NULL;
> +	kthread_stop(peer);
> +	put_task_struct(peer);
> +}
> +
> +/* start/stop monitoring a kthread other than current */
> +static int tlob_block_fn(void *arg)
> +{
> +	struct completion *done = arg;
> +
> +	msleep(20);
> +	complete(done);
> +	return 0;
> +}
> +
> +static void tlob_monitor_other_task(struct kunit *test)
> +{
> +	struct completion *done;
> +	struct tlob_kthread_guard *guard;
> +	struct task_struct *target;
> +	int ret;
> +
> +	done = kunit_kzalloc(test, sizeof(*done), GFP_KERNEL);
> +	KUNIT_ASSERT_NOT_NULL(test, done);
> +	init_completion(done);
> +
> +	target = kthread_run(tlob_block_fn, done, "tlob_target_kunit");
> +	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, target);
> +	get_task_struct(target);
> +
> +	guard = tlob_guard_kthread(test, target, NULL);
> +	KUNIT_ASSERT_NOT_NULL(test, guard);
> +
> +	ret = tlob_start_task(target, 5000000ULL);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	wait_for_completion(done);
> +
> +	/* 5 s budget won't fire in 20 ms; 0 or -EOVERFLOW are both valid */
> +	ret = tlob_stop_task(target);
> +	KUNIT_EXPECT_TRUE(test, ret == 0 || ret == -EOVERFLOW);
> +
> +	guard->task = NULL;
> +	kthread_stop(target);
> +	put_task_struct(target);
> +}
> +
> +static int tlob_sched_suite_init(struct kunit_suite *suite)
> +{
> +	rv_kunit_monitoring_on();
> +	return tlob_init_monitor();
> +}
> +
> +static void tlob_sched_suite_exit(struct kunit_suite *suite)
> +{
> +	tlob_destroy_monitor();
> +	rv_kunit_monitoring_off();
> +}
> +
> +static struct kunit_case tlob_sched_integration_cases[] = {
> +	KUNIT_CASE(tlob_sched_switch_accounting),
> +	KUNIT_CASE(tlob_monitor_other_task),
> +	{}
> +};
> +
> +static struct kunit_suite tlob_sched_integration_suite = {
> +	.name       = "tlob_sched_integration",
> +	.suite_init = tlob_sched_suite_init,
> +	.suite_exit = tlob_sched_suite_exit,
> +	.test_cases = tlob_sched_integration_cases,
> +};
> +
> +/* Suite 3: uprobe binding format - add/remove acceptance and rejection. */
> +
> +static const char * const tlob_format_valid[] = {
> +	"p /usr/bin/myapp:4768 4848 threshold=5000",
> +	"p /usr/bin/myapp:0x12a0 0x12f0 threshold=10000",
> +	"p /opt/my:app/bin:0x100 0x200 threshold=1000",
> +};
> +
> +static const char * const tlob_format_invalid[] = {
> +	/* add: malformed */
> +	"p /usr/bin/myapp:0x100 0x200 threshold=0",
> +	"p :0x100 0x200 threshold=5000",
> +	"p /usr/bin/myapp:0x100 threshold=5000",
> +	"p /usr/bin/myapp:-1 0x200 threshold=5000",
> +	"p /usr/bin/myapp:0x100 0x200",
> +	"p /usr/bin/myapp:0x100 0x100 threshold=5000",
> +	/* remove: malformed */
> +	"-usr/bin/myapp:0x100",
> +	"-/usr/bin/myapp",
> +	"-/:0x100",
> +	"-/usr/bin/myapp:abc",
> +};
> +
> +/*
> + * Valid add lines return -ENOENT (path does not exist in the test
> environment)
> + * rather than 0; a non-(-EINVAL) return confirms the format was accepted.
> + */
> +static void tlob_format_accepted(struct kunit *test)
> +{
> +	char buf[128];
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(tlob_format_valid); i++) {
> +		strscpy(buf, tlob_format_valid[i], sizeof(buf));
> +		KUNIT_EXPECT_NE(test, tlob_create_or_delete_uprobe(buf), -
> EINVAL);
> +	}
> +}
> +
> +static void tlob_format_rejected(struct kunit *test)
> +{
> +	char buf[128];
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(tlob_format_invalid); i++) {
> +		strscpy(buf, tlob_format_invalid[i], sizeof(buf));
> +		KUNIT_EXPECT_EQ(test, tlob_create_or_delete_uprobe(buf), -
> EINVAL);
> +	}
> +}
> +
> +static struct kunit_case tlob_uprobe_format_cases[] = {
> +	KUNIT_CASE(tlob_format_accepted),
> +	KUNIT_CASE(tlob_format_rejected),
> +	{}
> +};
> +
> +static struct kunit_suite tlob_uprobe_format_suite = {
> +	.name       = "tlob_uprobe_format",
> +	.test_cases = tlob_uprobe_format_cases,
> +};
> +
> +/* Suite 4: trace output - verify event_tlob and error_env_tlob field values.
> */
> +
> +static void tlob_trace_event_format(struct kunit *test)
> +{
> +	const struct tlob_captured_event *ev;
> +	int pid = current->pid;
> +	int ret;
> +
> +	tlob_event_count_reset();
> +	ret = tlob_start_task(current, 5000000ULL);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	/* sleep/wakeup/switch_in: running->sleeping->waiting->running */
> +	msleep(20);
> +
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), 0);
> +
> +	KUNIT_EXPECT_GE(test, tlob_event_count_read(), 3);
> +
> +	ev = tlob_last_event_read();
> +	KUNIT_EXPECT_EQ(test,    ev->id,          pid);
> +	KUNIT_EXPECT_STREQ(test, ev->state,       "waiting");
> +	KUNIT_EXPECT_STREQ(test, ev->event,       "switch_in");
> +	KUNIT_EXPECT_STREQ(test, ev->next_state,  "running");
> +	KUNIT_EXPECT_TRUE(test,  ev->final_state);
> +}
> +
> +static void tlob_trace_error_env_format(struct kunit *test)
> +{
> +	const struct tlob_captured_error_env *err;
> +	ktime_t t0;
> +	int pid = current->pid;
> +	int ret;
> +
> +	tlob_error_env_count_reset();
> +	ret = tlob_start_task(current, 1000);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	t0 = ktime_get();
> +	while (ktime_us_delta(ktime_get(), t0) < 50000)
> +		cpu_relax();
> +
> +	tlob_stop_task(current);
> +
> +	KUNIT_ASSERT_GE(test, tlob_error_env_count_read(), 1);
> +
> +	err = tlob_last_error_env_read();
> +	KUNIT_EXPECT_EQ(test,    err->id,    pid);
> +	KUNIT_EXPECT_STREQ(test, err->state, "running");
> +	KUNIT_EXPECT_STREQ(test, err->event, "budget_exceeded");
> +	KUNIT_EXPECT_TRUE(test, strncmp(err->env, "clk_elapsed=", 12) == 0);
> +}
> +
> +static int tlob_trace_suite_init(struct kunit_suite *suite)
> +{
> +	int ret;
> +
> +	rv_kunit_monitoring_on();
> +	ret = tlob_init_monitor();
> +	if (ret)
> +		goto err_mon_off;
> +	ret = tlob_register_kunit_probes();
> +	if (ret)
> +		goto err_destroy;
> +	ret = tlob_enable_hooks();
> +	if (ret)
> +		goto err_probes;
> +	return 0;
> +
> +err_probes:
> +	tlob_unregister_kunit_probes();
> +err_destroy:
> +	tlob_destroy_monitor();
> +err_mon_off:
> +	rv_kunit_monitoring_off();
> +	return ret;
> +}
> +
> +static void tlob_trace_suite_exit(struct kunit_suite *suite)
> +{
> +	tlob_disable_hooks();
> +	tlob_unregister_kunit_probes();
> +	tlob_destroy_monitor();
> +	rv_kunit_monitoring_off();
> +}
> +
> +static struct kunit_case tlob_trace_output_cases[] = {
> +	KUNIT_CASE(tlob_trace_event_format),
> +	KUNIT_CASE(tlob_trace_error_env_format),
> +	{}
> +};
> +
> +static struct kunit_suite tlob_trace_output_suite = {
> +	.name       = "tlob_trace_output",
> +	.suite_init = tlob_trace_suite_init,
> +	.suite_exit = tlob_trace_suite_exit,
> +	.test_cases = tlob_trace_output_cases,
> +};
> +
> +/*
> + * Suite 5: violation reaction - complement to Suite 4.
> + * Suite 4 checks trace field values; Suite 5 checks semantics:
> + * error count per budget expiry and per-state ns breakdown.
> + */
> +
> +/* generous budget; usleep forces state transitions; no error must fire */
> +static void tlob_no_error_within_budget(struct kunit *test)
> +{
> +	tlob_error_env_count_reset();
> +	tlob_event_count_reset();
> +
> +	KUNIT_ASSERT_EQ(test, tlob_start_task(current, 10000000ULL), 0);
> +	usleep_range(5000, 10000);
> +	KUNIT_EXPECT_EQ(test, tlob_stop_task(current), 0);
> +	KUNIT_EXPECT_EQ(test, tlob_error_env_count_read(), 0);
> +	KUNIT_EXPECT_GE(test, tlob_event_count_read(), 2);
> +}
> +
> +/* busy-spin 50 ms >> 1 ms budget; running_ns must dominate */
> +static void tlob_detail_running_dominates(struct kunit *test)
> +{
> +	const struct tlob_captured_detail *d;
> +	u64 total_ns;
> +	ktime_t t0;
> +	int ret;
> +
> +	tlob_error_env_count_reset();
> +
> +	ret = tlob_start_task(current, 1000);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	t0 = ktime_get();
> +	while (ktime_us_delta(ktime_get(), t0) < 50000)
> +		cpu_relax();
> +
> +	tlob_stop_task(current);
> +
> +	KUNIT_EXPECT_EQ(test, tlob_error_env_count_read(), 1);
> +	d = tlob_last_detail_read();
> +	KUNIT_EXPECT_EQ(test, d->pid, current->pid);
> +	KUNIT_EXPECT_EQ(test, d->threshold_us, 1000ULL);
> +	total_ns = d->running_ns + d->waiting_ns + d->sleeping_ns;
> +	KUNIT_EXPECT_GE(test, total_ns, 1000ULL * 1000);
> +	KUNIT_EXPECT_GT(test, d->running_ns, d->sleeping_ns + d->waiting_ns);
> +}
> +
> +struct tlob_hog_ctx {
> +	int spin_ms;
> +};
> +
> +static int tlob_hog_fn(void *arg)
> +{
> +	struct tlob_hog_ctx *ctx = arg;
> +	ktime_t t0 = ktime_get();
> +
> +	while (!kthread_should_stop() &&
> +	       ktime_ms_delta(ktime_get(), t0) < ctx->spin_ms)
> +		cpu_relax();
> +	return 0;
> +}
> +
> +/*
> + * SCHED_FIFO kthread bound to the same CPU preempts the monitored task
> + * (sched_switch prev_state == 0: running->waiting) and holds the CPU for
> + * 80 ms >> 10 ms budget, guaranteeing the timer fires in waiting state.
> + */
> +static void tlob_detail_waiting_dominates(struct kunit *test)
> +{
> +	struct tlob_hog_ctx *ctx;
> +	struct task_struct *hog;
> +	struct tlob_kthread_guard *guard;
> +	const struct tlob_captured_detail *d;
> +	struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
> +	int ret;
> +
> +	tlob_error_env_count_reset();
> +
> +	ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL);
> +	KUNIT_ASSERT_NOT_NULL(test, ctx);
> +	ctx->spin_ms = 80;
> +
> +	hog = kthread_create(tlob_hog_fn, ctx, "tlob_s5_hog");
> +	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, hog);
> +	get_task_struct(hog);
> +
> +	kthread_bind(hog, smp_processor_id());
> +	sched_setscheduler_nocheck(hog, SCHED_FIFO, &param);
> +
> +	guard = tlob_guard_kthread(test, hog, NULL);
> +	KUNIT_ASSERT_NOT_NULL(test, guard);
> +
> +	ret = tlob_start_task(current, 10000); /* 10 ms budget */
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	wake_up_process(hog);
> +	yield(); /* sched_switch prev_state == 0: running->waiting */
> +
> +	tlob_stop_task(current);
> +
> +	KUNIT_EXPECT_EQ(test, tlob_error_env_count_read(), 1);
> +	d = tlob_last_detail_read();
> +	KUNIT_EXPECT_EQ(test, d->sleeping_ns, 0ULL);
> +	KUNIT_EXPECT_GT(test, d->waiting_ns, d->running_ns + d->sleeping_ns);
> +
> +	guard->task = NULL;
> +	kthread_stop(hog);
> +	put_task_struct(hog);
> +}
> +
> +/* block on mutex for 80 ms >> 10 ms budget; sleeping_ns must dominate */
> +static void tlob_detail_sleeping_dominates(struct kunit *test)
> +{
> +	struct tlob_holder_ctx *ctx;
> +	struct tlob_kthread_guard *guard;
> +	struct task_struct *holder;
> +	const struct tlob_captured_detail *d;
> +	int ret;
> +
> +	tlob_error_env_count_reset();
> +
> +	ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL);
> +	KUNIT_ASSERT_NOT_NULL(test, ctx);
> +	ctx->hold_ms = 80;
> +	mutex_init(&ctx->lock);
> +	init_completion(&ctx->ready);
> +
> +	holder = kthread_run(tlob_holder_fn, ctx, "tlob_s5_detail");
> +	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, holder);
> +	get_task_struct(holder);
> +
> +	guard = tlob_guard_kthread(test, holder, NULL);
> +	KUNIT_ASSERT_NOT_NULL(test, guard);
> +
> +	wait_for_completion(&ctx->ready);
> +
> +	ret = tlob_start_task(current, 10000);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	mutex_lock(&ctx->lock);
> +	mutex_unlock(&ctx->lock);
> +
> +	tlob_stop_task(current);
> +
> +	KUNIT_EXPECT_EQ(test, tlob_error_env_count_read(), 1);
> +	d = tlob_last_detail_read();
> +	KUNIT_EXPECT_GT(test, d->sleeping_ns, d->running_ns + d->waiting_ns);
> +
> +	guard->task = NULL;
> +	kthread_stop(holder);
> +	put_task_struct(holder);
> +}
> +
> +static int tlob_violation_suite_init(struct kunit_suite *suite)
> +{
> +	int ret;
> +
> +	rv_kunit_monitoring_on();
> +	ret = tlob_init_monitor();
> +	if (ret)
> +		goto err_mon_off;
> +	ret = tlob_register_kunit_probes();
> +	if (ret)
> +		goto err_destroy;
> +	ret = tlob_enable_hooks();
> +	if (ret)
> +		goto err_probes;
> +	return 0;
> +
> +err_probes:
> +	tlob_unregister_kunit_probes();
> +err_destroy:
> +	tlob_destroy_monitor();
> +err_mon_off:
> +	rv_kunit_monitoring_off();
> +	return ret;
> +}
> +
> +static void tlob_violation_suite_exit(struct kunit_suite *suite)
> +{
> +	tlob_disable_hooks();
> +	tlob_unregister_kunit_probes();
> +	tlob_destroy_monitor();
> +	rv_kunit_monitoring_off();
> +}
> +
> +static struct kunit_case tlob_violation_react_cases[] = {
> +	KUNIT_CASE(tlob_no_error_within_budget),
> +	KUNIT_CASE(tlob_detail_running_dominates),
> +	KUNIT_CASE(tlob_detail_sleeping_dominates),
> +	KUNIT_CASE(tlob_detail_waiting_dominates),
> +	{}
> +};
> +
> +static struct kunit_suite tlob_violation_react_suite = {
> +	.name       = "tlob_violation_react",
> +	.suite_init = tlob_violation_suite_init,
> +	.suite_exit = tlob_violation_suite_exit,
> +	.test_cases = tlob_violation_react_cases,
> +};
> +
> +kunit_test_suites(&tlob_task_api_suite,
> +		  &tlob_sched_integration_suite,
> +		  &tlob_uprobe_format_suite,
> +		  &tlob_trace_output_suite,
> +		  &tlob_violation_react_suite);
> +
> +MODULE_DESCRIPTION("KUnit tests for the tlob RV monitor");
> +MODULE_LICENSE("GPL");


^ permalink raw reply

* Re: [PATCH v7 2/6] mm/memory-failure: surface unhandlable kernel pages as -ENOTRECOVERABLE
From: Breno Leitao @ 2026-05-15 13:13 UTC (permalink / raw)
  To: Lance Yang
  Cc: linmiaohe, akpm, david, ljs, vbabka, rppt, surenb, mhocko, shuah,
	nao.horiguchi, rostedt, mhiramat, mathieu.desnoyers, corbet,
	skhan, liam, linux-mm, linux-kernel, linux-doc, linux-kselftest,
	linux-trace-kernel, kernel-team
In-Reply-To: <20260515070353.87244-1-lance.yang@linux.dev>

On Fri, May 15, 2026 at 03:03:53PM +0800, Lance Yang wrote:
> 
> On Thu, May 14, 2026 at 07:37:14AM -0700, Breno Leitao wrote:
> >On Thu, May 14, 2026 at 09:28:30PM +0800, Lance Yang wrote:
> >> 
> >> On Wed, May 13, 2026 at 08:39:33AM -0700, Breno Leitao wrote:
> >> >get_any_page() collapses three different failure modes into a single
> >> >-EIO return:
> >> >
> >> >  * the put_page race in the !count_increased path;
> >> >  * the HWPoisonHandlable() rejection that bounces out of
> >> >    __get_hwpoison_page() with -EBUSY and exhausts shake_page() retries;
> >> >  * the HWPoisonHandlable() rejection that goes through the
> >> >    count_increased / put_page / shake_page retry loop.
> >> >
> >> >The first is transient (the page is racing with the allocator).  The
> >> >second can be either transient (a userspace folio briefly off LRU
> >> >during migration/compaction) or stable (slab/vmalloc/page-table/
> >> >kernel-stack pages).  The third describes a stable kernel-owned page
> >> >that the count_increased=true caller already held a reference on.
> >> >
> >> >Distinguish them on the return path: keep -EIO for both the put_page
> >> >race and the -EBUSY-after-retries branch (shake_page() cannot drag a
> >> >folio back from active migration, so we cannot prove the page is
> >> >permanently kernel-owned from there), keep -EBUSY for the allocation
> >> >race (unchanged), and return -ENOTRECOVERABLE only from the
> >> >count_increased-true HWPoisonHandlable() rejection that exhausts its
> >> >retries -- the caller's reference is structural evidence that the
> >> >page is owned by the kernel.
> >> >
> >> >Extend the unhandlable-page pr_err() to fire for either errno and
> >> >update the get_hwpoison_page() kerneldoc.
> >> >
> >> >memory_failure() still folds every negative return into
> >> >MF_MSG_GET_HWPOISON via its existing "else if (res < 0)" branch, so
> >> >this patch is a no-op for users of memory_failure() and only changes
> >> >the errno that soft_offline_page() can propagate to its callers.  A
> >> >follow-up wires the new return code through memory_failure() and
> >> >reports MF_MSG_KERNEL for the unrecoverable cases.
> >> >
> >> >Suggested-by: David Hildenbrand <david@kernel.org>
> >> >Signed-off-by: Breno Leitao <leitao@debian.org>
> >> >---
> >> > mm/memory-failure.c | 18 +++++++++++++++---
> >> > 1 file changed, 15 insertions(+), 3 deletions(-)
> >> >
> >> >diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> >> >index 49bcfbd04d213..bae883df3ccb2 100644
> >> >--- a/mm/memory-failure.c
> >> >+++ b/mm/memory-failure.c
> >> >@@ -1408,6 +1408,15 @@ static int get_any_page(struct page *p, unsigned long flags)
> >> > 				shake_page(p);
> >> > 				goto try_again;
> >> > 			}
> >> >+			/*
> >> >+			 * Return -EIO rather than -ENOTRECOVERABLE: this
> >> >+			 * branch is also reached for pages that are merely
> >> >+			 * off-LRU transiently (e.g. a folio in the middle
> >> >+			 * of migration or compaction), which shake_page()
> >> >+			 * cannot drag back.  The caller cannot prove the
> >> >+			 * page is permanently kernel-owned from here, so
> >> >+			 * keep it on the recoverable errno.
> >> >+			 */
> >> > 			ret = -EIO;
> >> > 			goto out;
> >> > 		}
> >> >@@ -1427,10 +1436,10 @@ static int get_any_page(struct page *p, unsigned long flags)
> >> > 			goto try_again;
> >> > 		}
> >> > 		put_page(p);
> >> >-		ret = -EIO;
> >> >+		ret = -ENOTRECOVERABLE;
> >> > 	}
> >> > out:
> >> >-	if (ret == -EIO)
> >> >+	if (ret == -EIO || ret == -ENOTRECOVERABLE)
> >> > 		pr_err("%#lx: unhandlable page.\n", page_to_pfn(p));
> >> > 
> >> > 	return ret;
> >> >@@ -1487,7 +1496,10 @@ static int __get_unpoison_page(struct page *page)
> >> >  *         -EIO for pages on which we can not handle memory errors,
> >> >  *         -EBUSY when get_hwpoison_page() has raced with page lifecycle
> >> >  *         operations like allocation and free,
> >> >- *         -EHWPOISON when the page is hwpoisoned and taken off from buddy.
> >> >+ *         -EHWPOISON when the page is hwpoisoned and taken off from buddy,
> >> >+ *         -ENOTRECOVERABLE for stable kernel-owned pages the handler
> >> >+ *         cannot recover (PG_reserved, slab, vmalloc, page tables,
> >> >+ *         kernel stacks, and similar non-LRU/non-buddy pages).
> >> 
> >> Did you test this patch series? I don't see how we ever get to
> >> -ENOTRECOVERABLE there ...
> >
> >Yes, I did. I am using the following test case:
> 
> Okay.
> 
> >https://github.com/leitao/linux/commit/cfebe84ddeab5ac34ed456331db980d57e7025dc
> >
> >	# RUN_DESTRUCTIVE=1 tools/testing/selftests/mm/hwpoison-panic.sh
> >	# enabling /proc/sys/vm/panic_on_unrecoverable_memory_failure
> >	# injecting hwpoison at phys 0x2a00000 (Kernel rodata)
> >	# expecting kernel panic: 'Memory failure: <pfn>: unrecoverable page'
> >	[  501.113256] Memory failure: 0x2a00: recovery action for reserved kernel page: Ignored
> >	[  501.113956] Kernel panic - not syncing: Memory failure: 0x2a00: unrecoverable page
> >
> >
> >> Even with MF_COUNT_INCREASED, the first pass does:
> >> 
> >> 	if (flags & MF_COUNT_INCREASED)
> >> 		count_increased = true;
> >> 
> >> 	[...]
> >> 
> >> 	if (PageHuge(p) || HWPoisonHandlable(p, flags)) {
> >> 		ret = 1;
> >> 	} else {
> >> 		if (pass++ < GET_PAGE_MAX_RETRY_NUM) { <-
> >> 			put_page(p);
> >> 			shake_page(p);
> >> 			count_increased = false;
> >> 			goto try_again; <-
> >> 		}
> >> 		put_page(p);
> >> 		ret = -ENOTRECOVERABLE;
> >> 	}
> >> 
> >> Then we come back with count_increased=false:
> >> 
> >> try_again:
> >> 	if (!count_increased) {
> >> 		ret = __get_hwpoison_page(p, flags); <-
> >> 		if (!ret) {
> >> 		[...]
> >> 		} else if (ret == -EBUSY) { <-
> >> 		[...]
> >> 			ret = -EIO;
> >> 			goto out; <-
> >> 		}
> >> 	}
> >> 
> >> For slab/vmalloc/page-table pages, __get_hwpoison_page() returns -EBUSY:
> >> 
> >> 	if (!HWPoisonHandlable(&folio->page, flags))
> >> 		return -EBUSY;
> >> 
> >> so they still seem to end up as -EIO ... Am I missing something?
> >
> >You are not, and thanks for catching this. I traced it again and the
> >-ENOTRECOVERABLE branch is unreachable for slab/vmalloc/page-table pages
> >exactly as you described. The __get_hwpoison_page() → -EBUSY → shake → retry
> >loop catches them first and they exit as -EIO.
> 
> Wonder if it would be simpler to just do a positive check near the top
> of get_any_page() instead. Something like:
> 
> static bool hwpoison_unrecoverable_kernel_page(struct page *page,
> 						unsigned long flags)

Ack. We probably want to call it something like HWPoisonKernelOwned() to
follow the same naming sematics of these helpers, such as HWPoisonHandlable()

By the way, I will re-include the self test back to this patch series,
In case they are not useful, we do not merge it.

Thanks for the review,
--breno

^ permalink raw reply

* Re: [RFC PATCH v2 10/10] selftests/verification: add tlob selftests
From: Gabriele Monaco @ 2026-05-15 13:23 UTC (permalink / raw)
  To: wen.yang; +Cc: linux-trace-kernel, linux-kernel, Steven Rostedt
In-Reply-To: <8148267505ef90175b6b69e1ffb3aa560ff42d35.1778522945.git.wen.yang@linux.dev>



On Tue, 2026-05-12 at 02:24 +0800, wen.yang@linux.dev wrote:
> From: Wen Yang <wen.yang@linux.dev>
> 
> Add selftest coverage for the tlob RV monitor in
> tools/testing/selftests/verification/.
> 
> Two helper binaries are built by tlob/Makefile: tlob_helper for the
> ioctl interface (/dev/rv) and tlob_uprobe_target for the uprobe tests.
> The top-level Makefile delegates to tlob/ via a generic MONITOR_SUBDIRS
> pattern so monitor-specific build details stay within each monitor's
> own subdirectory.
> 
> Eight test files cover the tracefs control interface (tracefs.tc), the
> ioctl self-instrumentation interface (ioctl.tc, 8 scenarios), and the
> uprobe external monitoring interface (uprobe_bind.tc, uprobe_violation.tc,
> uprobe_no_event.tc, uprobe_multi.tc, uprobe_detail_sleeping.tc,
> uprobe_detail_waiting.tc).
> 
> Tested on x86_64 with vng (virtme-ng):
> 
>   TAP version 13
>   1..12
>   ok 1 Test monitor enable/disable
>   ok 2 Test monitor reactor setting
>   ok 3 Check available monitors
>   ok 4 Test wwnr monitor with printk reactor
>   ok 5 Test tlob ioctl self-instrumentation (within/over-budget, error paths)
>   ok 6 Test tlob monitor tracefs interface (enable/disable and files)

This should be tested together with the other monitors (enable/disable), we
could at most expand those with the check_requires, though that seems to be
meant for ftracetest's internals.

Let's focus on tlob-only features in this patch.

Thanks,
Gabriele

>   ok 7 uprobe binding: visible in monitor file, removable, duplicate offset
> rejected
>   ok 8 uprobe detail sleeping: sleeping_ns dominates when task blocks between
> probes
>   ok 9 uprobe detail waiting: waiting_ns dominates when task is preempted
> between probes
>   ok 10 Two bindings on same binary with different offsets and budgets fire
> independently
>   ok 11 Verify no spurious error_env_tlob events without an active uprobe
> binding
>   ok 12 uprobe violation: error_env_tlob and detail_env_tlob fire with correct
> fields
>   # Totals: pass:12 fail:0 xfail:0 xpass:0 skip:0 error:0
> 
> Suggested-by: Gabriele Monaco <gmonaco@redhat.com> 
> Signed-off-by: Wen Yang <wen.yang@linux.dev>
> ---
>  tools/testing/selftests/verification/Makefile |  21 +-
>  .../verification/test.d/tlob/ioctl.tc         |  36 +
>  .../verification/test.d/tlob/tracefs.tc       |  17 +
>  .../verification/test.d/tlob/uprobe_bind.tc   |  34 +
>  .../test.d/tlob/uprobe_detail_sleeping.tc     |  47 ++
>  .../test.d/tlob/uprobe_detail_waiting.tc      |  60 ++
>  .../verification/test.d/tlob/uprobe_multi.tc  |  60 ++
>  .../test.d/tlob/uprobe_no_event.tc            |  19 +
>  .../test.d/tlob/uprobe_violation.tc           |  60 ++
>  .../selftests/verification/tlob/Makefile      |  21 +
>  .../selftests/verification/tlob/tlob_ioctl.c  | 626 ++++++++++++++++++
>  .../selftests/verification/tlob/tlob_target.c | 138 ++++
>  12 files changed, 1138 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/verification/test.d/tlob/ioctl.tc
>  create mode 100644
> tools/testing/selftests/verification/test.d/tlob/tracefs.tc
>  create mode 100644
> tools/testing/selftests/verification/test.d/tlob/uprobe_bind.tc
>  create mode 100644
> tools/testing/selftests/verification/test.d/tlob/uprobe_detail_sleeping.tc
>  create mode 100644
> tools/testing/selftests/verification/test.d/tlob/uprobe_detail_waiting.tc
>  create mode 100644
> tools/testing/selftests/verification/test.d/tlob/uprobe_multi.tc
>  create mode 100644
> tools/testing/selftests/verification/test.d/tlob/uprobe_no_event.tc
>  create mode 100644
> tools/testing/selftests/verification/test.d/tlob/uprobe_violation.tc
>  create mode 100644 tools/testing/selftests/verification/tlob/Makefile
>  create mode 100644 tools/testing/selftests/verification/tlob/tlob_ioctl.c
>  create mode 100644 tools/testing/selftests/verification/tlob/tlob_target.c
> 
> diff --git a/tools/testing/selftests/verification/Makefile
> b/tools/testing/selftests/verification/Makefile
> index aa8790c22a71..b5584fd3762d 100644
> --- a/tools/testing/selftests/verification/Makefile
> +++ b/tools/testing/selftests/verification/Makefile
> @@ -1,8 +1,27 @@
>  # SPDX-License-Identifier: GPL-2.0
> -all:
>  
>  TEST_PROGS := verificationtest-ktap
>  TEST_FILES := test.d settings
>  EXTRA_CLEAN := $(OUTPUT)/logs/*
>  
> +# Subdirectories that provide helper binaries for the test runner.
> +# Each entry must contain a Makefile that accepts OUTDIR= and deposits
> +# its binaries there; verificationtest-ktap adds OUTDIR to PATH so
> +# the ftracetest require-checks resolve the binaries by name.
> +MONITOR_SUBDIRS := tlob
> +
>  include ../lib.mk
> +
> +# Build and clean each monitor subdirectory.
> +all: $(patsubst %,_build_%,$(MONITOR_SUBDIRS))
> +
> +clean: $(patsubst %,_clean_%,$(MONITOR_SUBDIRS))
> +
> +.PHONY: $(patsubst %,_build_%,$(MONITOR_SUBDIRS)) \
> +        $(patsubst %,_clean_%,$(MONITOR_SUBDIRS))
> +
> +$(patsubst %,_build_%,$(MONITOR_SUBDIRS)): _build_%:
> +	$(MAKE) -C $* OUTDIR="$(OUTPUT)" TOOLS_INCLUDES="$(TOOLS_INCLUDES)"
> +
> +$(patsubst %,_clean_%,$(MONITOR_SUBDIRS)): _clean_%:
> +	$(MAKE) -C $* OUTDIR="$(OUTPUT)" clean
> diff --git a/tools/testing/selftests/verification/test.d/tlob/ioctl.tc
> b/tools/testing/selftests/verification/test.d/tlob/ioctl.tc
> new file mode 100644
> index 000000000000..54ae249af9a6
> --- /dev/null
> +++ b/tools/testing/selftests/verification/test.d/tlob/ioctl.tc
> @@ -0,0 +1,36 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# description: Test tlob ioctl self-instrumentation (within/over-budget,
> error paths)
> +# requires: tlob:monitor tlob_ioctl:program
> +
> +TLOB_HELPER=$(command -v tlob_ioctl)
> +
> +[ -c /dev/rv ] || exit_unsupported
> +
> +echo 1 > monitors/tlob/enable
> +
> +# within budget: 50 ms threshold, 10 ms workload
> +"$TLOB_HELPER" within_budget
> +
> +# over budget in running state: 1 ms threshold, 100 ms busy-spin
> +"$TLOB_HELPER" over_budget_running
> +
> +# over budget in sleeping state: 3 ms threshold, 50 ms sleep
> +"$TLOB_HELPER" over_budget_sleeping
> +
> +# over budget in waiting state: 1 us threshold, sched_yield
> +"$TLOB_HELPER" over_budget_waiting
> +
> +# error paths
> +"$TLOB_HELPER" double_start
> +"$TLOB_HELPER" stop_no_start
> +
> +# per-thread isolation
> +"$TLOB_HELPER" multi_thread
> +
> +# bind against disabled monitor must return ENODEV, not crash
> +echo 0 > monitors/tlob/enable
> +"$TLOB_HELPER" not_enabled
> +echo 1 > monitors/tlob/enable
> +
> +echo 0 > monitors/tlob/enable
> diff --git a/tools/testing/selftests/verification/test.d/tlob/tracefs.tc
> b/tools/testing/selftests/verification/test.d/tlob/tracefs.tc
> new file mode 100644
> index 000000000000..5d1e7cc02498
> --- /dev/null
> +++ b/tools/testing/selftests/verification/test.d/tlob/tracefs.tc
> @@ -0,0 +1,17 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# description: Test tlob monitor tracefs interface (enable/disable and files)
> +# requires: tlob:monitor
> +
> +check_requires monitors/tlob/enable monitors/tlob/desc monitors/tlob/monitor
> +
> +# enable / disable via the enable file
> +echo 1 > monitors/tlob/enable
> +grep -q 1 monitors/tlob/enable
> +echo "tlob" >> enabled_monitors
> +grep -q tlob enabled_monitors
> +
> +echo 0 > monitors/tlob/enable
> +grep -q 0 monitors/tlob/enable
> +echo "!tlob" >> enabled_monitors
> +! grep -q "^tlob$" enabled_monitors
> diff --git a/tools/testing/selftests/verification/test.d/tlob/uprobe_bind.tc
> b/tools/testing/selftests/verification/test.d/tlob/uprobe_bind.tc
> new file mode 100644
> index 000000000000..41e20d593855
> --- /dev/null
> +++ b/tools/testing/selftests/verification/test.d/tlob/uprobe_bind.tc
> @@ -0,0 +1,34 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# description: Test uprobe binding (visible in monitor file, removable,
> duplicate rejected)
> +# requires: tlob:monitor tlob_ioctl:program tlob_target:program
> +
> +TLOB_HELPER=$(command -v tlob_ioctl)
> +UPROBE_TARGET=$(command -v tlob_target)
> +TLOB_MONITOR=monitors/tlob/monitor
> +
> +busy_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_busy_work
> 2>/dev/null)
> +stop_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_busy_work_done
> 2>/dev/null)
> +[ -n "$busy_offset" ] || exit_unsupported
> +[ -n "$stop_offset" ] || exit_unsupported
> +
> +"$UPROBE_TARGET" 30000 &
> +busy_pid=$!
> +sleep 0.05
> +
> +echo 1 > monitors/tlob/enable
> +echo "p ${UPROBE_TARGET}:${busy_offset} ${stop_offset} threshold=5000000" >
> "$TLOB_MONITOR"
> +
> +# Binding must appear in monitor file with canonical hex-offset format.
> +grep -qE "^p ${UPROBE_TARGET}:0x[0-9a-f]+ 0x[0-9a-f]+ threshold=[0-9]+$"
> "$TLOB_MONITOR"
> +grep -q "threshold=5000000" "$TLOB_MONITOR"
> +
> +# Duplicate offset_start must be rejected.
> +! echo "p ${UPROBE_TARGET}:${busy_offset} ${stop_offset} threshold=9999" >
> "$TLOB_MONITOR" 2>/dev/null
> +
> +# Remove the binding; it must no longer appear.
> +echo "-${UPROBE_TARGET}:${busy_offset}" > "$TLOB_MONITOR"
> +! grep -q "^p .*:0x${busy_offset#0x} " "$TLOB_MONITOR"
> +
> +kill "$busy_pid" 2>/dev/null; wait "$busy_pid" 2>/dev/null || true
> +echo 0 > monitors/tlob/enable
> diff --git
> a/tools/testing/selftests/verification/test.d/tlob/uprobe_detail_sleeping.tc
> b/tools/testing/selftests/verification/test.d/tlob/uprobe_detail_sleeping.tc
> new file mode 100644
> index 000000000000..2b8656e0fef1
> --- /dev/null
> +++
> b/tools/testing/selftests/verification/test.d/tlob/uprobe_detail_sleeping.tc
> @@ -0,0 +1,47 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# description: Test uprobe detail sleeping (sleeping_ns dominates when task
> blocks between probes)
> +# requires: tlob:monitor tlob_ioctl:program tlob_target:program
> +
> +TLOB_HELPER=$(command -v tlob_ioctl)
> +UPROBE_TARGET=$(command -v tlob_target)
> +TLOB_MONITOR=monitors/tlob/monitor
> +
> +start_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_sleep_work
> 2>/dev/null)
> +stop_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_sleep_work_done
> 2>/dev/null)
> +[ -n "$start_offset" ] || exit_unsupported
> +[ -n "$stop_offset" ] || exit_unsupported
> +
> +"$UPROBE_TARGET" 5000 sleep &
> +busy_pid=$!
> +sleep 0.05
> +
> +echo 1 > /sys/kernel/tracing/events/rv/detail_env_tlob/enable
> +echo 1 > /sys/kernel/tracing/tracing_on
> +echo 1 > monitors/tlob/enable
> +echo > /sys/kernel/tracing/trace
> +
> +# 50 ms budget; task sleeps 200 ms per iteration -> sleeping_ns dominates.
> +echo "p ${UPROBE_TARGET}:${start_offset} ${stop_offset} threshold=50000" >
> "$TLOB_MONITOR"
> +
> +found=0; i=0
> +while [ "$i" -lt 30 ]; do
> +	sleep 0.1
> +	grep -q "detail_env_tlob" /sys/kernel/tracing/trace && { found=1;
> break; }
> +	i=$((i+1))
> +done
> +
> +echo "-${UPROBE_TARGET}:${start_offset}" > "$TLOB_MONITOR" 2>/dev/null
> +kill "$busy_pid" 2>/dev/null; wait "$busy_pid" 2>/dev/null || true
> +echo 0 > /sys/kernel/tracing/events/rv/detail_env_tlob/enable
> +echo 0 > monitors/tlob/enable
> +
> +[ "$found" = "1" ]
> +
> +line=$(grep "detail_env_tlob" /sys/kernel/tracing/trace | head -n 1)
> +running=$(echo "$line" | sed 's/.*running_ns=\([0-9]*\).*/\1/')
> +waiting=$(echo "$line" | sed 's/.*waiting_ns=\([0-9]*\).*/\1/')
> +sleeping=$(echo "$line" | sed 's/.*sleeping_ns=\([0-9]*\).*/\1/')
> +[ "$sleeping" -gt "$((running + waiting))" ]
> +
> +echo > /sys/kernel/tracing/trace
> diff --git
> a/tools/testing/selftests/verification/test.d/tlob/uprobe_detail_waiting.tc
> b/tools/testing/selftests/verification/test.d/tlob/uprobe_detail_waiting.tc
> new file mode 100644
> index 000000000000..0705854f24df
> --- /dev/null
> +++
> b/tools/testing/selftests/verification/test.d/tlob/uprobe_detail_waiting.tc
> @@ -0,0 +1,60 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# description: Test uprobe detail waiting (waiting_ns dominates when task is
> preempted between probes)
> +# requires: tlob:monitor tlob_ioctl:program tlob_target:program
> +
> +TLOB_HELPER=$(command -v tlob_ioctl)
> +UPROBE_TARGET=$(command -v tlob_target)
> +TLOB_MONITOR=monitors/tlob/monitor
> +
> +command -v chrt    > /dev/null || exit_unsupported
> +command -v taskset > /dev/null || exit_unsupported
> +
> +start_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_preempt_work
> 2>/dev/null)
> +stop_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET"
> tlob_preempt_work_done 2>/dev/null)
> +[ -n "$start_offset" ] || exit_unsupported
> +[ -n "$stop_offset" ]  || exit_unsupported
> +
> +cpu=0
> +
> +echo 1 > /sys/kernel/tracing/events/rv/detail_env_tlob/enable
> +echo 1 > /sys/kernel/tracing/tracing_on
> +echo 1 > monitors/tlob/enable
> +echo > /sys/kernel/tracing/trace
> +
> +# Register probe before the target starts so the start uprobe fires on the
> +# first entry to tlob_preempt_work. Budget: 500 ms.
> +echo "p ${UPROBE_TARGET}:${start_offset} ${stop_offset} threshold=500000" >
> "$TLOB_MONITOR"
> +
> +# Target starts; start probe fires on tlob_preempt_work entry.
> +taskset -c "$cpu" "$UPROBE_TARGET" 5000 preempt &
> +busy_pid=$!
> +sleep 0.05
> +
> +# RT hog on the same CPU preempts the target; target stays in waiting state
> +# (runnable, off-CPU) until the budget expires -> waiting_ns dominates.
> +chrt -f 99 taskset -c "$cpu" sh -c 'while true; do :; done' 2>/dev/null &
> +hog_pid=$!
> +
> +found=0; i=0
> +while [ "$i" -lt 30 ]; do
> +	sleep 0.1
> +	grep -q "detail_env_tlob" /sys/kernel/tracing/trace && { found=1;
> break; }
> +	i=$((i+1))
> +done
> +
> +echo "-${UPROBE_TARGET}:${start_offset}" > "$TLOB_MONITOR" 2>/dev/null
> +kill "$hog_pid" 2>/dev/null; wait "$hog_pid" 2>/dev/null || true
> +kill "$busy_pid" 2>/dev/null; wait "$busy_pid" 2>/dev/null || true
> +echo 0 > /sys/kernel/tracing/events/rv/detail_env_tlob/enable
> +echo 0 > monitors/tlob/enable
> +
> +[ "$found" = "1" ]
> +
> +line=$(grep "detail_env_tlob" /sys/kernel/tracing/trace | head -n 1)
> +running=$(echo "$line" | sed 's/.*running_ns=\([0-9]*\).*/\1/')
> +sleeping=$(echo "$line" | sed 's/.*sleeping_ns=\([0-9]*\).*/\1/')
> +waiting=$(echo "$line" | sed 's/.*waiting_ns=\([0-9]*\).*/\1/')
> +[ "$waiting" -gt "$((running + sleeping))" ]
> +
> +echo > /sys/kernel/tracing/trace
> diff --git a/tools/testing/selftests/verification/test.d/tlob/uprobe_multi.tc
> b/tools/testing/selftests/verification/test.d/tlob/uprobe_multi.tc
> new file mode 100644
> index 000000000000..c4b8f7108ae9
> --- /dev/null
> +++ b/tools/testing/selftests/verification/test.d/tlob/uprobe_multi.tc
> @@ -0,0 +1,60 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# description: Test two uprobe bindings on same binary (different offsets
> fire independently)
> +# requires: tlob:monitor tlob_ioctl:program tlob_target:program
> +
> +TLOB_HELPER=$(command -v tlob_ioctl)
> +UPROBE_TARGET=$(command -v tlob_target)
> +TLOB_MONITOR=monitors/tlob/monitor
> +
> +busy_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_busy_work
> 2>/dev/null)
> +busy_stop=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_busy_work_done
> 2>/dev/null)
> +sleep_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_sleep_work
> 2>/dev/null)
> +sleep_stop=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_sleep_work_done
> 2>/dev/null)
> +[ -n "$busy_offset" ]  || exit_unsupported
> +[ -n "$busy_stop" ]    || exit_unsupported
> +[ -n "$sleep_offset" ] || exit_unsupported
> +[ -n "$sleep_stop" ]   || exit_unsupported
> +
> +"$UPROBE_TARGET" 30000 &       # busy mode: tlob_busy_work fires every 200 ms
> +busy_pid=$!
> +"$UPROBE_TARGET" 30000 sleep & # sleep mode: tlob_sleep_work fires every 200
> ms
> +sleep_pid=$!
> +sleep 0.05
> +
> +echo 1 > /sys/kernel/tracing/events/rv/error_env_tlob/enable
> +echo 1 > /sys/kernel/tracing/events/rv/detail_env_tlob/enable
> +echo 1 > /sys/kernel/tracing/tracing_on
> +echo 1 > monitors/tlob/enable
> +echo > /sys/kernel/tracing/trace
> +
> +# Binding A: 5 s budget on the busy probe - must not fire in 200 ms loops.
> +echo "p ${UPROBE_TARGET}:${busy_offset} ${busy_stop} threshold=5000000" >
> "$TLOB_MONITOR"
> +# Binding B: 10 ns budget on the sleep probe - fires on first invocation.
> +echo "p ${UPROBE_TARGET}:${sleep_offset} ${sleep_stop} threshold=10" >
> "$TLOB_MONITOR"
> +
> +# Wait up to 2 s for error_env_tlob from binding B.
> +found=0; i=0
> +while [ "$i" -lt 20 ]; do
> +	sleep 0.1
> +	grep -q "error_env_tlob" /sys/kernel/tracing/trace && { found=1;
> break; }
> +	i=$((i+1))
> +done
> +
> +echo "-${UPROBE_TARGET}:${busy_offset}" > "$TLOB_MONITOR" 2>/dev/null
> +echo "-${UPROBE_TARGET}:${sleep_offset}" > "$TLOB_MONITOR" 2>/dev/null
> +kill "$sleep_pid" 2>/dev/null; wait "$sleep_pid" 2>/dev/null || true
> +kill "$busy_pid" 2>/dev/null; wait "$busy_pid" 2>/dev/null || true
> +
> +echo 0 > monitors/tlob/enable
> +echo 0 > /sys/kernel/tracing/events/rv/error_env_tlob/enable
> +echo 0 > /sys/kernel/tracing/events/rv/detail_env_tlob/enable
> +
> +[ "$found" = "1" ]
> +# error_env_tlob payload: label and clock variable must be present.
> +grep "error_env_tlob" /sys/kernel/tracing/trace | head -n 1 | grep -q
> "budget_exceeded"
> +grep "error_env_tlob" /sys/kernel/tracing/trace | head -n 1 | grep -q
> "clk_elapsed="
> +# detail_env_tlob must appear alongside the error.
> +grep -q "detail_env_tlob" /sys/kernel/tracing/trace
> +
> +echo > /sys/kernel/tracing/trace
> diff --git
> a/tools/testing/selftests/verification/test.d/tlob/uprobe_no_event.tc
> b/tools/testing/selftests/verification/test.d/tlob/uprobe_no_event.tc
> new file mode 100644
> index 000000000000..4a74853346e3
> --- /dev/null
> +++ b/tools/testing/selftests/verification/test.d/tlob/uprobe_no_event.tc
> @@ -0,0 +1,19 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# description: Test no spurious error_env_tlob events without an active
> uprobe binding
> +# requires: tlob:monitor tlob_ioctl:program
> +
> +TLOB_MONITOR=monitors/tlob/monitor
> +
> +echo 1 > /sys/kernel/tracing/events/rv/error_env_tlob/enable
> +echo 1 > /sys/kernel/tracing/tracing_on
> +echo 1 > monitors/tlob/enable
> +echo > /sys/kernel/tracing/trace
> +
> +sleep 0.5
> +
> +! grep -q "error_env_tlob" /sys/kernel/tracing/trace
> +
> +echo 0 > monitors/tlob/enable
> +echo 0 > /sys/kernel/tracing/events/rv/error_env_tlob/enable
> +echo > /sys/kernel/tracing/trace
> diff --git
> a/tools/testing/selftests/verification/test.d/tlob/uprobe_violation.tc
> b/tools/testing/selftests/verification/test.d/tlob/uprobe_violation.tc
> new file mode 100644
> index 000000000000..624fdb950f6b
> --- /dev/null
> +++ b/tools/testing/selftests/verification/test.d/tlob/uprobe_violation.tc
> @@ -0,0 +1,60 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# description: Test uprobe violation (error_env_tlob and detail_env_tlob fire
> with correct fields)
> +# requires: tlob:monitor tlob_ioctl:program tlob_target:program
> +
> +TLOB_HELPER=$(command -v tlob_ioctl)
> +UPROBE_TARGET=$(command -v tlob_target)
> +TLOB_MONITOR=monitors/tlob/monitor
> +
> +busy_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_busy_work
> 2>/dev/null)
> +stop_offset=$("$TLOB_HELPER" sym_offset "$UPROBE_TARGET" tlob_busy_work_done
> 2>/dev/null)
> +[ -n "$busy_offset" ] || exit_unsupported
> +[ -n "$stop_offset" ] || exit_unsupported
> +
> +"$UPROBE_TARGET" 30000 &
> +busy_pid=$!
> +sleep 0.05
> +
> +echo 1 > /sys/kernel/tracing/events/rv/error_env_tlob/enable
> +echo 1 > /sys/kernel/tracing/events/rv/detail_env_tlob/enable
> +echo 1 > /sys/kernel/tracing/tracing_on
> +echo 1 > monitors/tlob/enable
> +echo > /sys/kernel/tracing/trace
> +
> +# 10 ns budget - fires almost immediately; task is busy-spinning on-CPU.
> +echo "p ${UPROBE_TARGET}:${busy_offset} ${stop_offset} threshold=10" >
> "$TLOB_MONITOR"
> +
> +# wait up to 2 s for detail_env_tlob
> +found=0; i=0
> +while [ "$i" -lt 20 ]; do
> +	sleep 0.1
> +	grep -q "detail_env_tlob" /sys/kernel/tracing/trace && { found=1;
> break; }
> +	i=$((i+1))
> +done
> +
> +echo "-${UPROBE_TARGET}:${busy_offset}" > "$TLOB_MONITOR" 2>/dev/null
> +kill "$busy_pid" 2>/dev/null; wait "$busy_pid" 2>/dev/null || true
> +echo 0 > /sys/kernel/tracing/events/rv/error_env_tlob/enable
> +echo 0 > /sys/kernel/tracing/events/rv/detail_env_tlob/enable
> +echo 0 > monitors/tlob/enable
> +
> +[ "$found" = "1" ]
> +
> +# error_env_tlob event label must be budget_exceeded
> +grep "error_env_tlob" /sys/kernel/tracing/trace | head -n 1 | grep -q
> "budget_exceeded"
> +
> +# detail_env_tlob must have all five fields with the correct threshold
> +line=$(grep "detail_env_tlob" /sys/kernel/tracing/trace | head -n 1)
> +echo "$line" | grep -q "pid="
> +echo "$line" | grep -q "threshold_us=10"
> +echo "$line" | grep -q "running_ns="
> +echo "$line" | grep -q "waiting_ns="
> +echo "$line" | grep -q "sleeping_ns="
> +
> +# Busy-spin keeps the task on-CPU: running_ns must exceed sleeping_ns.
> +running=$(echo "$line" | sed 's/.*running_ns=\([0-9]*\).*/\1/')
> +sleeping=$(echo "$line" | sed 's/.*sleeping_ns=\([0-9]*\).*/\1/')
> +[ "$running" -gt "$sleeping" ]
> +
> +echo > /sys/kernel/tracing/trace
> diff --git a/tools/testing/selftests/verification/tlob/Makefile
> b/tools/testing/selftests/verification/tlob/Makefile
> new file mode 100644
> index 000000000000..1bedf946cb34
> --- /dev/null
> +++ b/tools/testing/selftests/verification/tlob/Makefile
> @@ -0,0 +1,21 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Builds tlob selftest helper binaries.
> +#
> +# Invoked by ../Makefile; pass OUTDIR to control the output directory
> +# and TOOLS_INCLUDES for the in-tree UAPI -isystem flag.
> +
> +OUTDIR ?= $(CURDIR)/..
> +CFLAGS += $(TOOLS_INCLUDES)
> +
> +.PHONY: all
> +all: $(OUTDIR)/tlob_ioctl $(OUTDIR)/tlob_target
> +
> +$(OUTDIR)/tlob_ioctl: tlob_ioctl.c
> +	$(CC) $(CFLAGS) -o $@ $< -lpthread
> +
> +$(OUTDIR)/tlob_target: tlob_target.c
> +	$(CC) $(CFLAGS) -o $@ $<
> +
> +.PHONY: clean
> +clean:
> +	$(RM) $(OUTDIR)/tlob_ioctl $(OUTDIR)/tlob_target
> diff --git a/tools/testing/selftests/verification/tlob/tlob_ioctl.c
> b/tools/testing/selftests/verification/tlob/tlob_ioctl.c
> new file mode 100644
> index 000000000000..abb4e2e80a2c
> --- /dev/null
> +++ b/tools/testing/selftests/verification/tlob/tlob_ioctl.c
> @@ -0,0 +1,626 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * tlob_ioctl.c - ioctl test driver and ELF utility for tlob selftests
> + *
> + * Usage: tlob_ioctl <subcommand> [args...]
> + *
> + *   not_enabled          - TRACE_START without monitor enabled -> ENODEV
> + *   within_budget        - sleep within budget -> 0
> + *   over_budget_running  - busy-spin past budget -> EOVERFLOW
> + *   over_budget_sleeping - sleep past budget -> EOVERFLOW
> + *   over_budget_waiting  - sched_yield into waiting state -> EOVERFLOW
> + *   double_start         - two starts without stop -> EALREADY
> + *   stop_no_start        - stop without start -> EINVAL
> + *   multi_thread         - two fds: thread A within budget, thread B over
> + *   bench                - TRACE_START/STOP latency (TAP output, always
> passes)
> + *   sym_offset <binary> <symbol> - print ELF file offset of symbol
> + *
> + * Exit: 0 = pass, 1 = fail, 2 = skip (device not available).
> + */
> +#define _GNU_SOURCE
> +#include <elf.h>
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <pthread.h>
> +#include <sched.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <sys/ioctl.h>
> +#include <sys/mman.h>
> +#include <sys/stat.h>
> +#include <time.h>
> +#include <unistd.h>
> +
> +#include <linux/rv.h>
> +
> +static int rv_fd = -1;
> +
> +static int open_rv(void)
> +{
> +	struct rv_bind_args bind = { .monitor_name = "tlob" };
> +
> +	rv_fd = open("/dev/rv", O_RDWR);
> +	if (rv_fd < 0) {
> +		fprintf(stderr, "open /dev/rv: %s\n", strerror(errno));
> +		return -1;
> +	}
> +	if (ioctl(rv_fd, RV_IOCTL_BIND_MONITOR, &bind) < 0) {
> +		fprintf(stderr, "bind tlob: %s\n", strerror(errno));
> +		close(rv_fd);
> +		rv_fd = -1;
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +static void busy_spin_us(unsigned long us)
> +{
> +	struct timespec start, now;
> +	unsigned long elapsed;
> +
> +	clock_gettime(CLOCK_MONOTONIC, &start);
> +	do {
> +		clock_gettime(CLOCK_MONOTONIC, &now);
> +		elapsed = (unsigned long)(now.tv_sec - start.tv_sec)
> +			  * 1000000000UL
> +			+ (unsigned long)(now.tv_nsec - start.tv_nsec);
> +	} while (elapsed < us * 1000UL);
> +}
> +
> +static int trace_start(uint64_t threshold_us)
> +{
> +	struct tlob_start_args args = {
> +		.threshold_us = threshold_us,
> +	};
> +
> +	return ioctl(rv_fd, TLOB_IOCTL_TRACE_START, &args);
> +}
> +
> +static int trace_stop(void)
> +{
> +	return ioctl(rv_fd, TLOB_IOCTL_TRACE_STOP, NULL);
> +}
> +
> +/* Synchronous TRACE_START / TRACE_STOP tests */
> +
> +/* Bind to a disabled monitor must return ENODEV without crashing */
> +static int test_not_enabled(void)
> +{
> +	struct rv_bind_args bind = { .monitor_name = "tlob" };
> +	int fd;
> +	int ret;
> +
> +	fd = open("/dev/rv", O_RDWR);
> +	if (fd < 0) {
> +		fprintf(stderr, "open /dev/rv: %s\n", strerror(errno));
> +		return 2; /* skip */
> +	}
> +
> +	ret = ioctl(fd, RV_IOCTL_BIND_MONITOR, &bind);
> +	close(fd);
> +
> +	if (ret == 0) {
> +		fprintf(stderr, "RV_IOCTL_BIND_MONITOR: expected ENODEV, got
> success\n");
> +		return 1;
> +	}
> +	if (errno != ENODEV) {
> +		fprintf(stderr, "RV_IOCTL_BIND_MONITOR: expected ENODEV, got
> %s\n",
> +			strerror(errno));
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +static int test_within_budget(void)
> +{
> +	int ret;
> +
> +	/* 50 ms budget */
> +	if (trace_start(50000) < 0) {
> +		fprintf(stderr, "TRACE_START: %s\n", strerror(errno));
> +		return 1;
> +	}
> +	usleep(10000); /* 10 ms */
> +	ret = trace_stop();
> +	if (ret != 0) {
> +		fprintf(stderr, "TRACE_STOP: expected 0, got %d errno=%s\n",
> +			ret, strerror(errno));
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +static int test_over_budget_running(void)
> +{
> +	int ret;
> +
> +	/* 1 ms budget */
> +	if (trace_start(1000) < 0) {
> +		fprintf(stderr, "TRACE_START: %s\n", strerror(errno));
> +		return 1;
> +	}
> +	busy_spin_us(100000); /* 100 ms */
> +	ret = trace_stop();
> +	if (ret == 0) {
> +		fprintf(stderr, "TRACE_STOP: expected EOVERFLOW, got 0\n");
> +		return 1;
> +	}
> +	if (errno != EOVERFLOW) {
> +		fprintf(stderr, "TRACE_STOP: expected EOVERFLOW, got %s\n",
> +			strerror(errno));
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +static int test_over_budget_sleeping(void)
> +{
> +	int ret;
> +
> +	/* 3 ms budget */
> +	if (trace_start(3000) < 0) {
> +		fprintf(stderr, "TRACE_START: %s\n", strerror(errno));
> +		return 1;
> +	}
> +	usleep(50000); /* 50 ms; sleeping time counts toward budget */
> +	ret = trace_stop();
> +	if (ret == 0) {
> +		fprintf(stderr, "TRACE_STOP: expected EOVERFLOW, got 0\n");
> +		return 1;
> +	}
> +	if (errno != EOVERFLOW) {
> +		fprintf(stderr, "TRACE_STOP: expected EOVERFLOW, got %s\n",
> +			strerror(errno));
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +static int test_over_budget_waiting(void)
> +{
> +	int ret;
> +
> +	/* 1 us budget */
> +	if (trace_start(1) < 0) {
> +		fprintf(stderr, "TRACE_START: %s\n", strerror(errno));
> +		return 1;
> +	}
> +	sched_yield(); /* running -> waiting -> running */
> +	busy_spin_us(10); /* 10 us >> 1 us budget; hrtimer fires during spin
> */
> +	ret = trace_stop();
> +	if (ret == 0) {
> +		fprintf(stderr, "TRACE_STOP: expected EOVERFLOW, got 0\n");
> +		return 1;
> +	}
> +	if (errno != EOVERFLOW) {
> +		fprintf(stderr, "TRACE_STOP: expected EOVERFLOW, got %s\n",
> +			strerror(errno));
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +/* Error-handling tests */
> +
> +static int test_double_start(void)
> +{
> +	int ret;
> +
> +	/* 10 s: large enough the hrtimer won't fire during the test */
> +	if (trace_start(10000000ULL) < 0) {
> +		fprintf(stderr, "first TRACE_START: %s\n", strerror(errno));
> +		return 1;
> +	}
> +	ret = trace_start(10000000);
> +	if (ret == 0) {
> +		fprintf(stderr, "second TRACE_START: expected EALREADY, got
> 0\n");
> +		trace_stop();
> +		return 1;
> +	}
> +	if (errno != EALREADY) {
> +		fprintf(stderr, "second TRACE_START: expected EALREADY, got
> %s\n",
> +			strerror(errno));
> +		trace_stop();
> +		return 1;
> +	}
> +	trace_stop();
> +	return 0;
> +}
> +
> +static int test_stop_no_start(void)
> +{
> +	int ret;
> +
> +	/* Ensure clean state: ignore error from a stale entry */
> +	trace_stop();
> +
> +	ret = trace_stop();
> +	if (ret == 0) {
> +		fprintf(stderr, "TRACE_STOP: expected EINVAL, got 0\n");
> +		return 1;
> +	}
> +	if (errno != EINVAL) {
> +		fprintf(stderr, "TRACE_STOP: expected EINVAL, got %s\n",
> +			strerror(errno));
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +/* Two threads, each with its own fd: A within budget, B over budget. */
> +
> +struct mt_thread_args {
> +	uint64_t      threshold_us;
> +	unsigned long workload_us;
> +	int           busy;
> +	int           expect_eoverflow;
> +	int           result;
> +};
> +
> +static void *mt_thread_fn(void *arg)
> +{
> +	struct mt_thread_args *a = arg;
> +	struct tlob_start_args args = { .threshold_us = a->threshold_us };
> +	struct rv_bind_args bind = { .monitor_name = "tlob" };
> +	int fd;
> +	int ret;
> +
> +	fd = open("/dev/rv", O_RDWR);
> +	if (fd < 0) {
> +		fprintf(stderr, "thread open /dev/rv: %s\n",
> strerror(errno));
> +		a->result = 1;
> +		return NULL;
> +	}
> +	if (ioctl(fd, RV_IOCTL_BIND_MONITOR, &bind) < 0) {
> +		fprintf(stderr, "thread bind tlob: %s\n", strerror(errno));
> +		close(fd);
> +		a->result = 1;
> +		return NULL;
> +	}
> +
> +	ret = ioctl(fd, TLOB_IOCTL_TRACE_START, &args);
> +	if (ret < 0) {
> +		fprintf(stderr, "thread TRACE_START: %s\n", strerror(errno));
> +		close(fd);
> +		a->result = 1;
> +		return NULL;
> +	}
> +
> +	if (a->busy)
> +		busy_spin_us(a->workload_us);
> +	else
> +		usleep(a->workload_us);
> +
> +	ret = ioctl(fd, TLOB_IOCTL_TRACE_STOP, NULL);
> +	if (a->expect_eoverflow) {
> +		if (ret == 0 || errno != EOVERFLOW) {
> +			fprintf(stderr, "thread: expected EOVERFLOW, got
> ret=%d errno=%s\n",
> +				ret, strerror(errno));
> +			close(fd);
> +			a->result = 1;
> +			return NULL;
> +		}
> +	} else {
> +		if (ret != 0) {
> +			fprintf(stderr, "thread: expected 0, got ret=%d
> errno=%s\n",
> +				ret, strerror(errno));
> +			close(fd);
> +			a->result = 1;
> +			return NULL;
> +		}
> +	}
> +	close(fd);
> +	a->result = 0;
> +	return NULL;
> +}
> +
> +static int test_multi_thread(void)
> +{
> +	pthread_t ta, tb;
> +	struct mt_thread_args a = {
> +		.threshold_us     = 20000,   /* 20 ms */
> +		.workload_us      = 5000,    /* 5 ms sleep -> within budget
> */
> +		.busy             = 0,
> +		.expect_eoverflow = 0,
> +	};
> +	struct mt_thread_args b = {
> +		.threshold_us     = 3000,    /* 3 ms */
> +		.workload_us      = 30000,   /* 30 ms spin -> over budget */
> +		.busy             = 1,
> +		.expect_eoverflow = 1,
> +	};
> +
> +	pthread_create(&ta, NULL, mt_thread_fn, &a);
> +	pthread_create(&tb, NULL, mt_thread_fn, &b);
> +	pthread_join(ta, NULL);
> +	pthread_join(tb, NULL);
> +
> +	return (a.result || b.result) ? 1 : 0;
> +}
> +
> +/*
> + * Benchmark TRACE_START, TRACE_STOP, and round-trip ioctls.
> + * Output uses TAP '#' prefix; always returns 0.
> + */
> +#define BENCH_WARMUP  32
> +#define BENCH_N      1000
> +
> +static long long timespec_diff_ns(const struct timespec *a,
> +				   const struct timespec *b)
> +{
> +	return (long long)(b->tv_sec - a->tv_sec) * 1000000000LL
> +		+ (b->tv_nsec - a->tv_nsec);
> +}
> +
> +static int test_bench(void)
> +{
> +	struct tlob_start_args args = {
> +		.threshold_us = 10000000ULL, /* 10 s */
> +	};
> +	struct timespec t0, t1;
> +	long long total_start_ns = 0, total_stop_ns = 0, total_rt_ns = 0;
> +	int i;
> +
> +	/* warm up */
> +	for (i = 0; i < BENCH_WARMUP; i++) {
> +		if (ioctl(rv_fd, TLOB_IOCTL_TRACE_START, &args) == 0)
> +			ioctl(rv_fd, TLOB_IOCTL_TRACE_STOP, NULL);
> +	}
> +
> +	/* start only */
> +	for (i = 0; i < BENCH_N; i++) {
> +		clock_gettime(CLOCK_MONOTONIC, &t0);
> +		ioctl(rv_fd, TLOB_IOCTL_TRACE_START, &args);
> +		clock_gettime(CLOCK_MONOTONIC, &t1);
> +		total_start_ns += timespec_diff_ns(&t0, &t1);
> +		ioctl(rv_fd, TLOB_IOCTL_TRACE_STOP, NULL);
> +	}
> +
> +	/* stop only */
> +	for (i = 0; i < BENCH_N; i++) {
> +		ioctl(rv_fd, TLOB_IOCTL_TRACE_START, &args);
> +		clock_gettime(CLOCK_MONOTONIC, &t0);
> +		ioctl(rv_fd, TLOB_IOCTL_TRACE_STOP, NULL);
> +		clock_gettime(CLOCK_MONOTONIC, &t1);
> +		total_stop_ns += timespec_diff_ns(&t0, &t1);
> +	}
> +
> +	/* round-trip */
> +	clock_gettime(CLOCK_MONOTONIC, &t0);
> +	for (i = 0; i < BENCH_N; i++) {
> +		ioctl(rv_fd, TLOB_IOCTL_TRACE_START, &args);
> +		ioctl(rv_fd, TLOB_IOCTL_TRACE_STOP, NULL);
> +	}
> +	clock_gettime(CLOCK_MONOTONIC, &t1);
> +	total_rt_ns = timespec_diff_ns(&t0, &t1);
> +
> +	printf("# start ioctl only:      %lld ns/iter (N=%d, includes
> syscall)\n",
> +	       total_start_ns / BENCH_N, BENCH_N);
> +	printf("# stop ioctl only:       %lld ns/iter (N=%d, includes
> syscall)\n",
> +	       total_stop_ns / BENCH_N, BENCH_N);
> +	printf("# start+stop roundtrip:  %lld ns/iter (N=%d, includes 2
> syscalls)\n",
> +	       total_rt_ns / BENCH_N, BENCH_N);
> +	return 0;
> +}
> +
> +/*
> + * Print the ELF file offset of <symname> in <binary>.  Walks .symtab
> + * (falling back to .dynsym) and converts vaddr to file offset via PT_LOAD.
> + * Supports 32- and 64-bit ELF.
> + */
> +static int sym_offset(const char *binary, const char *symname)
> +{
> +	int fd;
> +	struct stat st;
> +	void *map;
> +	Elf64_Ehdr *ehdr;
> +	Elf32_Ehdr *ehdr32;
> +	int is64;
> +	uint64_t sym_vaddr = 0;
> +	int found = 0;
> +	uint64_t file_offset = 0;
> +
> +	fd = open(binary, O_RDONLY);
> +	if (fd < 0) {
> +		fprintf(stderr, "open %s: %s\n", binary, strerror(errno));
> +		return 1;
> +	}
> +	if (fstat(fd, &st) < 0) {
> +		close(fd);
> +		return 1;
> +	}
> +	map = mmap(NULL, (size_t)st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
> +	close(fd);
> +	if (map == MAP_FAILED) {
> +		fprintf(stderr, "mmap: %s\n", strerror(errno));
> +		return 1;
> +	}
> +
> +	ehdr = (Elf64_Ehdr *)map;
> +	ehdr32 = (Elf32_Ehdr *)map;
> +	if (st.st_size < 4 ||
> +	    ehdr->e_ident[EI_MAG0] != ELFMAG0 ||
> +	    ehdr->e_ident[EI_MAG1] != ELFMAG1 ||
> +	    ehdr->e_ident[EI_MAG2] != ELFMAG2 ||
> +	    ehdr->e_ident[EI_MAG3] != ELFMAG3) {
> +		fprintf(stderr, "%s: not an ELF file\n", binary);
> +		munmap(map, (size_t)st.st_size);
> +		return 1;
> +	}
> +	is64 = (ehdr->e_ident[EI_CLASS] == ELFCLASS64);
> +
> +	if (is64) {
> +		Elf64_Shdr *shdrs = (Elf64_Shdr *)((char *)map + ehdr-
> >e_shoff);
> +		Elf64_Shdr *shstrtab_hdr = &shdrs[ehdr->e_shstrndx];
> +		const char *shstrtab = (char *)map + shstrtab_hdr->sh_offset;
> +		int si;
> +
> +		/* prefer .symtab; fall back to .dynsym */
> +		for (int pass = 0; pass < 2 && !found; pass++) {
> +			const char *target = pass ? ".dynsym" : ".symtab";
> +
> +			for (si = 0; si < ehdr->e_shnum && !found; si++) {
> +				Elf64_Shdr *sh = &shdrs[si];
> +				const char *name = shstrtab + sh->sh_name;
> +
> +				if (strcmp(name, target) != 0)
> +					continue;
> +
> +				Elf64_Shdr *strtab_sh = &shdrs[sh->sh_link];
> +				const char *strtab = (char *)map + strtab_sh-
> >sh_offset;
> +				Elf64_Sym *syms = (Elf64_Sym *)((char *)map +
> sh->sh_offset);
> +				uint64_t nsyms = sh->sh_size /
> sizeof(Elf64_Sym);
> +				uint64_t j;
> +
> +				for (j = 0; j < nsyms; j++) {
> +					if (strcmp(strtab + syms[j].st_name,
> symname) == 0) {
> +						sym_vaddr = syms[j].st_value;
> +						found = 1;
> +						break;
> +					}
> +				}
> +			}
> +		}
> +
> +		if (!found) {
> +			fprintf(stderr, "symbol '%s' not found in %s\n",
> symname, binary);
> +			munmap(map, (size_t)st.st_size);
> +			return 1;
> +		}
> +
> +		/* Convert vaddr to file offset via PT_LOAD segments */
> +		Elf64_Phdr *phdrs = (Elf64_Phdr *)((char *)map + ehdr-
> >e_phoff);
> +		int pi;
> +
> +		for (pi = 0; pi < ehdr->e_phnum; pi++) {
> +			Elf64_Phdr *ph = &phdrs[pi];
> +
> +			if (ph->p_type != PT_LOAD)
> +				continue;
> +			if (sym_vaddr >= ph->p_vaddr &&
> +			    sym_vaddr < ph->p_vaddr + ph->p_filesz) {
> +				file_offset = sym_vaddr - ph->p_vaddr + ph-
> >p_offset;
> +				break;
> +			}
> +		}
> +	} else {
> +		/* 32-bit ELF */
> +		Elf32_Shdr *shdrs = (Elf32_Shdr *)((char *)map + ehdr32-
> >e_shoff);
> +		Elf32_Shdr *shstrtab_hdr = &shdrs[ehdr32->e_shstrndx];
> +		const char *shstrtab = (char *)map + shstrtab_hdr->sh_offset;
> +		int si;
> +		uint32_t sym_vaddr32 = 0;
> +
> +		for (int pass = 0; pass < 2 && !found; pass++) {
> +			const char *target = pass ? ".dynsym" : ".symtab";
> +
> +			for (si = 0; si < ehdr32->e_shnum && !found; si++) {
> +				Elf32_Shdr *sh = &shdrs[si];
> +				const char *name = shstrtab + sh->sh_name;
> +
> +				if (strcmp(name, target) != 0)
> +					continue;
> +
> +				Elf32_Shdr *strtab_sh = &shdrs[sh->sh_link];
> +				const char *strtab = (char *)map + strtab_sh-
> >sh_offset;
> +				Elf32_Sym *syms = (Elf32_Sym *)((char *)map +
> sh->sh_offset);
> +				uint32_t nsyms = sh->sh_size /
> sizeof(Elf32_Sym);
> +				uint32_t j;
> +
> +				for (j = 0; j < nsyms; j++) {
> +					if (strcmp(strtab + syms[j].st_name,
> symname) == 0) {
> +						sym_vaddr32 =
> syms[j].st_value;
> +						found = 1;
> +						break;
> +					}
> +				}
> +			}
> +		}
> +
> +		if (!found) {
> +			fprintf(stderr, "symbol '%s' not found in %s\n",
> symname, binary);
> +			munmap(map, (size_t)st.st_size);
> +			return 1;
> +		}
> +
> +		Elf32_Phdr *phdrs = (Elf32_Phdr *)((char *)map + ehdr32-
> >e_phoff);
> +		int pi;
> +
> +		for (pi = 0; pi < ehdr32->e_phnum; pi++) {
> +			Elf32_Phdr *ph = &phdrs[pi];
> +
> +			if (ph->p_type != PT_LOAD)
> +				continue;
> +			if (sym_vaddr32 >= ph->p_vaddr &&
> +			    sym_vaddr32 < ph->p_vaddr + ph->p_filesz) {
> +				file_offset = sym_vaddr32 - ph->p_vaddr + ph-
> >p_offset;
> +				break;
> +			}
> +		}
> +		sym_vaddr = sym_vaddr32;
> +	}
> +
> +	munmap(map, (size_t)st.st_size);
> +
> +	if (!file_offset && sym_vaddr) {
> +		fprintf(stderr, "could not map vaddr 0x%lx to file offset\n",
> +			(unsigned long)sym_vaddr);
> +		return 1;
> +	}
> +
> +	printf("0x%lx\n", (unsigned long)file_offset);
> +	return 0;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	int rc;
> +
> +	if (argc < 2) {
> +		fprintf(stderr, "Usage: %s <subcommand> [args...]\n",
> argv[0]);
> +		return 1;
> +	}
> +
> +	/* sym_offset does not need /dev/rv */
> +	if (strcmp(argv[1], "sym_offset") == 0) {
> +		if (argc < 4) {
> +			fprintf(stderr, "Usage: %s sym_offset <binary>
> <symbol>\n",
> +				argv[0]);
> +			return 1;
> +		}
> +		return sym_offset(argv[2], argv[3]);
> +	}
> +
> +	/* not_enabled: monitor is disabled; bind must return ENODEV without
> open_rv() */
> +	if (strcmp(argv[1], "not_enabled") == 0)
> +		return test_not_enabled();
> +
> +	if (open_rv() < 0)
> +		return 2; /* skip */
> +
> +	if (strcmp(argv[1], "bench") == 0)
> +		rc = test_bench();
> +	else if (strcmp(argv[1], "within_budget") == 0)
> +		rc = test_within_budget();
> +	else if (strcmp(argv[1], "over_budget_running") == 0)
> +		rc = test_over_budget_running();
> +	else if (strcmp(argv[1], "over_budget_sleeping") == 0)
> +		rc = test_over_budget_sleeping();
> +	else if (strcmp(argv[1], "over_budget_waiting") == 0)
> +		rc = test_over_budget_waiting();
> +	else if (strcmp(argv[1], "double_start") == 0)
> +		rc = test_double_start();
> +	else if (strcmp(argv[1], "stop_no_start") == 0)
> +		rc = test_stop_no_start();
> +	else if (strcmp(argv[1], "multi_thread") == 0)
> +		rc = test_multi_thread();
> +	else {
> +		fprintf(stderr, "Unknown test: %s\n", argv[1]);
> +		rc = 1;
> +	}
> +
> +	close(rv_fd);
> +	return rc;
> +}
> diff --git a/tools/testing/selftests/verification/tlob/tlob_target.c
> b/tools/testing/selftests/verification/tlob/tlob_target.c
> new file mode 100644
> index 000000000000..0fdbc575d71d
> --- /dev/null
> +++ b/tools/testing/selftests/verification/tlob/tlob_target.c
> @@ -0,0 +1,138 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * tlob_target.c - uprobe target binary for tlob selftests.
> + *
> + * Provides three start/stop probe pairs, each designed to exercise a
> + * different dominant component of the detail_env_tlob ns breakdown:
> + *
> + *   tlob_busy_work    / tlob_busy_work_done    - busy-spin: running_ns
> dominates
> + *   tlob_sleep_work   / tlob_sleep_work_done   - nanosleep: sleeping_ns
> dominates
> + *   tlob_preempt_work / tlob_preempt_work_done - busy-spin: waiting_ns
> dominates
> + *                                                (needs an RT competitor on
> the same CPU)
> + *
> + * Usage: tlob_target <duration_ms> [mode]
> + *
> + * mode is one of: busy (default), sleep, preempt.
> + * Loops in 200 ms iterations until <duration_ms> has elapsed
> + * (0 = run for ~24 hours).
> + */
> +#define _GNU_SOURCE
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <time.h>
> +
> +#ifndef noinline
> +#define noinline __attribute__((noinline))
> +#endif
> +
> +static inline int timespec_before(const struct timespec *a,
> +				   const struct timespec *b)
> +{
> +	return a->tv_sec < b->tv_sec ||
> +	       (a->tv_sec == b->tv_sec && a->tv_nsec < b->tv_nsec);
> +}
> +
> +static void timespec_add_ms(struct timespec *ts, unsigned long ms)
> +{
> +	ts->tv_sec  += ms / 1000;
> +	ts->tv_nsec += (long)(ms % 1000) * 1000000L;
> +	if (ts->tv_nsec >= 1000000000L) {
> +		ts->tv_sec++;
> +		ts->tv_nsec -= 1000000000L;
> +	}
> +}
> +
> +/* stop probe; noinline keeps the entry point visible to uprobes */
> +noinline void tlob_busy_work_done(void)
> +{
> +	/* empty: uprobe fires on entry */
> +}
> +
> +/* start probe; busy-spin so running_ns dominates */
> +noinline void tlob_busy_work(unsigned long duration_ns)
> +{
> +	struct timespec start, now;
> +	unsigned long elapsed;
> +
> +	clock_gettime(CLOCK_MONOTONIC, &start);
> +	do {
> +		clock_gettime(CLOCK_MONOTONIC, &now);
> +		elapsed = (unsigned long)(now.tv_sec - start.tv_sec)
> +			  * 1000000000UL
> +			+ (unsigned long)(now.tv_nsec - start.tv_nsec);
> +	} while (elapsed < duration_ns);
> +
> +	tlob_busy_work_done();
> +}
> +
> +/* stop probe; noinline keeps the entry point visible to uprobes */
> +noinline void tlob_sleep_work_done(void)
> +{
> +	/* empty: uprobe fires on entry */
> +}
> +
> +/* start probe; nanosleep so sleeping_ns dominates */
> +noinline void tlob_sleep_work(unsigned long duration_ms)
> +{
> +	struct timespec ts = {
> +		.tv_sec  = duration_ms / 1000,
> +		.tv_nsec = (long)(duration_ms % 1000) * 1000000L,
> +	};
> +	nanosleep(&ts, NULL);
> +	tlob_sleep_work_done();
> +}
> +
> +/* stop probe; noinline keeps the entry point visible to uprobes */
> +noinline void tlob_preempt_work_done(void)
> +{
> +	/* empty: uprobe fires on entry */
> +}
> +
> +/*
> + * start probe; busy-spin so an RT competitor on the same CPU drives
> + * waiting_ns (prev_state==0 -> preempt event, task stays runnable off-CPU).
> + */
> +noinline void tlob_preempt_work(unsigned long duration_ms)
> +{
> +	struct timespec start, now;
> +	unsigned long elapsed;
> +
> +	clock_gettime(CLOCK_MONOTONIC, &start);
> +	do {
> +		clock_gettime(CLOCK_MONOTONIC, &now);
> +		elapsed = (unsigned long)(now.tv_sec - start.tv_sec)
> +			  * 1000000000UL
> +			+ (unsigned long)(now.tv_nsec - start.tv_nsec);
> +	} while (elapsed < duration_ms * 1000000UL);
> +
> +	tlob_preempt_work_done();
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	unsigned long duration_ms = 0;
> +	const char *mode = "busy";
> +	struct timespec deadline, now;
> +
> +	if (argc >= 2)
> +		duration_ms = strtoul(argv[1], NULL, 10);
> +	if (argc >= 3)
> +		mode = argv[2];
> +
> +	clock_gettime(CLOCK_MONOTONIC, &deadline);
> +	timespec_add_ms(&deadline, duration_ms ? duration_ms : 86400000UL);
> +
> +	do {
> +		if (strcmp(mode, "sleep") == 0)
> +			tlob_sleep_work(200);
> +		else if (strcmp(mode, "preempt") == 0)
> +			tlob_preempt_work(200);
> +		else
> +			tlob_busy_work(200 * 1000000UL);
> +		clock_gettime(CLOCK_MONOTONIC, &now);
> +	} while (timespec_before(&now, &deadline));
> +
> +	return 0;
> +}


^ permalink raw reply

* [PATCH v3 01/11] io_uring: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: io-uring, Steven Rostedt, linux-trace-kernel, Vineeth Pillai,
	Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 io_uring/io_uring.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index e612a66ee80e..1b657b714373 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -312,7 +312,7 @@ static __always_inline bool io_fill_cqe_req(struct io_ring_ctx *ctx,
 	}
 
 	if (trace_io_uring_complete_enabled())
-		trace_io_uring_complete(req->ctx, req, cqe);
+		trace_call__io_uring_complete(req->ctx, req, cqe);
 	return true;
 }
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 02/11] net: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Aaron Conole, Eelco Chaudron, Ilya Maximets,
	Marcelo Ricardo Leitner, Xin Long, Jon Maloy
  Cc: netdev, bpf, dev, linux-sctp, tipc-discussion, Steven Rostedt,
	linux-trace-kernel, Vineeth Pillai, Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 net/core/dev.c             | 2 +-
 net/core/xdp.c             | 2 +-
 net/openvswitch/actions.c  | 2 +-
 net/openvswitch/datapath.c | 2 +-
 net/sctp/outqueue.c        | 2 +-
 net/tipc/node.c            | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8bfa8313ef62..12a583ce4d95 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6482,7 +6482,7 @@ void netif_receive_skb_list(struct list_head *head)
 		return;
 	if (trace_netif_receive_skb_list_entry_enabled()) {
 		list_for_each_entry(skb, head, list)
-			trace_netif_receive_skb_list_entry(skb);
+			trace_call__netif_receive_skb_list_entry(skb);
 	}
 	netif_receive_skb_list_internal(head);
 	trace_netif_receive_skb_list_exit(0);
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 9890a30584ba..3003e5c57419 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -362,7 +362,7 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 		xsk_pool_set_rxq_info(allocator, xdp_rxq);
 
 	if (trace_mem_connect_enabled() && xdp_alloc)
-		trace_mem_connect(xdp_alloc, xdp_rxq);
+		trace_call__mem_connect(xdp_alloc, xdp_rxq);
 	return 0;
 }
 
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 140388a18ae0..7b7c93c3bde4 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -1260,7 +1260,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 		int err = 0;
 
 		if (trace_ovs_do_execute_action_enabled())
-			trace_ovs_do_execute_action(dp, skb, key, a, rem);
+			trace_call__ovs_do_execute_action(dp, skb, key, a, rem);
 
 		/* Actions that rightfully have to consume the skb should do it
 		 * and return directly.
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index bbbde50fc649..f2b6688f18d6 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -335,7 +335,7 @@ int ovs_dp_upcall(struct datapath *dp, struct sk_buff *skb,
 	int err;
 
 	if (trace_ovs_dp_upcall_enabled())
-		trace_ovs_dp_upcall(dp, skb, key, upcall_info);
+		trace_call__ovs_dp_upcall(dp, skb, key, upcall_info);
 
 	if (upcall_info->portid == 0) {
 		err = -ENOTCONN;
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index f6b8c13dafa4..4025d863ffc8 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -1267,7 +1267,7 @@ int sctp_outq_sack(struct sctp_outq *q, struct sctp_chunk *chunk)
 	/* SCTP path tracepoint for congestion control debugging. */
 	if (trace_sctp_probe_path_enabled()) {
 		list_for_each_entry(transport, transport_list, transports)
-			trace_sctp_probe_path(transport, asoc);
+			trace_call__sctp_probe_path(transport, asoc);
 	}
 
 	sack_ctsn = ntohl(sack->cum_tsn_ack);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 97aa970a0d83..6cfe4c40c82b 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1943,7 +1943,7 @@ static bool tipc_node_check_state(struct tipc_node *n, struct sk_buff *skb,
 
 	if (trace_tipc_node_check_state_enabled()) {
 		trace_tipc_skb_dump(skb, false, "skb for node state check");
-		trace_tipc_node_check_state(n, true, " ");
+		trace_call__tipc_node_check_state(n, true, " ");
 	}
 	l = n->links[bearer_id].link;
 	if (!l)
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 03/11] accel/habanalabs: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: Koby Elbaz, Konstantin Sinyuk, Oded Gabbay
  Cc: dri-devel, Steven Rostedt, linux-trace-kernel, Vineeth Pillai,
	Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 drivers/accel/habanalabs/common/device.c  | 9 +++++----
 drivers/accel/habanalabs/common/mmu/mmu.c | 3 ++-
 drivers/accel/habanalabs/common/pci/pci.c | 6 ++++--
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c
index 09b27bac3a31..68c6f973d53f 100644
--- a/drivers/accel/habanalabs/common/device.c
+++ b/drivers/accel/habanalabs/common/device.c
@@ -132,8 +132,9 @@ static void *hl_dma_alloc_common(struct hl_device *hdev, size_t size, dma_addr_t
 	}
 
 	if (trace_habanalabs_dma_alloc_enabled() && !ZERO_OR_NULL_PTR(ptr))
-		trace_habanalabs_dma_alloc(&(hdev)->pdev->dev, (u64) (uintptr_t) ptr, *dma_handle,
-						size, caller);
+		trace_call__habanalabs_dma_alloc(&(hdev)->pdev->dev,
+						 (u64) (uintptr_t) ptr,
+						 *dma_handle, size, caller);
 
 	return ptr;
 }
@@ -2656,7 +2657,7 @@ inline u32 hl_rreg(struct hl_device *hdev, u32 reg)
 	u32 val = readl(hdev->rmmio + reg);
 
 	if (unlikely(trace_habanalabs_rreg32_enabled()))
-		trace_habanalabs_rreg32(&(hdev)->pdev->dev, reg, val);
+		trace_call__habanalabs_rreg32(&(hdev)->pdev->dev, reg, val);
 
 	return val;
 }
@@ -2674,7 +2675,7 @@ inline u32 hl_rreg(struct hl_device *hdev, u32 reg)
 inline void hl_wreg(struct hl_device *hdev, u32 reg, u32 val)
 {
 	if (unlikely(trace_habanalabs_wreg32_enabled()))
-		trace_habanalabs_wreg32(&(hdev)->pdev->dev, reg, val);
+		trace_call__habanalabs_wreg32(&(hdev)->pdev->dev, reg, val);
 
 	writel(val, hdev->rmmio + reg);
 }
diff --git a/drivers/accel/habanalabs/common/mmu/mmu.c b/drivers/accel/habanalabs/common/mmu/mmu.c
index 6c7c4ff8a8a9..dd8b6fb3aa1f 100644
--- a/drivers/accel/habanalabs/common/mmu/mmu.c
+++ b/drivers/accel/habanalabs/common/mmu/mmu.c
@@ -263,7 +263,8 @@ int hl_mmu_unmap_page(struct hl_ctx *ctx, u64 virt_addr, u32 page_size, bool flu
 		mmu_funcs->flush(ctx);
 
 	if (trace_habanalabs_mmu_unmap_enabled() && !rc)
-		trace_habanalabs_mmu_unmap(&hdev->pdev->dev, virt_addr, 0, page_size, flush_pte);
+		trace_call__habanalabs_mmu_unmap(&hdev->pdev->dev, virt_addr, 0,
+						 page_size, flush_pte);
 
 	return rc;
 }
diff --git a/drivers/accel/habanalabs/common/pci/pci.c b/drivers/accel/habanalabs/common/pci/pci.c
index 81cbd8697d4c..12663e8e12e0 100644
--- a/drivers/accel/habanalabs/common/pci/pci.c
+++ b/drivers/accel/habanalabs/common/pci/pci.c
@@ -123,7 +123,8 @@ int hl_pci_elbi_read(struct hl_device *hdev, u64 addr, u32 *data)
 		pci_read_config_dword(pdev, mmPCI_CONFIG_ELBI_DATA, data);
 
 		if (unlikely(trace_habanalabs_elbi_read_enabled()))
-			trace_habanalabs_elbi_read(&hdev->pdev->dev, (u32) addr, val);
+			trace_call__habanalabs_elbi_read(&hdev->pdev->dev,
+							 (u32) addr, val);
 
 		return 0;
 	}
@@ -186,7 +187,8 @@ static int hl_pci_elbi_write(struct hl_device *hdev, u64 addr, u32 data)
 
 	if ((val & PCI_CONFIG_ELBI_STS_MASK) == PCI_CONFIG_ELBI_STS_DONE) {
 		if (unlikely(trace_habanalabs_elbi_write_enabled()))
-			trace_habanalabs_elbi_write(&hdev->pdev->dev, (u32) addr, val);
+			trace_call__habanalabs_elbi_write(&hdev->pdev->dev,
+							  (u32) addr, val);
 		return 0;
 	}
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 04/11] devfreq: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi
  Cc: linux-pm, Steven Rostedt, linux-trace-kernel, Vineeth Pillai,
	Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 drivers/devfreq/devfreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 82dd9a43dc62..9f71d9dc4a70 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -370,7 +370,7 @@ static int devfreq_set_target(struct devfreq *devfreq, unsigned long new_freq,
 	 * change order of between devfreq device and passive devfreq device.
 	 */
 	if (trace_devfreq_frequency_enabled() && new_freq != cur_freq)
-		trace_devfreq_frequency(devfreq, new_freq, cur_freq);
+		trace_call__devfreq_frequency(devfreq, new_freq, cur_freq);
 
 	freqs.new = new_freq;
 	devfreq_notify_transition(devfreq, &freqs, DEVFREQ_POSTCHANGE);
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 05/11] dma-buf: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: Sumit Semwal, Christian König
  Cc: linux-media, dri-devel, linaro-mm-sig, Steven Rostedt,
	linux-trace-kernel, Vineeth Pillai, Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 drivers/dma-buf/dma-fence.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index a2aa82f4eedd..a41cdd9c9343 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -553,7 +553,7 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
 	}
 	if (trace_dma_fence_wait_end_enabled()) {
 		rcu_read_lock();
-		trace_dma_fence_wait_end(fence);
+		trace_call__dma_fence_wait_end(fence);
 		rcu_read_unlock();
 	}
 	return ret;
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 06/11] drm: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie, Simona Vetter,
	Harry Wentland, Leo Li, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann
  Cc: amd-gfx, dri-devel, Steven Rostedt, linux-trace-kernel,
	Vineeth Pillai, Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c            |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  4 ++--
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +++++-----
 drivers/gpu/drm/scheduler/sched_entity.c          |  5 +++--
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index b24d5d21be5f..cb0b5cb07d57 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1004,7 +1004,7 @@ static void trace_amdgpu_cs_ibs(struct amdgpu_cs_parser *p)
 		struct amdgpu_job *job = p->jobs[i];
 
 		for (j = 0; j < job->num_ibs; ++j)
-			trace_amdgpu_cs(p, job, &job->ibs[j]);
+			trace_call__amdgpu_cs(p, job, &job->ibs[j]);
 	}
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9ba9de16a27a..a36ae94c425f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1415,7 +1415,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 
 	if (trace_amdgpu_vm_bo_mapping_enabled()) {
 		list_for_each_entry(mapping, &bo_va->valids, list)
-			trace_amdgpu_vm_bo_mapping(mapping);
+			trace_call__amdgpu_vm_bo_mapping(mapping);
 	}
 
 error_free:
@@ -2183,7 +2183,7 @@ void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx *ticket)
 				continue;
 		}
 
-		trace_amdgpu_vm_bo_cs(mapping);
+		trace_call__amdgpu_vm_bo_cs(mapping);
 	}
 }
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 5fc5d5608506..fbdc12cdd6bb 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5263,11 +5263,11 @@ static void amdgpu_dm_backlight_set_level(struct amdgpu_display_manager *dm,
 	}
 
 	if (trace_amdgpu_dm_brightness_enabled()) {
-		trace_amdgpu_dm_brightness(__builtin_return_address(0),
-					   user_brightness,
-					   brightness,
-					   caps->aux_support,
-					   power_supply_is_system_supplied() > 0);
+		trace_call__amdgpu_dm_brightness(__builtin_return_address(0),
+						 user_brightness,
+						 brightness,
+						 caps->aux_support,
+						 power_supply_is_system_supplied() > 0);
 	}
 
 	if (caps->aux_support) {
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index fe174a4857be..185a2636b599 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -429,7 +429,8 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity,
 
 	if (trace_drm_sched_job_unschedulable_enabled() &&
 	    !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &entity->dependency->flags))
-		trace_drm_sched_job_unschedulable(sched_job, entity->dependency);
+		trace_call__drm_sched_job_unschedulable(sched_job,
+							entity->dependency);
 
 	if (!dma_fence_add_callback(entity->dependency, &entity->cb,
 				    drm_sched_entity_wakeup))
@@ -586,7 +587,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 		unsigned long index;
 
 		xa_for_each(&sched_job->dependencies, index, entry)
-			trace_drm_sched_job_add_dep(sched_job, entry);
+			trace_call__drm_sched_job_add_dep(sched_job, entry);
 	}
 	atomic_inc(entity->rq->sched->score);
 	WRITE_ONCE(entity->last_user, current->group_leader);
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 07/11] HID: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: Srinivas Pandruvada, Jiri Kosina, Benjamin Tissoires
  Cc: linux-input, Steven Rostedt, linux-trace-kernel, Vineeth Pillai,
	Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 drivers/hid/intel-ish-hid/ipc/pci-ish.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hid/intel-ish-hid/ipc/pci-ish.c b/drivers/hid/intel-ish-hid/ipc/pci-ish.c
index ed3405c05e73..8d36ae96a3ee 100644
--- a/drivers/hid/intel-ish-hid/ipc/pci-ish.c
+++ b/drivers/hid/intel-ish-hid/ipc/pci-ish.c
@@ -110,7 +110,7 @@ void ish_event_tracer(struct ishtp_device *dev, const char *format, ...)
 		vsnprintf(tmp_buf, sizeof(tmp_buf), format, args);
 		va_end(args);
 
-		trace_ishtp_dump(tmp_buf);
+		trace_call__ishtp_dump(tmp_buf);
 	}
 }
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 08/11] scsi: ufs: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen
  Cc: linux-scsi, Steven Rostedt, linux-trace-kernel, Vineeth Pillai,
	Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 drivers/ufs/core/ufshcd.c | 37 +++++++++++++++++++------------------
 1 file changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index c3f08957d179..07f3126d2a94 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -421,8 +421,8 @@ static void ufshcd_add_cmd_upiu_trace(struct ufs_hba *hba,
 	else
 		header = &lrb->ucd_rsp_ptr->header;
 
-	trace_ufshcd_upiu(hba, str_t, header, &rq->sc.cdb,
-			  UFS_TSF_CDB);
+	trace_call__ufshcd_upiu(hba, str_t, header, &rq->sc.cdb,
+			       UFS_TSF_CDB);
 }
 
 static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba,
@@ -432,8 +432,8 @@ static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba,
 	if (!trace_ufshcd_upiu_enabled())
 		return;
 
-	trace_ufshcd_upiu(hba, str_t, &rq_rsp->header,
-			  &rq_rsp->qr, UFS_TSF_OSF);
+	trace_call__ufshcd_upiu(hba, str_t, &rq_rsp->header,
+			       &rq_rsp->qr, UFS_TSF_OSF);
 }
 
 static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
@@ -445,15 +445,15 @@ static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
 		return;
 
 	if (str_t == UFS_TM_SEND)
-		trace_ufshcd_upiu(hba, str_t,
-				  &descp->upiu_req.req_header,
-				  &descp->upiu_req.input_param1,
-				  UFS_TSF_TM_INPUT);
+		trace_call__ufshcd_upiu(hba, str_t,
+					&descp->upiu_req.req_header,
+					&descp->upiu_req.input_param1,
+					UFS_TSF_TM_INPUT);
 	else
-		trace_ufshcd_upiu(hba, str_t,
-				  &descp->upiu_rsp.rsp_header,
-				  &descp->upiu_rsp.output_param1,
-				  UFS_TSF_TM_OUTPUT);
+		trace_call__ufshcd_upiu(hba, str_t,
+					&descp->upiu_rsp.rsp_header,
+					&descp->upiu_rsp.output_param1,
+					UFS_TSF_TM_OUTPUT);
 }
 
 static void ufshcd_add_uic_command_trace(struct ufs_hba *hba,
@@ -470,10 +470,10 @@ static void ufshcd_add_uic_command_trace(struct ufs_hba *hba,
 	else
 		cmd = ufshcd_readl(hba, REG_UIC_COMMAND);
 
-	trace_ufshcd_uic_command(hba, str_t, cmd,
-				 ufshcd_readl(hba, REG_UIC_COMMAND_ARG_1),
-				 ufshcd_readl(hba, REG_UIC_COMMAND_ARG_2),
-				 ufshcd_readl(hba, REG_UIC_COMMAND_ARG_3));
+	trace_call__ufshcd_uic_command(hba, str_t, cmd,
+				       ufshcd_readl(hba, REG_UIC_COMMAND_ARG_1),
+				       ufshcd_readl(hba, REG_UIC_COMMAND_ARG_2),
+				       ufshcd_readl(hba, REG_UIC_COMMAND_ARG_3));
 }
 
 static void ufshcd_add_command_trace(struct ufs_hba *hba, struct scsi_cmnd *cmd,
@@ -522,8 +522,9 @@ static void ufshcd_add_command_trace(struct ufs_hba *hba, struct scsi_cmnd *cmd,
 	} else {
 		doorbell = ufshcd_readl(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL);
 	}
-	trace_ufshcd_command(cmd->device, hba, str_t, tag, doorbell, hwq_id,
-			     transfer_len, intr, lba, opcode, group_id);
+	trace_call__ufshcd_command(cmd->device, hba, str_t, tag, doorbell,
+				   hwq_id, transfer_len, intr, lba, opcode,
+				   group_id);
 }
 
 static void ufshcd_print_clk_freqs(struct ufs_hba *hba)
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 09/11] net: devlink: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: netdev, Steven Rostedt, linux-trace-kernel, Vineeth Pillai,
	Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 net/devlink/trap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/devlink/trap.c b/net/devlink/trap.c
index 8edb31654a68..d54276dcd62f 100644
--- a/net/devlink/trap.c
+++ b/net/devlink/trap.c
@@ -1497,7 +1497,7 @@ void devlink_trap_report(struct devlink *devlink, struct sk_buff *skb,
 
 		devlink_trap_report_metadata_set(&metadata, trap_item,
 						 in_devlink_port, fa_cookie);
-		trace_devlink_trap_report(devlink, skb, &metadata);
+		trace_call__devlink_trap_report(devlink, skb, &metadata);
 	}
 }
 EXPORT_SYMBOL_GPL(devlink_trap_report);
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 10/11] kernel: time, trace: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 13:59 UTC (permalink / raw)
  To: Anna-Maria Behnsen, Frederic Weisbecker, Ingo Molnar,
	Thomas Gleixner, Steven Rostedt, Masami Hiramatsu
  Cc: linux-kernel, linux-trace-kernel, Vineeth Pillai, Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 kernel/time/tick-sched.c       | 12 ++++++------
 kernel/trace/trace_benchmark.c |  2 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index cbbb87a0c6e7..3b42ee75f48c 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -348,32 +348,32 @@ static bool check_tick_dependency(atomic_t *dep)
 		return !!val;
 
 	if (val & TICK_DEP_MASK_POSIX_TIMER) {
-		trace_tick_stop(0, TICK_DEP_MASK_POSIX_TIMER);
+		trace_call__tick_stop(0, TICK_DEP_MASK_POSIX_TIMER);
 		return true;
 	}
 
 	if (val & TICK_DEP_MASK_PERF_EVENTS) {
-		trace_tick_stop(0, TICK_DEP_MASK_PERF_EVENTS);
+		trace_call__tick_stop(0, TICK_DEP_MASK_PERF_EVENTS);
 		return true;
 	}
 
 	if (val & TICK_DEP_MASK_SCHED) {
-		trace_tick_stop(0, TICK_DEP_MASK_SCHED);
+		trace_call__tick_stop(0, TICK_DEP_MASK_SCHED);
 		return true;
 	}
 
 	if (val & TICK_DEP_MASK_CLOCK_UNSTABLE) {
-		trace_tick_stop(0, TICK_DEP_MASK_CLOCK_UNSTABLE);
+		trace_call__tick_stop(0, TICK_DEP_MASK_CLOCK_UNSTABLE);
 		return true;
 	}
 
 	if (val & TICK_DEP_MASK_RCU) {
-		trace_tick_stop(0, TICK_DEP_MASK_RCU);
+		trace_call__tick_stop(0, TICK_DEP_MASK_RCU);
 		return true;
 	}
 
 	if (val & TICK_DEP_MASK_RCU_EXP) {
-		trace_tick_stop(0, TICK_DEP_MASK_RCU_EXP);
+		trace_call__tick_stop(0, TICK_DEP_MASK_RCU_EXP);
 		return true;
 	}
 
diff --git a/kernel/trace/trace_benchmark.c b/kernel/trace/trace_benchmark.c
index e19c32f2a938..189d383934fd 100644
--- a/kernel/trace/trace_benchmark.c
+++ b/kernel/trace/trace_benchmark.c
@@ -51,7 +51,7 @@ static void trace_do_benchmark(void)
 
 	local_irq_disable();
 	start = trace_clock_local();
-	trace_benchmark_event(bm_str, bm_last);
+	trace_call__benchmark_event(bm_str, bm_last);
 	stop = trace_clock_local();
 	local_irq_enable();
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 11/11] x86: msr: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 14:00 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, Vineeth Pillai,
	Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 arch/x86/lib/msr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index dfdd1da89f36..14785fe5e07b 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -125,21 +125,21 @@ EXPORT_SYMBOL_FOR_KVM(msr_clear_bit);
 #ifdef CONFIG_TRACEPOINTS
 void do_trace_write_msr(u32 msr, u64 val, int failed)
 {
-	trace_write_msr(msr, val, failed);
+	trace_call__write_msr(msr, val, failed);
 }
 EXPORT_SYMBOL(do_trace_write_msr);
 EXPORT_TRACEPOINT_SYMBOL(write_msr);
 
 void do_trace_read_msr(u32 msr, u64 val, int failed)
 {
-	trace_read_msr(msr, val, failed);
+	trace_call__read_msr(msr, val, failed);
 }
 EXPORT_SYMBOL(do_trace_read_msr);
 EXPORT_TRACEPOINT_SYMBOL(read_msr);
 
 void do_trace_rdpmc(u32 msr, u64 val, int failed)
 {
-	trace_rdpmc(msr, val, failed);
+	trace_call__rdpmc(msr, val, failed);
 }
 EXPORT_SYMBOL(do_trace_rdpmc);
 EXPORT_TRACEPOINT_SYMBOL(rdpmc);
-- 
2.54.0


^ permalink raw reply related

* [PATCH 1/3] cpufreq: amd-pstate: Use trace_call__##name() at guarded tracepoint call site
From: Vineeth Pillai (Google) @ 2026-05-15 14:01 UTC (permalink / raw)
  To: Huang Rui, Mario Limonciello, Rafael J. Wysocki, Viresh Kumar
  Cc: linux-pm, Steven Rostedt, linux-trace-kernel, Vineeth Pillai,
	Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 drivers/cpufreq/amd-pstate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 453084c67327..4722de25149b 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -368,7 +368,8 @@ static int amd_pstate_set_floor_perf(struct cpufreq_policy *policy, u8 perf)
 
 out_trace:
 	if (trace_amd_pstate_cppc_req2_enabled())
-		trace_amd_pstate_cppc_req2(cpudata->cpu, perf, changed, ret);
+		trace_call__amd_pstate_cppc_req2(cpudata->cpu, perf, changed,
+						 ret);
 	return ret;
 }
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH 2/3] drm/xe: Use trace_call__##name() at guarded tracepoint call site
From: Vineeth Pillai (Google) @ 2026-05-15 14:01 UTC (permalink / raw)
  To: Matthew Brost, Thomas Hellström, Rodrigo Vivi, David Airlie,
	Simona Vetter
  Cc: intel-xe, dri-devel, Steven Rostedt, linux-trace-kernel,
	Vineeth Pillai, Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 drivers/gpu/drm/xe/xe_guc_ct.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index a11cff7a20be..8a10a1ede983 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -1032,8 +1032,9 @@ static int h2g_write(struct xe_guc_ct *ct, const u32 *action, u32 len,
 	 * the fast H2G submission path when tracing is not active.
 	 */
 	if (trace_xe_guc_ctb_h2g_enabled())
-		trace_xe_guc_ctb_h2g(xe, gt->info.id, *(action - 1), full_len,
-				     desc_read(xe, h2g, head), h2g->info.tail);
+		trace_call__xe_guc_ctb_h2g(xe, gt->info.id, *(action - 1),
+					   full_len, desc_read(xe, h2g, head),
+					   h2g->info.tail);
 
 	return 0;
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH 3/3] drm/panthor: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-05-15 14:02 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
  Cc: dri-devel, Steven Rostedt, linux-trace-kernel, Vineeth Pillai,
	Peter Zijlstra

From: Vineeth Pillai <vineeth@bitbyteword.org>

Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Original v2 series:
https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/

Parts of the original v2 series have already been merged in mainline.
This patch is being reposted as a follow-up cleanup for the remaining
unmerged pieces.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
 drivers/gpu/drm/panthor/panthor_fw.c  | 4 ++--
 drivers/gpu/drm/panthor/panthor_gpu.c | 8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 8886002e1d31..601d464b312c 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1080,10 +1080,10 @@ static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
 
 	panthor_sched_report_fw_events(ptdev, status);
 
-	if (tracepoint_enabled(gpu_job_irq) && start) {
+	if (start) {
 		if (check_sub_overflow(ktime_get_ns(), start, &duration))
 			duration = U32_MAX;
-		trace_gpu_job_irq(ptdev->base.dev, status, duration);
+		trace_call__gpu_job_irq(ptdev->base.dev, status, duration);
 	}
 }
 PANTHOR_IRQ_HANDLER(job, JOB, panthor_job_irq_handler);
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index 2ab444ee8c71..b19754d7093c 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -87,10 +87,10 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
 	gpu_write(ptdev, GPU_INT_CLEAR, status);
 
 	if (tracepoint_enabled(gpu_power_status) && (status & GPU_POWER_INTERRUPTS_MASK))
-		trace_gpu_power_status(ptdev->base.dev,
-				       gpu_read64(ptdev, SHADER_READY),
-				       gpu_read64(ptdev, TILER_READY),
-				       gpu_read64(ptdev, L2_READY));
+		trace_call__gpu_power_status(ptdev->base.dev,
+					     gpu_read64(ptdev, SHADER_READY),
+					     gpu_read64(ptdev, TILER_READY),
+					     gpu_read64(ptdev, L2_READY));
 
 	if (status & GPU_IRQ_FAULT) {
 		u32 fault_status = gpu_read(ptdev, GPU_FAULT_STATUS);
-- 
2.54.0


^ permalink raw reply related

* Re: (subset) [PATCH v3 01/11] io_uring: Use trace_call__##name() at guarded tracepoint call sites
From: Jens Axboe @ 2026-05-15 14:02 UTC (permalink / raw)
  To: Vineeth Pillai (Google)
  Cc: io-uring, Steven Rostedt, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <20260515135903.2238731-1-vineeth@bitbyteword.org>


On Fri, 15 May 2026 09:59:03 -0400, Vineeth Pillai (Google) wrote:
> Replace trace_foo() with the new trace_call__foo() at sites already
> guarded by trace_foo_enabled(), avoiding a redundant
> static_branch_unlikely() re-evaluation inside the tracepoint.
> trace_call__foo() calls the tracepoint callbacks directly without
> utilizing the static branch again.
> 
> Original v2 series:
> https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/
> 
> [...]

Applied, thanks!

[01/11] io_uring: Use trace_call__##name() at guarded tracepoint call sites
        commit: cf9a29544a01ff818c7f0a01716dc5e48f8ad7b5

Best regards,
-- 
Jens Axboe




^ permalink raw reply

* Re: [PATCH v3 01/11] io_uring: Use trace_call__##name() at guarded tracepoint call sites
From: Steven Rostedt @ 2026-05-15 14:04 UTC (permalink / raw)
  To: Vineeth Pillai (Google)
  Cc: Jens Axboe, io-uring, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <20260515135903.2238731-1-vineeth@bitbyteword.org>

On Fri, 15 May 2026 09:59:03 -0400
"Vineeth Pillai (Google)" <vineeth@bitbyteword.org> wrote:

> From: Vineeth Pillai <vineeth@bitbyteword.org>
> 

Hi Vineeth,

> Replace trace_foo() with the new trace_call__foo() at sites already
> guarded by trace_foo_enabled(), avoiding a redundant
> static_branch_unlikely() re-evaluation inside the tracepoint.
> trace_call__foo() calls the tracepoint callbacks directly without
> utilizing the static branch again.
> 

> Original v2 series:
> https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/
> 
> Parts of the original v2 series have already been merged in mainline.
> This patch is being reposted as a follow-up cleanup for the remaining
> unmerged pieces.

This part should go below the '---'. There's no reason to add it to the git
change log.

You should probably also state that these can now go in individually as all
the dependencies are upstream.

> 
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
> Assisted-by: Claude:claude-sonnet-4-6
> ---

  <<here>>

Thanks,

-- Steve

>  io_uring/io_uring.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
> index e612a66ee80e..1b657b714373 100644
> --- a/io_uring/io_uring.h
> +++ b/io_uring/io_uring.h
> @@ -312,7 +312,7 @@ static __always_inline bool io_fill_cqe_req(struct io_ring_ctx *ctx,
>  	}
>  
>  	if (trace_io_uring_complete_enabled())
> -		trace_io_uring_complete(req->ctx, req, cqe);
> +		trace_call__io_uring_complete(req->ctx, req, cqe);
>  	return true;
>  }
>  


^ permalink raw reply

* Re: [PATCH v3 01/11] io_uring: Use trace_call__##name() at guarded tracepoint call sites
From: Jens Axboe @ 2026-05-15 14:06 UTC (permalink / raw)
  To: Steven Rostedt, Vineeth Pillai (Google)
  Cc: io-uring, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <20260515100448.715589f6@gandalf.local.home>

On 5/15/26 8:04 AM, Steven Rostedt wrote:
> On Fri, 15 May 2026 09:59:03 -0400
> "Vineeth Pillai (Google)" <vineeth@bitbyteword.org> wrote:
> 
>> From: Vineeth Pillai <vineeth@bitbyteword.org>
>>
> 
> Hi Vineeth,
> 
>> Replace trace_foo() with the new trace_call__foo() at sites already
>> guarded by trace_foo_enabled(), avoiding a redundant
>> static_branch_unlikely() re-evaluation inside the tracepoint.
>> trace_call__foo() calls the tracepoint callbacks directly without
>> utilizing the static branch again.
>>
> 
>> Original v2 series:
>> https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/
>>
>> Parts of the original v2 series have already been merged in mainline.
>> This patch is being reposted as a follow-up cleanup for the remaining
>> unmerged pieces.
> 
> This part should go below the '---'. There's no reason to add it to the git
> change log.

I pruned it.

> You should probably also state that these can now go in individually as all
> the dependencies are upstream.

I think he did, at least that's how I read it.


-- 
Jens Axboe


^ permalink raw reply

* Re: [PATCH v3 01/11] io_uring: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Remanan Pillai @ 2026-05-15 14:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Steven Rostedt, io-uring, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <49e77605-6227-426e-8103-329474bf88f9@kernel.dk>

On Fri, May 15, 2026 at 10:06 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 5/15/26 8:04 AM, Steven Rostedt wrote:
> > On Fri, 15 May 2026 09:59:03 -0400
> > "Vineeth Pillai (Google)" <vineeth@bitbyteword.org> wrote:
> >
> >> From: Vineeth Pillai <vineeth@bitbyteword.org>
> >>
> >
> > Hi Vineeth,
> >
> >> Replace trace_foo() with the new trace_call__foo() at sites already
> >> guarded by trace_foo_enabled(), avoiding a redundant
> >> static_branch_unlikely() re-evaluation inside the tracepoint.
> >> trace_call__foo() calls the tracepoint callbacks directly without
> >> utilizing the static branch again.
> >>
> >
> >> Original v2 series:
> >> https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/
> >>
> >> Parts of the original v2 series have already been merged in mainline.
> >> This patch is being reposted as a follow-up cleanup for the remaining
> >> unmerged pieces.
> >
> > This part should go below the '---'. There's no reason to add it to the git
> > change log.
>
Ahh sorry about this.

> I pruned it.
>
Thanks Jen :-). I can probably send a follow-up email directly to the
maintainers to prune this part, similar to what Jen did. I guess one
more version might feel like spam.


> > You should probably also state that these can now go in individually as all
> > the dependencies are upstream.
>
> I think he did, at least that's how I read it.
>
Yeah my intention was this, not sure if I worded it correctly. I will
include this in the follow-up email to the maintainers for rest of the
patches.

Thanks,
Vineeth

^ permalink raw reply

* Re: [PATCH v3 01/11] io_uring: Use trace_call__##name() at guarded tracepoint call sites
From: Jens Axboe @ 2026-05-15 14:25 UTC (permalink / raw)
  To: Vineeth Remanan Pillai
  Cc: Steven Rostedt, io-uring, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <CAO7JXPg+MJXF8smC9qXs93YziJT_amQwWKVW38L7F5XdS9-SaA@mail.gmail.com>

On 5/15/26 8:14 AM, Vineeth Remanan Pillai wrote:
> Thanks Jen :-). I can probably send a follow-up email directly to the
> maintainers to prune this part, similar to what Jen did. I guess one
> more version might feel like spam.

Jens...

-- 
Jens Axboe

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox