Linux Trace Kernel
 help / color / mirror / Atom feed
From: Gabriele Monaco <gmonaco@redhat.com>
To: wen.yang@linux.dev, Steven Rostedt <rostedt@goodmis.org>
Cc: linux-trace-kernel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 08/10] rv/tlob: add tlob hybrid automaton monitor
Date: Fri, 15 May 2026 15:08:35 +0200	[thread overview]
Message-ID: <9afb6ae43ebde1308eab97a8e8025d4ab5d6da45.camel@redhat.com> (raw)
In-Reply-To: <fe5ed6a9a0a911e6ec74dc06c453786a2c4fb6d1.1778522945.git.wen.yang@linux.dev>

On Tue, 2026-05-12 at 02:24 +0800, wen.yang@linux.dev wrote:
> From: Wen Yang <wen.yang@linux.dev>
> 
> diff --git a/Documentation/trace/rv/monitor_tlob.rst
> b/Documentation/trace/rv/monitor_tlob.rst
> new file mode 100644
> index 000000000000..91b592630b3f
> --- /dev/null
> +++ b/Documentation/trace/rv/monitor_tlob.rst
> +Usage
> +-----
> +
> +tracefs interface (uprobe-based external monitoring)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The ``monitor`` tracefs file instruments an unmodified binary via uprobes.
> +The format follows the ftrace ``uprobe_events`` convention (``PATH:OFFSET``
> +for the probe location, ``key=value`` for configuration parameters)::
> +
> +  p PATH:OFFSET_START OFFSET_STOP threshold=US
> +
> +The uprobe at ``OFFSET_START`` fires ``tlob_start_task()``; the uprobe at
> +``OFFSET_STOP`` fires ``tlob_stop_task()``.  Both offsets are ELF file
> +offsets of entry points in ``PATH``.  ``PATH`` may contain ``:``; the last
> +``:`` in the ``PATH:OFFSET_START`` token is the separator.
> +
> +To remove a binding, use ``-PATH:OFFSET_START``::
> +
> +  echo 1 > /sys/kernel/tracing/rv/monitors/tlob/enable
> +
> +  echo "p /usr/bin/myapp:0x12a0 0x12f0 threshold=5000" \
> +      > /sys/kernel/tracing/rv/monitors/tlob/monitor
> +
> +  # Remove a binding
> +  echo "-/usr/bin/myapp:0x12a0" >
> /sys/kernel/tracing/rv/monitors/tlob/monitor
> +
> +  # List registered bindings
> +  cat /sys/kernel/tracing/rv/monitors/tlob/monitor
> +
> +  # Read violations from the trace buffer
> +  cat /sys/kernel/tracing/trace
> +
> +ioctl self-instrumentation (/dev/rv)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I'm not particularly fond of ioctls, they aren't that flexible and in
this way I don't really see an added value.

In short, you're adding this so a program could instrument itself using
ioctls instead of using uprobes, cannot the same thing be achieved using
uprobes alone, e.g. by registering a function address or the current
instruction pointer?

If you really cannot do it with uprobes alone, wouldn't a sysfs/tracefs file
achieve a similar purpose without much of the boilerplate code?

> +
> +``/dev/rv`` is a shared RV character device.  Before using any monitor-
> specific
> +ioctl, the fd must be bound to a monitor via ``RV_IOCTL_BIND_MONITOR``.  Each
> +open fd has independent per-fd monitoring state::
> +
> +  int fd = open("/dev/rv", O_RDWR);
> +
> +  /* Bind this fd to the tlob monitor. */
> +  struct rv_bind_args bind = { .monitor_name = "tlob" };
> +  ioctl(fd, RV_IOCTL_BIND_MONITOR, &bind);
> +
> +  struct tlob_start_args args = {
> +      .threshold_us = 50000,   /* 50 ms in microseconds */
> +  };
> +  ioctl(fd, TLOB_IOCTL_TRACE_START, &args);
> +
> +  /* ... code path under observation ... */
> +
> +  int ret = ioctl(fd, TLOB_IOCTL_TRACE_STOP, NULL);
> +  /* ret == 0:          within budget  */
> +  /* ret == -EOVERFLOW: budget exceeded */
> +
> +  close(fd);
> +
> +``TRACE_STOP`` returns ``-EOVERFLOW`` whenever the budget was exceeded.
> +The HA timer calls ``da_monitor_reset()`` (storage remains); the
> +synchronous ``ha_cancel_timer_sync()`` in ``tlob_stop_task()`` ensures the
> +callback has completed before checking ``da_monitoring()``.
> +
> +Violation events
> +~~~~~~~~~~~~~~~~

Since you are not documenting the detail_env_tlob tracepoint, is it
something really required?
It's deviating from the original RV purpose (run a model and spot
violations) by adding further accounting, I'm fine with that if there is a
documented need.
In such case I would at the very least document its usage (thought I'd
really like to be rid of it and let the curious user implement the
accounting themselves).
> +
> +Budget violations are always reported via the ``error_env_tlob`` RV
> +tracepoint (HA clock-invariant violation), regardless of which interface
> +triggered them::
> +
> +  cat /sys/kernel/tracing/trace
> +
> +To capture violations in a file::
> +
> +  trace-cmd record -e error_env_tlob &
> +  # ... run workload ...
> +  trace-cmd report
> +

This is standard tracepoints usage, there's nothing about tlob we should
document here. If you feel the existing RV documentation should expand
this subject, feel free to contribute there.

> +tracefs files
> +-------------
> +
> +The following files are created under
> +``/sys/kernel/tracing/rv/monitors/tlob/``:
> +
> +``enable`` (rw)
> +  Write ``1`` to enable the monitor; write ``0`` to disable it.
> +
> +``desc`` (ro)
> +  Human-readable description of the monitor.

Same here, standard RV.

> +
> +``monitor`` (rw)
> +  Write ``p PATH:OFFSET_START OFFSET_STOP threshold=US``
> +  to bind two entry uprobes.  Write ``-PATH:OFFSET_START`` to remove a
> +  binding.  Read to list registered bindings in the same format.

And this is duplicating what mentioned above about uprobes, isn't it?

> +
> +Kernel API
> +----------
> +
> +.. kernel-doc:: kernel/trace/rv/monitors/tlob/tlob.c
> +   :functions: tlob_start_task tlob_stop_task
> +
> +``tlob_start_task(task, threshold_us)``
> +  Begin monitoring *task* with a total latency budget of *threshold_us*
> +  microseconds.  Allocates per-task state, sets initial DA state to
> +  ``running``, resets ``clk_elapsed``, and arms the HA budget timer.
> +  Returns 0, -ENODEV (monitor disabled), -ERANGE (zero threshold),
> +  -EALREADY (already monitoring), -ENOSPC (at capacity), or -ENOMEM.
> +
> +``tlob_stop_task(task)``
> +  Stop monitoring *task*.  Synchronously cancels the HA timer via
> +  ``ha_cancel_timer_sync()``, checks ``da_monitoring()`` to determine
> outcome.
> +  Returns 0 (clean stop, within budget), -EOVERFLOW (budget was exceeded),
> +  -ESRCH (not monitored), or -EAGAIN (concurrent stop racing).
> +

Is kernel code going to use this API? RV monitors are meant to be
enabled by userspace. What's the use-case here?

> +Design notes
> +------------
> +
> +State transitions are driven by two tracepoints:
> +
> +- ``sched_switch``: ``prev_state == 0`` (``TASK_RUNNING``, preempted,
> +  stays on runqueue) → running→waiting; ``prev_state != 0`` (voluntarily
> +  blocked, leaves runqueue) → running→sleeping; ``next`` pointer →
> +  waiting→running.
> +- ``sched_wakeup``: task moves back onto the runqueue → sleeping→waiting.
> +
> +No ``waiting → sleeping`` edge exists because a task can only block
> +itself while executing on CPU.  ``try_to_wake_up()`` is also a no-op
> +when ``__state == TASK_RUNNING``, so ``sched_wakeup`` never fires while
> +the task is in ``waiting`` state.

That's probably a bit too detailed for this page.
If you really want this information somewhere couldn't it stay in the
code?

Thanks,
Gabriele


  parent reply	other threads:[~2026-05-15 13:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 18:24 [RFC PATCH v2 00/10] rv/tlob: Add task latency over budget RV monitor wen.yang
2026-05-11 18:24 ` [RFC PATCH v2 01/10] rv/da: fix monitor start ordering and memory ordering for monitoring flag wen.yang
2026-05-13 12:39   ` Gabriele Monaco
2026-05-11 18:24 ` [RFC PATCH v2 02/10] rv/da: fix per-task da_monitor_destroy() ordering and sync wen.yang
2026-05-12  8:27   ` Gabriele Monaco
2026-05-12  9:09     ` Gabriele Monaco
2026-05-13  5:32       ` Wen Yang
2026-05-13  9:31         ` Gabriele Monaco
2026-05-11 18:24 ` [RFC PATCH v2 03/10] selftests/verification: fix verificationtest-ktap for out-of-tree execution wen.yang
2026-05-13  8:32   ` Gabriele Monaco
2026-05-11 18:24 ` [RFC PATCH v2 04/10] rv/da: add pre-allocated storage pool for per-object monitors wen.yang
2026-05-13 13:47   ` Gabriele Monaco
2026-05-13 13:50   ` Gabriele Monaco
2026-05-13 14:01   ` Gabriele Monaco
2026-05-15  8:30     ` [PATCH] Re: " Gabriele Monaco
2026-05-11 18:24 ` [RFC PATCH v2 05/10] rv: add generic uprobe infrastructure for RV monitors wen.yang
2026-05-11 18:24 ` [RFC PATCH v2 06/10] rvgen: support reset() on the __init arrow for global-window HA clocks wen.yang
2026-05-12 13:25   ` Gabriele Monaco
2026-05-11 18:24 ` [RFC PATCH v2 07/10] rv/tlob: add tlob model DOT file wen.yang
2026-05-11 18:24 ` [RFC PATCH v2 08/10] rv/tlob: add tlob hybrid automaton monitor wen.yang
2026-05-15  9:53   ` Gabriele Monaco
2026-05-15 13:08   ` Gabriele Monaco [this message]
2026-05-11 18:24 ` [RFC PATCH v2 09/10] rv/tlob: add KUnit tests for the tlob monitor wen.yang
2026-05-15 13:13   ` Gabriele Monaco
2026-05-11 18:24 ` [RFC PATCH v2 10/10] selftests/verification: add tlob selftests wen.yang
2026-05-13  7:46   ` Gabriele Monaco
2026-05-15 13:23   ` Gabriele Monaco

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9afb6ae43ebde1308eab97a8e8025d4ab5d6da45.camel@redhat.com \
    --to=gmonaco@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=wen.yang@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox