Linux Trace Kernel
 help / color / mirror / Atom feed
* [RFC PATCH v3 18/28] mm/damon: trace probe_hits
From: SeongJae Park @ 2026-05-16 18:36 UTC (permalink / raw)
  Cc: SeongJae Park, Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
	Steven Rostedt, damon, linux-kernel, linux-mm, linux-trace-kernel
In-Reply-To: <20260516183712.81393-1-sj@kernel.org>

Introduce a new tracepoint for exposing the per-region per-probe
positive sample count via tracefs.

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 include/trace/events/damon.h | 38 ++++++++++++++++++++++++++++++++++++
 mm/damon/core.c              |  9 +++++++++
 2 files changed, 47 insertions(+)

diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h
index 24fc402ab3c85..2fd914895c405 100644
--- a/include/trace/events/damon.h
+++ b/include/trace/events/damon.h
@@ -130,6 +130,44 @@ TRACE_EVENT(damon_monitor_intervals_tune,
 	TP_printk("sample_us=%lu", __entry->sample_us)
 );
 
+TRACE_EVENT_CONDITION(damon_region_aggregated,
+
+	TP_PROTO(unsigned int target_id, struct damon_region *r,
+		unsigned int nr_regions, unsigned int nr_probes),
+
+	TP_ARGS(target_id, r, nr_regions, nr_probes),
+
+	TP_CONDITION(nr_probes > 0),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, target_id)
+		__field(unsigned long, start)
+		__field(unsigned long, end)
+		__field(unsigned int, nr_regions)
+		__field(unsigned int, nr_accesses)
+		__field(unsigned int, age)
+		__dynamic_array(unsigned char, probe_hits, nr_probes)
+	),
+
+	TP_fast_assign(
+		__entry->target_id = target_id;
+		__entry->start = r->ar.start;
+		__entry->end = r->ar.end;
+		__entry->nr_regions = nr_regions;
+		__entry->nr_accesses = r->nr_accesses;
+		__entry->age = r->age;
+		memcpy(__get_dynamic_array(probe_hits), r->probe_hits,
+			sizeof(*r->probe_hits) * nr_probes);
+	),
+
+	TP_printk("target_id=%lu nr_regions=%u %lu-%lu: %u %u probe_hits=%s",
+			__entry->target_id, __entry->nr_regions,
+			__entry->start, __entry->end,
+			__entry->nr_accesses, __entry->age,
+			__print_hex(__get_dynamic_array(probe_hits),
+				__get_dynamic_array_len(probe_hits)))
+);
+
 TRACE_EVENT(damon_aggregated,
 
 	TP_PROTO(unsigned int target_id, struct damon_region *r,
diff --git a/mm/damon/core.c b/mm/damon/core.c
index dde3c8d8fef89..11b513eb077fe 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1881,6 +1881,13 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
 {
 	struct damon_target *t;
 	unsigned int ti = 0;	/* target's index */
+	unsigned int nr_probes = 0;
+	struct damon_probe *probe;
+
+	if (trace_damon_region_aggregated_enabled()) {
+		damon_for_each_probe(probe, c)
+			nr_probes++;
+	}
 
 	damon_for_each_target(t, c) {
 		struct damon_region *r;
@@ -1889,6 +1896,8 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
 			int i;
 
 			trace_damon_aggregated(ti, r, damon_nr_regions(t));
+			trace_damon_region_aggregated(ti, r,
+					damon_nr_regions(t), nr_probes);
 			damon_warn_fix_nr_accesses_corruption(r);
 			r->last_nr_accesses = r->nr_accesses;
 			r->nr_accesses = 0;
-- 
2.47.3

^ permalink raw reply related

* [RFC PATCH v3 00/28] mm/damon: introduce data attributes monitoring
From: SeongJae Park @ 2026-05-16 18:36 UTC (permalink / raw)
  Cc: SeongJae Park, Liam R. Howlett, Andrew Morton, David Hildenbrand,
	Jonathan Corbet, Lorenzo Stoakes, Masami Hiramatsu,
	Mathieu Desnoyers, Michal Hocko, Mike Rapoport, Shuah Khan,
	Shuah Khan, Steven Rostedt, Suren Baghdasaryan, Vlastimil Babka,
	damon, linux-doc, linux-kernel, linux-kselftest, linux-mm,
	linux-trace-kernel

TL; DR
======

Extend DAMON for monitoring general data attributes other than accesses.
The short term motivation is lightweight page type (e.g., belonging
cgroup) aware monitoring.  In long term, this will help extending DAMON
for multiple access events capture primitives (e.g., page faults and
PMU) and eventually pivotting DAMON to a "Data Attributes Monitoring and
Operations eNgine" in long term.

Background: High Cost of Page Level Properties Monitoring
=========================================================

DAMON is initially introduced as a Data Access MONitor.  It has been
extended for not only access monitoring but also data access-aware
system operations (DAMOS).  But still the monitoring part is only for
data accesses.

Data access patterns is good information, but some users need more
holistic views.  Particularly, users want to show the access pattern
information together with the types of the memory.  For example, users
who work for making huge pages efficiently want to know how much of
DAMON-found hot/cold regions are backed by huge pages.  Users who run
multiple workloads with different cgroups want to know how much of
DAMON-found hot/cold regions belong to specific cgroups.

For the user demand, we developed a DAMOS extension for page level
properties based monitoring [1], which has landed on 6.14.  Using the
feature, users can inform the page level data properties that they are
interested in, in a flexible format that uses DAMOS filters.  Then,
DAMON applies the filters to each folio of the entire DAMON region and
lets users know how many bytes of memory in each DAMON region passed the
given filters.

This gives page level detailed and deterministic information to users.
But, because the operation is done at page level, the overhead is
proportional to the memory size.  It was useful for test or debugging
purposes on a small number of machines.  But it was obviously too heavy
to be enabled always on all machines running the real user workloads.
For real world workloads, it was recommended to use the feature with
user-space controlled sampling approaches.  For example, users could do
the page level monitoring only once per hour, on randomly selected one
percent of machines of their fleet.  If the runtime and the  size of the
fleet is long and big enough, it should provide statistically meaningful
data.

But users are too busy to implement such controls on their own.

Data Attributes Monitoring
==========================

Extend DAMON to monitor not only data accesses, but also general data
attributes.  Do the extension while keeping the main promise of DAMON,
the bounded and best-effort minimum overhead.

Allow users to specify what data attributes in addition to the data
access they want to monitor.  Users can install one 'data probe' per
data attribute of their interest for this purpose.  The 'data probe'
should be able to be applied to any memory, and determine if the given
memory has the appropriate data attribute.  E.g., if memory of physical
address 42 belongs to cgroup A.  Each 'data probe' is configured with
filters that are very similar to the DAMOS filters.

When DAMON checks if each sampling address memory of each region is
accessed since the last check, it applies data probes if registered.
Same to the number of access check-positive samples accounting
(nr_accesses), it accounts the number of each data probe-positive
samples in another per-region counters array, namely 'probe_hits'. When
DAMON resets nr_accesses every aggregation interval, it resets
'probe_hits' together.

Users can read 'probe_hits' just before the values are reset.  In this
way, users can know how many hot/cold memory regions have data
attributes of their interest.  E.g., 30 percent of this system's hot
memory is belonging to cgroup A, and 80 percent of the cgroup
A-belonging hot memory is backed by huge pages.

Patches Sequence
================

First eight patches implement the core feature, interface and the
working support.  Patch 1 introduces data probe data structure, namely
damon_probe.  Patch 2 extends damon_ctx for installing data probes.
Patch 3 introduces another data structure for filters of each data
probe, namely damon_filter.  Patch 4 updates damon_ctx commit function
to handle the probes.  Patch 5 extends damon_region for the per-region
per-probe positive samples counter, namely probe_hits.  Patch 6 extends
damon_operations for applying probes on the underlying DAMON operations
implementation.  Patch 7 updates kdamond_fn() to invoke the probes
applying callback.  Patch 8 finally implements the probes support on
paddr ops.

Ten changes for user interface (patches 9-18) come next.  Patches 9-13
implements sysfs directories and files for setting data probes, namely
probes directory, probe directory, filters directory, filter directory
and filter directory internal files, respectively.  Patch 14 connects
the user inputs that are made via the sysfs files to DAMON core.
Following three patches (patches 15-17) implement sysfs directories and
files for showing the probe_hits to users, namely probes directory,
probe directory and hits files, respectively.  Patch 18 introduces a new
tracepoint for showing the probe_hits via tracefs.

Patch 19 adds a selftest for the sysfs files.

Patches 20 and 21 documents the design and usage of the new feature,
respectively.

Seven additional patches (patches 22-28) for monitoring belonging memory
cgroup follow.  Depending on the feedback, this part might be separated
to another series in future.  Patch 22 defines the DAMON filter type for
the new attribute, namely DAMON_FILTER_TYPE_MEMCG.  Patch 23 add the
support on paddr ops.  Patch 24 updates the sysfs interface for setup of
the target memcg.  Patch 25 move code for easy reuse of the filter
target memcg setup.  Patch 26 connects the user input to the core layer.
Finally, patches 27 and 28 update the design and usage documents for the
memcg attribute monitoring support.

Discussions
===========

This allows the page properties monitoring with overhead that is low
enough to be enabled always on real world workloads.  Because the
sampling time for access check is reused for data attributes check,  the
upper-bounded and best-effort minimum overhead of DAMON is kept.
Because the sampling memory for access check is reused for data
attributes check, additional overhead is minimum.

Still DAMOS-based page level properties monitoring should be useful,
because it provides a deterministic page level information.  When in
doubt of the sampling based information, running DAMOS-based one
together and comparing the results would be useful, for debugging and
tuning.

Plan for Dropping RFC tag
=========================

Making changes for feedback from myself, humans and Sashiko should be
the major remaining work.

I'm currently hoping to drop the RFC tag by 7.2-rc1.

Future Works: Mid Term
========================

This version of implementation is limiting the maximum number of data
probes to four.  I will try to find a way to remove the limit in future.
I personally think it should be enough for common use cases, though, and
therefore not giving high priority at the moment.

Future Works: Long Term
=======================

There are user requests for extending DAMON with detailed access
information, for example, per-CPUs/threads/read/writes monitoring.  For
that, I was working [2] on extending DAMON to use page fault events as
another access check primitives, and making the infrastructure flexible
for future use of yet another access check primitive.  Actually there is
another ongoing work [3] for extending DAMON with PMU events.  The
motivation of the work is reducing the overhead, though.

In my work [2], I was introducing a new interface for access sampling
primitives control.  Now I think this data probe interface can be used
for that, too.  That is, data access becomes just one type of data
attribute.  Also, pg_idle-confirmed access, page fault-confirmed access,
and PMU event-confirmed access will be different types of data
attributes.

The regions adjustment mechanism is currently working based on the
access information.  That's because DAMON is designed for data access
monitoring.  That is, data access information is the primary interest,
and therefore DAMON adjusts regions in a way that can best-present the
information.

Once data access becomes just one of data attributes, there is no reason
to think data access that special.  There might be some users not
interested in access at all but want to know the location of memory of
specific type.  Data probes interface will allow doing that.  Further,
we could extend the interface to let users set any data attribute as the
'primary' attribute.  Then, DAMON will split and merge regions in a way
that can best-present the 'primary' attributes.

DAMOS will also be extended, to specify targets based on not only the
data access pattern, but all user-registered data attributes.  From this
stage, we may be able to call DAMON as a "Data Attributes Monitoring and
Operations eNgine".

[1] https://lore.kernel.org/20250106193401.109161-1-sj@kernel.org
[2] https://lore.kernel.org/20251208062943.68824-1-sj@kernel.org/
[3] https://lore.kernel.org/20260423004211.7037-1-akinobu.mita@gmail.com

Changes from RFC v2.2
- rfc v2.2: https://lore.kernel.org/20260515004433.128933-1-sj@kernel.org
- Rename damon_aggregated_v2 trace event to damon_region_aggregated.
- Address Sashiko issues.
  - Enclose arguments on damon_for_each_{probe,filter}[_safe]() macros.
  - Fix typos in comments and documents.
  - Update probe_hits for region split and merge.
  - Add more documentation for damon_operation->apply_probes() callback.
  - Reduce unnecessary folio_{get,put}() in damon_pa_apply_probes().
  - Define damon_sysfs_probe_attrs as static.
  - Link scheme tried region sysfs dir and increase the count only after
    all internal dir population success.
  - Commit damon_filter->memcg_id for newly added filters.
Changes from RFC v2.1
- rfc v2.1: https://lore.kernel.org/20260514140904.119781-1-sj@kernel.org
- Rebase to mm-stable (7.1-rc3) to avoid Sashiko patch apply failure.
Changes from RFC v2
- rfc v2: https://lore.kernel.org/20260512143645.113201-1-sj@kernel.org
- Optimize nr_probes calculation for probe_hits tracepoint.
- Use TRACE_EVENT_CONDITION() for probe_hits tracepoint.
- Rebase to latest mm-new.
Changes from RFC
- rfc: https://lore.kernel.org/all/20260426205222.93895-1-sj@kernel.org/
- Support memcg DAMON filter.
- Use per-probe probe_hits sysfs file.
- Use dynamic_array for probe_hits tracing.
- Fix filter matching field.
- Fix folio leaking in damon_pa_filter_pass().
- Move nr_regions of damon_aggregated_v2 tracepoint after end.
- Rename DAMON_TEST_TYPE_ANON to DAMON_FILTER_TYPE_ANON.

SeongJae Park (28):
  mm/damon/core: introduce struct damon_probe
  mm/damon/core: embed damon_probe objects in damon_ctx
  mm/damon/core: introduce damon_filter
  mm/damon/core: commit probes
  mm/damon/core: introduce damon_region->probe_hits
  mm/damon/core: introduce damon_ops->apply_probes
  mm/damon/core: do data attributes monitoring
  mm/damon/paddr: support data attributes monitoring
  mm/damon/sysfs: implement probes dir
  mm/damon/sysfs: implement probe dir
  mm/damon/sysfs: implement filters directory
  mm/damon/sysfs: implement filter dir
  mm/damon/sysfs: implement filter dir files
  mm/damon/sysfs: setup probes on DAMON core API parameters
  mm/damon/sysfs-schemes: implement tried_regions/<r>/probes/
  mm/damon/sysfs-schemes: implement probe dir
  mm/damon/sysfs-schemes: implement probe/hits file
  mm/damon: trace probe_hits
  selftests/damon/sysfs.sh: test probes dir
  Docs/mm/damon/design: document data attributes monitoring
  Docs/admin-guide/mm/damon/usage: document data attributes monitoring
  mm/damon/core: introduce DAMON_FILTER_TYPE_MEMCG
  mm/damon/paddr: support DAMON_FILTER_TYPE_MEMCG
  mm/damon/sysfs: add filters/<F>/path file
  mm/damon/sysfs-schemes: move memcg_path_to_id() to sysfs-common
  mm/damon/sysfs: setup damon_filter->memcg_id from path
  Docs/mm/damon/design: update for memcg damon filter
  Docs/admin-guide/mm/damon/usage: update for memcg damon filter

 Documentation/admin-guide/mm/damon/usage.rst |  46 +-
 Documentation/mm/damon/design.rst            |  39 ++
 include/linux/damon.h                        |  69 +++
 include/trace/events/damon.h                 |  38 ++
 mm/damon/core.c                              | 211 +++++++
 mm/damon/paddr.c                             |  76 +++
 mm/damon/sysfs-common.c                      |  41 ++
 mm/damon/sysfs-common.h                      |   2 +
 mm/damon/sysfs-schemes.c                     | 226 ++++++--
 mm/damon/sysfs.c                             | 557 +++++++++++++++++++
 tools/testing/selftests/damon/sysfs.sh       |  48 ++
 11 files changed, 1305 insertions(+), 48 deletions(-)


base-commit: 5d6919055dec134de3c40167a490f33c74c12581
-- 
2.47.3

^ permalink raw reply

* Re: [RFC PATCH v2.2 18/28] mm/damon: trace probe_hits
From: SeongJae Park @ 2026-05-16 17:31 UTC (permalink / raw)
  To: SeongJae Park
  Cc: Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
	Steven Rostedt, damon, linux-kernel, linux-mm, linux-trace-kernel
In-Reply-To: <20260515004433.128933-19-sj@kernel.org>

On Thu, 14 May 2026 17:44:19 -0700 SeongJae Park <sj@kernel.org> wrote:

> Introduce a new tracepoint for exposing the per-region per-probe
> positive sample count via tracefs.
> 
> Signed-off-by: SeongJae Park <sj@kernel.org>
> ---
>  include/trace/events/damon.h | 38 ++++++++++++++++++++++++++++++++++++
>  mm/damon/core.c              |  9 +++++++++
>  2 files changed, 47 insertions(+)
> 
> diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h
> index 24fc402ab3c85..ec1e317923fd3 100644
> --- a/include/trace/events/damon.h
> +++ b/include/trace/events/damon.h
> @@ -130,6 +130,44 @@ TRACE_EVENT(damon_monitor_intervals_tune,
>  	TP_printk("sample_us=%lu", __entry->sample_us)
>  );
>  
> +TRACE_EVENT_CONDITION(damon_aggregated_v2,

I was thinking [1] about a better name of this tracepoint.  I will rename this
to 'damon_region_aggregated'.  And I will deprecate damon_aggregated, with
multi phase, like we did for DAMON debugfs interface.  The idea off the top of
my head at the moment is,

1. announce it as deprecated on the document, by end of 2026
2. rename it (e.g., damon_aggregated_deprecated) by end of 2027
3. removing the code by end of 2028

The deprecation might be done faster than the current idea.

As Steven commented [2], it should be ok to immediately removing it or
extending it to have probe_hits.  But I realize I'm quite lazy at DAMON
user-space tool development, and feel more comfortable on this approach for
now.  Please let me know if anyone has a different opinion.

[1] https://lore.kernel.org/20260514000611.147809-1-sj@kernel.org
[2] https://lore.kernel.org/20260513203237.3b1b3286@gandalf.local.home


Thanks,
SJ

[...]

^ permalink raw reply

* Re: [RFC PATCH v3] bpf: introduce TAINT_UNSAFE_BPF for mutating helpers
From: Aaron Tomlin @ 2026-05-16 17:01 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Steven Rostedt, Jonathan Corbet, Song Liu, KP Singh,
	Matt Bobrowski, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Eduard, Kumar Kartikeya Dwivedi,
	Masami Hiramatsu, Shuah Khan, Jiri Olsa, Martin KaFai Lau,
	Yonghong Song, Mathieu Desnoyers, Randy Dunlap, neelx, sean,
	chjohnst, steve, mproche, nick.lange, open list:DOCUMENTATION,
	LKML, bpf, linux-trace-kernel
In-Reply-To: <CAADnVQLw+_NaOVeaKabuf085wNo_-6MAv8w0EDO3fBz3KCQT5g@mail.gmail.com>

On Wed, May 13, 2026 at 09:35:29AM -0700, Alexei Starovoitov wrote:
> On Wed, May 13, 2026 at 8:23 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > On Wed, 13 May 2026 08:16:07 -0700
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > > It's impossible to track all modifications.
> > > See what sched-ext is doing.
> > > What does it modify? Everything.
> >
> > What about just having a list of what BPF programs are loaded, what they
> > may be attached to, and what kfuncs they are calling?
> 
> Ohh. These have been available forever.
> Just bpftool prog, bpftool link, bpftool prog dump xlated

Hi Alexei,

Thank you for sharing.

Kind regards,
-- 
Aaron Tomlin

^ permalink raw reply

* Re: [PATCH v7 2/6] mm/memory-failure: surface unhandlable kernel pages as -ENOTRECOVERABLE
From: Lance Yang @ 2026-05-16  4:06 UTC (permalink / raw)
  To: Breno Leitao
  Cc: linmiaohe, akpm, david, ljs, vbabka, rppt, surenb, mhocko, shuah,
	nao.horiguchi, rostedt, mhiramat, mathieu.desnoyers, corbet,
	skhan, liam, linux-mm, linux-kernel, linux-doc, linux-kselftest,
	linux-trace-kernel, kernel-team
In-Reply-To: <agcbfLHT5ZWnNeN0@gmail.com>



On 2026/5/15 21:13, Breno Leitao wrote:
[...]
>>
>> Wonder if it would be simpler to just do a positive check near the top
>> of get_any_page() instead. Something like:
>>
>> static bool hwpoison_unrecoverable_kernel_page(struct page *page,
>> 						unsigned long flags)
> 
> Ack. We probably want to call it something like HWPoisonKernelOwned() to
> follow the same naming sematics of these helpers, such as HWPoisonHandlable()
> 
> By the way, I will re-include the self test back to this patch series,
> In case they are not useful, we do not merge it.
> 

Sounds good :)

Can you also test the relevant page types if possible, especially
the ones the new helper is supposed to classify?

Cheers, Lance

^ permalink raw reply

* Re: [PATCH v4 3/3] tracefs: make root directory world-traversable
From: Steven Rostedt @ 2026-05-15 23:16 UTC (permalink / raw)
  To: Anubhav Shelat
  Cc: mpetlan, Masami Hiramatsu, Mathieu Desnoyers, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, Thomas Falcon, linux-kernel, linux-trace-kernel,
	linux-perf-users
In-Reply-To: <20260515194010.93725-5-ashelat@redhat.com>

On Fri, 15 May 2026 15:40:07 -0400
Anubhav Shelat <ashelat@redhat.com> wrote:

> Change the default tracefs mount mode from 0700 to 0755. This allows
> unprivileged users to access the eventfs directories underneath which
> already use 0755.
> 
> Tracing data files use mode 0440 and 0640 so they are not exposed by
> this change. Only the format and id files, which have been marked as
> work-readable, become accessible.
> 
> Directory listings of kprobes and uprobes, which contain functions or
> binaries, become visible to unprivileged users but do not contain kernel
> addresses. Admins using probes can restore the previous behavior with
> chmod or mount -o mode=700.
> 

I've been thinking about this and I believe a better approach is to
make a eventfs that is mounted at:

 /sys/kernel/events

and be the same directory structure as /sys/kernel/tracing/events but
only contain read only files like "id" and "format". This directory
would be mounted as 555 and readable by all.

-- Steve

^ permalink raw reply

* Re: [PATCH 1/7] uprobes/x86: Move optimized uprobe from nop5 to nop10
From: Andrii Nakryiko @ 2026-05-15 20:31 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Oleg Nesterov, Peter Zijlstra, Ingo Molnar, Masami Hiramatsu,
	Andrii Nakryiko, bpf, linux-trace-kernel
In-Reply-To: <20260514135342.22130-2-jolsa@kernel.org>

On Thu, May 14, 2026 at 6:53 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Andrii reported an issue with optimized uprobes [1] that can clobber
> redzone area with call instruction storing return address on stack
> where user code may keep temporary data without adjusting rsp.
>
> Fixing this by moving the optimized uprobes on top of 10-bytes nop
> instruction, so we can squeeze another instruction to escape the
> redzone area before doing the call, like:
>
>   lea -0x80(%rsp), %rsp
>   call tramp
>
> Note the lea instruction is used to adjust the rsp register without
> changing the flags.

I think it should be very loudly explained that we can't go back to
nop10 and have to do short jump over patched sequence (and why).

>
> The optimized uprobe performance stays the same:
>
>         uprobe-nop     :    3.129 ± 0.013M/s
>         uprobe-push    :    3.045 ± 0.006M/s
>         uprobe-ret     :    1.095 ± 0.004M/s
>   -->   uprobe-nop10   :    7.170 ± 0.020M/s
>         uretprobe-nop  :    2.143 ± 0.021M/s
>         uretprobe-push :    2.090 ± 0.000M/s
>         uretprobe-ret  :    0.942 ± 0.000M/s
>   -->   uretprobe-nop10:    3.381 ± 0.003M/s
>         usdt-nop       :    3.245 ± 0.004M/s
>   -->   usdt-nop10     :    7.256 ± 0.023M/s
>
> [1] https://lore.kernel.org/bpf/20260509003146.976844-1-andrii@kernel.org/
> Reported-by: Andrii Nakryiko <andrii@kernel.org>
> Closes: https://lore.kernel.org/bpf/20260509003146.976844-1-andrii@kernel.org/
> Fixes: ba2bfc97b462 ("uprobes/x86: Add support to optimize uprobes")
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  arch/x86/kernel/uprobes.c | 121 +++++++++++++++++++++++++++-----------
>  1 file changed, 86 insertions(+), 35 deletions(-)
>
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index ebb1baf1eb1d..f7c4101a4039 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -636,9 +636,21 @@ struct uprobe_trampoline {
>         unsigned long           vaddr;
>  };
>
> +#define LEA_INSN_SIZE          5
> +#define OPT_INSN_SIZE          (LEA_INSN_SIZE + CALL_INSN_SIZE)
> +#define OPT_JMP8_OFFSET                (OPT_INSN_SIZE - JMP8_INSN_SIZE)
> +#define REDZONE_SIZE           0x80
> +
> +static const u8 lea_rsp[] = { 0x48, 0x8d, 0x64, 0x24, 0x80 };
> +
> +static bool is_lea_insn(const uprobe_opcode_t *insn)
> +{
> +       return !memcmp(insn, lea_rsp, LEA_INSN_SIZE);
> +}
> +
>  static bool is_reachable_by_call(unsigned long vtramp, unsigned long vaddr)
>  {
> -       long delta = (long)(vaddr + 5 - vtramp);
> +       long delta = (long)(vaddr + OPT_INSN_SIZE - vtramp);
>
>         return delta >= INT_MIN && delta <= INT_MAX;
>  }
> @@ -651,7 +663,7 @@ static unsigned long find_nearest_trampoline(unsigned long vaddr)
>         };
>         unsigned long low_limit, high_limit;
>         unsigned long low_tramp, high_tramp;
> -       unsigned long call_end = vaddr + 5;
> +       unsigned long call_end = vaddr + OPT_INSN_SIZE;
>
>         if (check_add_overflow(call_end, INT_MIN, &low_limit))
>                 low_limit = PAGE_SIZE;
> @@ -826,8 +838,8 @@ SYSCALL_DEFINE0(uprobe)

should we change -ENXIO to -EPROTO or some other distinct error code,
so libbpf can avoid using nop5 attachment on kernels new enough to
support nop5 optimization, but old enough to not do this properly with
nop10?

>         regs->ax  = args.ax;
>         regs->r11 = args.r11;
>         regs->cx  = args.cx;
> -       regs->ip  = args.retaddr - 5;
> -       regs->sp += sizeof(args);
> +       regs->ip  = args.retaddr - OPT_INSN_SIZE;
> +       regs->sp += sizeof(args) + REDZONE_SIZE;
>         regs->orig_ax = -1;
>
>         sp = regs->sp;

[...]

^ permalink raw reply

* [PATCH] tracing: Fix desc in error path for the trace remote test module
From: Vincent Donnefort @ 2026-05-15 20:16 UTC (permalink / raw)
  To: rostedt, mhiramat, mathieu.desnoyers, linux-trace-kernel
  Cc: kernel-team, linux-kernel, Vincent Donnefort, Sashiko

During initialisation in remote_test_load(), if one of the
simple_ring_buffer fails to initialise, the error path attempts to
rollback initialised buffers. However, the rollback incorrectly uses the
global pointer to the trace descriptor, which is only set upon
successful load completion. Fix the error path by using the local
pointer to the descriptor.

Fixes: ea908a2b79c8 ("tracing: Add a trace remote module for testing")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>

diff --git a/kernel/trace/remote_test.c b/kernel/trace/remote_test.c
index 6c1b7701ddae..a3e2c9b606eb 100644
--- a/kernel/trace/remote_test.c
+++ b/kernel/trace/remote_test.c
@@ -110,9 +110,9 @@ static struct trace_buffer_desc *remote_test_load(unsigned long size, void *unus
 	return remote_test_buffer_desc;
 
 err_unload:
-	for_each_ring_buffer_desc(rb_desc, cpu, remote_test_buffer_desc)
+	for_each_ring_buffer_desc(rb_desc, cpu, desc)
 		remote_test_unload_simple_rb(rb_desc->cpu);
-	trace_remote_free_buffer(remote_test_buffer_desc);
+	trace_remote_free_buffer(desc);
 
 err_free_desc:
 	kfree(desc);

base-commit: 5d6919055dec134de3c40167a490f33c74c12581
-- 
2.54.0.563.g4f69b47b94-goog


^ permalink raw reply related

* [PATCH v4 2/3] perf: enable unprivileged syscall tracing with perf trace
From: Anubhav Shelat @ 2026-05-15 19:40 UTC (permalink / raw)
  To: mpetlan, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Thomas Falcon,
	linux-kernel, linux-trace-kernel, linux-perf-users
  Cc: Anubhav Shelat
In-Reply-To: <20260515194010.93725-2-ashelat@redhat.com>

Allow unprivileged users to trace their own processes' syscalls using
perf trace, similar to strace without the intrusive overhead of ptrace().

Currently, perf trace requires CAP_PERFMON or paranoid level ≤ 1 even
though the kernel has existing infrastructure (TRACE_EVENT_FL_CAP_ANY)
specifically designed to mark syscall tracepoints as safe for
unprivileged access. To fix this:

1. Loosen the condition in perf_event_open() which requires privileges
   for all events with exclude_kernel=0. This allows perf_event_open() to
   bypass the paranoid check for task-attached tracepoint events. Ensure
   that sample types which can expose kernel addresses to unprivileged
   users are blocked. Ensure the PERF_SECURITY_KERNEL LSM hook is
   preserved.

2. Make the format and id tracefs files world-readable only for tracepoints
   with TRACE_EVENT_FL_CAP_ANY, allowing unprivileged users to see syscall
   tracepoint ids without exposing sensitive information.

3. Add a check to perf_trace_event_perm() to block PERF_SAMPLE_IP on
   kernel tracepoints for unprivileged users to prevent KASLR bypass. We do
   this here rather than in kaddr_leak because perf_trace_event_perm() can
   distinguish between kernel tracepoints and uprobe tracepoints, where the
   IP is a safe user space address and is necessary for uprobe
   functionality.

4. Restrict pure counting events (no PERF_SAMPLE_RAW) to
   TRACE_EVENT_FL_CAP_ANY tracepoints preventing unprivileged users from
   counting internal kernel tracepoints while preserving current
   behavior for exclude_kernel=1 events.

Example usage after this change:
  $ perf trace ls          # works as unprivileged user
  $ perf trace             # system-wide, still requires privileges
  $ perf trace -p 1234     # requires ptrace permission on pid 1234

Assisted-by: Claude:claude-sonnet-4.5
Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
---
 kernel/events/core.c            | 28 +++++++++++++++++++++++++---
 kernel/trace/trace_event_perf.c | 21 ++++++++++++++++++++-
 kernel/trace/trace_events.c     | 16 ++++++++++++++--
 3 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7935d5663944..ff2d1e9a0b79 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -13873,9 +13873,31 @@ SYSCALL_DEFINE5(perf_event_open,
 		return err;
 
 	if (!attr.exclude_kernel) {
-		err = perf_allow_kernel();
-		if (err)
-			return err;
+		bool tp_bypass = false;
+
+		/* Check unprivileged tracepoints */
+		if (attr.type == PERF_TYPE_TRACEPOINT && pid != -1) {
+			/*
+			 * Block sample types that expose kernel addresses to
+			 * prevent KASLR bypass
+			 */
+			u64 kaddr_leak = PERF_SAMPLE_CALLCHAIN |
+					 PERF_SAMPLE_BRANCH_STACK |
+					 PERF_SAMPLE_ADDR |
+					 PERF_SAMPLE_REGS_INTR;
+
+			tp_bypass = !(attr.sample_type & kaddr_leak);
+		}
+
+		if (!tp_bypass) {
+			err = perf_allow_kernel();
+			if (err)
+				return err;
+		} else {
+			err = security_perf_event_open(PERF_SECURITY_KERNEL);
+			if (err)
+				return err;
+		}
 	}
 
 	if (attr.namespaces) {
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index a6bb7577e8c5..466007ed2869 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -72,9 +72,28 @@ static int perf_trace_event_perm(struct trace_event_call *tp_event,
 			return -EINVAL;
 	}
 
+	/*
+	 * PERF_SAMPLE_IP on kernel tracepoints exposes a kernel text
+	 * address, weakening KASLR. Block for unprivileged users unless
+	 * the tracepoint is a uprobe (userspace IP, safe to expose).
+	 */
+	if ((p_event->attr.sample_type & PERF_SAMPLE_IP) &&
+	    !p_event->attr.exclude_kernel &&
+	    !(tp_event->flags & TRACE_EVENT_FL_UPROBE) &&
+	    sysctl_perf_event_paranoid > 1 && !perfmon_capable())
+		return -EACCES;
+
 	/* No tracing, just counting, so no obvious leak */
-	if (!(p_event->attr.sample_type & PERF_SAMPLE_RAW))
+	if (!(p_event->attr.sample_type & PERF_SAMPLE_RAW)) {
+		/* Prevent unprivileged users from counting kernel tracepoints */
+		if (!p_event->attr.exclude_kernel &&
+		    sysctl_perf_event_paranoid > 1 && !perfmon_capable()) {
+			if (!(p_event->attach_state == PERF_ATTACH_TASK &&
+			      (tp_event->flags & TRACE_EVENT_FL_CAP_ANY)))
+				return -EACCES;
+		}
 		return 0;
+	}
 
 	/* Some events are ok to be traced by non-root users... */
 	if (p_event->attach_state == PERF_ATTACH_TASK) {
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index c46e623e7e0d..cbd07e2ec528 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -3050,7 +3050,13 @@ static int event_callback(const char *name, umode_t *mode, void **data,
 	struct trace_event_call *call = file->event_call;
 
 	if (strcmp(name, "format") == 0) {
-		*mode = TRACE_MODE_READ;
+		/*
+		 * Make format tracefs file world readable for tracepoints with
+		 * TRACE_EVENT_FL_CAP_ANY
+		 */
+		*mode = (call->flags & TRACE_EVENT_FL_CAP_ANY) ?
+			(TRACE_MODE_READ | 0004) :
+			TRACE_MODE_READ;
 		*fops = &ftrace_event_format_fops;
 		return 1;
 	}
@@ -3086,7 +3092,13 @@ static int event_callback(const char *name, umode_t *mode, void **data,
 #ifdef CONFIG_PERF_EVENTS
 	if (call->event.type && call->class->reg &&
 	    strcmp(name, "id") == 0) {
-		*mode = TRACE_MODE_READ;
+		/*
+		 * Make id tracefs file world readable for tracepoints with
+		 * TRACE_EVENT_FL_CAP_ANY
+		 */
+		*mode = (call->flags & TRACE_EVENT_FL_CAP_ANY) ?
+			(TRACE_MODE_READ | 0004) :
+			TRACE_MODE_READ;
 		*data = (void *)(long)call->event.type;
 		*fops = &ftrace_event_id_fops;
 		return 1;
-- 
2.54.0


^ permalink raw reply related

* [PATCH v4 3/3] tracefs: make root directory world-traversable
From: Anubhav Shelat @ 2026-05-15 19:40 UTC (permalink / raw)
  To: mpetlan, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Thomas Falcon,
	linux-kernel, linux-trace-kernel, linux-perf-users
  Cc: Anubhav Shelat
In-Reply-To: <20260515194010.93725-2-ashelat@redhat.com>

Change the default tracefs mount mode from 0700 to 0755. This allows
unprivileged users to access the eventfs directories underneath which
already use 0755.

Tracing data files use mode 0440 and 0640 so they are not exposed by
this change. Only the format and id files, which have been marked as
work-readable, become accessible.

Directory listings of kprobes and uprobes, which contain functions or
binaries, become visible to unprivileged users but do not contain kernel
addresses. Admins using probes can restore the previous behavior with
chmod or mount -o mode=700.

Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
---
 fs/tracefs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index f3d6188a3b7b..3a6a0c800a8b 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -23,7 +23,7 @@
 #include <linux/slab.h>
 #include "internal.h"
 
-#define TRACEFS_DEFAULT_MODE	0700
+#define TRACEFS_DEFAULT_MODE	0755
 static struct kmem_cache *tracefs_inode_cachep __ro_after_init;
 
 static struct vfsmount *tracefs_mount;
-- 
2.54.0


^ permalink raw reply related

* [PATCH v4 1/3] perf evsel: don't set PERF_SAMPLE_IP for unprivileged tracepoints
From: Anubhav Shelat @ 2026-05-15 19:40 UTC (permalink / raw)
  To: mpetlan, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Thomas Falcon,
	linux-kernel, linux-trace-kernel, linux-perf-users
  Cc: Anubhav Shelat
In-Reply-To: <20260515194010.93725-2-ashelat@redhat.com>

For tracepoint events the IP is a static kernel address.
It doesn't vary by sample and provides no useful information for
unprivileged users. Skipping setting PERF_SAMPLE_IP for unprivileged
tracepoints avoids exposing a kernel address that reveals the KASLR base
offset.

Make an exception for uprobes, which are registered as
PERF_TYPE_TRACEPOINT, because the IP is important for their
functionality and is a safe userspace address. Detect them with
__probe_ip (entry) and __probe_ret_ip (return) using evsel__field().

Assisted-by: Claude:claude-sonnet-4.5
Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
---
 tools/perf/util/evsel.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2ee87fd84d3e..bf66e0c78451 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1509,7 +1509,19 @@ void evsel__config(struct evsel *evsel, const struct record_opts *opts,
 	attr->write_backward = opts->overwrite ? 1 : 0;
 	attr->read_format   = PERF_FORMAT_LOST;
 
-	evsel__set_sample_bit(evsel, IP);
+	/*
+	 * Don't set PERF_SAMPLE_IP for unprivileged kernel tracepoints to
+	 * avoid exposing kernel addresses. Uprobes expose only userspace
+	 * addresses so they're safe. Detect entry and return uprobes.
+	 */
+	if (attr->type != PERF_TYPE_TRACEPOINT || perf_event_paranoid_check(1)
+#ifdef HAVE_LIBTRACEEVENT
+	    || evsel__field(evsel, "__probe_ip")
+	    || evsel__field(evsel, "__probe_ret_ip")
+#endif
+	    )
+		evsel__set_sample_bit(evsel, IP);
+
 	evsel__set_sample_bit(evsel, TID);
 
 	if (evsel->sample_read) {
-- 
2.54.0


^ permalink raw reply related

* [PATCH v4 0/3] Enable perf tracing for unprivileged users
From: Anubhav Shelat @ 2026-05-15 19:40 UTC (permalink / raw)
  To: mpetlan, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Thomas Falcon,
	linux-kernel, linux-trace-kernel, linux-perf-users
  Cc: Anubhav Shelat

Enable users to use perf-trace to trace their own processes, like strace
but without the overhead of ptrace(). Ensure that users cannot access
other users' or systemwide tracing data.

Changes in v4:
- Preserve security_perf_event_open(PERF_SECURITY_KERNEL) LSM hook in
  the tp_bypass path.
- Lift the PERF_SAMPLE_IP check out of the tp_bypass path above the
  PERF_SAMPLE_RAW branch so it applies to counting and sampling. This
  also allows us to ensure PERF_SAMPLE_IP is set for uprobes.
- Block counting path for TRACE_EVENT_FL_CAP_ANY for unprivileged users
  with sysctl_perf_event_paranoid > 1.

Changes in v3:
- Don't set PERF_SAMPLE_IP for unprivileged tracepoints. This allows us
  to exclude PERF_SAMPLE_IP from kaddr_leak without weakening KASLR.
- Mount tracefs as world-traversable so users can access eventfs
  directories.

Anubhav Shelat (3):
  perf evsel: don't set PERF_SAMPLE_IP for unprivileged tracepoints
  perf: enable unprivileged syscall tracing with perf trace
  tracefs: make root directory world-traversable

 fs/tracefs/inode.c              |  2 +-
 kernel/events/core.c            | 28 +++++++++++++++++++++++++---
 kernel/trace/trace_event_perf.c | 21 ++++++++++++++++++++-
 kernel/trace/trace_events.c     | 16 ++++++++++++++--
 tools/perf/util/evsel.c         | 14 +++++++++++++-
 5 files changed, 73 insertions(+), 8 deletions(-)

-- 
2.54.0


^ permalink raw reply

* Re: [PATCH 06/13] verification/rvgen: Convert __fill_verify_guards_func() to Lark
From: Wander Lairson Costa @ 2026-05-15 19:35 UTC (permalink / raw)
  To: Nam Cao; +Cc: Gabriele Monaco, Steven Rostedt, linux-trace-kernel, linux-kernel
In-Reply-To: <e8a636c8ea6da554fd51b1241b9181f65af420c8.1777962130.git.namcao@linutronix.de>

On Tue, May 05, 2026 at 08:59:27AM +0200, Nam Cao wrote:
> Prepare to remove self.guards and self.__parse_constraints(), convert
> __fill_verify_guards_func() to use the parsed transitions from Lark.
> 
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> ---
>  tools/verification/rvgen/rvgen/dot2k.py | 39 ++++++++++++++++++++-----
>  1 file changed, 31 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/verification/rvgen/rvgen/dot2k.py b/tools/verification/rvgen/rvgen/dot2k.py
> index 3a39ae29e41e..cf7e5ddc649c 100644
> --- a/tools/verification/rvgen/rvgen/dot2k.py
> +++ b/tools/verification/rvgen/rvgen/dot2k.py
> @@ -221,6 +221,20 @@ class ha2k(dot2k):
>      def __parse_single_constraint(self, rule: dict, value: str) -> str:
>          return f"ha_get_env(ha_mon, {rule["env"]}{self.enum_suffix}, time_ns) {rule["op"]} {value}"
>  
> +    def __parse_guard_rule(self, rule) -> str:
> +        buff = []
> +        for c, sep in rule.rules:
> +            env = c.env + self.enum_suffix
> +            op = c.op
> +            val = self.__adjust_value(c.val, c.unit)
> +
> +            cond = f"ha_get_env(ha_mon, {env}, time_ns) {op} {val}"
> +            if sep:
> +                cond += f" {sep}"
> +            buff.append(cond)
> +        buff[-1] += ';'
> +        return buff
> +
>      def __get_constraint_env(self, constr: str) -> str:
>          """Extract the second argument from an ha_ function"""
>          env = constr.split("(")[1].split()[1].rstrip(")").rstrip(",")
> @@ -398,8 +412,9 @@ f"""static inline void ha_convert_inv_guard(struct ha_monitor *ha_mon,
>  
>      def __fill_verify_guards_func(self) -> list[str]:
>          buff = []
> -        if not self.guards:
> -            return []
> +
> +        if not self.has_guard:
> +            return

The signature of function says this function return a list, instead of
None.

>  
>          buff.append(
>  f"""static inline bool ha_verify_guards(struct ha_monitor *ha_mon,
> @@ -410,14 +425,22 @@ f"""static inline bool ha_verify_guards(struct ha_monitor *ha_mon,
>  """)
>  
>          _else = ""
> -        for edge, constr in sorted(self.guards.items()):
> +        for transition in self.transitions:
> +            if not transition.rule and not transition.reset:
> +                continue
> +
>              buff.append(f"\t{_else}if (curr_state == "
> -                        f"{self.states[edge[0]]}{self.enum_suffix} && "
> -                        f"event == {self.events[edge[1]]}{self.enum_suffix})")
> -            if constr.count(";") > 0:
> +                        f"{transition.src}{self.enum_suffix} && "
> +                        f"event == {transition.event}{self.enum_suffix})")
> +            rule = transition.rule
> +            reset = transition.reset
> +            if rule and reset:
>                  buff[-1] += " {"
> -            buff += [f"\t\t{c};" for c in constr.split(";")]
> -            if constr.count(";") > 0:
> +            if rule:
> +                buff.append("\t\t" + self.__format_guard_rules(self.__parse_guard_rule(rule))[0])
> +            if reset:
> +                buff.append(f"\t\tha_reset_env(ha_mon, {reset.env}{self.enum_suffix}, time_ns);")
> +            if rule and reset:
>                  _else = "} else "
>              else:
>                  _else = "else "
> -- 
> 2.47.3
> 


^ permalink raw reply

* Re: [PATCH v3 08/11] scsi: ufs: Use trace_call__##name() at guarded tracepoint call sites
From: Bart Van Assche @ 2026-05-15 19:21 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Vineeth Pillai (Google), James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <20260515145048.1c021bc9@gandalf.local.home>

On 5/15/26 11:50 AM, Steven Rostedt wrote:
> On Fri, 15 May 2026 08:27:27 -0700
> Bart Van Assche <bvanassche@acm.org> wrote:
> 
>> On 5/15/26 6:59 AM, Vineeth Pillai (Google) wrote:
>>>    static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba,
>>> @@ -432,8 +432,8 @@ static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba,
>>>    	if (!trace_ufshcd_upiu_enabled())
>>>    		return;
>>>    
>>> -	trace_ufshcd_upiu(hba, str_t, &rq_rsp->header,
>>> -			  &rq_rsp->qr, UFS_TSF_OSF);
>>> +	trace_call__ufshcd_upiu(hba, str_t, &rq_rsp->header,
>>> +			       &rq_rsp->qr, UFS_TSF_OSF);
>>>    }
>>
>> Instead of making this change, please remove the
>> trace_ufshcd_upiu_enabled() call because it is redundant.
> 
> You mean to remove the ufshcd_add_query_upiu_trace() function and just use
> a tracepoint where it is called?

That would be even better.

>>>    static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
>>> @@ -445,15 +445,15 @@ static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
>>>    		return;
>>>    
>>>    	if (str_t == UFS_TM_SEND)
>>> -		trace_ufshcd_upiu(hba, str_t,
>>> -				  &descp->upiu_req.req_header,
>>> -				  &descp->upiu_req.input_param1,
>>> -				  UFS_TSF_TM_INPUT);
>>> +		trace_call__ufshcd_upiu(hba, str_t,
>>> +					&descp->upiu_req.req_header,
>>> +					&descp->upiu_req.input_param1,
>>> +					UFS_TSF_TM_INPUT);
>>>    	else
>>> -		trace_ufshcd_upiu(hba, str_t,
>>> -				  &descp->upiu_rsp.rsp_header,
>>> -				  &descp->upiu_rsp.output_param1,
>>> -				  UFS_TSF_TM_OUTPUT);
>>> +		trace_call__ufshcd_upiu(hba, str_t,
>>> +					&descp->upiu_rsp.rsp_header,
>>> +					&descp->upiu_rsp.output_param1,
>>> +					UFS_TSF_TM_OUTPUT);
>>>    }
>>
>> Same comment here: I think it would be better to remove the
>> trace_ufshcd_upiu_enabled() call rather than
>> changing trace_ufshcd_upiu() into trace_call__ufshcd_upiu().
> 
> Well, removing it here would mean placing the if (str == UFS_TM_SEND) into
> the code and processing it even when tracing is disabled. With the
> trace_*_enabled() helper, it's all a nop.

The ufshcd_add_tm_upiu_trace() function is only called from the UFS
error handler and hence is not performance sensitive. The execution of
an additional if-test in this function is not a concern at all.

Thanks,

Bart.

^ permalink raw reply

* Re: [PATCH 03/13] verification/rvgen: Implement state and transition parser based on Lark
From: Wander Lairson Costa @ 2026-05-15 19:07 UTC (permalink / raw)
  To: Nam Cao; +Cc: Gabriele Monaco, Steven Rostedt, linux-trace-kernel, linux-kernel
In-Reply-To: <361efb610ba7c06b3668a953a6847ea80453c2e3.1777962130.git.namcao@linutronix.de>

On Tue, May 05, 2026 at 08:59:24AM +0200, Nam Cao wrote:
> The DOT parsing scripts directly parse the raw text and they are quite
> fragile. If the input dot files' formats are slightly changed (for
> instance, by breaking long some lines which is allowed by the DOT
> language), the scripts would fail.
> 
> Prepare to move away from the raw text processing, implement parsers based
> on Lark which parse states, transitions and constraints.
> 
> The parse results are not used yet. The existing scripts will be converted
> one by one to them, and the raw text processing will eventually be removed.
> 
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> ---
>  tools/verification/rvgen/rvgen/automata.py | 207 +++++++++++++++++++++
>  1 file changed, 207 insertions(+)
> 
> diff --git a/tools/verification/rvgen/rvgen/automata.py b/tools/verification/rvgen/rvgen/automata.py
> index 4e3d719a0952..32c16736a41b 100644
> --- a/tools/verification/rvgen/rvgen/automata.py
> +++ b/tools/verification/rvgen/rvgen/automata.py
> @@ -194,6 +194,155 @@ class ParseTree:
>          self.node_attrs = attributes_parser.node_attrs
>          self.edge_attrs = attributes_parser.edge_attrs
>  
> +class ConstraintCondition:
> +    def __init__(self, env: str, op: str, val: str, unit=None):
> +        self.env = env
> +        self.op = op
> +        self.val = val
> +        self.unit = unit
> +        if unit is None:
> +            # try to infer unit from constants or parameters
> +            val_for_unit = val.lower().replace("()", "")
> +            if val_for_unit.endswith("_ns"):
> +                self.unit = "ns"
> +            if val_for_unit.endswith("_jiffies"):
> +                self.unit = "j"
> +
> +class ConstraintRule:
> +    grammar = r'''
> +        rule: condition (OP condition)*
> +
> +        OP: "&&" | "||"
> +
> +        condition: ENV CMP_OP VAL UNIT?
> +
> +        ENV: CNAME
> +
> +        CMP_OP: "==" | "<=" | "<" | ">=" | ">"
> +
> +        VAL: /[0-9]+/
> +           | /[A-Z_]+\(\)/
> +           | /[A-Z_]+/
> +           | /[a-z_]+\(\)/
> +           | /[a-z_]+/
> +
> +        UNIT: "ns" | "us" | "ms" | "s"
> +    '''
> +
> +    def __init__(self, c: ConstraintCondition):
> +        '''
> +        A list of pairs of
> +          - the condition (e.g. is_constr_dl == 1)
> +          - the logical operator ("||" or "&&") combining this
> +            condition with the next one if it exists, otherwise None
> +
> +        TODO: Perhaps use an abstract syntax tree instead, because
> +              this representation cannot capture precedence
> +        '''
> +        self.rules = [[c, None]]

Here self.rules is a list of lists...

> +
> +    def chain(self, op: str, c: ConstraintCondition):
> +        self.rules[-1][1] = op
> +        self.rules.append((c, None))

... but here it is a list of tuples.

> +
> +class ConstraintReset:
> +    def __init__(self, env):
> +        self.env = env
> +
> +class StateLabelParser:
> +    grammar = r'''
> +    label: CNAME ("\\n" condition)?
> +
> +    %import common.CNAME
> +    %import common.WS
> +    %ignore WS
> +    ''' + ConstraintRule.grammar
> +
> +    def __init__(self, label: str):
> +        parser = lark.Lark(self.grammar, parser='lalr', start="label")
> +        tree = parser.parse(label)
> +
> +        self.state = tree.children[0]
> +        self.constraint = None
> +
> +        if len(tree.children) == 2:
> +            self.constraint = ConstraintCondition(*tree.children[1].children)
> +            if self.constraint.op not in ("<", "<="):
> +                raise AutomataError("State constraints must be clock expirations like"
> +                                    f" clk<N ({label})")
> +
> +class EventLabelParser:
> +    grammar = r'''
> +    events: event ("\\n" event)*
> +
> +    event: name (";" guard)*
> +
> +    guard: reset
> +         | rule
> +         | rule reset
> +         | reset rule
> +
> +    name: CNAME
> +
> +    reset: "reset" "(" ENV ")"
> +
> +    %import common.CNAME
> +    %import common.WS
> +    %ignore WS
> +    ''' + ConstraintRule.grammar
> +
> +    class GetEvents(lark.visitors.Transformer):
> +        def guard(self, args):
> +            reset = None
> +            rule = None
> +            for arg in args:
> +                if arg.data == "reset":
> +                    reset = ConstraintReset(arg.children[0])
> +                elif arg.data == "rule":
> +                    conditions = arg.children
> +                    rule = ConstraintRule(conditions[0])
> +                    for i in range(1, len(conditions), 2):
> +                        rule.chain(conditions[i], conditions[i + 1])
> +            return reset, rule
> +
> +        def OP(self, args):
> +            return args
> +
> +        def condition(self, args):
> +            return ConstraintCondition(*args)
> +
> +        def event(self, args):
> +            name = args[0]
> +            rule, reset = None, None
> +            if len(args) == 2:
> +                reset, rule = args[1]
> +            return name, reset, rule
> +
> +        def events(self, args):
> +            return args
> +
> +        def name(self, args):
> +            return args[0]
> +
> +    def __init__(self, label: str):
> +        parser = lark.Lark(self.grammar, parser='lalr', start="events")
> +        tree = parser.parse(label)
> +        self.events = self.GetEvents().transform(tree)
> +
> +class Transition:
> +    def __init__(self, src: str, dst: str, event: str,
> +                 reset: ConstraintReset, rule: ConstraintRule):
> +        self.src = src
> +        self.dst = dst
> +        self.event = event
> +        self.rule = rule
> +        self.reset = reset
> +
> +class State:
> +    def __init__(self, name: str, inv: ConstraintRule):
> +        self.name = name
> +        self.inv = inv
> +
>  class _ConstraintKey:
>      """Base class for constraint keys."""
>  
> @@ -248,6 +397,8 @@ class Automata:
>          self.name = model_name or self.__get_model_name()
>          self.__dot_lines = self.__open_dot()
>          self.__parse_tree = ParseTree(file_path)
> +        self.transitions = self.__parse_transitions()
> +        self._states, self._initial_state, self._final_states = self.__parse_states()
>          self.states, self.initial_state, self.final_states = self.__get_state_variables()
>          self.env_types = {}
>          self.env_stored = set()
> @@ -323,6 +474,62 @@ class Automata:
>  
>          return cursor
>  
> +    def __parse_transitions(self):
> +        transitions = []
> +
> +        for edge in self.__parse_tree.edges:
> +            attr = self.__parse_tree.edge_attrs.get(edge)
> +            if not attr:
> +                continue
> +
> +            label = attr.get("label")
> +
> +            src, dst = edge
> +
> +            parser = EventLabelParser(label)
> +            for event, reset, rule in parser.events:
> +                transitions.append(Transition(src, dst, event, reset, rule))
> +
> +        transitions.sort(key=lambda t : (t.src, t.event))
> +        return transitions
> +
> +    def __parse_states(self):
> +        initial_state = ""
> +        states = []
> +        final_states = []
> +
> +        for node in self.__parse_tree.nodes:
> +            attr = self.__parse_tree.node_attrs[node]
> +            label = attr["label"]
> +
> +            if node.startswith(Automata.init_marker):
> +                initial_state = node[len(Automata.init_marker):]
> +
> +            if not label:
> +                continue
> +
> +            parser = StateLabelParser(attr["label"])
> +            state = State(parser.state, parser.constraint)
> +
> +            states.append(state)
> +
> +            shape = attr.get("shape")
> +            if shape in ("doublecircle", "ellipse"):
> +                final_states.append(state)
> +
> +
> +        initial_state = next((s for s in states if s.name == initial_state), None)
> +        if not initial_state:
> +            raise AutomataError("The automaton doesn't have an initial state")
> +
> +        if not final_states:
> +            final_states.append(initial_state)
> +
> +        states.remove(initial_state)
> +        states.sort(key=lambda s : s.name)
> +        states.insert(0, initial_state)
> +        return states, initial_state, final_states
> +
>      def __get_state_variables(self) -> tuple[list[str], str, list[str]]:
>          # wait for node declaration
>          states = []
> -- 
> 2.47.3
> 


^ permalink raw reply

* Re: [PATCH v3 08/11] scsi: ufs: Use trace_call__##name() at guarded tracepoint call sites
From: Steven Rostedt @ 2026-05-15 18:50 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vineeth Pillai (Google), James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <9fde73e7-0108-48d7-a1a0-ccc9776beb5c@acm.org>

On Fri, 15 May 2026 08:27:27 -0700
Bart Van Assche <bvanassche@acm.org> wrote:

> On 5/15/26 6:59 AM, Vineeth Pillai (Google) wrote:
> >   static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba,
> > @@ -432,8 +432,8 @@ static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba,
> >   	if (!trace_ufshcd_upiu_enabled())
> >   		return;
> >   
> > -	trace_ufshcd_upiu(hba, str_t, &rq_rsp->header,
> > -			  &rq_rsp->qr, UFS_TSF_OSF);
> > +	trace_call__ufshcd_upiu(hba, str_t, &rq_rsp->header,
> > +			       &rq_rsp->qr, UFS_TSF_OSF);
> >   }  
> 
> Instead of making this change, please remove the 
> trace_ufshcd_upiu_enabled() call because it is redundant.

You mean to remove the ufshcd_add_query_upiu_trace() function and just use
a tracepoint where it is called?

Makes sense.

> 
> >   static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
> > @@ -445,15 +445,15 @@ static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
> >   		return;
> >   
> >   	if (str_t == UFS_TM_SEND)
> > -		trace_ufshcd_upiu(hba, str_t,
> > -				  &descp->upiu_req.req_header,
> > -				  &descp->upiu_req.input_param1,
> > -				  UFS_TSF_TM_INPUT);
> > +		trace_call__ufshcd_upiu(hba, str_t,
> > +					&descp->upiu_req.req_header,
> > +					&descp->upiu_req.input_param1,
> > +					UFS_TSF_TM_INPUT);
> >   	else
> > -		trace_ufshcd_upiu(hba, str_t,
> > -				  &descp->upiu_rsp.rsp_header,
> > -				  &descp->upiu_rsp.output_param1,
> > -				  UFS_TSF_TM_OUTPUT);
> > +		trace_call__ufshcd_upiu(hba, str_t,
> > +					&descp->upiu_rsp.rsp_header,
> > +					&descp->upiu_rsp.output_param1,
> > +					UFS_TSF_TM_OUTPUT);
> >   }  
> 
> Same comment here: I think it would be better to remove the 
> trace_ufshcd_upiu_enabled() call rather than
> changing trace_ufshcd_upiu() into trace_call__ufshcd_upiu().

Well, removing it here would mean placing the if (str == UFS_TM_SEND) into
the code and processing it even when tracing is disabled. With the
trace_*_enabled() helper, it's all a nop.

-- Steve



^ permalink raw reply

* Re: [PATCH v3 07/11] HID: Use trace_call__##name() at guarded tracepoint call sites
From: Steven Rostedt @ 2026-05-15 18:43 UTC (permalink / raw)
  To: srinivas pandruvada
  Cc: Vineeth Pillai (Google), Jiri Kosina, Benjamin Tissoires,
	linux-input, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <fbc8c9659f707f46b5d8a6479fc42d5bb1d0efcd.camel@linux.intel.com>

On Fri, 15 May 2026 08:09:25 -0700
srinivas pandruvada <srinivas.pandruvada@linux.intel.com> wrote:

> On Fri, 2026-05-15 at 09:59 -0400, Vineeth Pillai (Google) wrote:
> > From: Vineeth Pillai <vineeth@bitbyteword.org>
> > 
> > Replace trace_foo() with the new trace_call__foo() at sites already
> > guarded by trace_foo_enabled(), avoiding a redundant
> > static_branch_unlikely() re-evaluation inside the tracepoint.
> > trace_call__foo() calls the tracepoint callbacks directly without
> > utilizing the static branch again.
> > 
> > Original v2 series:
> > https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/
> > 
> > Parts of the original v2 series have already been merged in mainline.
> > This patch is being reposted as a follow-up cleanup for the remaining
> > unmerged pieces.
> > 
> > Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> > Suggested-by: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
> > Assisted-by: Claude:claude-sonnet-4-6  
> 
>     Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> 

Thanks, I'll take this through my tree.

-- Steve

^ permalink raw reply

* Re: [PATCH 1/3] cpufreq: amd-pstate: Use trace_call__##name() at guarded tracepoint call site
From: Steven Rostedt @ 2026-05-15 18:40 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: Vineeth Pillai (Google), Huang Rui, Rafael J. Wysocki,
	Viresh Kumar, linux-pm, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <00e25aaf-6b85-4717-9b63-2c607a446a77@amd.com>

On Fri, 15 May 2026 13:29:33 -0500
Mario Limonciello <mario.limonciello@amd.com> wrote:

> No concerns this going through another tree together.
> 
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

I'll apply it to my tree.

Thanks,

-- Steve

^ permalink raw reply

* Re: [PATCH 02/13] verification/rvgen: Introduce a parse tree for automata using Lark
From: Wander Lairson Costa @ 2026-05-15 18:37 UTC (permalink / raw)
  To: Nam Cao; +Cc: Gabriele Monaco, Steven Rostedt, linux-trace-kernel, linux-kernel
In-Reply-To: <050ac3d7aeb1ece12a4deb91fc173de24ad147de.1777962130.git.namcao@linutronix.de>

On Tue, May 05, 2026 at 08:59:23AM +0200, Nam Cao wrote:
> The DOT parsing scripts directly parse the raw text and they are quite
> fragile. If the input dot files' formats are slightly changed (for
> instance, by breaking long some lines which is allowed by the DOT language
> defined by graphviz), the scripts would fail.
> 
> To make the scripts robust, the parser should be implemented based on the
> dot language specification, not based on how the existing dot files look.
> 
> As a first step, use Lark to implement a Parser based on the graphviz dot
> language specification. The resulting parse tree is not used yet, but the
> existing scripts will be converted one by one to use this new parse tree in
> the follow-up commits.
> 
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> ---
>  tools/verification/rvgen/rvgen/automata.py | 182 +++++++++++++++++++++
>  1 file changed, 182 insertions(+)
> 
> diff --git a/tools/verification/rvgen/rvgen/automata.py b/tools/verification/rvgen/rvgen/automata.py
> index b9f8149f7118..4e3d719a0952 100644
> --- a/tools/verification/rvgen/rvgen/automata.py
> +++ b/tools/verification/rvgen/rvgen/automata.py
> @@ -13,6 +13,187 @@ import re
>  from typing import Iterator
>  from itertools import islice
>  
> +import lark
> +
> +class ParseTree:
> +    # based on https://graphviz.org/doc/info/lang.html
> +    # with the irrelevant stuffs (port and compass) removed
> +    grammar = r'''
> +    start: "strict"? ("graph" | "digraph") ID? "{" stmt_list "}"
> +
> +    stmt_list: (stmt ";"? stmt_list)?
> +
> +    stmt: node_stmt
> +        | edge_stmt
> +        | attr_stmt
> +        | ID "=" ID
> +        | subgraph
> +
> +    attr_stmt: attr_type attr_list
> +
> +    attr_type: "graph" -> graph
> +            | "node"  -> node
> +            | "edge"  -> edge
> +
> +    attr_list: "[" a_list? "]" attr_list?
> +
> +    a_list: ID "=" ID (";" | ",")? a_list?
> +
> +    edge_stmt: (node_id | subgraph) edgerhs attr_list?
> +
> +    edgerhs: edgeop (node_id | subgraph) edgerhs?
> +
> +    edgeop: "->" | "--"
> +
> +    node_stmt: node_id attr_list?
> +
> +    node_id: ID
> +
> +    subgraph: ("subgraph" ID?)? "{" stmt_list "}"
> +
> +    ID: /[_a-zA-Z][_a-zA-Z0-9]+/

This regex rejects symbol character symbol. Is that intentional?

> +      | /-?(\.[0-9]+|[0-9]+(\.[0-9]*))/
> +      | /".*?"/
> +
> +    %import common.WS
> +    %ignore WS
> +    '''
> +
> +    @staticmethod
> +    def parse_edge(tree: lark.Tree) -> tuple[str, str]:
> +        # only support a simple node-to-node edge
> +        nodes = []
> +        for node in tree.iter_subtrees_topdown():
> +            if node.data == "node_id":
> +                nodes.append(node.children[0].strip('"'))
> +
> +        if len(nodes) != 2:
> +            raise AutomataError("Only state-to-state transition is supported")
> +
> +        return tuple(nodes)
> +
> +    class ParseNodes(lark.visitors.Visitor):
> +        def __init__(self, *args, **kwargs):
> +            self.nodes = set()
> +            super().__init__(*args, **kwargs)
> +
> +        def node_stmt(self, tree):
> +            node_id = tree.children[0]
> +            node = node_id.children[0].strip('"')
> +            self.nodes.add(node)
> +
> +    class ParseEdges(lark.visitors.Visitor):
> +        def __init__(self, *args, **kwargs):
> +            self.edges = set()
> +            super().__init__(*args, **kwargs)
> +
> +        def edge_stmt(self, tree):
> +            edge = ParseTree.parse_edge(tree)
> +            self.edges.add(edge)
> +
> +    class ParseAttributes(lark.visitors.Interpreter):
> +        def __init__(self, *args, **kwargs):
> +            '''
> +            Stacks of default attributes. [0] is the default
> +            attributes for the outermost scope, while [-1] is the
> +            default attributes for the current scope.
> +            '''
> +            self.default_node_attrs = [{}]
> +            self.default_edge_attrs = [{}]
> +
> +            self.node_attrs = {}
> +            self.edge_attrs = {}
> +
> +            super().__init__(*args, **kwargs)
> +
> +        @staticmethod
> +        def __get_attrs(stmt: lark.Tree) -> dict[str, str]:
> +            attrs = {}
> +
> +            for node in stmt.iter_subtrees():
> +                if node.data == "a_list":
> +                    attrs[node.children[0]] = node.children[1].strip('"')
> +
> +            return attrs
> +
> +
> +        def subgraph(self, tree):
> +            # We are entering a new scope, inherit the default
> +            # attributes of the outer scope
> +            self.default_node_attrs.append(self.default_node_attrs[-1].copy())
> +            self.default_edge_attrs.append(self.default_edge_attrs[-1].copy())
> +
> +            children = self.visit_children(tree)
> +
> +            # Exiting the scope
> +            del self.default_node_attrs[-1]
> +            del self.default_edge_attrs[-1]
> +
> +            return children
> +
> +        def node_stmt(self, tree):
> +            node_id = tree.children[0]
> +            node = node_id.children[0].strip('"')
> +
> +            attrs = self.default_node_attrs[-1].copy()
> +            attrs |= self.__get_attrs(tree)
> +
> +            if attrs:
> +                if node in self.node_attrs:
> +                    self.node_attrs[node] = attrs | self.node_attrs[node]
> +                else:
> +                    self.node_attrs[node] = attrs
> +
> +            return self.visit_children(tree)
> +
> +        def edge_stmt(self, tree):
> +            edge = ParseTree.parse_edge(tree)
> +
> +            attrs = self.default_edge_attrs[-1].copy()
> +            attrs |= self.__get_attrs(tree)
> +
> +            if attrs:
> +                if edge in self.edge_attrs:
> +                    self.edge_attrs[edge] = attrs | self.edge_attrs[edge]
> +                else:
> +                    self.edge_attrs[edge] = attrs
> +
> +            return self.visit_children(tree)
> +
> +        def attr_stmt(self, tree):
> +            attr_type = tree.children[0].data
> +            attrs = self.__get_attrs(tree)
> +
> +            if attr_type == "node":
> +                self.default_node_attrs[-1] |= attrs
> +            elif attr_type == "edge":
> +                self.default_edge_attrs[-1] |= attrs
> +            else:
> +                # graph attributes are irrelevant
> +                pass
> +
> +            self.visit_children(tree)
> +
> +    def __init__(self, dot_file):
> +        parser = lark.Lark(self.grammar, parser='lalr')
> +        node_parser = self.ParseNodes()
> +        edge_parser = self.ParseEdges()
> +        attributes_parser = self.ParseAttributes()
> +
> +        try:
> +            with open(dot_file, "r") as dot_file:
> +                tree = parser.parse(dot_file.read())
> +                attributes_parser.visit(tree)
> +                node_parser.visit(tree)
> +                edge_parser.visit(tree)
> +        except OSError as exc:
> +            raise AutomataError(exc.strerror) from exc
> +
> +        self.nodes = node_parser.nodes
> +        self.edges = edge_parser.edges
> +        self.node_attrs = attributes_parser.node_attrs
> +        self.edge_attrs = attributes_parser.edge_attrs
> +
>  class _ConstraintKey:
>      """Base class for constraint keys."""
>  
> @@ -66,6 +247,7 @@ class Automata:
>          self.__dot_path = file_path
>          self.name = model_name or self.__get_model_name()
>          self.__dot_lines = self.__open_dot()
> +        self.__parse_tree = ParseTree(file_path)
>          self.states, self.initial_state, self.final_states = self.__get_state_variables()
>          self.env_types = {}
>          self.env_stored = set()
> -- 
> 2.47.3
> 


^ permalink raw reply

* Re: [PATCH 1/3] cpufreq: amd-pstate: Use trace_call__##name() at guarded tracepoint call site
From: Mario Limonciello @ 2026-05-15 18:29 UTC (permalink / raw)
  To: Vineeth Pillai (Google), Huang Rui, Rafael J. Wysocki,
	Viresh Kumar
  Cc: linux-pm, Steven Rostedt, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <20260515140121.2239414-1-vineeth@bitbyteword.org>



On 5/15/26 09:01, Vineeth Pillai (Google) wrote:
> From: Vineeth Pillai <vineeth@bitbyteword.org>
> 
> Replace trace_foo() with the new trace_call__foo() at sites already
> guarded by trace_foo_enabled(), avoiding a redundant
> static_branch_unlikely() re-evaluation inside the tracepoint.
> trace_call__foo() calls the tracepoint callbacks directly without
> utilizing the static branch again.
> 
> Original v2 series:
> https://lore.kernel.org/linux-trace-kernel/20260323160052.17528-1-vineeth@bitbyteword.org/
> 
> Parts of the original v2 series have already been merged in mainline.
> This patch is being reposted as a follow-up cleanup for the remaining
> unmerged pieces.
> 
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
> Assisted-by: Claude:claude-sonnet-4-6

No concerns this going through another tree together.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>   drivers/cpufreq/amd-pstate.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 453084c67327..4722de25149b 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -368,7 +368,8 @@ static int amd_pstate_set_floor_perf(struct cpufreq_policy *policy, u8 perf)
>   
>   out_trace:
>   	if (trace_amd_pstate_cppc_req2_enabled())
> -		trace_amd_pstate_cppc_req2(cpudata->cpu, perf, changed, ret);
> +		trace_call__amd_pstate_cppc_req2(cpudata->cpu, perf, changed,
> +						 ret);
>   	return ret;
>   }
>   


^ permalink raw reply

* Re: (subset) [PATCH v3 00/28] vfs/nfsd: add support for CB_NOTIFY callbacks in directory delegations
From: Christian Brauner @ 2026-05-15 17:26 UTC (permalink / raw)
  To: Jeff Layton, Chuck Lever
  Cc: Christian Brauner, Alexander Viro, Jan Kara, Alexander Aring,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, NeilBrown, Olga Kornievskaia,
	Dai Ngo, Tom Talpey, Trond Myklebust, Anna Schumaker,
	Amir Goldstein, Calum Mackay, linux-fsdevel, linux-kernel,
	linux-trace-kernel, linux-doc, linux-nfs
In-Reply-To: <20260428-dir-deleg-v3-0-5a0780ba9def@kernel.org>

On Tue, 28 Apr 2026 08:09:44 +0100, Jeff Layton wrote:
> Re-posting the set per Christian's request. The only difference in this
> version is a small error handling fix in alloc_init_dir_deleg(). The old
> version could crash since release_pages() can't handle an array with
> NULL pointers in it.
> 
> ---------------------------------8<------------------------------------
> 
> [...]

@Chuck, @Jeff, I've only merged the vfs specific changes into a stable branch.
You can pull it I won't touch it again. You can pull the nfsd work in in
whatever form you like. Same procedure I use with io_uring et al.

Let me know if that work for you.

---

Applied to the vfs-7.2.directory.delegations branch of the vfs/vfs.git tree.
Patches in the vfs-7.2.directory.delegations branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-7.2.directory.delegations

[01/28] filelock: pass current blocking lease to trace_break_lease_block() rather than "new_fl"
        https://git.kernel.org/vfs/vfs/c/89330d3a60f7
[02/28] filelock: add support for ignoring deleg breaks for dir change events
        https://git.kernel.org/vfs/vfs/c/24cbf43337f4
[03/28] filelock: add a tracepoint to start of break_lease()
        https://git.kernel.org/vfs/vfs/c/e39026a86b48
[04/28] filelock: add an inode_lease_ignore_mask helper
        https://git.kernel.org/vfs/vfs/c/95825fdcc0b0
[05/28] fsnotify: new tracepoint in fsnotify()
        https://git.kernel.org/vfs/vfs/c/ad4489dcd08d
[06/28] fsnotify: add fsnotify_modify_mark_mask()
        https://git.kernel.org/vfs/vfs/c/12ffbb117b64
[07/28] fsnotify: add FSNOTIFY_EVENT_RENAME data type
        https://git.kernel.org/vfs/vfs/c/010043003c0c

^ permalink raw reply

* Re: [PATCH v3 1/2] tracing: Return ERR_PTR() from expr_str()
From: Steven Rostedt @ 2026-05-15 16:24 UTC (permalink / raw)
  To: Pengpeng Hou
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-trace-kernel,
	linux-kernel
In-Reply-To: <20260417223002.1-tracing-expr-v3-pengpeng@iscas.ac.cn>

On Fri, 17 Apr 2026 20:24:00 +0800
Pengpeng Hou <pengpeng@iscas.ac.cn> wrote:

> diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
> index 73ea180cad55..954e0beb7f0a 100644
> --- a/kernel/trace/trace_events_hist.c
> +++ b/kernel/trace/trace_events_hist.c
> @@ -1764,13 +1764,14 @@ static void expr_field_str(struct hist_field *field, char *expr)
>  static char *expr_str(struct hist_field *field, unsigned int level)
>  {
>  	char *expr;
> +	int ret = -EINVAL;

No need for the ret value, use:

	char *expr __free(kfree) = NULL;

instead.

>  
>  	if (level > 1)
> -		return NULL;
> +		return ERR_PTR(-EINVAL);
>  
>  	expr = kzalloc(MAX_FILTER_STR_VAL, GFP_KERNEL);
>  	if (!expr)
> -		return NULL;
> +		return ERR_PTR(-ENOMEM);
>  
>  	if (!field->operands[0]) {
>  		expr_field_str(field, expr);
> @@ -1782,9 +1783,9 @@ static char *expr_str(struct hist_field *field, unsigned int level)
>  
>  		strcat(expr, "-(");
>  		subexpr = expr_str(field->operands[0], ++level);
> -		if (!subexpr) {
> -			kfree(expr);
> -			return NULL;
> +		if (IS_ERR(subexpr)) {
> +			ret = PTR_ERR(subexpr);
> +			goto free;

The above could then be:

		if (IS_ERR(subexpr))
			return subexpr;

>  		}
>  		strcat(expr, subexpr);
>  		strcat(expr, ")");
> @@ -1810,13 +1811,16 @@ static char *expr_str(struct hist_field *field, unsigned int level)
>  		strcat(expr, "*");
>  		break;
>  	default:
> -		kfree(expr);
> -		return NULL;

This could be just:

		return ERR_PTR(-EINVAL);


> +		goto free;
>  	}
>  
>  	expr_field_str(field->operands[1], expr);
>  
>  	return expr;

And the above would be:

	return_ptr(expr);

> +
> +free:
> +	kfree(expr);
> +	return ERR_PTR(ret);

And the above isn't needed.

-- Steve

>  }
>  

^ permalink raw reply

* Re: [PATCH v3 2/2] tracing: Bound histogram expression strings with seq_buf
From: Steven Rostedt @ 2026-05-15 16:16 UTC (permalink / raw)
  To: Pengpeng Hou
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-trace-kernel,
	linux-kernel
In-Reply-To: <20260417223002.2-tracing-expr-v3-pengpeng@iscas.ac.cn>

On Fri, 17 Apr 2026 20:28:00 +0800
Pengpeng Hou <pengpeng@iscas.ac.cn> wrote:

> expr_str() allocates a fixed MAX_FILTER_STR_VAL buffer and then builds
> expression names with a series of raw strcat() appends. Nested operands,
> constants and field flags can push the rendered string past that fixed
> limit before the name is attached to the hist field.
> 
> Build the expression strings with seq_buf and return -E2BIG when the
> rendered name would exceed MAX_FILTER_STR_VAL.
> 
> Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
> ---

Things have changed and this no longer applies cleanly. Can you send a v3
rebased on top of v7.1-rc3.

Also, make sure it's a new thread and not a reply to this patch series.

You can add a

 Changes since v3: https://lore.kernel.org/all/20260417223002.2-tracing-expr-v3-pengpeng@iscas.ac.cn/

to that patch too.

-- Steve


> Changes since v2:
> - split the ERR_PTR() conversion into patch 1/2 as requested by Steven
>   Rostedt
> - keep this patch focused on the seq_buf conversion and overflow
>   detection
> 

^ permalink raw reply

* Re: [PATCH 01/13] verification/rvgen: Switch LTL parser to Lark
From: Wander Lairson Costa @ 2026-05-15 15:55 UTC (permalink / raw)
  To: Nam Cao; +Cc: Gabriele Monaco, Steven Rostedt, linux-trace-kernel, linux-kernel
In-Reply-To: <85aaa8cacb31cfc78619a07aeae9a86d059a4cc1.1777962130.git.namcao@linutronix.de>

On Tue, May 05, 2026 at 08:59:22AM +0200, Nam Cao wrote:
> The LTL parser is built using Ply. However, Ply is no longer
> maintained [1].
> 
> Switch to use Lark instead. In addition to being actively maintained, Lark
> also offers additional features (namely, automatically creating the
> abstract syntax tree) which make the parser simpler.
> 
> Link: https://github.com/dabeaz/ply/commit/9d7c40099e23ff78f9d86ef69a26c1e8a83e706a [1]
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> ---
>  tools/verification/rvgen/rvgen/ltl2ba.py | 189 +++++++++--------------
>  1 file changed, 70 insertions(+), 119 deletions(-)
> 
> diff --git a/tools/verification/rvgen/rvgen/ltl2ba.py b/tools/verification/rvgen/rvgen/ltl2ba.py
> index 7f538598a868..b2dee2dbe257 100644
> --- a/tools/verification/rvgen/rvgen/ltl2ba.py
> +++ b/tools/verification/rvgen/rvgen/ltl2ba.py
> @@ -7,8 +7,7 @@
>  # https://doi.org/10.1007/978-0-387-34892-6_1
>  # With extra optimizations
>  
> -from ply.lex import lex
> -from ply.yacc import yacc
> +import lark
>  from .automata import AutomataError
>  
>  # Grammar:
> @@ -30,42 +29,38 @@ from .automata import AutomataError
>  #       imply
>  #       equivalent
>  
> -tokens = (
> -   'AND',
> -   'OR',
> -   'IMPLY',
> -   'UNTIL',
> -   'ALWAYS',
> -   'EVENTUALLY',
> -   'NEXT',
> -   'VARIABLE',
> -   'LITERAL',
> -   'NOT',
> -   'LPAREN',
> -   'RPAREN',
> -   'ASSIGN',
> -)
> -
> -t_AND = r'and'
> -t_OR = r'or'
> -t_IMPLY = r'imply'
> -t_UNTIL = r'until'
> -t_ALWAYS = r'always'
> -t_NEXT = r'next'
> -t_EVENTUALLY = r'eventually'
> -t_VARIABLE = r'[A-Z_0-9]+'
> -t_LITERAL = r'true|false'
> -t_NOT = r'not'
> -t_LPAREN = r'\('
> -t_RPAREN = r'\)'
> -t_ASSIGN = r'='
> -t_ignore_COMMENT = r'\#.*'
> -t_ignore = ' \t\n'
> -
> -def t_error(t):
> -    raise AutomataError(f"Illegal character '{t.value[0]}'")
> -
> -lexer = lex()
> +GRAMMAR = r'''
> +start: assign+
> +
> +assign: VARIABLE "=" _ltl
> +
> +_ltl: _opd | binop | unop
> +
> +_opd : VARIABLE
> +     | LITERAL
> +     | "(" _ltl ")"
> +
> +unop: UNOP _ltl
> +UNOP: "always"
> +     | "eventually"
> +     | "next"
> +     | "not"
> +
> +binop: _opd BINOP _ltl
> +BINOP: "until"
> +      | "and"
> +      | "or"
> +      | "imply"
> +
> +VARIABLE: /[A-Z_0-9]+/

Is it ok to start variable names with a number? (unless I am reading the
regex wrong).

> +LITERAL: "true" | "false"
> +
> +COMMENT: "#" /.*/ "\n"
> +%ignore COMMENT
> +
> +%import common.WS
> +%ignore WS
> +'''
>  
>  class GraphNode:
>      uid = 0
> @@ -432,90 +427,46 @@ class Literal:
>          node.old |= {n}
>          return node.expand(node_set)
>  
> -def p_spec(p):
> -    '''
> -    spec : assign
> -         | assign spec
> -    '''
> -    if len(p) == 3:
> -        p[2].append(p[1])
> -        p[0] = p[2]
> -    else:
> -        p[0] = [p[1]]
> -
> -def p_assign(p):
> -    '''
> -    assign : VARIABLE ASSIGN ltl
> -    '''
> -    p[0] = (p[1], p[3])
> -
> -def p_ltl(p):
> -    '''
> -    ltl : opd
> -        | binop
> -        | unop
> -    '''
> -    p[0] = p[1]
> -
> -def p_opd(p):
> -    '''
> -    opd : VARIABLE
> -        | LITERAL
> -        | LPAREN ltl RPAREN
> -    '''
> -    if p[1] == "true":
> -        p[0] = ASTNode(Literal(True))
> -    elif p[1] == "false":
> -        p[0] = ASTNode(Literal(False))
> -    elif p[1] == '(':
> -        p[0] = p[2]
> -    else:
> -        p[0] = ASTNode(Variable(p[1]))
> -
> -def p_unop(p):
> -    '''
> -    unop : ALWAYS ltl
> -         | EVENTUALLY ltl
> -         | NEXT ltl
> -         | NOT ltl
> -    '''
> -    if p[1] == "always":
> -        op = AlwaysOp(p[2])
> -    elif p[1] == "eventually":
> -        op = EventuallyOp(p[2])
> -    elif p[1] == "next":
> -        op = NextOp(p[2])
> -    elif p[1] == "not":
> -        op = NotOp(p[2])
> -    else:
> -        raise AutomataError(f"Invalid unary operator {p[1]}")
> -
> -    p[0] = ASTNode(op)
> -
> -def p_binop(p):
> -    '''
> -    binop : opd UNTIL ltl
> -          | opd AND ltl
> -          | opd OR ltl
> -          | opd IMPLY ltl
> -    '''
> -    if p[2] == "and":
> -        op = AndOp(p[1], p[3])
> -    elif p[2] == "until":
> -        op = UntilOp(p[1], p[3])
> -    elif p[2] == "or":
> -        op = OrOp(p[1], p[3])
> -    elif p[2] == "imply":
> -        op = ImplyOp(p[1], p[3])
> -    else:
> -        raise AutomataError(f"Invalid binary operator {p[2]}")
> -
> -    p[0] = ASTNode(op)
> -
> -parser = yacc()
> +class Transform(lark.visitors.Transformer):
> +    def unop(self, node):
> +        if node[0] == "always":
> +            return ASTNode(AlwaysOp(node[1]))
> +        if node[0] == "eventually":
> +            return ASTNode(EventuallyOp(node[1]))
> +        if node[0] == "next":
> +            return ASTNode(NextOp(node[1]))
> +        if node[0] == "not":
> +            return ASTNode(NotOp(node[1]))
> +        raise ValueError("Unknown operator %s" % node[1])
> +
> +    def binop(self, node):
> +        if node[1] == "until":
> +            return ASTNode(UntilOp(node[0], node[2]))
> +        if node[1] == "and":
> +            return ASTNode(AndOp(node[0], node[2]))
> +        if node[1] == "or":
> +            return ASTNode(OrOp(node[0], node[2]))
> +        if node[1] == "imply":
> +            return ASTNode(ImplyOp(node[0], node[2]))
> +        raise ValueError("Unknown operator %s" % node[1])
> +
> +    def VARIABLE(self, args):
> +        return ASTNode(Variable(args))
> +
> +    def LITERAL(self, args):
> +        return ASTNode(Variable(args))

shouldn't this be `return ASTNode(Literal(args) == "true")`?

> +
> +    def start(self, node):
> +        return node
> +
> +    def assign(self, node):
> +        return node[0].op.name, node[1]
> +
> +parser = lark.Lark(GRAMMAR)
>  
>  def parse_ltl(s: str) -> ASTNode:
>      spec = parser.parse(s)
> +    spec = Transform().transform(spec)
>  
>      rule = None
>      subexpr = {}
> -- 
> 2.47.3
> 


^ permalink raw reply

* Re: [PATCH v3 08/11] scsi: ufs: Use trace_call__##name() at guarded tracepoint call sites
From: Bart Van Assche @ 2026-05-15 15:27 UTC (permalink / raw)
  To: Vineeth Pillai (Google), James E.J. Bottomley, Martin K. Petersen
  Cc: linux-scsi, Steven Rostedt, linux-trace-kernel, Peter Zijlstra
In-Reply-To: <20260515135946.2238888-1-vineeth@bitbyteword.org>


On 5/15/26 6:59 AM, Vineeth Pillai (Google) wrote:
>   static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba,
> @@ -432,8 +432,8 @@ static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba,
>   	if (!trace_ufshcd_upiu_enabled())
>   		return;
>   
> -	trace_ufshcd_upiu(hba, str_t, &rq_rsp->header,
> -			  &rq_rsp->qr, UFS_TSF_OSF);
> +	trace_call__ufshcd_upiu(hba, str_t, &rq_rsp->header,
> +			       &rq_rsp->qr, UFS_TSF_OSF);
>   }

Instead of making this change, please remove the 
trace_ufshcd_upiu_enabled() call because it is redundant.

>   static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
> @@ -445,15 +445,15 @@ static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
>   		return;
>   
>   	if (str_t == UFS_TM_SEND)
> -		trace_ufshcd_upiu(hba, str_t,
> -				  &descp->upiu_req.req_header,
> -				  &descp->upiu_req.input_param1,
> -				  UFS_TSF_TM_INPUT);
> +		trace_call__ufshcd_upiu(hba, str_t,
> +					&descp->upiu_req.req_header,
> +					&descp->upiu_req.input_param1,
> +					UFS_TSF_TM_INPUT);
>   	else
> -		trace_ufshcd_upiu(hba, str_t,
> -				  &descp->upiu_rsp.rsp_header,
> -				  &descp->upiu_rsp.output_param1,
> -				  UFS_TSF_TM_OUTPUT);
> +		trace_call__ufshcd_upiu(hba, str_t,
> +					&descp->upiu_rsp.rsp_header,
> +					&descp->upiu_rsp.output_param1,
> +					UFS_TSF_TM_OUTPUT);
>   }

Same comment here: I think it would be better to remove the 
trace_ufshcd_upiu_enabled() call rather than
changing trace_ufshcd_upiu() into trace_call__ufshcd_upiu().

Thanks,

Bart.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox