Linux Trace Kernel

Linux Trace Kernel
 help / color / mirror / Atom feed

* [PATCH 2/5] docs: fix repeated word 'that' across documentation
From: Adrien Reynard @ 2026-05-08 16:37 UTC (permalink / raw)
  To: Paul E. McKenney, Frederic Weisbecker, Neeraj Upadhyay,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Jonathan Corbet, Shuah Khan, Greg Kroah-Hartman,
	Rafael J. Wysocki, Danilo Krummrich, David Howells,
	Paulo Alcantara, Masami Hiramatsu,
	open list:READ-COPY UPDATE (RCU), open list:DOCUMENTATION,
	open list, open list:DRIVER CORE, KOBJECTS, DEBUGFS AND SYSFS,
	open list:FILESYSTEMS [NETFS LIBRARY],
	open list:FILESYSTEMS [NETFS LIBRARY], open list:TRACING
  Cc: Adrien Reynard

Signed-off-by: Adrien Reynard <reynard.adrien.08@gmail.com>
---
 Documentation/RCU/rcu.rst                          | 2 +-
 Documentation/driver-api/driver-model/overview.rst | 2 +-
 Documentation/filesystems/netfs_library.rst        | 2 +-
 Documentation/trace/histogram-design.rst           | 2 +-
 Documentation/trace/histogram.rst                  | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst
index bf6617b330a7..320ad3292b75 100644
--- a/Documentation/RCU/rcu.rst
+++ b/Documentation/RCU/rcu.rst
@@ -32,7 +32,7 @@ Frequently Asked Questions
   Just as with spinlocks, RCU readers are not permitted to
   block, switch to user-mode execution, or enter the idle loop.
   Therefore, as soon as a CPU is seen passing through any of these
-  three states, we know that that CPU has exited any previous RCU
+  three states, we know that CPU has exited any previous RCU
   read-side critical sections.  So, if we remove an item from a
   linked list, and then wait until all CPUs have switched context,
   executed in user mode, or executed in the idle loop, we can
diff --git a/Documentation/driver-api/driver-model/overview.rst b/Documentation/driver-api/driver-model/overview.rst
index b3f447bf9f07..c1966d506d55 100644
--- a/Documentation/driver-api/driver-model/overview.rst
+++ b/Documentation/driver-api/driver-model/overview.rst
@@ -55,7 +55,7 @@ struct pci_dev now looks like this::
 Note first that the struct device dev within the struct pci_dev is
 statically allocated. This means only one allocation on device discovery.
 
-Note also that that struct device dev is not necessarily defined at the
+Note also that struct device dev is not necessarily defined at the
 front of the pci_dev structure.  This is to make people think about what
 they're doing when switching between the bus driver and the global driver,
 and to discourage meaningless and incorrect casts between the two.
diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
index ddd799df6ce3..4033de4535ac 100644
--- a/Documentation/filesystems/netfs_library.rst
+++ b/Documentation/filesystems/netfs_library.rst
@@ -626,7 +626,7 @@ A number of members are available for access/use by the filesystem:
 
    These are set by the filesystem or the cache in ->prepare_read() or
    ->prepare_write() for each subrequest to indicate the maximum number of
-   bytes and, optionally, the maximum number of segments (if not 0) that that
+   bytes and, optionally, the maximum number of segments (if not 0) that
    subrequest can support.
 
  * ``submit_extendable_to``
diff --git a/Documentation/trace/histogram-design.rst b/Documentation/trace/histogram-design.rst
index e92f56ebd0b5..949bbfdb0f16 100644
--- a/Documentation/trace/histogram-design.rst
+++ b/Documentation/trace/histogram-design.rst
@@ -738,7 +738,7 @@ creates its own variable, wakeup_lat, but nothing yet uses it::
 
 Looking at the sched_waking 'hist_debug' output, in addition to the
 normal key and value hist_fields, in the val fields section we see a
-field with the HIST_FIELD_FL_VAR flag, which indicates that that field
+field with the HIST_FIELD_FL_VAR flag, which indicates that field
 represents a variable.  Note that in addition to the variable name,
 contained in the var.name field, it includes the var.idx, which is the
 index into the tracing_map_elt.vars[] array of the actual variable
diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst
index 340bcb5099e7..5b303fabdf32 100644
--- a/Documentation/trace/histogram.rst
+++ b/Documentation/trace/histogram.rst
@@ -1700,7 +1700,7 @@ to that rule is that any variable used in an expression is essentially
 'read-once' - once it's used by an expression in a subsequent event,
 it's reset to its 'unset' state, which means it can't be used again
 unless it's set again.  This ensures not only that an event doesn't
-use an uninitialized variable in a calculation, but that that variable
+use an uninitialized variable in a calculation, but that variable
 is used only once and not for any unrelated subsequent match.
 
 The basic syntax for saving a variable is to simply prefix a unique
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH 2/5] docs: fix repeated word 'that' across documentation
From: Shuah Khan @ 2026-05-08 17:15 UTC (permalink / raw)
  To: Adrien Reynard, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
	Lai Jiangshan, Zqiang, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Danilo Krummrich, David Howells,
	Paulo Alcantara, Masami Hiramatsu,
	open list:READ-COPY UPDATE (RCU), open list:DOCUMENTATION,
	open list, open list:DRIVER CORE, KOBJECTS, DEBUGFS AND SYSFS,
	open list:FILESYSTEMS [NETFS LIBRARY],
	open list:FILESYSTEMS [NETFS LIBRARY], open list:TRACING,
	Shuah Khan
In-Reply-To: <20260508163759.16231-1-reynard.adrien.08@gmail.com>

On 5/8/26 10:37, Adrien Reynard wrote:

Missing commit log in all your patches - I don't patch 1/5 in
my Inbox.

> Signed-off-by: Adrien Reynard <reynard.adrien.08@gmail.com>
> ---
>   Documentation/RCU/rcu.rst                          | 2 +-
>   Documentation/driver-api/driver-model/overview.rst | 2 +-
>   Documentation/filesystems/netfs_library.rst        | 2 +-
>   Documentation/trace/histogram-design.rst           | 2 +-
>   Documentation/trace/histogram.rst                  | 2 +-
>   5 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst
> index bf6617b330a7..320ad3292b75 100644
> --- a/Documentation/RCU/rcu.rst
> +++ b/Documentation/RCU/rcu.rst
> @@ -32,7 +32,7 @@ Frequently Asked Questions
>     Just as with spinlocks, RCU readers are not permitted to
>     block, switch to user-mode execution, or enter the idle loop.
>     Therefore, as soon as a CPU is seen passing through any of these
> -  three states, we know that that CPU has exited any previous RCU
> +  three states, we know that CPU has exited any previous RCU

The original intent might have been to say, "that cpu", so adding
the missing comma after the first "that" or change "that" to "the"
would make sense.


>     read-side critical sections.  So, if we remove an item from a
>     linked list, and then wait until all CPUs have switched context,
>     executed in user mode, or executed in the idle loop, we can
> diff --git a/Documentation/driver-api/driver-model/overview.rst b/Documentation/driver-api/driver-model/overview.rst
> index b3f447bf9f07..c1966d506d55 100644
> --- a/Documentation/driver-api/driver-model/overview.rst
> +++ b/Documentation/driver-api/driver-model/overview.rst
> @@ -55,7 +55,7 @@ struct pci_dev now looks like this::
>   Note first that the struct device dev within the struct pci_dev is
>   statically allocated. This means only one allocation on device discovery.
>   
> -Note also that that struct device dev is not necessarily defined at the
> +Note also that struct device dev is not necessarily defined at the

Sam comment here, replace "that" with "the" or add missing comma

>   front of the pci_dev structure.  This is to make people think about what
>   they're doing when switching between the bus driver and the global driver,
>   and to discourage meaningless and incorrect casts between the two.
> diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
> index ddd799df6ce3..4033de4535ac 100644
> --- a/Documentation/filesystems/netfs_library.rst
> +++ b/Documentation/filesystems/netfs_library.rst
> @@ -626,7 +626,7 @@ A number of members are available for access/use by the filesystem:
>   
>      These are set by the filesystem or the cache in ->prepare_read() or
>      ->prepare_write() for each subrequest to indicate the maximum number of
> -   bytes and, optionally, the maximum number of segments (if not 0) that that
> +   bytes and, optionally, the maximum number of segments (if not 0) that

Same here.

>      subrequest can support.
>   
>    * ``submit_extendable_to``
> diff --git a/Documentation/trace/histogram-design.rst b/Documentation/trace/histogram-design.rst
> index e92f56ebd0b5..949bbfdb0f16 100644
> --- a/Documentation/trace/histogram-design.rst
> +++ b/Documentation/trace/histogram-design.rst
> @@ -738,7 +738,7 @@ creates its own variable, wakeup_lat, but nothing yet uses it::
>   
>   Looking at the sched_waking 'hist_debug' output, in addition to the
>   normal key and value hist_fields, in the val fields section we see a
> -field with the HIST_FIELD_FL_VAR flag, which indicates that that field
> +field with the HIST_FIELD_FL_VAR flag, which indicates that field

Same here


>   represents a variable.  Note that in addition to the variable name,
>   contained in the var.name field, it includes the var.idx, which is the
>   index into the tracing_map_elt.vars[] array of the actual variable
> diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst
> index 340bcb5099e7..5b303fabdf32 100644
> --- a/Documentation/trace/histogram.rst
> +++ b/Documentation/trace/histogram.rst
> @@ -1700,7 +1700,7 @@ to that rule is that any variable used in an expression is essentially
>   'read-once' - once it's used by an expression in a subsequent event,
>   it's reset to its 'unset' state, which means it can't be used again
>   unless it's set again.  This ensures not only that an event doesn't
> -use an uninitialized variable in a calculation, but that that variable
> +use an uninitialized variable in a calculation, but that variable

Same here

>   is used only once and not for any unrelated subsequent match.
>   
>   The basic syntax for saving a variable is to simply prefix a unique

thanks,
-- Shuah


^ permalink raw reply

* Re: [PATCH v1 1/2] serial: qcom-geni: trace: Add tracepoint support for Qualcomm GENI serial
From: Steven Rostedt @ 2026-05-08 17:25 UTC (permalink / raw)
  To: Praveen Talari
  Cc: Masami Hiramatsu, Mathieu Desnoyers, Greg Kroah-Hartman,
	Jiri Slaby, Konrad Dybcio, linux-kernel, linux-trace-kernel,
	linux-arm-msm, linux-serial, Mukesh Kumar Savaliya,
	Aniket Randive, chandana.chiluveru, jyothi.seerapu
In-Reply-To: <20260506-add-tracepoints-for-qcom-geni-serial-v1-1-544b22612e08@oss.qualcomm.com>

On Wed, 06 May 2026 22:54:44 +0530
Praveen Talari <praveen.talari@oss.qualcomm.com> wrote:

> +TRACE_EVENT(geni_serial_tx_data,
> +	    TP_PROTO(struct device *dev, const u8 *buf, unsigned int len),
> +	    TP_ARGS(dev, buf, len),
> +
> +	    TP_STRUCT__entry(__string(name, dev_name(dev))
> +			     __field(unsigned int, len)
> +			     __dynamic_array(u8, data, len)
> +	    ),
> +
> +	    TP_fast_assign(__assign_str(name);
> +			   __entry->len = len;
> +			   memcpy(__get_dynamic_array(data), buf, len);
> +	    ),
> +
> +	    TP_printk("%s: tx_len=%u data=%s",
> +		      __get_str(name), __entry->len,
> +		      __print_hex(__get_dynamic_array(data), __entry->len))
> +);
> +
> +TRACE_EVENT(geni_serial_rx_data,
> +	    TP_PROTO(struct device *dev, const u8 *buf, unsigned int len),
> +	    TP_ARGS(dev, buf, len),
> +
> +	    TP_STRUCT__entry(__string(name, dev_name(dev))
> +			     __field(unsigned int, len)
> +			     __dynamic_array(u8, data, len)
> +	    ),
> +
> +	    TP_fast_assign(__assign_str(name);
> +			   __entry->len = len;
> +			   memcpy(__get_dynamic_array(data), buf, len);
> +	    ),
> +
> +	    TP_printk("%s: rx_len=%u data=%s",

Do you really need to say "tx_len" and "rx_len", could it just be "len" and
have the name of the tracepoint show what it is?

Each TRACE_EVENT() is really just a:

  DECLARE_EVENT_CLASS() followed by a DEFINE_EVENT()

underneath.

And each TRACE_EVENT() costs around 5K in size, where most of that is in
the DECLARE_EVENT_CLASS() portion. Thus, you can save some memory by using
DECLARE_EVENT_CLASS() and then define the above two events with
DEFINE_EVENT().

-- Steve


> +		      __get_str(name), __entry->len,
> +		      __print_hex(__get_dynamic_array(data), __entry->len))
> +);
> +


^ permalink raw reply

* Re: [PATCH 1/2] mm/page_alloc: add tracepoints for zone->lock acquisitions
From: Andrew Morton @ 2026-05-08 17:29 UTC (permalink / raw)
  To: hawk
  Cc: linux-mm, Vlastimil Babka, Steven Rostedt, Suren Baghdasaryan,
	Michal Hocko, Zi Yan, David Hildenbrand, Lorenzo Stoakes,
	Shuah Khan, linux-kernel, linux-trace-kernel, kernel-team
In-Reply-To: <20260508162207.3315781-1-hawk@kernel.org>

e .configOn Fri,  8 May 2026 18:22:06 +0200 hawk@kernel.org wrote:

> Add tracepoints to the page allocator fast paths that acquire
> zone->lock, allowing diagnosis of lock contention in production.

Thanks, I'm surprised we haven't done this yet.

Unfortunately "mm: use spinlock guards for zone lock" messed this up
(https://lore.kernel.org/all/cover.1777462630.git.d@ilvokhin.com/).

So please let's give it a few days for reviewers to comment then redo
against mm.git's mm-unstable branch?

^ permalink raw reply

* Re: [PATCH 1/2] mm/page_alloc: add tracepoints for zone->lock acquisitions
From: Vlastimil Babka (SUSE) @ 2026-05-08 17:38 UTC (permalink / raw)
  To: Andrew Morton, hawk, Dmitry Ilvokhin, Matthew Wilcox
  Cc: linux-mm, Steven Rostedt, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, David Hildenbrand, Lorenzo Stoakes, Shuah Khan,
	linux-kernel, linux-trace-kernel, kernel-team
In-Reply-To: <20260508102948.b1c687e623fabec65580f258@linux-foundation.org>

On 5/8/26 7:29 PM, Andrew Morton wrote:
> e .configOn Fri,  8 May 2026 18:22:06 +0200 hawk@kernel.org wrote:
> 
>> Add tracepoints to the page allocator fast paths that acquire
>> zone->lock, allowing diagnosis of lock contention in production.
> 
> Thanks, I'm surprised we haven't done this yet.

There was a recent attempt [1]. Not being a generic solution wasn't welcome.

[1] https://lore.kernel.org/all/cover.1772206930.git.d@ilvokhin.com/

> Unfortunately "mm: use spinlock guards for zone lock" messed this up
> (https://lore.kernel.org/all/cover.1777462630.git.d@ilvokhin.com/).
> 
> So please let's give it a few days for reviewers to comment then redo
> against mm.git's mm-unstable branch?


^ permalink raw reply

* Re: [PATCH 1/2] mm/page_alloc: add tracepoints for zone->lock acquisitions
From: Vlastimil Babka (SUSE) @ 2026-05-08 17:40 UTC (permalink / raw)
  To: Andrew Morton, hawk, Dmitry Ilvokhin, Matthew Wilcox
  Cc: linux-mm, Steven Rostedt, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, David Hildenbrand, Lorenzo Stoakes, Shuah Khan,
	linux-kernel, linux-trace-kernel, kernel-team
In-Reply-To: <832e4333-4079-4865-8ad8-3dd8868fb964@kernel.org>

On 5/8/26 7:38 PM, Vlastimil Babka (SUSE) wrote:
> On 5/8/26 7:29 PM, Andrew Morton wrote:
>> e .configOn Fri,  8 May 2026 18:22:06 +0200 hawk@kernel.org wrote:
>>
>>> Add tracepoints to the page allocator fast paths that acquire
>>> zone->lock, allowing diagnosis of lock contention in production.
>>
>> Thanks, I'm surprised we haven't done this yet.
> 
> There was a recent attempt [1]. Not being a generic solution wasn't welcome.
> 
> [1] https://lore.kernel.org/all/cover.1772206930.git.d@ilvokhin.com/

And this is the generic solution I think?

https://lore.kernel.org/all/cover.1777999826.git.d@ilvokhin.com/

>> Unfortunately "mm: use spinlock guards for zone lock" messed this up
>> (https://lore.kernel.org/all/cover.1777462630.git.d@ilvokhin.com/).
>>
>> So please let's give it a few days for reviewers to comment then redo
>> against mm.git's mm-unstable branch?
> 


^ permalink raw reply

* Re: [PATCH RFC v4 10/44] KVM: guest_memfd: Add support for KVM_SET_MEMORY_ATTRIBUTES2
From: Sean Christopherson @ 2026-05-08 17:40 UTC (permalink / raw)
  To: Michael Roth
  Cc: Ackerley Tng, aik, andrew.jones, binbin.wu, brauner, chao.p.peng,
	david, ira.weiny, jmattson, jthoughton, oupton, pankaj.gupta,
	qperret, rick.p.edgecombe, rientjes, shivankg, steven.price,
	tabba, willy, wyihan, yan.y.zhao, forkloop, pratyush,
	suzuki.poulose, aneesh.kumar, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Baoquan He, Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm
In-Reply-To: <foi2zvv5qrfdcspnx4fstrvzl74m6xp6zrsw5omlbprxh4jrhx@vxnwk7fr46gu>

On Wed, Apr 29, 2026, Michael Roth wrote:
> On Fri, Apr 24, 2026 at 12:08:45PM -0700, Ackerley Tng wrote:
> > Michael Roth <michael.roth@amd.com> writes:
> > 
> > Thank you for your patches!
> > 
> > >
> > > [...snip...]
> > >
> > >>
> > >> I also did some minor updates (prefixed with a "[squash]" tag) to advertise
> > >> the KVM_SET_MEMORY_ATTRIBUTES2_PRESERVED flag so it can be used by
> > >
> > > Though I'm not sure how we deal with it if SNP/TDX at some point become
> > > capable of using the PRESERVED flag *after* populate... but maybe that's
> > > too unlikely to worry about? If we wanted to address it though, we could
> > > have both PRESERVED and PRESERVED_BEFORE_LAUNCH so they can be
> > > enumerated separately from the start.
> > >
> > 
> > Not sure how likely it is, but if SNP and TDX can honor PRESERVE
> > semantics after populate, I think we could implement support under a new
> > flag like CIPHER.
> 
> That works, but it still makes things *slightly* awkward due to special-casing
> the PRESERVE semantics for 1 guest type vs. another.

Summarizing this week's PUCK call[*]:

Scrap PRESERVE and ZERO, and simply rely on vendor specific semantics.

My desire to enforce PRESERVE and ZERO semantics and avoid relying on vendor specific
behavior (i.e. on trusted firmware semantics) is a pipe dream.  Unless KVM does
a truly insane amount of per-gfn tracking, KVM can't know the state of memory for
a given page, and so can't guarantee PRESERVE or ZERO will be honored.

If userspace requests PRESERVE, just because it's _possible_ to preserve contents
(e.g. during the pre-boot phase on TDX), doesn't mean the contents are _guaranteed_
to be preserved.  If userspace doesn't actually ADD the memory to the guest's
initial image, then the contents won't be preserved.  Ditto for SNP.

To guarantee PRESERVE, KVM would need to track per-gfn information to know if the
memory was actually preserved.  And enforcing PRESERVE would be all kinds of crazy;
KVM would have to kill the VM or something?  And that would still require userspace
to be aware of vendor specific details.

The same holds true for ZERO.  On a private=>shared conversion, KVM can't guarantee
the memory is zeroed by trusted firmware unless KVM tracks, per-gfn, whether or
not the memory was actually fully assigned to the guest.  E.g. if userspace does
shared=>private and then private=>shared(ZERO), without the memory being faulted
into the guest, then the TDX-Module won't have "seen" the page and so wont' have
zeroed it on the private=>shared conversion.

And trying to special case SNP's "validated CPUID" behavior, where memory can be
preserved on private=>shared after a failed shared=>private, would also require
tracking that the page was never actually converted to private.

Note, regarding ZERO, someone (Mike? Ackerley?) pointed out that userspace typically
doesn't rely on the kernel to zero memory, and so supporting ZERO for private=>shared
isn't really all that valuable in the first place.

[*] https://drive.google.com/file/d/1w0ifzh5PmNViJ1SKru9jK9x52MybXSNa/view?usp=drive_link

^ permalink raw reply

* Re: [PATCH 2/5] docs: fix repeated word 'that' across documentation
From: Randy Dunlap @ 2026-05-08 17:40 UTC (permalink / raw)
  To: Shuah Khan, Adrien Reynard, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
	Lai Jiangshan, Zqiang, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Danilo Krummrich, David Howells,
	Paulo Alcantara, Masami Hiramatsu,
	open list:READ-COPY UPDATE (RCU), open list:DOCUMENTATION,
	open list, open list:DRIVER CORE, KOBJECTS, DEBUGFS AND SYSFS,
	open list:FILESYSTEMS [NETFS LIBRARY],
	open list:FILESYSTEMS [NETFS LIBRARY], open list:TRACING
In-Reply-To: <1501caea-8cff-4968-aca6-e8d4b20e0e80@linuxfoundation.org>



On 5/8/26 10:15 AM, Shuah Khan wrote:
> On 5/8/26 10:37, Adrien Reynard wrote:
> 
> Missing commit log in all your patches - I don't patch 1/5 in
> my Inbox.
> 
>> Signed-off-by: Adrien Reynard <reynard.adrien.08@gmail.com>
>> ---
>>   Documentation/RCU/rcu.rst                          | 2 +-
>>   Documentation/driver-api/driver-model/overview.rst | 2 +-
>>   Documentation/filesystems/netfs_library.rst        | 2 +-
>>   Documentation/trace/histogram-design.rst           | 2 +-
>>   Documentation/trace/histogram.rst                  | 2 +-
>>   5 files changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst
>> index bf6617b330a7..320ad3292b75 100644
>> --- a/Documentation/RCU/rcu.rst
>> +++ b/Documentation/RCU/rcu.rst
>> @@ -32,7 +32,7 @@ Frequently Asked Questions
>>     Just as with spinlocks, RCU readers are not permitted to
>>     block, switch to user-mode execution, or enter the idle loop.
>>     Therefore, as soon as a CPU is seen passing through any of these
>> -  three states, we know that that CPU has exited any previous RCU
>> +  three states, we know that CPU has exited any previous RCU
> 
> The original intent might have been to say, "that cpu", so adding
> the missing comma after the first "that" or change "that" to "the"
> would make sense.

Not a comma, please.
I don't see a problem with "that that," but "that the" could also be OK.

> 
> 
>>     read-side critical sections.  So, if we remove an item from a
>>     linked list, and then wait until all CPUs have switched context,
>>     executed in user mode, or executed in the idle loop, we can


-- 
~Randy


^ permalink raw reply

* Re: [PATCH 2/5] docs: fix repeated word 'that' across documentation
From: Paul E. McKenney @ 2026-05-08 17:52 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Shuah Khan, Adrien Reynard, Frederic Weisbecker, Neeraj Upadhyay,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Danilo Krummrich, David Howells, Paulo Alcantara,
	Masami Hiramatsu, open list:READ-COPY UPDATE (RCU),
	open list:DOCUMENTATION, open list,
	open list:DRIVER CORE, KOBJECTS, DEBUGFS AND SYSFS,
	open list:FILESYSTEMS [NETFS LIBRARY],
	open list:FILESYSTEMS [NETFS LIBRARY], open list:TRACING
In-Reply-To: <5f68ac30-21ac-494b-a140-2307e236f0a2@infradead.org>

On Fri, May 08, 2026 at 10:40:49AM -0700, Randy Dunlap wrote:
> 
> 
> On 5/8/26 10:15 AM, Shuah Khan wrote:
> > On 5/8/26 10:37, Adrien Reynard wrote:
> > 
> > Missing commit log in all your patches - I don't patch 1/5 in
> > my Inbox.
> > 
> >> Signed-off-by: Adrien Reynard <reynard.adrien.08@gmail.com>
> >> ---
> >>   Documentation/RCU/rcu.rst                          | 2 +-
> >>   Documentation/driver-api/driver-model/overview.rst | 2 +-
> >>   Documentation/filesystems/netfs_library.rst        | 2 +-
> >>   Documentation/trace/histogram-design.rst           | 2 +-
> >>   Documentation/trace/histogram.rst                  | 2 +-
> >>   5 files changed, 5 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst
> >> index bf6617b330a7..320ad3292b75 100644
> >> --- a/Documentation/RCU/rcu.rst
> >> +++ b/Documentation/RCU/rcu.rst
> >> @@ -32,7 +32,7 @@ Frequently Asked Questions
> >>     Just as with spinlocks, RCU readers are not permitted to
> >>     block, switch to user-mode execution, or enter the idle loop.
> >>     Therefore, as soon as a CPU is seen passing through any of these
> >> -  three states, we know that that CPU has exited any previous RCU
> >> +  three states, we know that CPU has exited any previous RCU
> > 
> > The original intent might have been to say, "that cpu", so adding
> > the missing comma after the first "that" or change "that" to "the"
> > would make sense.
> 
> Not a comma, please.
> I don't see a problem with "that that," but "that the" could also be OK.

This CPU was already mentioned.  So if for whatever reason we cannnot
stomach "that that", then "that this" would be better than "that the".

I suppose that false positives from simple grammar checkers might be
sufficient reason, but in this brave new world of LLMs, shouldn't we
be hoping for better?  ;-)

						Thanx, Paul

> >>     read-side critical sections.  So, if we remove an item from a
> >>     linked list, and then wait until all CPUs have switched context,
> >>     executed in user mode, or executed in the idle loop, we can
> 
> 
> -- 
> ~Randy
> 

^ permalink raw reply

* Re: [PATCH 1/2] mm/page_alloc: add tracepoints for zone->lock acquisitions
From: Dmitry Ilvokhin @ 2026-05-08 18:07 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: Andrew Morton, hawk, Matthew Wilcox, linux-mm, Steven Rostedt,
	Suren Baghdasaryan, Michal Hocko, Zi Yan, David Hildenbrand,
	Lorenzo Stoakes, Shuah Khan, linux-kernel, linux-trace-kernel,
	kernel-team
In-Reply-To: <4f61457e-deff-430f-8a1e-d3c33c925db3@kernel.org>

On Fri, May 08, 2026 at 07:40:51PM +0200, Vlastimil Babka (SUSE) wrote:
> On 5/8/26 7:38 PM, Vlastimil Babka (SUSE) wrote:
> > On 5/8/26 7:29 PM, Andrew Morton wrote:
> >> e .configOn Fri,  8 May 2026 18:22:06 +0200 hawk@kernel.org wrote:
> >>
> >>> Add tracepoints to the page allocator fast paths that acquire
> >>> zone->lock, allowing diagnosis of lock contention in production.
> >>
> >> Thanks, I'm surprised we haven't done this yet.
> > 
> > There was a recent attempt [1]. Not being a generic solution wasn't welcome.
> > 
> > [1] https://lore.kernel.org/all/cover.1772206930.git.d@ilvokhin.com/
> 
> And this is the generic solution I think?
> 
> https://lore.kernel.org/all/cover.1777999826.git.d@ilvokhin.com/

Thanks for cc'ing me, Vlastimil.

Yes, this is an attempt at a generic solution for tracing contended
locks, including spinlocks, so it should also cover the use case
proposed in this patchset.

In fact, zone->lock contention was one of the primary motivations for
this work.

^ permalink raw reply

* Re: [PATCH 2/5] docs: fix repeated word 'that' across documentation
From: David Laight @ 2026-05-08 18:26 UTC (permalink / raw)
  To: Shuah Khan
  Cc: Adrien Reynard, Paul E. McKenney, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
	Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
	Lai Jiangshan, Zqiang, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Danilo Krummrich, David Howells,
	Paulo Alcantara, Masami Hiramatsu,
	open list:READ-COPY UPDATE (RCU), open list:DOCUMENTATION,
	open list, open list:DRIVER CORE, KOBJECTS, DEBUGFS AND SYSFS,
	open list:FILESYSTEMS [NETFS LIBRARY],
	open list:FILESYSTEMS [NETFS LIBRARY], open list:TRACING
In-Reply-To: <1501caea-8cff-4968-aca6-e8d4b20e0e80@linuxfoundation.org>

On Fri, 8 May 2026 11:15:28 -0600
Shuah Khan <skhan@linuxfoundation.org> wrote:

> On 5/8/26 10:37, Adrien Reynard wrote:
> 
> Missing commit log in all your patches - I don't patch 1/5 in
> my Inbox.
> 
> > Signed-off-by: Adrien Reynard <reynard.adrien.08@gmail.com>
> > ---
> >   Documentation/RCU/rcu.rst                          | 2 +-
> >   Documentation/driver-api/driver-model/overview.rst | 2 +-
> >   Documentation/filesystems/netfs_library.rst        | 2 +-
> >   Documentation/trace/histogram-design.rst           | 2 +-
> >   Documentation/trace/histogram.rst                  | 2 +-
> >   5 files changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst
> > index bf6617b330a7..320ad3292b75 100644
> > --- a/Documentation/RCU/rcu.rst
> > +++ b/Documentation/RCU/rcu.rst
> > @@ -32,7 +32,7 @@ Frequently Asked Questions
> >     Just as with spinlocks, RCU readers are not permitted to
> >     block, switch to user-mode execution, or enter the idle loop.
> >     Therefore, as soon as a CPU is seen passing through any of these
> > -  three states, we know that that CPU has exited any previous RCU
> > +  three states, we know that CPU has exited any previous RCU  
> 
> The original intent might have been to say, "that cpu", so adding
> the missing comma after the first "that" or change "that" to "the"
> would make sense.
...

I don't think adding a comma would be correct.
The clause splits as 'we know that' 'that CPU' and the repeated 'that'
is absolutely correct.
Maybe 'that CPU' could be replaced by 'it'; but it can be difficult to
work out what back references like 'it' refer to.

You can re-order it, as (say):
	Therefore we know that as soon as a CPU is seen passing through any of these
	three states it has exited any previous RCU read-side critical sections.

But just because some grammar book says you shouldn't have repeated words
doesn't mean there aren't exceptions.

The sign writer was doing a new sign for the 'Pig and Whistle'.
Unfortunately the gaps between Pig and and and and and Whistle
ended up visibly different.

-- David



^ permalink raw reply

* Re: [PATCH 2/5] docs: fix repeated word 'that' across documentation
From: David Howells @ 2026-05-08 19:43 UTC (permalink / raw)
  To: Adrien Reynard
  Cc: Paul E. McKenney, Frederic Weisbecker, Neeraj Upadhyay,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Jonathan Corbet, Shuah Khan, Greg Kroah-Hartman,
	Rafael J. Wysocki, Danilo Krummrich, David Howells,
	Paulo Alcantara, Masami Hiramatsu,
	open list:READ-COPY UPDATE (RCU), open list:DOCUMENTATION,
	open list, open list:DRIVER CORE, KOBJECTS, DEBUGFS AND SYSFS,
	open list:FILESYSTEMS [NETFS LIBRARY],
	open list:FILESYSTEMS [NETFS LIBRARY], open list:TRACING
In-Reply-To: <20260508163759.16231-1-reynard.adrien.08@gmail.com>

Adrien Reynard <reynard.adrien.08@gmail.com> wrote:

> -  three states, we know that that CPU has exited any previous RCU

This is arguably correct.  The two 'that' words are functionally different.
If you look at another language, say Hungarian, they are different words
(e.g. 'hogy' vs 'az').

David


^ permalink raw reply

* [PATCH] tracing: Avoid NULL return from hist_field_name() on truncation
From: David Carlier @ 2026-05-08 19:57 UTC (permalink / raw)
  To: linux-trace-kernel
  Cc: rostedt, mhiramat, mathieu.desnoyers, zanussi, pengpeng,
	linux-kernel, David Carlier

hist_field_name() returns "" everywhere except the fully-qualified
VAR_REF/EXPR case, where snprintf() truncation returns NULL early
and bypasses the bottom NULL->"" guard. Callers don't expect NULL:
strcat(expr, hist_field_name(field, 0)) at trace_events_hist.c:1758
and the strcmp() in the sort-key match loop at :4804 both deref it.

system and event_name are bounded by MAX_EVENT_NAME_LEN, but the
field name on a VAR_REF is kstrdup'd from a histogram variable
name parsed out of the trigger string and has no length cap, so
a long enough var name in a fully qualified reference can reach
the truncation path.

Keep the length check but leave field_name as "" on overflow.

Fixes: 5ec1d1e97de1 ("tracing: Rebuild full_name on each hist_field_name() call")
Signed-off-by: David Carlier <devnexen@gmail.com>
---
 kernel/trace/trace_events_hist.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 0dbbf6cca9bc..eb2c2bc8bc3d 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1369,10 +1369,8 @@ static const char *hist_field_name(struct hist_field *field,
 			len = snprintf(full_name, sizeof(full_name), fmt,
 				       field->system, field->event_name,
 				       field->name);
-			if (len >= sizeof(full_name))
-				return NULL;
-
-			field_name = full_name;
+			if (len < sizeof(full_name))
+				field_name = full_name;
 		} else
 			field_name = field->name;
 	} else if (field->flags & HIST_FIELD_FL_TIMESTAMP)
-- 
2.53.0

^ permalink raw reply related

* Re: [PATCH 2/2] selftests/mm: add zone->lock tracepoint verification test
From: David Hildenbrand (Arm) @ 2026-05-08 20:15 UTC (permalink / raw)
  To: hawk, Andrew Morton, linux-mm
  Cc: Vlastimil Babka, Steven Rostedt, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Lorenzo Stoakes, Shuah Khan, linux-kernel,
	linux-trace-kernel, kernel-team
In-Reply-To: <20260508162207.3315781-2-hawk@kernel.org>

On 5/8/26 18:22, hawk@kernel.org wrote:
> From: Jesper Dangaard Brouer <hawk@kernel.org>
> 
> Add a selftest to verify the kmem:mm_zone_lock_contended,
> kmem:mm_zone_locked, and kmem:mm_zone_lock_unlock tracepoints.
> 
> The test has two components:
> 
> zone_lock_contention.c - a workload that spawns threads doing rapid
> page allocation and freeing to generate zone->lock contention. It
> shrinks PCP lists via percpu_pagelist_high_fraction to force frequent
> free_pcppages_bulk() and rmqueue_bulk() calls.
> 
> test_zone_lock_tracepoints.sh - uses bpftrace to verify tracepoints
> exist, have the expected fields, fire under load, and that wait_ns
> is populated when contention occurs.
> 
> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
> ---
>  tools/testing/selftests/mm/Makefile           |   2 +
>  .../mm/test_zone_lock_tracepoints.sh          | 212 ++++++++++++++++++
>  .../selftests/mm/zone_lock_contention.c       | 166 ++++++++++++++

This really looks excessive and ... not really how we usually treat tracepoints?

I don't know about others, but I don't think this is really what we want as a MM
selftest.

-- 
Cheers,

David

^ permalink raw reply

* Re: [PATCH RFC v4 10/44] KVM: guest_memfd: Add support for KVM_SET_MEMORY_ATTRIBUTES2
From: Ackerley Tng @ 2026-05-08 21:21 UTC (permalink / raw)
  To: Sean Christopherson, Michael Roth
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	ira.weiny, jmattson, jthoughton, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Baoquan He, Barry Song,
	Axel Rasmussen, Yuanchu Xie, Wei Xu, Jason Gunthorpe,
	Vlastimil Babka, kvm, linux-kernel, linux-trace-kernel, linux-doc,
	linux-kselftest, linux-mm
In-Reply-To: <af4gJ6xZ3e7UXOuO@google.com>

Sean Christopherson <seanjc@google.com> writes:

>
> [...snip...]
>
>
> Summarizing this week's PUCK call[*]:
>
> Scrap PRESERVE and ZERO, and simply rely on vendor specific semantics.
>
>
> [...snip...]
>

Thanks for the summary! Please see v6 here:

https://lore.kernel.org/all/20260507-gmem-inplace-conversion-v6-0-91ab5a8b19a4@google.com/T/

^ permalink raw reply

* Re: [PATCH 0/2] tools/bootconfig: render kernel.* subtree as a cmdline string
From: Andrew Morton @ 2026-05-08 21:56 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Masami Hiramatsu, linux-kernel, linux-trace-kernel, paulmck, oss,
	kernel-team
In-Reply-To: <20260508-bootconfig_using_tools-v1-0-1132219aa773@debian.org>

On Fri, 08 May 2026 06:55:02 -0700 Breno Leitao <leitao@debian.org> wrote:

> Add a bootconfig -> kernel cmdline rendering capability shared between
> the kernel parser library and the userspace tools/bootconfig binary.
> 
> The new userspace mode "tools/bootconfig -C <file>" walks a bootconfig
> file's "kernel" subtree and prints it as a flat, space-separated
> cmdline string suitable for direct use as (or appending to) a kernel
> command line.
> 
> This series prepares tools/bootconfig and lib/bootconfig.c for an
> upcoming feature that lets the kernel build render an embedded
> bootconfig file's "kernel" subtree to a flat cmdline string and embed
> it in the kernel image.
> 
> The follow-up series (sent separately) wires this into setup_arch() so
> early_param() handlers see values supplied via CONFIG_BOOT_CONFIG_EMBED_FILE,
> following Masami suggestion in [1]
> 
> These two patches are pure groundwork. They add no kernel feature,
> change no runtime behavior, and are useful on their own (the new
> "tools/bootconfig -C" mode lets anyone render a .bootconfig file to
> a cmdline string from the shell).
> 
> Landing them independently lets the follow-up series focus on the
> kernel-side plumbing without dragging the refactor and tool addition
> through the same review cycle.

I'll assume that Masami will process this, although
`scripts/get_maintainer.pl lib/bootconfig.c' doesn't mention a git
tree.

https://sashiko.dev/#/patchset/20260508-bootconfig_using_tools-v1-0-1132219aa773@debian.org
says a bunch of picky things which seem pretty ignorable to me.  Your
call ;)

^ permalink raw reply

* Re: [PATCH v1 1/2] spi: qcom-geni: trace: Add trace events for Qualcomm GENI SPI
From: Trilok Soni @ 2026-05-08 23:14 UTC (permalink / raw)
  To: Mark Brown, Praveen Talari
  Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel, linux-arm-msm, linux-spi,
	MukeshKumarSavaliyamukesh.savaliya, AniketRandiveaniket.randive,
	chandana.chiluveru, jyothi.seerapu
In-Reply-To: <af3spostNgoRU0Vv@sirena.co.uk>

On 5/8/2026 7:01 AM, Mark Brown wrote:
> On Thu, May 07, 2026 at 11:03:39PM +0530, Praveen Talari wrote:
>> On 07-05-2026 13:43, Mark Brown wrote:
> 
>>> By generic I mean this should not be driver specific at all.
> 
>> I hope these changes are fine. Please let me know if you have any concerns
>> or feedback.
> 
> The data tracepoints look plausible but I would expect them to be
> generated by the core, they'll be there for everything so I'd expect
> them to work for everything.

I agree here. Praveen - this is similar to suggestion I had for the i2c
internally. 

---Trilok Soni


^ permalink raw reply

* Re: [PATCH v6 01/43] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings
From: Ackerley Tng @ 2026-05-08 23:36 UTC (permalink / raw)
  To: Ackerley Tng via B4 Relay, aik, andrew.jones, binbin.wu, brauner,
	chao.p.peng, david, ira.weiny, jmattson, jthoughton, michael.roth,
	oupton, pankaj.gupta, qperret, rick.p.edgecombe, rientjes,
	shivankg, steven.price, tabba, willy, wyihan, yan.y.zhao,
	forkloop, pratyush, suzuki.poulose, aneesh.kumar, liam,
	Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Baoquan He, Barry Song,
	Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng,
	Shakeel Butt, Kiryl Shutsemau, Jason Gunthorpe, Vlastimil Babka
  Cc: kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260507-gmem-inplace-conversion-v6-1-91ab5a8b19a4@google.com>

Ackerley Tng via B4 Relay <devnull+ackerleytng.google.com@kernel.org>
writes:

>
> [...snip...]
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 69c9d6d546b28..5011d38820d0d 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -4,6 +4,7 @@
>  #include <linux/falloc.h>
>  #include <linux/fs.h>
>  #include <linux/kvm_host.h>
> +#include <linux/maple_tree.h>
>  #include <linux/mempolicy.h>
>  #include <linux/pseudo_fs.h>
>  #include <linux/pagemap.h>
> @@ -33,6 +34,13 @@ struct gmem_inode {
>  	struct list_head gmem_file_list;
>
>  	u64 flags;
> +	/*
> +	 * Every index in this inode, whether memory is populated or
> +	 * not, is tracked in attributes. The entire range of indices,
> +	 * corresponding to the size of this inode, is represented in
> +	 * this maple tree.

Concretely, if the entire guest_memfd is 2M in size, indices [0, 511] is
represented with some value, either 0 (SHARED) or
KVM_MEMORY_ATTRIBUTE_PRIVATE. [512, ULONG_MAX] is also defined in the
tree, as NULL.

Since guest_memfd uses xa_mk_value(0) to store the value 0 ("SHARED"),
that makes 0 distinct from NULL, which works for guest_memfd.


(Liam and I discussed this off-list due to a email configuration issue)

> +	 */
> +	struct maple_tree attributes;
>  };
>
>
> [...snip...]
>

^ permalink raw reply

* Re: [PATCH v2] mm: vmscan: rework lru_shrink and write_folio tracepoints
From: Andrew Morton @ 2026-05-08 23:47 UTC (permalink / raw)
  To: qiwu.chen
  Cc: rostedt, mhiramat, hannes, david, mhocko, willy,
	linux-trace-kernel, linux-mm, qiwu.chen
In-Reply-To: <20260506083652.100160-1-qiwu.chen@transsion.com>

On Wed,  6 May 2026 16:36:52 +0800 "qiwu.chen" <qiwuchen55@gmail.com> wrote:

> From: "qiwu.chen" <qiwuchen55@gmail.com>
> Signed-off-by: qiwu.chen <qiwu.chen@transsion.com>

Which should we use?  If it's the transsion.com address (which I
assumed) then this can be communicated by placing an explicit From:
line at start-of-changelog.


^ permalink raw reply

* Re: [PATCH v5 2/2] blk-mq: expose tag starvation counts via debugfs
From: kernel test robot @ 2026-05-09  0:12 UTC (permalink / raw)
  To: Aaron Tomlin, axboe, rostedt, mhiramat, mathieu.desnoyers
  Cc: llvm, oe-kbuild-all, bvanassche, johannes.thumshirn, kch, dlemoal,
	ritesh.list, loberman, neelx, sean, mproche, chjohnst,
	linux-block, linux-kernel, linux-trace-kernel
In-Reply-To: <20260427020142.358912-3-atomlin@atomlin.com>

Hi Aaron,

kernel test robot noticed the following build errors:

[auto build test ERROR on axboe/for-next]
[also build test ERROR on trace/for-next linus/master v7.1-rc2 next-20260508]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Aaron-Tomlin/blk-mq-add-tracepoint-block_rq_tag_wait/20260428-013259
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git for-next
patch link:    https://lore.kernel.org/r/20260427020142.358912-3-atomlin%40atomlin.com
patch subject: [PATCH v5 2/2] blk-mq: expose tag starvation counts via debugfs
config: x86_64-rhel-9.4-rust (https://download.01.org/0day-ci/archive/20260509/202605090233.BZVa3x9s-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
rustc: rustc 1.88.0 (6b00bc388 2025-06-23)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260509/202605090233.BZVa3x9s-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202605090233.BZVa3x9s-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from block/blk-core.c:45:
   In file included from include/trace/events/block.h:726:
   In file included from include/trace/define_trace.h:132:
   In file included from include/trace/trace_events.h:468:
>> include/trace/events/block.h:255:47: error: expected ':'
     255 |                 __entry->dev            = q->disk ? disk_devt(q->disk);
         |                                                                       ^
         |                                                                       : 
   include/trace/stages/stage6_event_callback.h:133:33: note: expanded from macro 'TP_fast_assign'
     133 | #define TP_fast_assign(args...) args
         |                                 ^
   include/trace/trace_events.h:44:16: note: expanded from macro 'TRACE_EVENT'
      44 |                              PARAMS(assign),                   \
         |                                     ^
   include/linux/tracepoint.h:160:25: note: expanded from macro 'PARAMS'
     160 | #define PARAMS(args...) args
         |                         ^
   include/trace/trace_events.h:435:16: note: expanded from macro 'DECLARE_EVENT_CLASS'
     435 |                       PARAMS(assign), PARAMS(print))                    \
         |                              ^
   include/linux/tracepoint.h:160:25: note: expanded from macro 'PARAMS'
     160 | #define PARAMS(args...) args
         |                         ^
   include/trace/trace_events.h:427:4: note: expanded from macro '\
   __DECLARE_EVENT_CLASS'
     427 |         { assign; }                                                     \
         |           ^
   include/trace/events/block.h:255:27: note: to match this '?'
     255 |                 __entry->dev            = q->disk ? disk_devt(q->disk);
         |                                                   ^
>> include/trace/events/block.h:255:47: error: expected expression
     255 |                 __entry->dev            = q->disk ? disk_devt(q->disk);
         |                                                                       ^
   In file included from block/blk-core.c:45:
   In file included from include/trace/events/block.h:726:
   In file included from include/trace/define_trace.h:133:
   In file included from include/trace/perf.h:110:
>> include/trace/events/block.h:255:47: error: expected ':'
     255 |                 __entry->dev            = q->disk ? disk_devt(q->disk);
         |                                                                       ^
         |                                                                       : 
   include/trace/stages/stage6_event_callback.h:133:33: note: expanded from macro 'TP_fast_assign'
     133 | #define TP_fast_assign(args...) args
         |                                 ^
   include/trace/trace_events.h:44:16: note: expanded from macro 'TRACE_EVENT'
      44 |                              PARAMS(assign),                   \
         |                                     ^
   include/linux/tracepoint.h:160:25: note: expanded from macro 'PARAMS'
     160 | #define PARAMS(args...) args
         |                         ^
   include/trace/perf.h:67:16: note: expanded from macro 'DECLARE_EVENT_CLASS'
      67 |                       PARAMS(assign), PARAMS(print))                    \
         |                              ^
   include/linux/tracepoint.h:160:25: note: expanded from macro 'PARAMS'
     160 | #define PARAMS(args...) args
         |                         ^
   include/trace/perf.h:51:4: note: expanded from macro '\
   __DECLARE_EVENT_CLASS'
      51 |         { assign; }                                                     \
         |           ^
   include/trace/events/block.h:255:27: note: to match this '?'
     255 |                 __entry->dev            = q->disk ? disk_devt(q->disk);
         |                                                   ^
>> include/trace/events/block.h:255:47: error: expected expression
     255 |                 __entry->dev            = q->disk ? disk_devt(q->disk);
         |                                                                       ^
   4 errors generated.


vim +255 include/trace/events/block.h

   242	
   243		TP_PROTO(struct request_queue *q, struct blk_mq_hw_ctx *hctx, bool is_sched_tag),
   244	
   245		TP_ARGS(q, hctx, is_sched_tag),
   246	
   247		TP_STRUCT__entry(
   248			__field( dev_t,		dev			)
   249			__field( u32,		hctx_id			)
   250			__field( u32,		nr_tags			)
   251			__field( bool,		is_sched_tag		)
   252		),
   253	
   254		TP_fast_assign(
 > 255			__entry->dev		= q->disk ? disk_devt(q->disk);
   256			__entry->hctx_id	= hctx->queue_num;
   257			__entry->is_sched_tag	= is_sched_tag;
   258	
   259			if (is_sched_tag)
   260				__entry->nr_tags = hctx->sched_tags->nr_tags;
   261			else
   262				__entry->nr_tags = hctx->tags->nr_tags;
   263		),
   264	
   265		TP_printk("%d,%d hctx=%u starved on %s tags (depth=%u)",
   266			  MAJOR(__entry->dev), MINOR(__entry->dev),
   267			  __entry->hctx_id,
   268			  __entry->is_sched_tag ? "scheduler" : "hardware",
   269			  __entry->nr_tags)
   270	);
   271	

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* [PATCH bpf 1/2] uprobes/x86: Fix red zone clobbering in nop5 optimization
From: Andrii Nakryiko @ 2026-05-09  0:30 UTC (permalink / raw)
  To: bpf
  Cc: linux-trace-kernel, jolsa, oleg, peterz, mingo, mhiramat,
	Andrii Nakryiko

The x86 uprobe nop5 optimization currently replaces a 5-byte NOP at the
probe site with a CALL into a uprobe trampoline. CALL pushes a return
address to [rsp-8]. On x86-64 this is inside the 128-byte red zone, where
user code may keep temporary data without adjusting rsp.

Use a 5-byte JMP instead. JMP does not write to the user stack, but it
also does not provide a return address. Replace the single trampoline
entry with a page of 16-byte slots. Each optimized probe jumps to its
assigned slot, the slot moves rsp below the red zone, saves the registers
clobbered by syscall, and invokes the uprobe syscall:

  Probe site:   jmp slot_N              (5B, replaces nop5)

  Slot N:       lea  -128(%rsp), %rsp   (5B)  skip red zone
                push %rcx               (1B)  save (syscall clobbers)
                push %r11               (2B)  save (syscall clobbers)
                push %rax               (1B)  save (syscall uses for nr)
                mov  $336, %eax         (5B)  uprobe syscall number
                syscall                 (2B)

All slots contain identical code at different offsets, so the trampoline
page is generated once at boot and mapped read-execute into each process.
The syscall handler identifies the slot from regs->ip, which points just
after the syscall instruction, and uses a per-mm slot table to recover the
original probe address.

The uprobe syscall does not return to the trampoline slot. The handler
restores the probe-site register state, runs the uprobe consumers, sets
pt_regs to continue at probe_addr + 5 unless a consumer redirected
execution, and returns directly through the IRET path. This preserves
general purpose registers, including rcx and r11, without requiring any
post-syscall cleanup code in the trampoline and avoids call/ret, RSB, and
shadow stack concerns.

Protect the per-mm trampoline list with RCU and free trampoline metadata
with kfree_rcu(). This lets the syscall path resolve trampoline slots
without taking mmap_lock. The optimized-instruction detection path also
walks the trampoline list under an RCU read-side lock. Since that path
starts from the JMP target, it translates the slot start to the post-syscall
IP expected by the shared resolver before checking the trampoline mapping.

Each trampoline page provides 256 slots. Slots stay permanently assigned
to their first probe address and are reused only when the same address is
probed again. Reassigning detached slots is deliberately avoided because a
thread can remain in a trampoline for an unbounded time due to ptrace,
interrupts, or scheduling delays. If a reachable trampoline page runs out
of slots, probes that cannot allocate a slot fall back to the slower INT3
path.

Require the entire trampoline page to be reachable by a rel32 JMP before
reusing it for a probe. This keeps every slot in the page within the range
that can be encoded at the probe site.

Change the error code returned when the uprobe syscall is invoked outside
a kernel-generated trampoline from -ENXIO to -EPROTO. This lets libbpf and
similar libraries distinguish fixed kernels from kernels with the
red-zone-clobbering implementation and enable nop5 optimization only on
fixed kernels.

Performance (usdt single-thread, M/s):

                  usdt-nop  usdt-nop5-base  usdt-nop5-fix  nop5-change  iret%
  Skylake          3.149        6.422          4.865         -24.3%     39.1%
  Milan            2.910        3.443          3.820         +11.0%     24.3%
  Sapphire Rapids  1.896        4.023          3.693          -8.2%     24.9%
  Bergamo          3.393        3.895          3.849          -1.2%     24.5%

The fixed nop5 path remains faster than the non-optimized INT3 path on all
measured systems. The regression relative to the old CALL-based trampoline
comes from IRET being more expensive than SYSRET, most noticeably on older
Intel Skylake. Newer Intel CPUs and tested AMD CPUs have lower IRET cost,
and AMD Milan improves because removing mmap_lock from the hot path more
than offsets the IRET cost.

Multi-threaded throughput scales nearly linearly with the number of CPUs, like
it used to, thanks to lockless RCU-protected uprobe trampoline lookup.

Fixes: ba2bfc97b462 ("uprobes/x86: Add support to optimize uprobes")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 arch/x86/include/asm/uprobes.h                |  18 ++
 arch/x86/kernel/uprobes.c                     | 262 ++++++++++--------
 tools/lib/bpf/features.c                      |   8 +-
 .../selftests/bpf/prog_tests/uprobe_syscall.c |   5 +-
 tools/testing/selftests/bpf/prog_tests/usdt.c |   2 +-
 5 files changed, 181 insertions(+), 114 deletions(-)

diff --git a/arch/x86/include/asm/uprobes.h b/arch/x86/include/asm/uprobes.h
index 362210c79998..a7cf5c92d95a 100644
--- a/arch/x86/include/asm/uprobes.h
+++ b/arch/x86/include/asm/uprobes.h
@@ -25,6 +25,24 @@ enum {
 	ARCH_UPROBE_FLAG_OPTIMIZE_FAIL  = 1,
 };
 
+/*
+ * Trampoline page layout: identical 16-byte slots, each containing:
+ *   lea  -128(%rsp), %rsp (5B)  skip red zone
+ *   push %rcx             (1B)  save (syscall clobbers)
+ *   push %r11             (2B)  save (syscall clobbers)
+ *   push %rax             (1B)  save (syscall uses for nr)
+ *   mov  $336, %eax       (5B)  uprobe syscall number
+ *   syscall               (2B)
+ *                        = 16B, no padding needed
+ *
+ * The handler identifies which probe fired from regs->ip (each
+ * slot is at a unique offset), looks up the probe address from a
+ * per-process table, and returns directly to probe_addr+5 via iret
+ * with all registers restored.
+ */
+#define UPROBE_TRAMP_SLOT_SIZE	16
+#define UPROBE_TRAMP_MAX_SLOTS	(PAGE_SIZE / UPROBE_TRAMP_SLOT_SIZE)
+
 struct uprobe_xol_ops;
 
 struct arch_uprobe {
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index ebb1baf1eb1d..7e1f14200bbb 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -633,16 +633,25 @@ static struct vm_special_mapping tramp_mapping = {
 
 struct uprobe_trampoline {
 	struct hlist_node	node;
+	struct rcu_head		rcu;
 	unsigned long		vaddr;
+	unsigned long		probe_addrs[UPROBE_TRAMP_MAX_SLOTS];
 };
 
-static bool is_reachable_by_call(unsigned long vtramp, unsigned long vaddr)
+
+static bool is_reachable_by_jmp(unsigned long dst, unsigned long src)
 {
-	long delta = (long)(vaddr + 5 - vtramp);
+	long delta = (long)(dst - (src + JMP32_INSN_SIZE));
 
 	return delta >= INT_MIN && delta <= INT_MAX;
 }
 
+static bool is_reachable_by_trampoline(unsigned long vtramp, unsigned long vaddr)
+{
+	return is_reachable_by_jmp(vtramp, vaddr) &&
+	       is_reachable_by_jmp(vtramp + PAGE_SIZE - 1, vaddr);
+}
+
 static unsigned long find_nearest_trampoline(unsigned long vaddr)
 {
 	struct vm_unmapped_area_info info = {
@@ -711,6 +720,21 @@ static struct uprobe_trampoline *create_uprobe_trampoline(unsigned long vaddr)
 	return tramp;
 }
 
+static int tramp_alloc_slot(struct uprobe_trampoline *tramp, unsigned long probe_addr)
+{
+	int i;
+
+	for (i = 0; i < UPROBE_TRAMP_MAX_SLOTS; i++) {
+		if (tramp->probe_addrs[i] == probe_addr)
+			return i;
+		if (tramp->probe_addrs[i] == 0) {
+			tramp->probe_addrs[i] = probe_addr;
+			return i;
+		}
+	}
+	return -ENOSPC;
+}
+
 static struct uprobe_trampoline *get_uprobe_trampoline(unsigned long vaddr, bool *new)
 {
 	struct uprobes_state *state = &current->mm->uprobes_state;
@@ -720,7 +744,7 @@ static struct uprobe_trampoline *get_uprobe_trampoline(unsigned long vaddr, bool
 		return NULL;
 
 	hlist_for_each_entry(tramp, &state->head_tramps, node) {
-		if (is_reachable_by_call(tramp->vaddr, vaddr)) {
+		if (is_reachable_by_trampoline(tramp->vaddr, vaddr)) {
 			*new = false;
 			return tramp;
 		}
@@ -731,7 +755,7 @@ static struct uprobe_trampoline *get_uprobe_trampoline(unsigned long vaddr, bool
 		return NULL;
 
 	*new = true;
-	hlist_add_head(&tramp->node, &state->head_tramps);
+	hlist_add_head_rcu(&tramp->node, &state->head_tramps);
 	return tramp;
 }
 
@@ -742,8 +766,8 @@ static void destroy_uprobe_trampoline(struct uprobe_trampoline *tramp)
 	 * because there's no easy way to make sure none of the threads
 	 * is still inside the trampoline.
 	 */
-	hlist_del(&tramp->node);
-	kfree(tramp);
+	hlist_del_rcu(&tramp->node);
+	kfree_rcu(tramp, rcu);
 }
 
 void arch_uprobe_init_state(struct mm_struct *mm)
@@ -761,147 +785,153 @@ void arch_uprobe_clear_state(struct mm_struct *mm)
 		destroy_uprobe_trampoline(tramp);
 }
 
-static bool __in_uprobe_trampoline(unsigned long ip)
+/*
+ * Find the trampoline containing @ip. If @probe_addr is non-NULL, also
+ * resolve the slot index from @ip and return the probe address.
+ *
+ * @ip is expected to point right after the syscall instruction, i.e.,
+ * at the end of the slot (slot_start + UPROBE_TRAMP_SLOT_SIZE).
+ */
+static bool resolve_uprobe_addr(unsigned long ip, unsigned long *probe_addr)
 {
-	struct vm_area_struct *vma = vma_lookup(current->mm, ip);
+	struct uprobes_state *state = &current->mm->uprobes_state;
+	struct uprobe_trampoline *tramp;
 
-	return vma && vma_is_special_mapping(vma, &tramp_mapping);
-}
+	hlist_for_each_entry_rcu(tramp, &state->head_tramps, node) {
+		/*
+		 * ip points to after syscall, so it's on 16 byte boundary,
+		 * which means that valid ip can point right after the page
+		 * and should never be at zero offset within the page
+		 */
+		if (ip <= tramp->vaddr || ip > tramp->vaddr + PAGE_SIZE)
+			continue;
 
-static bool in_uprobe_trampoline(unsigned long ip)
-{
-	struct mm_struct *mm = current->mm;
-	bool found, retry = true;
-	unsigned int seq;
+		if (probe_addr) {
+			/* we already validated ip is within expected range */
+			unsigned int slot = (ip - tramp->vaddr - 1) / UPROBE_TRAMP_SLOT_SIZE;
+			unsigned long addr = tramp->probe_addrs[slot];
 
-	rcu_read_lock();
-	if (mmap_lock_speculate_try_begin(mm, &seq)) {
-		found = __in_uprobe_trampoline(ip);
-		retry = mmap_lock_speculate_retry(mm, seq);
-	}
-	rcu_read_unlock();
+			*probe_addr = addr;
+			if (addr == 0)
+				return false;
+		}
 
-	if (retry) {
-		mmap_read_lock(mm);
-		found = __in_uprobe_trampoline(ip);
-		mmap_read_unlock(mm);
+		return true;
 	}
-	return found;
+	return false;
+}
+
+static bool in_uprobe_trampoline(unsigned long ip, unsigned long *probe_addr)
+{
+	guard(rcu)();
+	return resolve_uprobe_addr(ip, probe_addr);
 }
 
 /*
- * See uprobe syscall trampoline; the call to the trampoline will push
- * the return address on the stack, the trampoline itself then pushes
- * cx, r11 and ax.
+ * The trampoline slot pushes cx, r11, ax (the registers syscall clobbers)
+ * before doing the uprobe syscall. No return address is pushed — the
+ * probe site uses jmp, not call.
  */
 struct uprobe_syscall_args {
 	unsigned long ax;
 	unsigned long r11;
 	unsigned long cx;
-	unsigned long retaddr;
 };
 
+#define UPROBE_TRAMP_REDZONE 128
+
 SYSCALL_DEFINE0(uprobe)
 {
 	struct pt_regs *regs = task_pt_regs(current);
 	struct uprobe_syscall_args args;
-	unsigned long ip, sp, sret;
+	unsigned long probe_addr;
 	int err;
 
 	/* Allow execution only from uprobe trampolines. */
-	if (!in_uprobe_trampoline(regs->ip))
-		return -ENXIO;
+	if (!in_uprobe_trampoline(regs->ip, &probe_addr))
+		return -EPROTO;
 
 	err = copy_from_user(&args, (void __user *)regs->sp, sizeof(args));
 	if (err)
 		goto sigill;
 
-	ip = regs->ip;
-
 	/*
-	 * expose the "right" values of ax/r11/cx/ip/sp to uprobe_consumer/s, plus:
-	 * - adjust ip to the probe address, call saved next instruction address
-	 * - adjust sp to the probe's stack frame (check trampoline code)
+	 * Restore the register state as it was at the probe site:
+	 * - ax/r11/cx from the trampoline-saved copies on user stack
+	 * - adjust ip to the probe address based on matching slot
+	 * - adjust sp to skip red zone and pushed args
 	 */
 	regs->ax  = args.ax;
 	regs->r11 = args.r11;
 	regs->cx  = args.cx;
-	regs->ip  = args.retaddr - 5;
-	regs->sp += sizeof(args);
+	regs->ip  = probe_addr;
+	regs->sp += sizeof(args) + UPROBE_TRAMP_REDZONE;
 	regs->orig_ax = -1;
 
-	sp = regs->sp;
-
-	err = shstk_pop((u64 *)&sret);
-	if (err == -EFAULT || (!err && sret != args.retaddr))
-		goto sigill;
-
-	handle_syscall_uprobe(regs, regs->ip);
+	handle_syscall_uprobe(regs, probe_addr);
 
 	/*
-	 * Some of the uprobe consumers has changed sp, we can do nothing,
-	 * just return via iret.
+	 * Skip the jmp instruction at the probe site (5 bytes) unless
+	 * a consumer redirected execution elsewhere.
 	 */
-	if (regs->sp != sp) {
-		/* skip the trampoline call */
-		if (args.retaddr - 5 == regs->ip)
-			regs->ip += 5;
-		return regs->ax;
-	}
+	if (regs->ip == probe_addr)
+		regs->ip = probe_addr + 5;
 
-	regs->sp -= sizeof(args);
-
-	/* for the case uprobe_consumer has changed ax/r11/cx */
-	args.ax  = regs->ax;
-	args.r11 = regs->r11;
-	args.cx  = regs->cx;
-
-	/* keep return address unless we are instructed otherwise */
-	if (args.retaddr - 5 != regs->ip)
-		args.retaddr = regs->ip;
-
-	if (shstk_push(args.retaddr) == -EFAULT)
-		goto sigill;
-
-	regs->ip = ip;
-
-	err = copy_to_user((void __user *)regs->sp, &args, sizeof(args));
-	if (err)
-		goto sigill;
-
-	/* ensure sysret, see do_syscall_64() */
-	regs->r11 = regs->flags;
-	regs->cx  = regs->ip;
-	return 0;
+	/*
+	 * Return via iret by returning regs->ax. This preserves all
+	 * GP registers (including cx and r11) without needing any
+	 * user-space cleanup code. The iret path is used because we
+	 * don't set up cx/r11 for sysret.
+	 */
+	return regs->ax;
 
 sigill:
 	force_sig(SIGILL);
 	return -1;
 }
 
+/*
+ * All uprobe trampoline slots are identical: skip the red zone,
+ * save the three registers that syscall clobbers, then invoke
+ * the uprobe syscall. The handler returns directly to the probe
+ * caller via iret. Execution never returns to the trampoline.
+ */
 asm (
 	".pushsection .rodata\n"
-	".balign " __stringify(PAGE_SIZE) "\n"
-	"uprobe_trampoline_entry:\n"
+	".balign " __stringify(UPROBE_TRAMP_SLOT_SIZE) "\n"
+	"uprobe_trampoline_slot:\n"
+	"lea -128(%rsp), %rsp\n"
 	"push %rcx\n"
 	"push %r11\n"
 	"push %rax\n"
-	"mov $" __stringify(__NR_uprobe) ", %rax\n"
+	"mov $" __stringify(__NR_uprobe) ", %eax\n"
 	"syscall\n"
-	"pop %rax\n"
-	"pop %r11\n"
-	"pop %rcx\n"
-	"ret\n"
-	"int3\n"
-	".balign " __stringify(PAGE_SIZE) "\n"
+	"uprobe_trampoline_slot_end:\n"
 	".popsection\n"
 );
 
-extern u8 uprobe_trampoline_entry[];
+extern u8 uprobe_trampoline_slot[];
+extern u8 uprobe_trampoline_slot_end[];
 
 static int __init arch_uprobes_init(void)
 {
-	tramp_mapping_pages[0] = virt_to_page(uprobe_trampoline_entry);
+	unsigned int slot_size = uprobe_trampoline_slot_end - uprobe_trampoline_slot;
+	struct page *page;
+	u8 *page_addr;
+	int i;
+
+	BUILD_BUG_ON(UPROBE_TRAMP_SLOT_SIZE != 16);
+	WARN_ON_ONCE(slot_size != UPROBE_TRAMP_SLOT_SIZE);
+
+	page = alloc_page(GFP_KERNEL);
+	if (!page)
+		return -ENOMEM;
+
+	page_addr = page_address(page);
+	for (i = 0; i < UPROBE_TRAMP_MAX_SLOTS; i++)
+		memcpy(page_addr + i * UPROBE_TRAMP_SLOT_SIZE, uprobe_trampoline_slot, slot_size);
+
+	tramp_mapping_pages[0] = page;
 	return 0;
 }
 
@@ -909,7 +939,7 @@ late_initcall(arch_uprobes_init);
 
 enum {
 	EXPECT_SWBP,
-	EXPECT_CALL,
+	EXPECT_JMP,
 };
 
 struct write_opcode_ctx {
@@ -917,14 +947,14 @@ struct write_opcode_ctx {
 	int expect;
 };
 
-static int is_call_insn(uprobe_opcode_t *insn)
+static int is_jmp_insn(uprobe_opcode_t *insn)
 {
-	return *insn == CALL_INSN_OPCODE;
+	return *insn == JMP32_INSN_OPCODE;
 }
 
 /*
  * Verification callback used by int3_update uprobe_write calls to make sure
- * the underlying instruction is as expected - either int3 or call.
+ * the underlying instruction is as expected - either int3 or jmp.
  */
 static int verify_insn(struct page *page, unsigned long vaddr, uprobe_opcode_t *new_opcode,
 		       int nbytes, void *data)
@@ -939,8 +969,8 @@ static int verify_insn(struct page *page, unsigned long vaddr, uprobe_opcode_t *
 		if (is_swbp_insn(&old_opcode[0]))
 			return 1;
 		break;
-	case EXPECT_CALL:
-		if (is_call_insn(&old_opcode[0]))
+	case EXPECT_JMP:
+		if (is_jmp_insn(&old_opcode[0]))
 			return 1;
 		break;
 	}
@@ -978,7 +1008,7 @@ static int int3_update(struct arch_uprobe *auprobe, struct vm_area_struct *vma,
 	 * so we can skip this step for optimize == true.
 	 */
 	if (!optimize) {
-		ctx.expect = EXPECT_CALL;
+		ctx.expect = EXPECT_JMP;
 		err = uprobe_write(auprobe, vma, vaddr, &int3, 1, verify_insn,
 				   true /* is_register */, false /* do_update_ref_ctr */,
 				   &ctx);
@@ -1015,13 +1045,13 @@ static int int3_update(struct arch_uprobe *auprobe, struct vm_area_struct *vma,
 }
 
 static int swbp_optimize(struct arch_uprobe *auprobe, struct vm_area_struct *vma,
-			 unsigned long vaddr, unsigned long tramp)
+			 unsigned long vaddr, unsigned long slot_vaddr)
 {
-	u8 call[5];
+	u8 jmp[5];
 
-	__text_gen_insn(call, CALL_INSN_OPCODE, (const void *) vaddr,
-			(const void *) tramp, CALL_INSN_SIZE);
-	return int3_update(auprobe, vma, vaddr, call, true /* optimize */);
+	__text_gen_insn(jmp, JMP32_INSN_OPCODE, (const void *) vaddr,
+			(const void *) slot_vaddr, JMP32_INSN_SIZE);
+	return int3_update(auprobe, vma, vaddr, jmp, true /* optimize */);
 }
 
 static int swbp_unoptimize(struct arch_uprobe *auprobe, struct vm_area_struct *vma,
@@ -1049,11 +1079,17 @@ static bool __is_optimized(uprobe_opcode_t *insn, unsigned long vaddr)
 	struct __packed __arch_relative_insn {
 		u8 op;
 		s32 raddr;
-	} *call = (struct __arch_relative_insn *) insn;
+	} *jmp = (struct __arch_relative_insn *) insn;
 
-	if (!is_call_insn(insn))
+	if (!is_jmp_insn(&jmp->op))
 		return false;
-	return __in_uprobe_trampoline(vaddr + 5 + call->raddr);
+
+	guard(rcu)();
+	/*
+	 * resolve_uprobe_addr() expects IP pointing after syscall instruction
+	 * (after the slot, basically), so adjust jump target address accordingly
+	 */
+	return resolve_uprobe_addr(vaddr + 5 + jmp->raddr + UPROBE_TRAMP_SLOT_SIZE, NULL);
 }
 
 static int is_optimized(struct mm_struct *mm, unsigned long vaddr)
@@ -1113,8 +1149,9 @@ static int __arch_uprobe_optimize(struct arch_uprobe *auprobe, struct mm_struct
 {
 	struct uprobe_trampoline *tramp;
 	struct vm_area_struct *vma;
+	unsigned long slot_vaddr;
 	bool new = false;
-	int err = 0;
+	int slot, err;
 
 	vma = find_vma(mm, vaddr);
 	if (!vma)
@@ -1122,8 +1159,17 @@ static int __arch_uprobe_optimize(struct arch_uprobe *auprobe, struct mm_struct
 	tramp = get_uprobe_trampoline(vaddr, &new);
 	if (!tramp)
 		return -EINVAL;
-	err = swbp_optimize(auprobe, vma, vaddr, tramp->vaddr);
-	if (WARN_ON_ONCE(err) && new)
+
+	slot = tramp_alloc_slot(tramp, vaddr);
+	if (slot < 0) {
+		if (new)
+			destroy_uprobe_trampoline(tramp);
+		return slot;
+	}
+
+	slot_vaddr = tramp->vaddr + slot * UPROBE_TRAMP_SLOT_SIZE;
+	err = swbp_optimize(auprobe, vma, vaddr, slot_vaddr);
+	if (err && new)
 		destroy_uprobe_trampoline(tramp);
 	return err;
 }
diff --git a/tools/lib/bpf/features.c b/tools/lib/bpf/features.c
index 4f19a0d79b0c..1b6c113357b2 100644
--- a/tools/lib/bpf/features.c
+++ b/tools/lib/bpf/features.c
@@ -577,10 +577,12 @@ static int probe_ldimm64_full_range_off(int token_fd)
 static int probe_uprobe_syscall(int token_fd)
 {
 	/*
-	 * If kernel supports uprobe() syscall, it will return -ENXIO when called
-	 * from the outside of a kernel-generated uprobe trampoline.
+	 * If kernel supports uprobe() syscall, it will return -EPROTO when
+	 * called from outside a kernel-generated uprobe trampoline.
+	 * Older kernels with the red-zone-clobbering bug return -ENXIO;
+	 * we only enable the nop5 optimization on fixed kernels.
 	 */
-	return syscall(__NR_uprobe) < 0 && errno == ENXIO;
+	return syscall(__NR_uprobe) < 0 && errno == EPROTO;
 }
 #else
 static int probe_uprobe_syscall(int token_fd)
diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
index 955a37751b52..0d5eb4cd1ddf 100644
--- a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
+++ b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
@@ -422,7 +422,8 @@ static void *check_attach(struct uprobe_syscall_executed *skel, trigger_t trigge
 	/* .. and check the trampoline is as expected. */
 	call = (struct __arch_relative_insn *) addr;
 	tramp = (void *) (call + 1) + call->raddr;
-	ASSERT_EQ(call->op, 0xe8, "call");
+	tramp = (void *)((unsigned long)tramp & ~(getpagesize() - 1UL));
+	ASSERT_EQ(call->op, 0xe9, "jmp");
 	ASSERT_OK(find_uprobes_trampoline(tramp), "uprobes_trampoline");
 
 	return tramp;
@@ -762,7 +763,7 @@ static void test_uprobe_error(void)
 	long err = syscall(__NR_uprobe);
 
 	ASSERT_EQ(err, -1, "error");
-	ASSERT_EQ(errno, ENXIO, "errno");
+	ASSERT_EQ(errno, EPROTO, "errno");
 }
 
 static void __test_uprobe_syscall(void)
diff --git a/tools/testing/selftests/bpf/prog_tests/usdt.c b/tools/testing/selftests/bpf/prog_tests/usdt.c
index 69759b27794d..9d3744d4e936 100644
--- a/tools/testing/selftests/bpf/prog_tests/usdt.c
+++ b/tools/testing/selftests/bpf/prog_tests/usdt.c
@@ -329,7 +329,7 @@ static void subtest_optimized_attach(void)
 	ASSERT_EQ(*addr_2, 0x90, "nop");
 
 	/* call is on addr_2 + 1 address */
-	ASSERT_EQ(*(addr_2 + 1), 0xe8, "call");
+	ASSERT_EQ(*(addr_2 + 1), 0xe9, "jmp");
 	ASSERT_EQ(skel->bss->executed, 4, "executed");
 
 cleanup:
-- 
2.53.0-Meta


^ permalink raw reply related

* [PATCH bpf 2/2] selftests/bpf: Add tests for uprobe nop5 red zone clobbering
From: Andrii Nakryiko @ 2026-05-09  0:30 UTC (permalink / raw)
  To: bpf
  Cc: linux-trace-kernel, jolsa, oleg, peterz, mingo, mhiramat,
	Andrii Nakryiko
In-Reply-To: <20260509003146.976844-1-andrii@kernel.org>

The uprobe nop5 optimization used to replace a 5-byte NOP with a 5-byte
CALL to a trampoline. The CALL pushes a return address onto the stack at
[rsp-8], clobbering whatever was stored there.

On x86-64, the red zone is the 128 bytes below rsp that user code may use
for temporary storage without adjusting rsp. Compilers can place USDT
argument operands there, generating specs like "8@-8(%rbp)" when rbp ==
rsp. With the CALL-based optimization, the return address overwrites that
argument before the BPF-side USDT argument fetch runs.

Add two tests for this case. The uprobe_syscall subtest stores known values
at -8(%rsp), -16(%rsp), and -24(%rsp), executes an optimized nop5 uprobe,
and verifies the red-zone data is still intact. The USDT subtest triggers a
probe in a function where the compiler places three USDT operands in the
red zone and verifies that all 10 optimized invocations deliver the expected
argument values to BPF.

On an unfixed kernel, the first hit goes through the INT3 path and later
hits use the optimized CALL path, so the red-zone checks fail after
optimization.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/prog_tests/uprobe_syscall.c | 75 ++++++++++++++++++-
 tools/testing/selftests/bpf/prog_tests/usdt.c | 46 ++++++++++++
 tools/testing/selftests/bpf/progs/test_usdt.c | 25 +++++++
 tools/testing/selftests/bpf/usdt_2.c          | 13 ++++
 4 files changed, 158 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
index 0d5eb4cd1ddf..6c651e4ff49a 100644
--- a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
+++ b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
@@ -357,6 +357,46 @@ __nocf_check __weak void usdt_test(void)
 	USDT(optimized_uprobe, usdt);
 }
 
+/*
+ * Assembly-level red zone clobbering test. Stores known values in the
+ * red zone (below RSP), executes a nop5 (uprobe site), and checks that
+ * the values survived. Returns 0 if intact, 1 if clobbered.
+ *
+ * If the nop5 optimization uses CALL (which pushes a return address to
+ * [rsp-8]), the value at -8(%rsp) gets overwritten.
+ */
+__attribute__((aligned(16)))
+__nocf_check __weak __naked unsigned long uprobe_red_zone_test(void)
+{
+	asm volatile (
+		"movabs $0x1111111111111111, %%rax\n"
+		"movq   %%rax, -8(%%rsp)\n"
+		"movabs $0x2222222222222222, %%rax\n"
+		"movq   %%rax, -16(%%rsp)\n"
+		"movabs $0x3333333333333333, %%rax\n"
+		"movq   %%rax, -24(%%rsp)\n"
+
+		".byte 0x0f, 0x1f, 0x44, 0x00, 0x00\n" /* nop5: uprobe site */
+
+		"movabs $0x1111111111111111, %%rax\n"
+		"cmpq   %%rax, -8(%%rsp)\n"
+		"jne    1f\n"
+		"movabs $0x2222222222222222, %%rax\n"
+		"cmpq   %%rax, -16(%%rsp)\n"
+		"jne    1f\n"
+		"movabs $0x3333333333333333, %%rax\n"
+		"cmpq   %%rax, -24(%%rsp)\n"
+		"jne    1f\n"
+
+		"xorl   %%eax, %%eax\n"
+		"retq\n"
+		"1:\n"
+		"movl   $1, %%eax\n"
+		"retq\n"
+		::: "rax", "memory"
+	);
+}
+
 static int find_uprobes_trampoline(void *tramp_addr)
 {
 	void *start, *end;
@@ -394,7 +434,7 @@ static void *find_nop5(void *fn)
 {
 	int i;
 
-	for (i = 0; i < 10; i++) {
+	for (i = 0; i < 128; i++) {
 		if (!memcmp(nop5, fn + i, 5))
 			return fn + i;
 	}
@@ -758,6 +798,37 @@ static void test_uprobe_race(void)
 #define __NR_uprobe 336
 #endif
 
+static void test_uprobe_red_zone(void)
+{
+	struct uprobe_syscall_executed *skel;
+	struct bpf_link *link;
+	void *nop5_addr;
+	size_t offset;
+	int i;
+
+	nop5_addr = find_nop5(uprobe_red_zone_test);
+	if (!ASSERT_NEQ(nop5_addr, NULL, "find_nop5"))
+		return;
+
+	skel = uprobe_syscall_executed__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		return;
+
+	offset = get_uprobe_offset(nop5_addr);
+	link = bpf_program__attach_uprobe_opts(skel->progs.test_uprobe,
+			0, "/proc/self/exe", offset, NULL);
+	if (!ASSERT_OK_PTR(link, "attach_uprobe"))
+		goto cleanup;
+
+	for (i = 0; i < 10; i++)
+		ASSERT_EQ(uprobe_red_zone_test(), 0, "red_zone_intact");
+
+	bpf_link__destroy(link);
+
+cleanup:
+	uprobe_syscall_executed__destroy(skel);
+}
+
 static void test_uprobe_error(void)
 {
 	long err = syscall(__NR_uprobe);
@@ -784,6 +855,8 @@ static void __test_uprobe_syscall(void)
 		test_uprobe_usdt();
 	if (test__start_subtest("uprobe_race"))
 		test_uprobe_race();
+	if (test__start_subtest("uprobe_red_zone"))
+		test_uprobe_red_zone();
 	if (test__start_subtest("uprobe_error"))
 		test_uprobe_error();
 	if (test__start_subtest("uprobe_regs_equal"))
diff --git a/tools/testing/selftests/bpf/prog_tests/usdt.c b/tools/testing/selftests/bpf/prog_tests/usdt.c
index 9d3744d4e936..5e607773d5cc 100644
--- a/tools/testing/selftests/bpf/prog_tests/usdt.c
+++ b/tools/testing/selftests/bpf/prog_tests/usdt.c
@@ -250,6 +250,7 @@ static void subtest_basic_usdt(bool optimized)
 #ifdef __x86_64__
 extern void usdt_1(void);
 extern void usdt_2(void);
+extern void usdt_red_zone_trigger(void);
 
 static unsigned char nop1[1] = { 0x90 };
 static unsigned char nop1_nop5_combo[6] = { 0x90, 0x0f, 0x1f, 0x44, 0x00, 0x00 };
@@ -335,6 +336,49 @@ static void subtest_optimized_attach(void)
 cleanup:
 	test_usdt__destroy(skel);
 }
+
+/*
+ * Test that USDT arguments survive nop5 optimization in a function where
+ * the compiler places operands in the red zone.
+ *
+ * Signal handlers are prone to having the compiler place USDT argument
+ * operands in the red zone (below rsp). When nop5 is optimized to a
+ * call instruction, the call pushes the return address to [rsp-8],
+ * potentially clobbering the argument.
+ */
+static void subtest_optimized_red_zone(void)
+{
+	struct test_usdt *skel;
+	int i;
+
+	skel = test_usdt__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		return;
+
+	skel->bss->expected_arg[0] = 0xDEADBEEF;
+	skel->bss->expected_arg[1] = 0xCAFEBABE;
+	skel->bss->expected_arg[2] = 0xFEEDFACE;
+	skel->bss->expected_pid = getpid();
+
+	skel->links.usdt_check_arg = bpf_program__attach_usdt(
+		skel->progs.usdt_check_arg, 0, "/proc/self/exe",
+		"optimized_attach", "usdt_red_zone", NULL);
+	if (!ASSERT_OK_PTR(skel->links.usdt_check_arg, "attach_usdt_red_zone"))
+		goto cleanup;
+
+	for (i = 0; i < 10; i++)
+		usdt_red_zone_trigger();
+
+	ASSERT_EQ(skel->bss->arg_total, 10, "arg_total");
+	ASSERT_EQ(skel->bss->arg_bad, 0, "arg_bad");
+	ASSERT_EQ(skel->bss->arg_last[0], 0xDEADBEEF, "arg_last_1");
+	ASSERT_EQ(skel->bss->arg_last[1], 0xCAFEBABE, "arg_last_2");
+	ASSERT_EQ(skel->bss->arg_last[2], 0xFEEDFACE, "arg_last_3");
+
+cleanup:
+	test_usdt__destroy(skel);
+}
+
 #endif
 
 unsigned short test_usdt_100_semaphore SEC(".probes");
@@ -608,6 +652,8 @@ void test_usdt(void)
 		subtest_basic_usdt(true);
 	if (test__start_subtest("optimized_attach"))
 		subtest_optimized_attach();
+	if (test__start_subtest("optimized_red_zone"))
+		subtest_optimized_red_zone();
 #endif
 	if (test__start_subtest("multispec"))
 		subtest_multispec_usdt();
diff --git a/tools/testing/selftests/bpf/progs/test_usdt.c b/tools/testing/selftests/bpf/progs/test_usdt.c
index f00cb52874e0..0ee78fb050a1 100644
--- a/tools/testing/selftests/bpf/progs/test_usdt.c
+++ b/tools/testing/selftests/bpf/progs/test_usdt.c
@@ -149,5 +149,30 @@ int usdt_executed(struct pt_regs *ctx)
 		executed++;
 	return 0;
 }
+
+int arg_total;
+int arg_bad;
+long arg_last[3];
+long expected_arg[3];
+int expected_pid;
+
+SEC("usdt")
+int BPF_USDT(usdt_check_arg, long arg1, long arg2, long arg3)
+{
+	if (expected_pid != (bpf_get_current_pid_tgid() >> 32))
+		return 0;
+
+	__sync_fetch_and_add(&arg_total, 1);
+	arg_last[0] = arg1;
+	arg_last[1] = arg2;
+	arg_last[2] = arg3;
+
+	if (arg1 != expected_arg[0] ||
+	    arg2 != expected_arg[1] ||
+	    arg3 != expected_arg[2])
+		__sync_fetch_and_add(&arg_bad, 1);
+
+	return 0;
+}
 #endif
 char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/usdt_2.c b/tools/testing/selftests/bpf/usdt_2.c
index 789883aaca4c..fc7e6d220a38 100644
--- a/tools/testing/selftests/bpf/usdt_2.c
+++ b/tools/testing/selftests/bpf/usdt_2.c
@@ -13,4 +13,17 @@ void usdt_2(void)
 	USDT(optimized_attach, usdt_2);
 }
 
+static volatile unsigned long usdt_red_zone_arg1 = 0xDEADBEEF;
+static volatile unsigned long usdt_red_zone_arg2 = 0xCAFEBABE;
+static volatile unsigned long usdt_red_zone_arg3 = 0xFEEDFACE;
+
+void __attribute__((noinline)) usdt_red_zone_trigger(void)
+{
+	unsigned long a1 = usdt_red_zone_arg1;
+	unsigned long a2 = usdt_red_zone_arg2;
+	unsigned long a3 = usdt_red_zone_arg3;
+
+	USDT(optimized_attach, usdt_red_zone, a1, a2, a3);
+}
+
 #endif
-- 
2.53.0-Meta


^ permalink raw reply related

* Re: [PATCH v1 1/2] serial: qcom-geni: trace: Add tracepoint support for Qualcomm GENI serial
From: Praveen Talari @ 2026-05-09  1:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, Greg Kroah-Hartman,
	Jiri Slaby, Konrad Dybcio, linux-kernel, linux-trace-kernel,
	linux-arm-msm, linux-serial, Mukesh Kumar Savaliya,
	Aniket Randive, chandana.chiluveru, jyothi.seerapu
In-Reply-To: <20260508132543.4f100ae0@gandalf.local.home>


On 08-05-2026 22:55, Steven Rostedt wrote:
> On Wed, 06 May 2026 22:54:44 +0530
> Praveen Talari <praveen.talari@oss.qualcomm.com> wrote:
>
>> +TRACE_EVENT(geni_serial_tx_data,
>> +	    TP_PROTO(struct device *dev, const u8 *buf, unsigned int len),
>> +	    TP_ARGS(dev, buf, len),
>> +
>> +	    TP_STRUCT__entry(__string(name, dev_name(dev))
>> +			     __field(unsigned int, len)
>> +			     __dynamic_array(u8, data, len)
>> +	    ),
>> +
>> +	    TP_fast_assign(__assign_str(name);
>> +			   __entry->len = len;
>> +			   memcpy(__get_dynamic_array(data), buf, len);
>> +	    ),
>> +
>> +	    TP_printk("%s: tx_len=%u data=%s",
>> +		      __get_str(name), __entry->len,
>> +		      __print_hex(__get_dynamic_array(data), __entry->len))
>> +);
>> +
>> +TRACE_EVENT(geni_serial_rx_data,
>> +	    TP_PROTO(struct device *dev, const u8 *buf, unsigned int len),
>> +	    TP_ARGS(dev, buf, len),
>> +
>> +	    TP_STRUCT__entry(__string(name, dev_name(dev))
>> +			     __field(unsigned int, len)
>> +			     __dynamic_array(u8, data, len)
>> +	    ),
>> +
>> +	    TP_fast_assign(__assign_str(name);
>> +			   __entry->len = len;
>> +			   memcpy(__get_dynamic_array(data), buf, len);
>> +	    ),
>> +
>> +	    TP_printk("%s: rx_len=%u data=%s",
> Do you really need to say "tx_len" and "rx_len", could it just be "len" and
> have the name of the tracepoint show what it is?
Sure. I will review and update next patch.
>
> Each TRACE_EVENT() is really just a:
>
>    DECLARE_EVENT_CLASS() followed by a DEFINE_EVENT()
>
> underneath.
>
> And each TRACE_EVENT() costs around 5K in size, where most of that is in
> the DECLARE_EVENT_CLASS() portion. Thus, you can save some memory by using
> DECLARE_EVENT_CLASS() and then define the above two events with
> DEFINE_EVENT().

Thank you for suggestion and will update in next patch.

Thanks,

Praveen Talari

>
> -- Steve
>
>
>> +		      __get_str(name), __entry->len,
>> +		      __print_hex(__get_dynamic_array(data), __entry->len))
>> +);
>> +

^ permalink raw reply

* Re: [PATCH v2] mm: vmscan: rework lru_shrink and write_folio tracepoints
From: Andrew Morton @ 2026-05-09  1:29 UTC (permalink / raw)
  To: qiwu.chen
  Cc: rostedt, mhiramat, hannes, david, mhocko, willy,
	linux-trace-kernel, linux-mm, qiwu.chen
In-Reply-To: <20260506083652.100160-1-qiwu.chen@transsion.com>

On Wed,  6 May 2026 16:36:52 +0800 "qiwu.chen" <qiwuchen55@gmail.com> wrote:

> Currently, reclaim_flags always contains RECLAIM_WB_ASYNC in lru_shrink
> tracepoints since commit 41ac1999c3e35 ("mm: vmscan: do not stall on
> writeback during memory compaction"), which is useless for debugging
> memory pressure issues. Other RECLAIM_WB_* flags are not used anywhere
> else, so they can be directly removed.
> This patch reworks the lru_shrink and write_folio tracepoints for better
> correlation and analysis:
>  - traces each folio lru type instead of reclaim_flags.
>  - traces each lru_shrink with reason.
>  - remove the printing of the unnecessary PFN for mm_vmscan_write_folio.

Applying this to 7.1-rc1, my x86_64 allmodconfig blew up.



In file included from ./include/trace/define_trace.h:132,
                 from ./include/trace/events/vmscan.h:602,
                 from mm/vmscan.c:72:
./include/trace/events/vmscan.h: In function 'trace_raw_output_mm_vmscan_write_folio':
./include/trace/events/vmscan.h:358:19: error: format '%p' expects argument of type 'void *', but argument 3 has type 'long unsigned int' [-Werror=format=]
  358 |         TP_printk("folio=%p lru=%s",
      |                   ^~~~~~~~~~~~~~~~~
./include/trace/trace_events.h:219:34: note: in definition of macro 'DECLARE_EVENT_CLASS'
  219 |         trace_event_printf(iter, print);                                \
      |                                  ^~~~~
./include/trace/trace_events.h:45:30: note: in expansion of macro 'PARAMS'
   45 |                              PARAMS(print));                   \
      |                              ^~~~~~
./include/trace/events/vmscan.h:342:1: note: in expansion of macro 'TRACE_EVENT'
  342 | TRACE_EVENT(mm_vmscan_write_folio,
      | ^~~~~~~~~~~
./include/trace/events/vmscan.h:358:9: note: in expansion of macro 'TP_printk'
  358 |         TP_printk("folio=%p lru=%s",
      |         ^~~~~~~~~
In file included from ./include/trace/trace_events.h:256:
./include/trace/events/vmscan.h:358:27: note: format string is defined here
  358 |         TP_printk("folio=%p lru=%s",
      |                          ~^
      |                           |
      |                           void *
      |                          %ld
./include/trace/events/vmscan.h: In function 'do_trace_event_raw_event_mm_vmscan_write_folio':
./include/trace/events/vmscan.h:354:32: error: assignment to 'long unsigned int' from 'struct folio *' makes integer from pointer without a cast [-Wint-conversion]
  354 |                 __entry->folio = folio;
      |                                ^
./include/trace/trace_events.h:427:11: note: in definition of macro '__DECLARE_EVENT_CLASS'
  427 |         { assign; }                                                     \
      |           ^~~~~~
./include/trace/trace_events.h:435:23: note: in expansion of macro 'PARAMS'
  435 |                       PARAMS(assign), PARAMS(print))                    \
      |                       ^~~~~~
./include/trace/trace_events.h:40:9: note: in expansion of macro 'DECLARE_EVENT_CLASS'
   40 |         DECLARE_EVENT_CLASS(name,                              \
      |         ^~~~~~~~~~~~~~~~~~~
./include/trace/trace_events.h:44:30: note: in expansion of macro 'PARAMS'
   44 |                              PARAMS(assign),                   \
      |                              ^~~~~~
./include/trace/events/vmscan.h:342:1: note: in expansion of macro 'TRACE_EVENT'
  342 | TRACE_EVENT(mm_vmscan_write_folio,
      | ^~~~~~~~~~~
./include/trace/events/vmscan.h:353:9: note: in expansion of macro 'TP_fast_assign'
  353 |         TP_fast_assign(
      |         ^~~~~~~~~~~~~~
In file included from ./include/trace/define_trace.h:133:
./include/trace/events/vmscan.h: In function 'do_perf_trace_mm_vmscan_write_folio':
./include/trace/events/vmscan.h:354:32: error: assignment to 'long unsigned int' from 'struct folio *' makes integer from pointer without a cast [-Wint-conversion]
  354 |                 __entry->folio = folio;
      |                                ^
./include/trace/perf.h:51:11: note: in definition of macro '__DECLARE_EVENT_CLASS'
   51 |         { assign; }                                                     \
      |           ^~~~~~
./include/trace/perf.h:67:23: note: in expansion of macro 'PARAMS'
   67 |                       PARAMS(assign), PARAMS(print))                    \
      |                       ^~~~~~
./include/trace/trace_events.h:40:9: note: in expansion of macro 'DECLARE_EVENT_CLASS'
   40 |         DECLARE_EVENT_CLASS(name,                              \
      |         ^~~~~~~~~~~~~~~~~~~
./include/trace/trace_events.h:44:30: note: in expansion of macro 'PARAMS'
   44 |                              PARAMS(assign),                   \
      |                              ^~~~~~
./include/trace/events/vmscan.h:342:1: note: in expansion of macro 'TRACE_EVENT'
  342 | TRACE_EVENT(mm_vmscan_write_folio,
      | ^~~~~~~~~~~
./include/trace/events/vmscan.h:353:9: note: in expansion of macro 'TP_fast_assign'
  353 |         TP_fast_assign(
      |         ^~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[3]: *** [scripts/Makefile.build:289: mm/vmscan.o] Error 1
make[2]: *** [scripts/Makefile.build:548: mm] Error 2
make[1]: *** [/usr/src/25/Makefile:2141: .] Error 2
make: *** [Makefile:248: __sub-make] Error 2


^ permalink raw reply

* Re: [PATCH v1 1/2] spi: qcom-geni: trace: Add trace events for Qualcomm GENI SPI
From: Praveen Talari @ 2026-05-09  2:07 UTC (permalink / raw)
  To: Mark Brown
  Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel, linux-arm-msm, linux-spi,
	MukeshKumarSavaliyamukesh.savaliya, AniketRandiveaniket.randive,
	chandana.chiluveru, jyothi.seerapu
In-Reply-To: <af3spostNgoRU0Vv@sirena.co.uk>


On 08-05-2026 19:31, Mark Brown wrote:
> On Thu, May 07, 2026 at 11:03:39PM +0530, Praveen Talari wrote:
>> On 07-05-2026 13:43, Mark Brown wrote:
>>> By generic I mean this should not be driver specific at all.
>> I hope these changes are fine. Please let me know if you have any concerns
>> or feedback.
> The data tracepoints look plausible but I would expect them to be
> generated by the core, they'll be there for everything so I'd expect

Thank you for the clarification. I now understand your point clearly.

Could you also please review the changes made in spi.c ?
I would appreciate any feedback or suggestions you may have.


diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 91dd831d2d3b..f0d3665412fe 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1658,6 +1658,11 @@ static int spi_transfer_one_message(struct 
spi_controller *ctlr,

                 trace_spi_transfer_stop(msg, xfer);

+               if (spi_valid_txbuf(msg, xfer))
+                       trace_spi_tx_data(msg->spi, xfer->tx_buf, 
xfer->len);
+               if (spi_valid_rxbuf(msg, xfer))
+                       trace_spi_rx_data(msg->spi, xfer->rx_buf, 
xfer->len);
+
                 if (msg->status != -EINPROGRESS)

                         goto out;


Thanks,

Praveen Talari

> them to work for everything.

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox