Linux Trace Kernel

Linux Trace Kernel
 help / color / mirror / Atom feed

* Re: [PATCH] tracing: preserve module tracepoint strings
From: Petr Pavlu @ 2026-04-09 12:37 UTC (permalink / raw)
  To: Cao Ruichuang
  Cc: rostedt, mhiramat, mathieu.desnoyers, mcgrof, da.gomez,
	samitolvanen, atomlin, linux-kernel, linux-trace-kernel,
	linux-modules
In-Reply-To: <20260406170944.51047-1-create0818@163.com>

On 4/6/26 7:09 PM, Cao Ruichuang wrote:
> tracepoint_string() is documented as exporting constant strings
> through printk_formats, including when it is used from modules.
> That currently does not work.
> 
> A small test module that calls
> tracepoint_string("tracepoint_string_test_module_string") loads
> successfully and gets a pointer back, but the string never appears
> in /sys/kernel/tracing/printk_formats. The loader only collects
> __trace_printk_fmt from modules and ignores __tracepoint_str.
> 
> Collect module __tracepoint_str entries too, copy them to stable
> tracing-managed storage like module trace_printk formats, and let
> trace_is_tracepoint_string() recognize those copied strings. This
> makes module tracepoint strings visible through printk_formats and
> keeps them accepted by the trace string safety checks.

The documentation for tracepoint_string() in include/linux/tracepoint.h
contains:

 * The @str used must be a constant string and persistent as it would not
 * make sense to show a string that no longer exists. But it is still fine
 * to be used with modules, because when modules are unloaded, if they
 * had tracepoints, the ring buffers are cleared too. As long as the string
 * does not change during the life of the module, it is fine to use
 * tracepoint_string() within a module.

The implemented approach copies all strings referenced by
__tracepoint_str and keeps them for the kernel's lifetime. I wonder if
the code could directly reference the original data and handle its
removal when the module is unloaded, or if the quoted comment should be
updated to reflect the new behavior.

-- 
Thanks,
Petr

^ permalink raw reply

* Re: [PATCH v4] kernel/trace: fixed static warnings
From: Abhijith Sriram @ 2026-04-09 12:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, open list:TRACING,
	open list:TRACING
In-Reply-To: <20260408162459.6b8f4b01@gandalf.local.home>

On Wed, Apr 8, 2026 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
>
> Subject should be:
>
>  tracing: Fixed static checker warnings
>
> On Mon,  6 Apr 2026 09:28:34 +0200
> abhijithsriram95@gmail.com wrote:
>
> > From: Abhijith Sriram <abhijithsriram95@gmail.com>
> >
> > The change in the function argument description
> > was due to the static code checker script reading
> > the word filter back to back
> >
>
> The below changes should be beneath the '---'
>
> > Changes in v2:
>
> The last change should be first. In fact, I only care about the last change
> as the previous versions should have the description of what changed.
>
> > - corrected *m = file->private_data to m = file->private_data
> >
> > Changes in v3:
> > - reverted the changes for struct seq_file *m and
> >   added a new empty line instead
> >
> > Changes in v4:
>
> That said, this should really be:
>
>  Changes since v3: https://lore.kernel.org/all/20260406060046.223496-2-abhijithsriram95@gmail.com/
>
>
> > - added a new empty line before char *buf ...
> >   previously this line was relocated to avoid the
> >   static check warning.
> >
> > Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
> > ---
> >  kernel/trace/trace_events_trigger.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
> > index 655db2e82513..664283bcd9ea 100644
> > --- a/kernel/trace/trace_events_trigger.c
> > +++ b/kernel/trace/trace_events_trigger.c
> > @@ -246,7 +246,7 @@ event_triggers_post_call(struct trace_event_file *file,
> >  }
> >  EXPORT_SYMBOL_GPL(event_triggers_post_call);
> >
> > -#define SHOW_AVAILABLE_TRIGGERS      (void *)(1UL)
> > +#define SHOW_AVAILABLE_TRIGGERS      ((void *)(1UL))
> >
> >  static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
> >  {
> > @@ -352,6 +352,7 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
> >               ret = seq_open(file, &event_triggers_seq_ops);
> >               if (!ret) {
> >                       struct seq_file *m = file->private_data;
> > +
>
> This blank line makes the code look worse. Yes, we usually want a blank
> line between the variable declarations and the code, but when it comes to
> code blocks (not functions) that rule is not as strict.
>
> Get rid of this newline.
>
>
> >                       m->private = file;
> >               }
> >       }
> > @@ -390,6 +391,7 @@ static ssize_t event_trigger_regex_write(struct file *file,
> >  {
> >       struct trace_event_file *event_file;
> >       ssize_t ret;
> > +
> >       char *buf __free(kfree) = NULL;
>
> The char *buf is a declaration. It no new line is expected before it.
>
> >
> >       if (!cnt)
> > @@ -633,6 +635,7 @@ clear_event_triggers(struct trace_array *tr)
> >
> >       list_for_each_entry(file, &tr->events, list) {
> >               struct event_trigger_data *data, *n;
> > +
>
> Again, if it's in a code block, don't change it.
>
> -- Steve
>
> >               list_for_each_entry_safe(data, n, &file->triggers, list) {
> >                       trace_event_trigger_enable_disable(file, 0);
> >                       list_del_rcu(&data->list);
> > @@ -785,7 +788,7 @@ static void unregister_trigger(char *glob,
> >   *   cmd               - the trigger command name
> >   *   glob              - the trigger command name optionally prefaced with '!'
> >   *   param_and_filter  - text following cmd and ':'
> > - *   param             - text following cmd and ':' and stripped of filter
> > + *   param             - text following cmd and ':' and filter removed
> >   *   filter            - the optional filter text following (and including) 'if'
> >   *
> >   * To illustrate the use of these components, here are some concrete
>

Shall we totally scrap these changes? Apart from introducing new lines
the only other
changes are to add some brackets and reframe the comments. We are not adding
a lot of value from this change. I will take this as a positive first
time experience
into linux kernel dev and focus on other meaningful changes.

-- 
Regards
Abhijith Sriram

^ permalink raw reply

* Re: [PATCH v2 1/2] tracing/fprobe: Avoid kcalloc() in rcu_read_lock section
From: Menglong Dong @ 2026-04-09 12:05 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Masami Hiramatsu (Google)
  Cc: Menglong Dong, Mathieu Desnoyers, jiang.biao, linux-kernel,
	linux-trace-kernel
In-Reply-To: <177573095696.3666478.4412068539797028855.stgit@mhiramat.tok.corp.google.com>

On 2026/4/9 18:35 Masami Hiramatsu (Google) <mhiramat@kernel.org> write:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> fprobe_remove_node_in_module() is called under RCU read locked, but
> this invokes kcalloc() if there are more than 8 fprobes installed
> on the module. Sashiko warns it because kcalloc() can sleep [1].
> 
>  [1] https://sashiko.dev/#/patchset/177552432201.853249.5125045538812833325.stgit%40mhiramat.tok.corp.google.com
> 
> To fix this issue, expand the batch size to 128 and do not expand
> the fprobe_addr_list, but just cancel walking on fprobe_ip_table,
> update fgraph/ftrace_ops and retry the loop again.
> 
> Fixes: 0de4c70d04a4 ("tracing: fprobe: use rhltable for fprobe_ip_table")
> Cc: stable@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  kernel/trace/fprobe.c |   53 ++++++++++++++++++-------------------------------
>  1 file changed, 19 insertions(+), 34 deletions(-)
> 
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 56d145017902..058cf6ef7ebb 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -536,7 +536,7 @@ static void fprobe_graph_remove_ips(unsigned long *addrs, int num)

Hi, Masami. Thanks for the fixes. Overall, the whole series
LGTM.

Some nits below.

>  
>  #ifdef CONFIG_MODULES
>  
[...]
>  				  unsigned long val, void *data)
> @@ -591,6 +567,7 @@ static int fprobe_module_callback(struct notifier_block *nb,
>  	struct fprobe_hlist_node *node;
>  	struct rhashtable_iter iter;
>  	struct module *mod = data;
> +	bool retry;
>  
>  	if (val != MODULE_STATE_GOING)
>  		return NOTIFY_DONE;
> @@ -600,13 +577,19 @@ static int fprobe_module_callback(struct notifier_block *nb,
>  	if (!alist.addrs)
>  		return NOTIFY_DONE;

The "retry" confuse me a little. How about we use "again" and "more"
here:

+again:
+	more = false;

>  
> +retry:
> +	retry = false;
> +	alist.index = 0;
>  	mutex_lock(&fprobe_mutex);
>  	rhltable_walk_enter(&fprobe_ip_table, &iter);
>  	do {
>  		rhashtable_walk_start(&iter);
>  
>  		while ((node = rhashtable_walk_next(&iter)) && !IS_ERR(node))
> -			fprobe_remove_node_in_module(mod, node, &alist);
> +			if (fprobe_remove_node_in_module(mod, node, &alist) < 0) {
> +				retry = true;
> +				break;
> +			}

Wrap the code within the "while" with {}?

Thanks!
Menglong Dong

>  
>  		rhashtable_walk_stop(&iter);
>  	} while (node == ERR_PTR(-EAGAIN));
> @@ -615,6 +598,8 @@ static int fprobe_module_callback(struct notifier_block *nb,
>  	if (alist.index > 0)
>  		fprobe_set_ips(alist.addrs, alist.index, 1, 0);
>  	mutex_unlock(&fprobe_mutex);
> +	if (retry)
> +		goto retry;
>  
>  	kfree(alist.addrs);
>  
> 
> 
> 





^ permalink raw reply

* [PATCH v2 2/2] tracing/fprobe: Check the same type fprobe on table as the unregistered one
From: Masami Hiramatsu (Google) @ 2026-04-09 10:36 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu
  Cc: Menglong Dong, Mathieu Desnoyers, jiang.biao, linux-kernel,
	linux-trace-kernel
In-Reply-To: <177573094819.3666478.11900825120958856397.stgit@mhiramat.tok.corp.google.com>

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Commit 2c67dc457bc6 ("tracing: fprobe: optimization for entry only case")
introduced a different ftrace_ops for entry-only fprobes.

However, when unregistering an fprobe, the kernel only checks if another
fprobe exists at the same address, without checking which type of fprobe
it is.
If different fprobes are registered at the same address, the same address
will be registered in both fgraph_ops and ftrace_ops, but only one of
them will be deleted when unregistering. (the one removed first will not
be deleted from the ops).

This results in junk entries remaining in either fgraph_ops or ftrace_ops.
For example:
 =======
 cd /sys/kernel/tracing

 # 'Add entry and exit events on the same place'
 echo 'f:event1 vfs_read' >> dynamic_events
 echo 'f:event2 vfs_read%return' >> dynamic_events

 # 'Enable both of them'
 echo 1 > events/fprobes/enable
 cat enabled_functions
vfs_read (2)            ->arch_ftrace_ops_list_func+0x0/0x210

 # 'Disable and remove exit event'
 echo 0 > events/fprobes/event2/enable
 echo -:event2 >> dynamic_events

 # 'Disable and remove all events'
 echo 0 > events/fprobes/enable
 echo > dynamic_events

 # 'Add another event'
 echo 'f:event3 vfs_open%return' > dynamic_events
 cat dynamic_events
f:fprobes/event3 vfs_open%return

 echo 1 > events/fprobes/enable
 cat enabled_functions
vfs_open (1)            tramp: 0xffffffffa0001000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60    subops: {ent:fprobe_fgraph_entry+0x0/0x620 ret:fprobe_return+0x0/0x150}
vfs_read (1)            tramp: 0xffffffffa0001000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60    subops: {ent:fprobe_fgraph_entry+0x0/0x620 ret:fprobe_return+0x0/0x150}
 =======

As you can see, an entry for the vfs_read remains.

To fix this issue, when unregistering, the kernel should also check if
there is the same type of fprobes still exist at the same address, and
if not, delete its entry from either fgraph_ops or ftrace_ops.

Fixes: 2c67dc457bc6 ("tracing: fprobe: optimization for entry only case")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/fprobe.c |   81 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 65 insertions(+), 16 deletions(-)

diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 058cf6ef7ebb..5c059ec1babc 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -85,11 +85,9 @@ static int insert_fprobe_node(struct fprobe_hlist_node *node)
 	return rhltable_insert(&fprobe_ip_table, &node->hlist, fprobe_rht_params);
 }
 
-/* Return true if there are synonims */
-static bool delete_fprobe_node(struct fprobe_hlist_node *node)
+static void delete_fprobe_node(struct fprobe_hlist_node *node)
 {
 	lockdep_assert_held(&fprobe_mutex);
-	bool ret;
 
 	/* Avoid double deleting */
 	if (READ_ONCE(node->fp) != NULL) {
@@ -97,13 +95,6 @@ static bool delete_fprobe_node(struct fprobe_hlist_node *node)
 		rhltable_remove(&fprobe_ip_table, &node->hlist,
 				fprobe_rht_params);
 	}
-
-	rcu_read_lock();
-	ret = !!rhltable_lookup(&fprobe_ip_table, &node->addr,
-				fprobe_rht_params);
-	rcu_read_unlock();
-
-	return ret;
 }
 
 /* Check existence of the fprobe */
@@ -337,6 +328,32 @@ static bool fprobe_is_ftrace(struct fprobe *fp)
 	return !fp->exit_handler;
 }
 
+static bool fprobe_exists_on_hash(unsigned long ip, bool ftrace)
+{
+	struct rhlist_head *head, *pos;
+	struct fprobe_hlist_node *node;
+	struct fprobe *fp;
+
+	guard(rcu)();
+	head = rhltable_lookup(&fprobe_ip_table, &ip,
+				fprobe_rht_params);
+	if (!head)
+		return false;
+	/* We have to check the same type on the list. */
+	rhl_for_each_entry_rcu(node, pos, head, hlist) {
+		if (node->addr != ip)
+			break;
+		fp = READ_ONCE(node->fp);
+		if (likely(fp)) {
+			if ((!ftrace && fp->exit_handler) ||
+			    (ftrace && !fp->exit_handler))
+				return true;
+		}
+	}
+
+	return false;
+}
+
 #ifdef CONFIG_MODULES
 static void fprobe_set_ips(unsigned long *ips, unsigned int cnt, int remove,
 			   int reset)
@@ -360,6 +377,29 @@ static bool fprobe_is_ftrace(struct fprobe *fp)
 	return false;
 }
 
+static bool fprobe_exists_on_hash(unsigned long ip, bool ftrace __maybe_unused)
+{
+	struct rhlist_head *head, *pos;
+	struct fprobe_hlist_node *node;
+	struct fprobe *fp;
+
+	guard(rcu)();
+	head = rhltable_lookup(&fprobe_ip_table, &ip,
+				fprobe_rht_params);
+	if (!head)
+		return false;
+	/* We only need to check fp is there. */
+	rhl_for_each_entry_rcu(node, pos, head, hlist) {
+		if (node->addr != ip)
+			break;
+		fp = READ_ONCE(node->fp);
+		if (likely(fp))
+			return true;
+	}
+
+	return false;
+}
+
 #ifdef CONFIG_MODULES
 static void fprobe_set_ips(unsigned long *ips, unsigned int cnt, int remove,
 			   int reset)
@@ -547,15 +587,22 @@ struct fprobe_addr_list {
 static int fprobe_remove_node_in_module(struct module *mod, struct fprobe_hlist_node *node,
 					 struct fprobe_addr_list *alist)
 {
+	lockdep_assert_in_rcu_read_lock();
+
 	if (!within_module(node->addr, mod))
 		return 0;
 
-	if (delete_fprobe_node(node))
-		return 0;
+	delete_fprobe_node(node);
+	/*
+	 * Don't care the type here, because all fprobes on the same
+	 * address must be removed eventually.
+	 */
+	if (!rhltable_lookup(&fprobe_ip_table, &node->addr, fprobe_rht_params)) {
+		alist->addrs[alist->index++] = node->addr;
+		if (alist->index == alist->size)
+			return -ENOSPC;
+	}
 
-	alist->addrs[alist->index++] = node->addr;
-	if (alist->index == alist->size)
-		return -ENOSPC;
 	return 0;
 }
 
@@ -926,7 +973,9 @@ int unregister_fprobe(struct fprobe *fp)
 	/* Remove non-synonim ips from table and hash */
 	count = 0;
 	for (i = 0; i < hlist_array->size; i++) {
-		if (!delete_fprobe_node(&hlist_array->array[i]))
+		delete_fprobe_node(&hlist_array->array[i]);
+		if (!fprobe_exists_on_hash(hlist_array->array[i].addr,
+					   fprobe_is_ftrace(fp)))
 			addrs[count++] = hlist_array->array[i].addr;
 	}
 	del_fprobe_hash(fp);


^ permalink raw reply related

* [PATCH v2 1/2] tracing/fprobe: Avoid kcalloc() in rcu_read_lock section
From: Masami Hiramatsu (Google) @ 2026-04-09 10:35 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu
  Cc: Menglong Dong, Mathieu Desnoyers, jiang.biao, linux-kernel,
	linux-trace-kernel
In-Reply-To: <177573094819.3666478.11900825120958856397.stgit@mhiramat.tok.corp.google.com>

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

fprobe_remove_node_in_module() is called under RCU read locked, but
this invokes kcalloc() if there are more than 8 fprobes installed
on the module. Sashiko warns it because kcalloc() can sleep [1].

 [1] https://sashiko.dev/#/patchset/177552432201.853249.5125045538812833325.stgit%40mhiramat.tok.corp.google.com

To fix this issue, expand the batch size to 128 and do not expand
the fprobe_addr_list, but just cancel walking on fprobe_ip_table,
update fgraph/ftrace_ops and retry the loop again.

Fixes: 0de4c70d04a4 ("tracing: fprobe: use rhltable for fprobe_ip_table")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/fprobe.c |   53 ++++++++++++++++++-------------------------------
 1 file changed, 19 insertions(+), 34 deletions(-)

diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 56d145017902..058cf6ef7ebb 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -536,7 +536,7 @@ static void fprobe_graph_remove_ips(unsigned long *addrs, int num)
 
 #ifdef CONFIG_MODULES
 
-#define FPROBE_IPS_BATCH_INIT 8
+#define FPROBE_IPS_BATCH_INIT 128
 /* instruction pointer address list */
 struct fprobe_addr_list {
 	int index;
@@ -544,45 +544,21 @@ struct fprobe_addr_list {
 	unsigned long *addrs;
 };
 
-static int fprobe_addr_list_add(struct fprobe_addr_list *alist, unsigned long addr)
+static int fprobe_remove_node_in_module(struct module *mod, struct fprobe_hlist_node *node,
+					 struct fprobe_addr_list *alist)
 {
-	unsigned long *addrs;
-
-	/* Previously we failed to expand the list. */
-	if (alist->index == alist->size)
-		return -ENOSPC;
-
-	alist->addrs[alist->index++] = addr;
-	if (alist->index < alist->size)
+	if (!within_module(node->addr, mod))
 		return 0;
 
-	/* Expand the address list */
-	addrs = kcalloc(alist->size * 2, sizeof(*addrs), GFP_KERNEL);
-	if (!addrs)
-		return -ENOMEM;
-
-	memcpy(addrs, alist->addrs, alist->size * sizeof(*addrs));
-	alist->size *= 2;
-	kfree(alist->addrs);
-	alist->addrs = addrs;
+	if (delete_fprobe_node(node))
+		return 0;
 
+	alist->addrs[alist->index++] = node->addr;
+	if (alist->index == alist->size)
+		return -ENOSPC;
 	return 0;
 }
 
-static void fprobe_remove_node_in_module(struct module *mod, struct fprobe_hlist_node *node,
-					 struct fprobe_addr_list *alist)
-{
-	if (!within_module(node->addr, mod))
-		return;
-	if (delete_fprobe_node(node))
-		return;
-	/*
-	 * If failed to update alist, just continue to update hlist.
-	 * Therefore, at list user handler will not hit anymore.
-	 */
-	fprobe_addr_list_add(alist, node->addr);
-}
-
 /* Handle module unloading to manage fprobe_ip_table. */
 static int fprobe_module_callback(struct notifier_block *nb,
 				  unsigned long val, void *data)
@@ -591,6 +567,7 @@ static int fprobe_module_callback(struct notifier_block *nb,
 	struct fprobe_hlist_node *node;
 	struct rhashtable_iter iter;
 	struct module *mod = data;
+	bool retry;
 
 	if (val != MODULE_STATE_GOING)
 		return NOTIFY_DONE;
@@ -600,13 +577,19 @@ static int fprobe_module_callback(struct notifier_block *nb,
 	if (!alist.addrs)
 		return NOTIFY_DONE;
 
+retry:
+	retry = false;
+	alist.index = 0;
 	mutex_lock(&fprobe_mutex);
 	rhltable_walk_enter(&fprobe_ip_table, &iter);
 	do {
 		rhashtable_walk_start(&iter);
 
 		while ((node = rhashtable_walk_next(&iter)) && !IS_ERR(node))
-			fprobe_remove_node_in_module(mod, node, &alist);
+			if (fprobe_remove_node_in_module(mod, node, &alist) < 0) {
+				retry = true;
+				break;
+			}
 
 		rhashtable_walk_stop(&iter);
 	} while (node == ERR_PTR(-EAGAIN));
@@ -615,6 +598,8 @@ static int fprobe_module_callback(struct notifier_block *nb,
 	if (alist.index > 0)
 		fprobe_set_ips(alist.addrs, alist.index, 1, 0);
 	mutex_unlock(&fprobe_mutex);
+	if (retry)
+		goto retry;
 
 	kfree(alist.addrs);
 


^ permalink raw reply related

* [PATCH v2 0/2] tracing/fprobe: Fix fprobe_ip_table related bugs
From: Masami Hiramatsu (Google) @ 2026-04-09 10:35 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu
  Cc: Menglong Dong, Mathieu Desnoyers, jiang.biao, linux-kernel,
	linux-trace-kernel

Hi,

Here are patches to fix bugs in fprobe.

The previous version is here.

https://lore.kernel.org/all/177555745233.1441308.6535024692186959381.stgit@mhiramat.tok.corp.google.com/

In this version, I added another bug in module callback
which calls kcalloc() in rcu_read_lock() section[1/2] which
is reported by Sashiko[1].

[1] https://sashiko.dev/#/patchset/177555745233.1441308.6535024692186959381.stgit%40mhiramat.tok.corp.google.com

Thank you!

---

Masami Hiramatsu (Google) (2):
      tracing/fprobe: Avoid kcalloc() in rcu_read_lock section
      tracing/fprobe: Check the same type fprobe on table as the unregistered one


 kernel/trace/fprobe.c |  122 +++++++++++++++++++++++++++++++------------------
 1 file changed, 78 insertions(+), 44 deletions(-)

--
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH] tracing/fprobe: Check the same type fprobe on table as the unregistered one
From: Masami Hiramatsu @ 2026-04-09 10:34 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Steven Rostedt, Menglong Dong, Mathieu Desnoyers, jiang.biao,
	linux-kernel, linux-trace-kernel
In-Reply-To: <20260408110237.08d982fab08356bc8d48af45@kernel.org>

On Wed, 8 Apr 2026 11:02:37 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> Hmm,
> 
> I also found another problem when probing module with fprobe.
> 
> /sys/kernel/tracing # insmod xt_LOG.ko 
> /sys/kernel/tracing # echo 'f:test log_tg*' > dynamic_events 
> /sys/kernel/tracing # echo 1 > events/fprobes/test/enable 
> /sys/kernel/tracing # cat enabled_functions 
> log_tg [xt_LOG] (1)             tramp: 0xffffffffa0004000 (fprobe_ftrace_entry+0x0/0x490) ->fprobe_ftrace_entry+0x0/0x490
> log_tg_check [xt_LOG] (1)               tramp: 0xffffffffa0004000 (fprobe_ftrace_entry+0x0/0x490) ->fprobe_ftrace_entry+0x0/0x490
> log_tg_destroy [xt_LOG] (1)             tramp: 0xffffffffa0004000 (fprobe_ftrace_entry+0x0/0x490) ->fprobe_ftrace_entry+0x0/0x490
> /sys/kernel/tracing # wc -l enabled_functions 
> 3 enabled_functions
> /sys/kernel/tracing # rmmod xt_LOG
> /sys/kernel/tracing # wc -l enabled_functions 
> 34085 enabled_functions
> 
> It seems to reverse the selected functions if the hash is empty...

So I think there are several ways to fix this issue.

 - Introduce a new flag for ftrace_ops, which is "empty means empty".

 - Improve ftrace to automatically remove module function entries
   from registered ftrace_ops. (currently it is done for the ftrace_ops
   in trace_array list, but not for ftrace_ops list.)

 - While walking over the fprobe_ip_tables, try to find existing
   enabled fprobe node. If it does not exist, unregister ftrace_ops.

 - Count the ftrace_ops hash table size and if it is expected to be 0,
   unregister the ftrace_ops.

The first 2 are changing ftrace (but cleaner). The latter 2 only
changes fprobes, but relatively ugly. I think the 2nd one will be
the best because we already have done for ftrace_ops of ring-buffer
instances.

Thanks,


> 
> Thanks,
> 
> On Tue,  7 Apr 2026 19:24:12 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> 
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > 
> > Commit 2c67dc457bc6 ("tracing: fprobe: optimization for entry only case")
> > introduced a different ftrace_ops for entry-only fprobes.
> > 
> > However, when unregistering an fprobe, the kernel only checks if another
> > fprobe exists at the same address, without checking which type of fprobe
> > it is.
> > If different fprobes are registered at the same address, the same address
> > will be registered in both fgraph_ops and ftrace_ops, but only one of
> > them will be deleted when unregistering. (the one removed first will not
> > be deleted from the ops).
> > 
> > This results in junk entries remaining in either fgraph_ops or ftrace_ops.
> > For example:
> >  =======
> >  cd /sys/kernel/tracing
> > 
> >  # 'Add entry and exit events on the same place'
> >  echo 'f:event1 vfs_read' >> dynamic_events
> >  echo 'f:event2 vfs_read%return' >> dynamic_events
> > 
> >  # 'Enable both of them'
> >  echo 1 > events/fprobes/enable
> >  cat enabled_functions
> > vfs_read (2)            ->arch_ftrace_ops_list_func+0x0/0x210
> > 
> >  # 'Disable and remove exit event'
> >  echo 0 > events/fprobes/event2/enable
> >  echo -:event2 >> dynamic_events
> > 
> >  # 'Disable and remove all events'
> >  echo 0 > events/fprobes/enable
> >  echo > dynamic_events
> > 
> >  # 'Add another event'
> >  echo 'f:event3 vfs_open%return' > dynamic_events
> >  cat dynamic_events
> > f:fprobes/event3 vfs_open%return
> > 
> >  echo 1 > events/fprobes/enable
> >  cat enabled_functions
> > vfs_open (1)            tramp: 0xffffffffa0001000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60    subops: {ent:fprobe_fgraph_entry+0x0/0x620 ret:fprobe_return+0x0/0x150}
> > vfs_read (1)            tramp: 0xffffffffa0001000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60    subops: {ent:fprobe_fgraph_entry+0x0/0x620 ret:fprobe_return+0x0/0x150}
> >  =======
> > 
> > As you can see, an entry for the vfs_read remains.
> > 
> > To fix this issue, when unregistering, the kernel should also check if
> > there is the same type of fprobes still exist at the same address, and
> > if not, delete its entry from either fgraph_ops or ftrace_ops.
> > 
> > Fixes: 2c67dc457bc6 ("tracing: fprobe: optimization for entry only case")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> >  kernel/trace/fprobe.c |   77 +++++++++++++++++++++++++++++++++++++++----------
> >  1 file changed, 62 insertions(+), 15 deletions(-)
> > 
> > diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> > index dcadf1d23b8a..7f75e6e4462c 100644
> > --- a/kernel/trace/fprobe.c
> > +++ b/kernel/trace/fprobe.c
> > @@ -85,11 +85,9 @@ static int insert_fprobe_node(struct fprobe_hlist_node *node)
> >  	return rhltable_insert(&fprobe_ip_table, &node->hlist, fprobe_rht_params);
> >  }
> >  
> > -/* Return true if there are synonims */
> > -static bool delete_fprobe_node(struct fprobe_hlist_node *node)
> > +static void delete_fprobe_node(struct fprobe_hlist_node *node)
> >  {
> >  	lockdep_assert_held(&fprobe_mutex);
> > -	bool ret;
> >  
> >  	/* Avoid double deleting */
> >  	if (READ_ONCE(node->fp) != NULL) {
> > @@ -97,13 +95,6 @@ static bool delete_fprobe_node(struct fprobe_hlist_node *node)
> >  		rhltable_remove(&fprobe_ip_table, &node->hlist,
> >  				fprobe_rht_params);
> >  	}
> > -
> > -	rcu_read_lock();
> > -	ret = !!rhltable_lookup(&fprobe_ip_table, &node->addr,
> > -				fprobe_rht_params);
> > -	rcu_read_unlock();
> > -
> > -	return ret;
> >  }
> >  
> >  /* Check existence of the fprobe */
> > @@ -337,6 +328,32 @@ static bool fprobe_is_ftrace(struct fprobe *fp)
> >  	return !fp->exit_handler;
> >  }
> >  
> > +static bool fprobe_exists_on_hash(unsigned long ip, bool ftrace)
> > +{
> > +	struct rhlist_head *head, *pos;
> > +	struct fprobe_hlist_node *node;
> > +	struct fprobe *fp;
> > +
> > +	guard(rcu)();
> > +	head = rhltable_lookup(&fprobe_ip_table, &ip,
> > +				fprobe_rht_params);
> > +	if (!head)
> > +		return false;
> > +	/* We have to check the same type on the list. */
> > +	rhl_for_each_entry_rcu(node, pos, head, hlist) {
> > +		if (node->addr != ip)
> > +			break;
> > +		fp = READ_ONCE(node->fp);
> > +		if (likely(fp)) {
> > +			if ((!ftrace && fp->exit_handler) ||
> > +			    (ftrace && !fp->exit_handler))
> > +				return true;
> > +		}
> > +	}
> > +
> > +	return false;
> > +}
> > +
> >  #ifdef CONFIG_MODULES
> >  static void fprobe_set_ips(unsigned long *ips, unsigned int cnt, int remove,
> >  			   int reset)
> > @@ -360,6 +377,29 @@ static bool fprobe_is_ftrace(struct fprobe *fp)
> >  	return false;
> >  }
> >  
> > +static bool fprobe_exists_on_hash(unsigned long ip, bool ftrace __maybe_unused)
> > +{
> > +	struct rhlist_head *head, *pos;
> > +	struct fprobe_hlist_node *node;
> > +	struct fprobe *fp;
> > +
> > +	guard(rcu)();
> > +	head = rhltable_lookup(&fprobe_ip_table, &ip,
> > +				fprobe_rht_params);
> > +	if (!head)
> > +		return false;
> > +	/* We only need to check fp is there. */
> > +	rhl_for_each_entry_rcu(node, pos, head, hlist) {
> > +		if (node->addr != ip)
> > +			break;
> > +		fp = READ_ONCE(node->fp);
> > +		if (likely(fp))
> > +			return true;
> > +	}
> > +
> > +	return false;
> > +}
> > +
> >  #ifdef CONFIG_MODULES
> >  static void fprobe_set_ips(unsigned long *ips, unsigned int cnt, int remove,
> >  			   int reset)
> > @@ -574,15 +614,20 @@ static int fprobe_addr_list_add(struct fprobe_addr_list *alist, unsigned long ad
> >  static void fprobe_remove_node_in_module(struct module *mod, struct fprobe_hlist_node *node,
> >  					 struct fprobe_addr_list *alist)
> >  {
> > +	lockdep_assert_in_rcu_read_lock();
> > +
> >  	if (!within_module(node->addr, mod))
> >  		return;
> > -	if (delete_fprobe_node(node))
> > -		return;
> > +
> > +	delete_fprobe_node(node);
> >  	/*
> > -	 * If failed to update alist, just continue to update hlist.
> > +	 * Ignore failure of updating alist, but continue to update hlist.
> >  	 * Therefore, at list user handler will not hit anymore.
> > +	 * And don't care the type here, because all fprobes on the same
> > +	 * address must be removed eventually.
> >  	 */
> > -	fprobe_addr_list_add(alist, node->addr);
> > +	if (!rhltable_lookup(&fprobe_ip_table, &node->addr, fprobe_rht_params))
> > +		fprobe_addr_list_add(alist, node->addr);
> >  }
> >  
> >  /* Handle module unloading to manage fprobe_ip_table. */
> > @@ -943,7 +988,9 @@ int unregister_fprobe(struct fprobe *fp)
> >  	/* Remove non-synonim ips from table and hash */
> >  	count = 0;
> >  	for (i = 0; i < hlist_array->size; i++) {
> > -		if (!delete_fprobe_node(&hlist_array->array[i]))
> > +		delete_fprobe_node(&hlist_array->array[i]);
> > +		if (!fprobe_exists_on_hash(hlist_array->array[i].addr,
> > +					   fprobe_is_ftrace(fp)))
> >  			addrs[count++] = hlist_array->array[i].addr;
> >  	}
> >  	del_fprobe_hash(fp);
> > 
> 
> 
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [RFC PATCH 0/4] trace, livepatch: Allow kprobe return overriding for livepatched functions
From: Miroslav Benes @ 2026-04-09 10:08 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Song Liu, jpoimboe, jikos, pmladek, joe.lawrence, rostedt,
	mhiramat, mathieu.desnoyers, kpsingh, mattbobrowski, jolsa, ast,
	daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
	live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <CALOAHbAmTAfamStF9sZtO6efWYJ1sbXJp3PbsVapZf7dba91ig@mail.gmail.com>

> Can we add something like ALLOW_LIVEPATCH_ERROR_INJECTION() to allow
> error injection on functions defined inside a livepatch?

No.

I am sorry but you always seem to find band aids to your set up and how 
you deal with live patches internally. While I can see that something like 
a hybrid mode might be useful to people if done right (and we are not 
there yet), the combination of it with bpf overrides or anything like that 
is not something I would like to see in upstream.

Miroslav

^ permalink raw reply

* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Miroslav Benes @ 2026-04-09  9:47 UTC (permalink / raw)
  To: Yafang Shao
  Cc: jpoimboe, jikos, pmladek, joe.lawrence, rostedt, mhiramat,
	mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa, ast,
	daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
	live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <20260402092607.96430-3-laoar.shao@gmail.com>

Hi,

On Thu, 2 Apr 2026, Yafang Shao wrote:

> Introduce the ability for kprobes to override the return values of
> functions that have been livepatched. This functionality is guarded by the
> CONFIG_KPROBE_OVERRIDE_KLP_FUNC configuration option.

this is imprecise if I read the code correctly. You want to override live 
patch functions, not the original ones which are live patched.

I also think that if nothing else, it needs to be more specific then just 
checking mod->klp. It should check if a function itself in klp module is 
overridable to keep it as limited as possible. Even with that, the 
concerns expressed by the others still apply.

---
Miroslav

^ permalink raw reply

* [PATCH v2] tracing/hist: bound expression strings with seq_buf
From: Pengpeng Hou @ 2026-04-09  2:56 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu
  Cc: Mathieu Desnoyers, linux-trace-kernel, linux-kernel, pengpeng
In-Reply-To: <20260407153001.1-tracing-hist-expr-pengpeng@iscas.ac.cn>

expr_str() allocates a fixed MAX_FILTER_STR_VAL buffer and then builds
expression names with a series of raw strcat() appends. Nested operands,
constants and field flags can push the rendered string past that fixed
limit before the name is attached to the hist field.

Build the expression strings with seq_buf and propagate failures back to
the expression parser when the rendered name would exceed
MAX_FILTER_STR_VAL.

Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
---
Changes since v1:
- replace the previous bounded append helper and manual length tracking
  with seq_buf as suggested by Steven Rostedt
- keep the -E2BIG propagation back into parse_unary() and parse_expr()

 kernel/trace/trace_events_hist.c | 93 ++++++++++++++++++++++----------
 1 file changed, 64 insertions(+), 29 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 73ea180cad55..09aaedb92993 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -8,6 +8,7 @@
 #include <linux/module.h>
 #include <linux/kallsyms.h>
 #include <linux/security.h>
+#include <linux/seq_buf.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/stacktrace.h>
@@ -1738,85 +1739,104 @@ static const char *get_hist_field_flags(struct hist_field *hist_field)
 	return flags_str;
 }
 
-static void expr_field_str(struct hist_field *field, char *expr)
+static bool expr_field_str(struct hist_field *field, struct seq_buf *s)
 {
+	const char *field_name;
+
 	if (field->flags & HIST_FIELD_FL_VAR_REF)
-		strcat(expr, "$");
-	else if (field->flags & HIST_FIELD_FL_CONST) {
-		char str[HIST_CONST_DIGITS_MAX];
+		seq_buf_putc(s, '$');
+	else if (field->flags & HIST_FIELD_FL_CONST)
+		seq_buf_printf(s, "%llu", field->constant);
 
-		snprintf(str, HIST_CONST_DIGITS_MAX, "%llu", field->constant);
-		strcat(expr, str);
-	}
+	field_name = hist_field_name(field, 0);
+	if (!field_name)
+		return false;
 
-	strcat(expr, hist_field_name(field, 0));
+	seq_buf_puts(s, field_name);
 
 	if (field->flags && !(field->flags & HIST_FIELD_FL_VAR_REF)) {
 		const char *flags_str = get_hist_field_flags(field);
 
-		if (flags_str) {
-			strcat(expr, ".");
-			strcat(expr, flags_str);
-		}
+		if (flags_str)
+			seq_buf_printf(s, ".%s", flags_str);
 	}
+
+	return !seq_buf_has_overflowed(s);
 }
 
 static char *expr_str(struct hist_field *field, unsigned int level)
 {
 	char *expr;
+	struct seq_buf s;
+	int ret = -E2BIG;
 
 	if (level > 1)
-		return NULL;
+		return ERR_PTR(-EINVAL);
 
 	expr = kzalloc(MAX_FILTER_STR_VAL, GFP_KERNEL);
 	if (!expr)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
+
+	seq_buf_init(&s, expr, MAX_FILTER_STR_VAL);
 
 	if (!field->operands[0]) {
-		expr_field_str(field, expr);
+		if (!expr_field_str(field, &s))
+			goto free;
+		seq_buf_str(&s);
 		return expr;
 	}
 
 	if (field->operator == FIELD_OP_UNARY_MINUS) {
 		char *subexpr;
 
-		strcat(expr, "-(");
+		seq_buf_puts(&s, "-(");
 		subexpr = expr_str(field->operands[0], ++level);
-		if (!subexpr) {
-			kfree(expr);
-			return NULL;
+		if (IS_ERR(subexpr)) {
+			ret = PTR_ERR(subexpr);
+			goto free;
 		}
-		strcat(expr, subexpr);
-		strcat(expr, ")");
+		seq_buf_puts(&s, subexpr);
+		seq_buf_putc(&s, ')');
 
 		kfree(subexpr);
+		if (seq_buf_has_overflowed(&s))
+			goto free;
 
+		seq_buf_str(&s);
 		return expr;
 	}
 
-	expr_field_str(field->operands[0], expr);
+	if (!expr_field_str(field->operands[0], &s))
+		goto free;
 
 	switch (field->operator) {
 	case FIELD_OP_MINUS:
-		strcat(expr, "-");
+		seq_buf_putc(&s, '-');
 		break;
 	case FIELD_OP_PLUS:
-		strcat(expr, "+");
+		seq_buf_putc(&s, '+');
 		break;
 	case FIELD_OP_DIV:
-		strcat(expr, "/");
+		seq_buf_putc(&s, '/');
 		break;
 	case FIELD_OP_MULT:
-		strcat(expr, "*");
+		seq_buf_putc(&s, '*');
 		break;
 	default:
-		kfree(expr);
-		return NULL;
+		ret = -EINVAL;
+		goto free;
 	}
 
-	expr_field_str(field->operands[1], expr);
+	if (seq_buf_has_overflowed(&s) ||
+	    !expr_field_str(field->operands[1], &s))
+		goto free;
 
+	seq_buf_str(&s);
 	return expr;
+
+free:
+	kfree(expr);
+	return ERR_PTR(ret);
 }
 
 /*
@@ -2630,6 +2650,11 @@ static struct hist_field *parse_unary(struct hist_trigger_data *hist_data,
 	expr->is_signed = operand1->is_signed;
 	expr->operator = FIELD_OP_UNARY_MINUS;
 	expr->name = expr_str(expr, 0);
+	if (IS_ERR(expr->name)) {
+		ret = PTR_ERR(expr->name);
+		expr->name = NULL;
+		goto free;
+	}
 	expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
 	if (!expr->type) {
 		ret = -ENOMEM;
@@ -2842,6 +2867,11 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data,
 		destroy_hist_field(operand1, 0);
 
 		expr->name = expr_str(expr, 0);
+		if (IS_ERR(expr->name)) {
+			ret = PTR_ERR(expr->name);
+			expr->name = NULL;
+			goto free_expr;
+		}
 	} else {
 		/* The operand sizes should be the same, so just pick one */
 		expr->size = operand1->size;
@@ -2855,6 +2885,11 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data,
 		}
 
 		expr->name = expr_str(expr, 0);
+		if (IS_ERR(expr->name)) {
+			ret = PTR_ERR(expr->name);
+			expr->name = NULL;
+			goto free_expr;
+		}
 	}
 
 	return expr;
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related

* Re: [PATCH] selftests/ftrace: Account for fprobe attachment at creation
From: Masami Hiramatsu @ 2026-04-09  8:29 UTC (permalink / raw)
  To: Cao Ruichuang
  Cc: rostedt, mathieu.desnoyers, shuah, linux-kernel,
	linux-trace-kernel, linux-kselftest
In-Reply-To: <177571189990.17594.14983613605049028604@163.com>

On Thu, 09 Apr 2026 13:18:19 +0800
Cao Ruichuang <create0818@163.com> wrote:

> Hi Masami,
> 
> I reran this in a clean QEMU boot environment with only the minimal
> tracefs setup and the add_remove_fprobe sequence.
> 
> I saw the same behavior there: after creating myevent1/2/3,
> enabled_functions went from 2 to 4 before any event was enabled, and
> enabling myevent1/2/3 did not increase the count further. After cleanup,
> it returned to the original baseline again.

Hmm, that does not mean test failure but could be a bug.
Can you dump the enabled_functions and share it when the
test failed?

> So this does not look like a dirty tracing environment issue. The
> failing testcase assumption appears to be that the attachment happens on
> enable, while on the current kernel it is already visible after create.

What kernel are you using? I need to check why this happens.
I guess there is a change which introduces this issue.

Thank you,

> 
> Thanks,
> Cao Ruichuang
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH mm-unstable v15 03/13] mm/khugepaged: generalize __collapse_huge_page_* for mTHP support
From: David Hildenbrand (Arm) @ 2026-04-09  8:14 UTC (permalink / raw)
  To: Nico Pache
  Cc: linux-doc, linux-kernel, linux-mm, linux-trace-kernel, aarcange,
	akpm, anshuman.khandual, apopple, baohua, baolin.wang, byungchul,
	catalin.marinas, cl, corbet, dave.hansen, dev.jain, gourry,
	hannes, hughd, jack, jackmanb, jannh, jglisse, joshua.hahnjy, kas,
	lance.yang, Liam.Howlett, lorenzo.stoakes, mathieu.desnoyers,
	matthew.brost, mhiramat, mhocko, peterx, pfalcato, rakie.kim,
	raquini, rdunlap, richard.weiyang, rientjes, rostedt, rppt,
	ryan.roberts, shivankg, sunnanyong, surenb, thomas.hellstrom,
	tiwai, usamaarif642, vbabka, vishal.moola, wangkefeng.wang, will,
	willy, yang, ying.huang, ziy, zokeefe
In-Reply-To: <CAA1CXcA8nE2PZrB4J1gV5v16PeQ7X2AiwjJ3gO1Q8hW7tyTtPQ@mail.gmail.com>

On 4/8/26 21:48, Nico Pache wrote:
> On Thu, Mar 12, 2026 at 2:56 PM David Hildenbrand (Arm)
> <david@kernel.org> wrote:
>>
>> On 3/12/26 21:36, David Hildenbrand (Arm) wrote:
>>>
>>> Okay, now I am confused. Why are you not taking care of
>>> collapse_scan_pmd() in the same context?
>>>
>>> Because if you make sure that we properly check against a max_ptes_swap
>>> similar as in the style above, we'd rule out swapin right from the start?
>>>
>>> Also, I would expect that all other parameters in there are similarly
>>> handled?
>>>
>>
>> Okay, I think you should add the following:
> 
> Hey! Thanks for all your reviews here.
> 
> For multiple reasons, here is the solution I developed:
> 
> Add a patch before the generalize __collapse.. patch that reworks the
> max_ptes* handling and introduces the helpers (no functional changes).

I assume that's roughly the patch I shared below? If so, sounds good to me.

-- 
Cheers,

David

^ permalink raw reply

* Re: [RFC PATCH 3/4] livepatch: Add "replaceable" attribute to klp_patch
From: Petr Mladek @ 2026-04-09  7:36 UTC (permalink / raw)
  To: Song Liu
  Cc: Yafang Shao, Joe Lawrence, Dylan Hatch, jpoimboe, jikos, mbenes,
	rostedt, mhiramat, mathieu.desnoyers, kpsingh, mattbobrowski,
	jolsa, ast, daniel, andrii, martin.lau, eddyz87, memxor,
	yonghong.song, live-patching, linux-kernel, linux-trace-kernel,
	bpf
In-Reply-To: <CAPhsuW42WqGuZ1Z-RG0yzifZ7rh=XKUa5hKb6JxLeTWdc4s4-A@mail.gmail.com>

On Wed 2026-04-08 11:19:50, Song Liu wrote:
> On Wed, Apr 8, 2026 at 4:43 AM Petr Mladek <pmladek@suse.com> wrote:
> [...]
> > > >
> > > > This is weird semantic. Which livepatch tag would be allowed to
> > > > supersede it, please?
> > > >
> > > > Do we still need this category?
> > >
> > > It can be superseded by any livepatch that has a non-zero tag set.
> >
> > And this exactly the weird thing.
> >
> > A patch with the .replace flag set is supposed to obsolete all already
> > installed livepatches. It means that it should provide all existing
> > fixes and features.
> >
> > Now, we want to introduce a replace flag/set which would allow to
> > replace/obsolete only the livepatch with the same tag/set number.
> > And we want to prevent conflicts by making sure that livepatches with
> > different tag/set number will never livepatch the same function.
> >
> > Obviously, livepatches with different tag/set number could not
> > obsolete the same no-replace livepatch. They would need to livepatch
> > the same functions touched by the no-replace livepatch and would
> > conflict.
> >
> > So, I suggest to remove the no-replace mode completely. It should
> > not be needed. A livepatch which should be installed in parallel
> > will simply use another unique tag/set number.
> 
> I think I see your point now. Existing code works as:
> - replace=false doesn't replace anything
> - replace=true replaces everything
> 
> If we assume false=0 and true=1, it is technically possible to define:
> - replace_set=0 doesn't replace anything
> - replace_set=1 replaces everything
> - replace_set=2+ only replace the same replace_set

Yes. This well describes my point.

> This is probably a little too complicated.
> 
> > > This ensures backward compatibility: while a non-atomic-replace
> > > livepatch can be superseded by an atomic-replace one, the reverse is
> > > not permitted—an atomic-replace livepatch cannot be superseded by a
> > > non-atomic one.
> >
> > IMHO, the backward compatibility would just create complexity and mess
> > in this case.
> 
> Given that livepatch is for expert users, I think we can make this work
> without backward compatibility. But breaking compatibility is always not
> preferred.

I believe that it is acceptable because:

  1. It was always hard to combine no-replace and replace livepatches.
     I wonder if anyone combines them at all.

  2. I believe that nobody tries to load the same livepatch module on
     different kernel versions. Instead, everyone prepares a custom
     livepatch module for each livepatched kernel version/release.

     And the tooling for creating livepatches will need to be updated
     to use "number" instead of "true/false" anyway.

That said, it is easier to always use "0" for non-replace patches
instead of assigning an unique "number" to avoid replacing. But
I do not think that this would justify the complexity of having
different semantic for 0, 1, and 2+ replace_set numbers.

Best Regards,
Petr

^ permalink raw reply

* Re: [PATCH] selftests/ftrace: Account for fprobe attachment at creation
From: Cao Ruichuang @ 2026-04-09  5:18 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: rostedt, mathieu.desnoyers, shuah, linux-kernel,
	linux-trace-kernel, linux-kselftest
In-Reply-To: <20260408094920.417cbb5e64220afc2bf1b2aa@kernel.org>

Hi Masami,

I reran this in a clean QEMU boot environment with only the minimal
tracefs setup and the add_remove_fprobe sequence.

I saw the same behavior there: after creating myevent1/2/3,
enabled_functions went from 2 to 4 before any event was enabled, and
enabling myevent1/2/3 did not increase the count further. After cleanup,
it returned to the original baseline again.

So this does not look like a dirty tracing environment issue. The
failing testcase assumption appears to be that the attachment happens on
enable, while on the current kernel it is already visible after create.

Thanks,
Cao Ruichuang

^ permalink raw reply

* Re: [PATCH 1/2] tracing: Store trace_marker_raw payload length in events
From: Cao Ruichuang @ 2026-04-09  5:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: mhiramat, mathieu.desnoyers, shuah, linux-kernel,
	linux-trace-kernel, linux-kselftest
In-Reply-To: <20260408133931.5755684f@robin>

Hi Steven,

I looked through the current trace_marker_raw users and the expected
usage more carefully.

The existing users I found treat trace_marker_raw as a binary channel
and parse the data from raw events or trace_pipe_raw, not from the
formatted trace output. Given that, I do not see a strong requirement
for TRACE_RAW_DATA itself to carry and maintain an exact payload length
just for the text printing path.

So I agree this is not really a technical limitation in the current
implementation, but a requirement question, and the requirement does
not look strong enough here.

I'll drop this series for now and close this task on my side.

Thanks,
Cao Ruichuang

^ permalink raw reply

* Re: [PATCH net-next v2 0/2] mptcp: autotune related improvement
From: patchwork-bot+netdevbpf @ 2026-04-09  2:40 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: martineau, geliang, davem, edumazet, kuba, pabeni, horms, fw,
	netdev, mptcp, linux-kernel, rostedt, mhiramat, mathieu.desnoyers,
	linux-trace-kernel
In-Reply-To: <20260407-net-next-mptcp-reduce-rbuf-v2-0-0d1d135bf6f6@kernel.org>

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 07 Apr 2026 10:45:16 +0200 you wrote:
> Here are two patches from Paolo that have been crafted a couple of
> months ago, but needed more validation because they were indirectly
> causing instabilities in the sefltests. The root cause has been fixed in
> 'net' recently in commit 8c09412e584d ("selftests: mptcp: more stable
> simult_flows tests").
> 
> These patches refactor the receive space and RTT estimator, overall
> making DRS more correct while avoiding receive buffer drifting to
> tcp_rmem[2], which in turn makes the throughput more stable and less
> bursty, especially with high bandwidth and low delay environments.
> 
> [...]

Here is the summary with links:
  - [net-next,v2,1/2] mptcp: better mptcp-level RTT estimator
    https://git.kernel.org/netdev/net-next/c/d2000361e4dd
  - [net-next,v2,2/2] mptcp: add receive queue awareness in tcp_rcv_space_adjust()
    https://git.kernel.org/netdev/net-next/c/7272d8131a9d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* [PATCH v3] tracing/hist: bound synthetic-field strings with seq_buf
From: Pengpeng Hou @ 2026-04-09  2:19 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu
  Cc: Tom Zanussi, Mathieu Desnoyers, linux-trace-kernel, linux-kernel,
	pengpeng
In-Reply-To: <20260401112224.85582-2-pengpeng@iscas.ac.cn>

The synthetic field helpers build a prefixed synthetic variable name and
a generated hist command in fixed MAX_FILTER_STR_VAL buffers. The
current code appends those strings with raw strcat(), so long key lists,
field names, or saved filters can run past the end of the staging
buffers.

Build both strings with seq_buf and propagate -E2BIG if either the
synthetic variable name or the generated command exceeds
MAX_FILTER_STR_VAL. This keeps the existing tracing-side limit while
using the helper intended for bounded command construction.

Fixes: 02205a6752f2 ("tracing: Add support for 'field variables'")
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
---
Changes since v2: https://lore.kernel.org/all/20260401112224.85582-2-pengpeng@iscas.ac.cn/

- switch the synthetic name and generated command construction to seq_buf
  as suggested by Steven Rostedt
- keep MAX_FILTER_STR_VAL as the tracing-side limit and return -E2BIG on
  overflow

 kernel/trace/trace_events_hist.c | 44 ++++++++++++++++++++++----------
 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 73ea180cad55..7c3873719beb 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -8,6 +8,7 @@
 #include <linux/module.h>
 #include <linux/kallsyms.h>
 #include <linux/security.h>
+#include <linux/seq_buf.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/stacktrace.h>
@@ -2962,14 +2963,21 @@ find_synthetic_field_var(struct hist_trigger_data *target_hist_data,
 			 char *system, char *event_name, char *field_name)
 {
 	struct hist_field *event_var;
+	struct seq_buf s;
 	char *synthetic_name;
 
 	synthetic_name = kzalloc(MAX_FILTER_STR_VAL, GFP_KERNEL);
 	if (!synthetic_name)
 		return ERR_PTR(-ENOMEM);
 
-	strcpy(synthetic_name, "synthetic_");
-	strcat(synthetic_name, field_name);
+	seq_buf_init(&s, synthetic_name, MAX_FILTER_STR_VAL);
+	seq_buf_puts(&s, "synthetic_");
+	seq_buf_puts(&s, field_name);
+	seq_buf_str(&s);
+	if (seq_buf_has_overflowed(&s)) {
+		kfree(synthetic_name);
+		return ERR_PTR(-E2BIG);
+	}
 
 	event_var = find_event_var(target_hist_data, system, event_name, synthetic_name);
 
@@ -3014,6 +3022,7 @@ create_field_var_hist(struct hist_trigger_data *target_hist_data,
 	struct trace_event_file *file;
 	struct hist_field *key_field;
 	struct hist_field *event_var;
+	struct seq_buf s;
 	char *saved_filter;
 	char *cmd;
 	int ret;
@@ -3046,41 +3055,48 @@ create_field_var_hist(struct hist_trigger_data *target_hist_data,
 	/* See if a synthetic field variable has already been created */
 	event_var = find_synthetic_field_var(target_hist_data, subsys_name,
 					     event_name, field_name);
-	if (!IS_ERR_OR_NULL(event_var))
+	if (IS_ERR(event_var))
+		return event_var;
+	if (event_var)
 		return event_var;
 
 	var_hist = kzalloc_obj(*var_hist);
 	if (!var_hist)
 		return ERR_PTR(-ENOMEM);
 
+	saved_filter = find_trigger_filter(hist_data, file);
+
 	cmd = kzalloc(MAX_FILTER_STR_VAL, GFP_KERNEL);
 	if (!cmd) {
 		kfree(var_hist);
 		return ERR_PTR(-ENOMEM);
 	}
 
+	seq_buf_init(&s, cmd, MAX_FILTER_STR_VAL);
+
 	/* Use the same keys as the compatible histogram */
-	strcat(cmd, "keys=");
+	seq_buf_puts(&s, "keys=");
 
 	for_each_hist_key_field(i, hist_data) {
 		key_field = hist_data->fields[i];
 		if (!first)
-			strcat(cmd, ",");
-		strcat(cmd, key_field->field->name);
+			seq_buf_putc(&s, ',');
+		seq_buf_puts(&s, key_field->field->name);
 		first = false;
 	}
 
 	/* Create the synthetic field variable specification */
-	strcat(cmd, ":synthetic_");
-	strcat(cmd, field_name);
-	strcat(cmd, "=");
-	strcat(cmd, field_name);
+	seq_buf_printf(&s, ":synthetic_%s=%s", field_name, field_name);
 
 	/* Use the same filter as the compatible histogram */
-	saved_filter = find_trigger_filter(hist_data, file);
-	if (saved_filter) {
-		strcat(cmd, " if ");
-		strcat(cmd, saved_filter);
+	if (saved_filter)
+		seq_buf_printf(&s, " if %s", saved_filter);
+
+	seq_buf_str(&s);
+	if (seq_buf_has_overflowed(&s)) {
+		kfree(cmd);
+		kfree(var_hist);
+		return ERR_PTR(-E2BIG);
 	}
 
 	var_hist->cmd = kstrdup(cmd, GFP_KERNEL);
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related

* Re: [PATCH bpf-next v5 0/2] Reject sleepable kprobe_multi programs at attach time
From: patchwork-bot+netdevbpf @ 2026-04-09  1:20 UTC (permalink / raw)
  To: Varun R Mallya
  Cc: bpf, leon.hwang, memxor, jolsa, ast, daniel, yonghong.song,
	rostedt, linux-kernel, linux-trace-kernel
In-Reply-To: <20260408190137.101418-1-varunrmallya@gmail.com>

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Thu,  9 Apr 2026 00:31:35 +0530 you wrote:
> These patches fix an issue where sleepable kprobe_multi programs
> were allowed to attach, leading to "sleeping function called from invalid
> context" splats.
> 
> Because kprobe.multi programs run in atomic/RCU context, they cannot
> sleep. However, `bpf_kprobe_multi_link_attach()` previously lacked
> validation for the `prog->sleepable` flag. This allowed sleepable
> helpers, such as `bpf_copy_from_user()`, to be invoked from an invalid
> non-sleepable context.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v5,1/2] bpf: Reject sleepable kprobe_multi programs at attach time
    (no matching commit)
  - [bpf-next,v5,2/2] selftests/bpf: Add test to ensure kprobe_multi is not sleepable
    https://git.kernel.org/bpf/bpf-next/c/c7cab53f9d52

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v2 2/2] tracing/hist: reject synthetic-field strings that exceed MAX_FILTER_STR_VAL
From: Steven Rostedt @ 2026-04-08 21:31 UTC (permalink / raw)
  To: Pengpeng Hou
  Cc: mhiramat, mathieu.desnoyers, tom.zanussi, linux-kernel,
	linux-trace-kernel
In-Reply-To: <20260401112224.85582-2-pengpeng@iscas.ac.cn>

On Wed,  1 Apr 2026 19:22:24 +0800
Pengpeng Hou <pengpeng@iscas.ac.cn> wrote:

> diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
> index f9c8a4f078ea..4172c91605af 100644
> --- a/kernel/trace/trace_events_hist.c
> +++ b/kernel/trace/trace_events_hist.c
> @@ -2966,12 +2966,16 @@ find_synthetic_field_var(struct hist_trigger_data *target_hist_data,
>  	struct hist_field *event_var;
>  	char *synthetic_name;
>  
> +	if ((sizeof("synthetic_") - 1) + strlen(field_name) >=
> +	    MAX_FILTER_STR_VAL)
> +		return ERR_PTR(-E2BIG);
> +
>  	synthetic_name = kzalloc(MAX_FILTER_STR_VAL, GFP_KERNEL);
>  	if (!synthetic_name)
>  		return ERR_PTR(-ENOMEM);
>  
> -	strcpy(synthetic_name, "synthetic_");
> -	strcat(synthetic_name, field_name);
> +	scnprintf(synthetic_name, MAX_FILTER_STR_VAL, "synthetic_%s",
> +		  field_name);
>  
>  	event_var = find_event_var(target_hist_data, system, event_name, synthetic_name);
>  
> @@ -3018,6 +3022,8 @@ create_field_var_hist(struct hist_trigger_data *target_hist_data,
>  	struct hist_field *event_var;
>  	char *saved_filter;
>  	char *cmd;
> +	size_t cmdlen;
> +	size_t off;
>  	int ret;
>  
>  	if (target_hist_data->n_field_var_hists >= SYNTH_FIELDS_MAX) {
> @@ -3048,13 +3054,36 @@ create_field_var_hist(struct hist_trigger_data *target_hist_data,
>  	/* See if a synthetic field variable has already been created */
>  	event_var = find_synthetic_field_var(target_hist_data, subsys_name,
>  					     event_name, field_name);
> -	if (!IS_ERR_OR_NULL(event_var))
> +	if (IS_ERR(event_var))
> +		return event_var;
> +	if (event_var)
>  		return event_var;
>  
>  	var_hist = kzalloc_obj(*var_hist);
>  	if (!var_hist)
>  		return ERR_PTR(-ENOMEM);
>  
> +	saved_filter = find_trigger_filter(hist_data, file);
> +
> +	cmdlen = strlen("keys=") + strlen(":synthetic_") +
> +		 strlen(field_name) + strlen("=") + strlen(field_name);

Instead of doing all this complex updates, let's use seq_buf in this patch
instead. That's what it's for.

I'll take patch 1 as is, just update this patch.

Thanks,

-- Steve


> +	first = true;
> +	for_each_hist_key_field(i, hist_data) {
> +		key_field = hist_data->fields[i];
> +		if (!first)
> +			cmdlen++;
> +		cmdlen += strlen(key_field->field->name);
> +		first = false;
> +	}
> +
> +	if (saved_filter)
> +		cmdlen += strlen(" if ") + strlen(saved_filter);
> +
> +	if (cmdlen >= MAX_FILTER_STR_VAL) {
> +		kfree(var_hist);
> +		return ERR_PTR(-E2BIG);
> +	}
> +
>  	cmd = kzalloc(MAX_FILTER_STR_VAL, GFP_KERNEL);
>  	if (!cmd) {
>  		kfree(var_hist);
> @@ -3062,28 +3091,24 @@ create_field_var_hist(struct hist_trigger_data *target_hist_data,
>  	}
>  
>  	/* Use the same keys as the compatible histogram */
> -	strcat(cmd, "keys=");
> +	off = scnprintf(cmd, MAX_FILTER_STR_VAL, "keys=");
> +	first = true;
>  
>  	for_each_hist_key_field(i, hist_data) {
>  		key_field = hist_data->fields[i];
> -		if (!first)
> -			strcat(cmd, ",");
> -		strcat(cmd, key_field->field->name);
> +		off += scnprintf(cmd + off, MAX_FILTER_STR_VAL - off, "%s%s",
> +				 first ? "" : ",", key_field->field->name);
>  		first = false;
>  	}
>  
>  	/* Create the synthetic field variable specification */
> -	strcat(cmd, ":synthetic_");
> -	strcat(cmd, field_name);
> -	strcat(cmd, "=");
> -	strcat(cmd, field_name);
> +	off += scnprintf(cmd + off, MAX_FILTER_STR_VAL - off,
> +			 ":synthetic_%s=%s", field_name, field_name);
>  
>  	/* Use the same filter as the compatible histogram */
> -	saved_filter = find_trigger_filter(hist_data, file);
> -	if (saved_filter) {
> -		strcat(cmd, " if ");
> -		strcat(cmd, saved_filter);
> -	}
> +	if (saved_filter)
> +		scnprintf(cmd + off, MAX_FILTER_STR_VAL - off, " if %s",
> +			  saved_filter);
>  
>  	var_hist->cmd = kstrdup(cmd, GFP_KERNEL);
>  	if (!var_hist->cmd) {


^ permalink raw reply

* Re: [PATCH] tracing/hist: bound expression string construction
From: Steven Rostedt @ 2026-04-08 21:27 UTC (permalink / raw)
  To: Pengpeng Hou
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-trace-kernel,
	linux-kernel
In-Reply-To: <20260407153001.1-tracing-hist-expr-pengpeng@iscas.ac.cn>

On Tue, 7 Apr 2026 14:09:10 +0800
Pengpeng Hou <pengpeng@iscas.ac.cn> wrote:

> expr_str() allocates a fixed MAX_FILTER_STR_VAL buffer and then builds
> expression names with a series of raw strcat() appends. Nested operands,
> constants and field flags can push the rendered string past that fixed
> limit before the name is attached to the hist field.
> 
> Convert the construction helpers to explicit bounded appends and
> propagate failures back to the expression parser when the rendered name
> would exceed MAX_FILTER_STR_VAL.
> 
> Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
> ---
>  kernel/trace/trace_events_hist.c | 101 +++++++++++++++++++++++--------
>  1 file changed, 76 insertions(+), 25 deletions(-)
> 
> diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
> index 73ea180cad55..caaa262360d2 100644
> --- a/kernel/trace/trace_events_hist.c
> +++ b/kernel/trace/trace_events_hist.c
> @@ -1738,85 +1738,121 @@ static const char *get_hist_field_flags(struct hist_field *hist_field)
>  	return flags_str;
>  }
>  
> -static void expr_field_str(struct hist_field *field, char *expr)
> +static bool expr_append(char *expr, size_t *len, const char *str)
>  {
> -	if (field->flags & HIST_FIELD_FL_VAR_REF)
> -		strcat(expr, "$");
> -	else if (field->flags & HIST_FIELD_FL_CONST) {
> +	size_t str_len = strlen(str);
> +
> +	if (*len + str_len >= MAX_FILTER_STR_VAL)
> +		return false;
> +
> +	memcpy(expr + *len, str, str_len + 1);
> +	*len += str_len;
> +	return true;
> +}
> +

This looks like a better job for seq_buf.


> +static bool expr_field_str(struct hist_field *field, char *expr, size_t *len)
> +{

	struct seq_buf s;

	seq_buf_init(&s, expr, MAX_FILTER_STR_VAL);

> +	if (field->flags & HIST_FIELD_FL_VAR_REF) {
> +		if (!expr_append(expr, len, "$"))
> +			return false;

		seq_buf_putc(&s, '$');

> +	} else if (field->flags & HIST_FIELD_FL_CONST) {
>  		char str[HIST_CONST_DIGITS_MAX];
> +		int ret;
> +
> +		ret = snprintf(str, sizeof(str), "%llu", field->constant);
> +		if (ret >= sizeof(str))
> +			return false;
>  
> -		snprintf(str, HIST_CONST_DIGITS_MAX, "%llu", field->constant);
> -		strcat(expr, str);

		seq_buf_printf(&s, "%llu", field->constant);

> +		if (!expr_append(expr, len, str))
> +			return false;
>  	}
>  
> -	strcat(expr, hist_field_name(field, 0));

	seq_buf_puts(&s, hist_field_name(field, 0));

> +	if (!expr_append(expr, len, hist_field_name(field, 0)))
> +		return false;
>  
>  	if (field->flags && !(field->flags & HIST_FIELD_FL_VAR_REF)) {
>  		const char *flags_str = get_hist_field_flags(field);
>  
>  		if (flags_str) {
> -			strcat(expr, ".");
> -			strcat(expr, flags_str);

			seq_buf_printf(&s, ".%s", flags_str);

> +			if (!expr_append(expr, len, ".") ||
> +			    !expr_append(expr, len, flags_str))
> +				return false;
>  		}
>  	}

	/* Add terminating character */
	seq_buf_str(&s);

	return seq_buf_overflow(&s) ? false : true;

> +
> +	return true;
>  }
>  
>  static char *expr_str(struct hist_field *field, unsigned int level)
>  {
>  	char *expr;
> +	size_t len = 0;
>  

This could all be converted too.

-- Steve

>  	if (level > 1)
> -		return NULL;
> +		return ERR_PTR(-EINVAL);
>  
>  	expr = kzalloc(MAX_FILTER_STR_VAL, GFP_KERNEL);
>  	if (!expr)
> -		return NULL;
> +		return ERR_PTR(-ENOMEM);
>  
>  	if (!field->operands[0]) {
> -		expr_field_str(field, expr);
> +		if (!expr_field_str(field, expr, &len))
> +			goto free;
>  		return expr;
>  	}
>  
>  	if (field->operator == FIELD_OP_UNARY_MINUS) {
>  		char *subexpr;
>  
> -		strcat(expr, "-(");
> +		if (!expr_append(expr, &len, "-("))
> +			goto free;
>  		subexpr = expr_str(field->operands[0], ++level);
>  		if (!subexpr) {
> -			kfree(expr);
> -			return NULL;
> +			goto free;
> +		}
> +		if (!expr_append(expr, &len, subexpr) ||
> +		    !expr_append(expr, &len, ")")) {
> +			kfree(subexpr);
> +			goto free;
>  		}
> -		strcat(expr, subexpr);
> -		strcat(expr, ")");
>  
>  		kfree(subexpr);
>  
>  		return expr;
>  	}
>  
> -	expr_field_str(field->operands[0], expr);
> +	if (!expr_field_str(field->operands[0], expr, &len))
> +		goto free;
>  
>  	switch (field->operator) {
>  	case FIELD_OP_MINUS:
> -		strcat(expr, "-");
> +		if (!expr_append(expr, &len, "-"))
> +			goto free;
>  		break;
>  	case FIELD_OP_PLUS:
> -		strcat(expr, "+");
> +		if (!expr_append(expr, &len, "+"))
> +			goto free;
>  		break;
>  	case FIELD_OP_DIV:
> -		strcat(expr, "/");
> +		if (!expr_append(expr, &len, "/"))
> +			goto free;
>  		break;
>  	case FIELD_OP_MULT:
> -		strcat(expr, "*");
> +		if (!expr_append(expr, &len, "*"))
> +			goto free;
>  		break;
>  	default:
> -		kfree(expr);
> -		return NULL;
> +		goto free;
>  	}
>  
> -	expr_field_str(field->operands[1], expr);
> +	if (!expr_field_str(field->operands[1], expr, &len))
> +		goto free;
>  
>  	return expr;
> +
> +free:
> +	kfree(expr);
> +	return ERR_PTR(-E2BIG);
>  }
>  
>  /*
> @@ -2630,6 +2666,11 @@ static struct hist_field *parse_unary(struct hist_trigger_data *hist_data,
>  	expr->is_signed = operand1->is_signed;
>  	expr->operator = FIELD_OP_UNARY_MINUS;
>  	expr->name = expr_str(expr, 0);
> +	if (IS_ERR(expr->name)) {
> +		ret = PTR_ERR(expr->name);
> +		expr->name = NULL;
> +		goto free;
> +	}
>  	expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
>  	if (!expr->type) {
>  		ret = -ENOMEM;
> @@ -2842,6 +2883,11 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data,
>  		destroy_hist_field(operand1, 0);
>  
>  		expr->name = expr_str(expr, 0);
> +		if (IS_ERR(expr->name)) {
> +			ret = PTR_ERR(expr->name);
> +			expr->name = NULL;
> +			goto free_expr;
> +		}
>  	} else {
>  		/* The operand sizes should be the same, so just pick one */
>  		expr->size = operand1->size;
> @@ -2855,6 +2901,11 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data,
>  		}
>  
>  		expr->name = expr_str(expr, 0);
> +		if (IS_ERR(expr->name)) {
> +			ret = PTR_ERR(expr->name);
> +			expr->name = NULL;
> +			goto free_expr;
> +		}
>  	}
>  
>  	return expr;


^ permalink raw reply

* Re: [PATCH] selftests/ftrace: Quote check_requires comparisons
From: Shuah Khan @ 2026-04-08 21:22 UTC (permalink / raw)
  To: Steven Rostedt, Cao Ruichuang
  Cc: mhiramat, mathieu.desnoyers, shuah, linux-kernel,
	linux-trace-kernel, linux-kselftest, Shuah Khan
In-Reply-To: <20260408171831.69520c50@gandalf.local.home>

On 4/8/26 15:18, Steven Rostedt wrote:
> On Wed,  8 Apr 2026 12:32:12 +0800
> Cao Ruichuang <create0818@163.com> wrote:
> 
>> check_requires() compares requirement strings that can contain shell
>> pattern characters such as '[' and ']'. Under /bin/sh, the unquoted
>> test expressions can emit 'unexpected operator' warnings while parsing
>> README-backed requirements.
>>
>> Quote the relevant comparisons and path checks so the helper handles
>> those patterns without spurious shell warnings.
>>
>> Validated by rerunning fprobe_syntax_errors.tc and confirming the
>> previous '/bin/sh: unexpected operator' lines disappear from the
>> detailed ftracetest log.
>>
>> Signed-off-by: Cao Ruichuang <create0818@163.com>
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> 
> Shuah,
> 
> Care to take this through your tree?
> 
> -- Steve

Yes - I can take this for the merge window.

thanks,
-- Shuah

^ permalink raw reply

* Re: [PATCH] selftests/ftrace: Quote check_requires comparisons
From: Steven Rostedt @ 2026-04-08 21:18 UTC (permalink / raw)
  To: Cao Ruichuang
  Cc: mhiramat, mathieu.desnoyers, shuah, linux-kernel,
	linux-trace-kernel, linux-kselftest, Shuah Khan
In-Reply-To: <20260408043212.8063-1-create0818@163.com>

On Wed,  8 Apr 2026 12:32:12 +0800
Cao Ruichuang <create0818@163.com> wrote:

> check_requires() compares requirement strings that can contain shell
> pattern characters such as '[' and ']'. Under /bin/sh, the unquoted
> test expressions can emit 'unexpected operator' warnings while parsing
> README-backed requirements.
> 
> Quote the relevant comparisons and path checks so the helper handles
> those patterns without spurious shell warnings.
> 
> Validated by rerunning fprobe_syntax_errors.tc and confirming the
> previous '/bin/sh: unexpected operator' lines disappear from the
> detailed ftracetest log.
> 
> Signed-off-by: Cao Ruichuang <create0818@163.com>

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Shuah,

Care to take this through your tree?

-- Steve

> ---
>  tools/testing/selftests/ftrace/test.d/functions | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/ftrace/test.d/functions b/tools/testing/selftests/ftrace/test.d/functions
> index e8e718139..442aa28ff 100644
> --- a/tools/testing/selftests/ftrace/test.d/functions
> +++ b/tools/testing/selftests/ftrace/test.d/functions
> @@ -145,13 +145,13 @@ check_requires() { # Check required files and tracers
>  	p=${i%:program}
>          r=${i%:README}
>          t=${i%:tracer}
> -	if [ $p != $i ]; then
> -	    if ! which $p ; then
> +	if [ "$p" != "$i" ]; then
> +	    if ! which "$p" ; then
>                  echo "Required program $p is not found."
>                  exit_unresolved
>  	    fi
> -        elif [ $t != $i ]; then
> -            if ! grep -wq $t available_tracers ; then
> +        elif [ "$t" != "$i" ]; then
> +            if ! grep -wq "$t" available_tracers ; then
>                  echo "Required tracer $t is not configured."
>                  exit_unsupported
>              fi
> @@ -162,11 +162,11 @@ check_requires() { # Check required files and tracers
>  	    else
>  		test=$TRACING_DIR
>  	    fi
> -            if ! grep -Fq "$r" $test/README ; then
> +            if ! grep -Fq "$r" "$test"/README ; then
>                  echo "Required feature pattern \"$r\" is not in README."
>                  exit_unsupported
>              fi
> -        elif [ ! -e $i ]; then
> +        elif [ ! -e "$i" ]; then
>              echo "Required feature interface $i doesn't exist."
>              exit_unsupported
>          fi
> @@ -223,4 +223,4 @@ get_mnt_options() {
>  	local opts=$(mount | grep -m1 "$mnt_point" | sed -e 's/.*(\(.*\)).*/\1/')
>  
>  	echo "$opts"
> -}
> \ No newline at end of file
> +}


^ permalink raw reply

* Re: [PATCH] tracing: preserve module tracepoint strings
From: Steven Rostedt @ 2026-04-08 20:32 UTC (permalink / raw)
  To: Cao Ruichuang
  Cc: mhiramat, mathieu.desnoyers, mcgrof, petr.pavlu, da.gomez,
	samitolvanen, atomlin, linux-kernel, linux-trace-kernel,
	linux-modules
In-Reply-To: <20260406170944.51047-1-create0818@163.com>

On Tue,  7 Apr 2026 01:09:44 +0800
Cao Ruichuang <create0818@163.com> wrote:

> tracepoint_string() is documented as exporting constant strings
> through printk_formats, including when it is used from modules.
> That currently does not work.
> 
> A small test module that calls
> tracepoint_string("tracepoint_string_test_module_string") loads
> successfully and gets a pointer back, but the string never appears
> in /sys/kernel/tracing/printk_formats. The loader only collects
> __trace_printk_fmt from modules and ignores __tracepoint_str.
> 
> Collect module __tracepoint_str entries too, copy them to stable
> tracing-managed storage like module trace_printk formats, and let
> trace_is_tracepoint_string() recognize those copied strings. This
> makes module tracepoint strings visible through printk_formats and
> keeps them accepted by the trace string safety checks.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217196
> Signed-off-by: Cao Ruichuang <create0818@163.com>
> ---


As this is not a trivial change, and affects module code, I'm going to hold
off until after v7.1-rc1 to apply it.

-- Steve

^ permalink raw reply

* [PATCH v3] seq_buf: Export seq_buf_putmem_hex() and add KUnit tests
From: Shuvam Pandey @ 2026-04-08 20:23 UTC (permalink / raw)
  To: Steven Rostedt, Andrew Morton
  Cc: Masami Hiramatsu, Mathieu Desnoyers, David Gow, linux-kernel,
	linux-trace-kernel, shuvampandey1

seq_buf: Export seq_buf_putmem_hex() and add KUnit tests

The seq_buf KUnit suite does not exercise seq_buf_putmem_hex().

Add one test for the len > 8 chunking path and one overflow test
where a later chunk no longer fits in the buffer.

Export seq_buf_putmem_hex() as well so SEQ_BUF_KUNIT_TEST=m links
cleanly. Without the export, modpost reports seq_buf_putmem_hex as
undefined when seq_buf_kunit is built as a module.

Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuvam Pandey <shuvampandey1@gmail.com>
---
Changes since v1: https://lore.kernel.org/all/20260406033728.25998-1-shuvampandey1@gmail.com/
- export seq_buf_putmem_hex() so SEQ_BUF_KUNIT_TEST=m links cleanly
- validate with a fresh arm64 build using CONFIG_KUNIT=y and CONFIG_SEQ_BUF_KUNIT_TEST=m
- send as a new thread
- capitalize the subject for seq_buf/tracing style

 lib/seq_buf.c             |  1 +
 lib/tests/seq_buf_kunit.c | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index f3f3436d60a9403eae5b1ef9b091b027881f14fb..b59488fa8135cdb0340fbeb43d8d74db8ae13146 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -298,6 +298,7 @@ int seq_buf_putmem_hex(struct seq_buf *s, const void *mem,
 	}
 	return 0;
 }
+EXPORT_SYMBOL_GPL(seq_buf_putmem_hex);
 
 /**
  * seq_buf_path - copy a path into the sequence buffer
diff --git a/lib/tests/seq_buf_kunit.c b/lib/tests/seq_buf_kunit.c
index 8a01579a978e655cd09024d0ea9c4c9cd095263f..eb466386bbefb1c81773cdae65a8ac3df91cd8ea 100644
--- a/lib/tests/seq_buf_kunit.c
+++ b/lib/tests/seq_buf_kunit.c
@@ -184,6 +184,38 @@ static void seq_buf_get_buf_commit_test(struct kunit *test)
 	KUNIT_EXPECT_TRUE(test, seq_buf_has_overflowed(&s));
 }
 
+static void seq_buf_putmem_hex_test(struct kunit *test)
+{
+	DECLARE_SEQ_BUF(s, 24);
+	const u8 data[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+#ifdef __BIG_ENDIAN
+	const char *expected = "0001020304050607 0809 ";
+#else
+	const char *expected = "0706050403020100 0908 ";
+#endif
+
+	KUNIT_EXPECT_EQ(test, seq_buf_putmem_hex(&s, data, sizeof(data)), 0);
+	KUNIT_EXPECT_FALSE(test, seq_buf_has_overflowed(&s));
+	KUNIT_EXPECT_EQ(test, seq_buf_used(&s), strlen(expected));
+	KUNIT_EXPECT_STREQ(test, seq_buf_str(&s), expected);
+}
+
+static void seq_buf_putmem_hex_overflow_test(struct kunit *test)
+{
+	DECLARE_SEQ_BUF(s, 20);
+	const u8 data[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+#ifdef __BIG_ENDIAN
+	const char *expected = "0001020304050607 ";
+#else
+	const char *expected = "0706050403020100 ";
+#endif
+
+	KUNIT_EXPECT_EQ(test, seq_buf_putmem_hex(&s, data, sizeof(data)), -1);
+	KUNIT_EXPECT_TRUE(test, seq_buf_has_overflowed(&s));
+	KUNIT_EXPECT_EQ(test, seq_buf_used(&s), 20);
+	KUNIT_EXPECT_STREQ(test, seq_buf_str(&s), expected);
+}
+
 static struct kunit_case seq_buf_test_cases[] = {
 	KUNIT_CASE(seq_buf_init_test),
 	KUNIT_CASE(seq_buf_declare_test),
@@ -194,6 +226,8 @@ static struct kunit_case seq_buf_test_cases[] = {
 	KUNIT_CASE(seq_buf_printf_test),
 	KUNIT_CASE(seq_buf_printf_overflow_test),
 	KUNIT_CASE(seq_buf_get_buf_commit_test),
+	KUNIT_CASE(seq_buf_putmem_hex_test),
+	KUNIT_CASE(seq_buf_putmem_hex_overflow_test),
 	{}
 };
 

^ permalink raw reply related

* Re: [PATCH v4] kernel/trace: fixed static warnings
From: Steven Rostedt @ 2026-04-08 20:24 UTC (permalink / raw)
  To: abhijithsriram95
  Cc: Masami Hiramatsu, Mathieu Desnoyers, open list:TRACING,
	open list:TRACING
In-Reply-To: <20260406072834.243491-2-abhijithsriram95@gmail.com>


Subject should be:

 tracing: Fixed static checker warnings

On Mon,  6 Apr 2026 09:28:34 +0200
abhijithsriram95@gmail.com wrote:

> From: Abhijith Sriram <abhijithsriram95@gmail.com>
> 
> The change in the function argument description
> was due to the static code checker script reading
> the word filter back to back
> 

The below changes should be beneath the '---'

> Changes in v2:

The last change should be first. In fact, I only care about the last change
as the previous versions should have the description of what changed.

> - corrected *m = file->private_data to m = file->private_data
> 
> Changes in v3:
> - reverted the changes for struct seq_file *m and
>   added a new empty line instead
> 
> Changes in v4:

That said, this should really be:

 Changes since v3: https://lore.kernel.org/all/20260406060046.223496-2-abhijithsriram95@gmail.com/


> - added a new empty line before char *buf ...
>   previously this line was relocated to avoid the
>   static check warning.
> 
> Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
> ---
>  kernel/trace/trace_events_trigger.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
> index 655db2e82513..664283bcd9ea 100644
> --- a/kernel/trace/trace_events_trigger.c
> +++ b/kernel/trace/trace_events_trigger.c
> @@ -246,7 +246,7 @@ event_triggers_post_call(struct trace_event_file *file,
>  }
>  EXPORT_SYMBOL_GPL(event_triggers_post_call);
>  
> -#define SHOW_AVAILABLE_TRIGGERS	(void *)(1UL)
> +#define SHOW_AVAILABLE_TRIGGERS	((void *)(1UL))
>  
>  static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
>  {
> @@ -352,6 +352,7 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
>  		ret = seq_open(file, &event_triggers_seq_ops);
>  		if (!ret) {
>  			struct seq_file *m = file->private_data;
> +

This blank line makes the code look worse. Yes, we usually want a blank
line between the variable declarations and the code, but when it comes to
code blocks (not functions) that rule is not as strict.

Get rid of this newline.


>  			m->private = file;
>  		}
>  	}
> @@ -390,6 +391,7 @@ static ssize_t event_trigger_regex_write(struct file *file,
>  {
>  	struct trace_event_file *event_file;
>  	ssize_t ret;
> +
>  	char *buf __free(kfree) = NULL;

The char *buf is a declaration. It no new line is expected before it.

>  
>  	if (!cnt)
> @@ -633,6 +635,7 @@ clear_event_triggers(struct trace_array *tr)
>  
>  	list_for_each_entry(file, &tr->events, list) {
>  		struct event_trigger_data *data, *n;
> +

Again, if it's in a code block, don't change it.

-- Steve

>  		list_for_each_entry_safe(data, n, &file->triggers, list) {
>  			trace_event_trigger_enable_disable(file, 0);
>  			list_del_rcu(&data->list);
> @@ -785,7 +788,7 @@ static void unregister_trigger(char *glob,
>   *   cmd               - the trigger command name
>   *   glob              - the trigger command name optionally prefaced with '!'
>   *   param_and_filter  - text following cmd and ':'
> - *   param             - text following cmd and ':' and stripped of filter
> + *   param             - text following cmd and ':' and filter removed
>   *   filter            - the optional filter text following (and including) 'if'
>   *
>   * To illustrate the use of these components, here are some concrete


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox