linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] tracing: fprobe-events: Register fprobe only when the event is enabled
@ 2025-03-31 15:35 Masami Hiramatsu (Google)
  2025-03-31 15:35 ` [PATCH v2 1/7] tracing: fprobe events: Fix possible UAF on modules Masami Hiramatsu (Google)
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-03-31 15:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

Hi,
Here is the 2nd version of the series to register fprobe only when the
fprobe event is enabled. The previous version is here;

https://lore.kernel.org/all/174212767109.348872.18231451508464729427.stgit@devnote2/

In this version, I added fixes([1/7] and [2/7]) related to module
unloading ([1/7] should be handled separatedly) and updates
according to the previous thread.

Steve suggested the fprobe-events are always registered when it is
defined, even if it is disabled and that makes system overhead.
This series registeres the fprobes only when the event is enabled.

Thank you,

---

Masami Hiramatsu (Google) (7):
      tracing: fprobe events: Fix possible UAF on modules
      tracing: fprobe: Cleanup fprobe hash when module unloading
      tracing: tprobe-events: Remove mod field from tprobe-event
      tracing: tprobe-events: Support multiple tprobes on the same tracepoint
      tracing: fprobe-events: Register fprobe-events only when it is enabled
      selftests: tracing: Enable fprobe events before checking enable_functions
      tracing: tprobe-events: Register tracepoint when enable tprobe event


 include/linux/fprobe.h                             |    5 
 include/linux/module.h                             |    4 
 kernel/trace/fprobe.c                              |  108 ++++
 kernel/trace/trace_fprobe.c                        |  593 +++++++++++++-------
 .../ftrace/test.d/dynevent/add_remove_fprobe.tc    |   30 +
 5 files changed, 531 insertions(+), 209 deletions(-)

--
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/7] tracing: fprobe events: Fix possible UAF on modules
  2025-03-31 15:35 [PATCH v2 0/7] tracing: fprobe-events: Register fprobe only when the event is enabled Masami Hiramatsu (Google)
@ 2025-03-31 15:35 ` Masami Hiramatsu (Google)
  2025-03-31 15:38   ` Masami Hiramatsu
  2025-03-31 15:35 ` [PATCH v2 2/7] tracing: fprobe: Cleanup fprobe hash when module unloading Masami Hiramatsu (Google)
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-03-31 15:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Commit ac91052f0ae5 ("tracing: tprobe-events: Fix leakage of module
refcount") moved try_module_get() from __find_tracepoint_module_cb()
to find_tracepoint() caller, but that introduced a possible UAF
because the module can be unloaded before try_module_get(). In this
case, the module object should be freed too. Thus, try_module_get()
does not only fail but may access to the freed object.

To avoid that, try_module_get() in __find_tracepoint_module_cb()
again.

Fixes: ac91052f0ae5 ("tracing: tprobe-events: Fix leakage of module refcount")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/trace_fprobe.c |   26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 985ff98272da..2cd9ff1049f1 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -919,9 +919,15 @@ static void __find_tracepoint_module_cb(struct tracepoint *tp, struct module *mo
 	struct __find_tracepoint_cb_data *data = priv;
 
 	if (!data->tpoint && !strcmp(data->tp_name, tp->name)) {
-		data->tpoint = tp;
-		if (!data->mod)
+		/* If module is not specified, try getting module refcount. */
+		if (!data->mod && mod) {
+			/* If failed to get refcount, ignore this tracepoint. */
+			if (!try_module_get(mod))
+				return;
+
 			data->mod = mod;
+		}
+		data->tpoint = tp;
 	}
 }
 
@@ -933,7 +939,11 @@ static void __find_tracepoint_cb(struct tracepoint *tp, void *priv)
 		data->tpoint = tp;
 }
 
-/* Find a tracepoint from kernel and module. */
+/*
+ * Find a tracepoint from kernel and module. If the tracepoint is on the module,
+ * the module's refcount is incremented and returned as *@tp_mod. Thus, if it is
+ * not NULL, caller must call module_put(*tp_mod) after used the tracepoint.
+ */
 static struct tracepoint *find_tracepoint(const char *tp_name,
 					  struct module **tp_mod)
 {
@@ -962,7 +972,10 @@ static void reenable_trace_fprobe(struct trace_fprobe *tf)
 	}
 }
 
-/* Find a tracepoint from specified module. */
+/*
+ * Find a tracepoint from specified module. In this case, this does not get the
+ * module's refcount. The caller must ensure the module is not freed.
+ */
 static struct tracepoint *find_tracepoint_in_module(struct module *mod,
 						    const char *tp_name)
 {
@@ -1169,11 +1182,6 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 	if (is_tracepoint) {
 		ctx->flags |= TPARG_FL_TPOINT;
 		tpoint = find_tracepoint(symbol, &tp_mod);
-		/* lock module until register this tprobe. */
-		if (tp_mod && !try_module_get(tp_mod)) {
-			tpoint = NULL;
-			tp_mod = NULL;
-		}
 		if (tpoint) {
 			ctx->funcname = kallsyms_lookup(
 				(unsigned long)tpoint->probestub,


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/7] tracing: fprobe: Cleanup fprobe hash when module unloading
  2025-03-31 15:35 [PATCH v2 0/7] tracing: fprobe-events: Register fprobe only when the event is enabled Masami Hiramatsu (Google)
  2025-03-31 15:35 ` [PATCH v2 1/7] tracing: fprobe events: Fix possible UAF on modules Masami Hiramatsu (Google)
@ 2025-03-31 15:35 ` Masami Hiramatsu (Google)
  2025-04-07 23:52   ` Masami Hiramatsu
  2025-03-31 15:35 ` [PATCH v2 3/7] tracing: tprobe-events: Remove mod field from tprobe-event Masami Hiramatsu (Google)
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-03-31 15:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Cleanup fprobe address hash table on module unloading because the
target symbols will be disappeared when unloading module and not
sure the same symbol is mapped on the same address.

Note that this is at least disables the fprobes if a part of target
symbols on the unloaded modules. Unlike kprobes, fprobe does not
re-enable the probe point by itself. To do that, the caller should
take care register/unregister fprobe when loading/unloading modules.
This simplifies the fprobe state managememt related to the module
loading/unloading.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/fprobe.c |  103 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 101 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index cb86f90d4b1e..95c6e3473a76 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -89,8 +89,11 @@ static bool delete_fprobe_node(struct fprobe_hlist_node *node)
 {
 	lockdep_assert_held(&fprobe_mutex);
 
-	WRITE_ONCE(node->fp, NULL);
-	hlist_del_rcu(&node->hlist);
+	/* Avoid double deleting */
+	if (READ_ONCE(node->fp) != NULL) {
+		WRITE_ONCE(node->fp, NULL);
+		hlist_del_rcu(&node->hlist);
+	}
 	return !!find_first_fprobe_node(node->addr);
 }
 
@@ -411,6 +414,102 @@ static void fprobe_graph_remove_ips(unsigned long *addrs, int num)
 		ftrace_set_filter_ips(&fprobe_graph_ops.ops, addrs, num, 1, 0);
 }
 
+#ifdef CONFIG_MODULES
+
+#define FPROBE_IPS_BATCH_INIT 8
+/* instruction pointer address list */
+struct fprobe_addr_list {
+	int index;
+	int size;
+	unsigned long *addrs;
+};
+
+static int fprobe_addr_list_add(struct fprobe_addr_list *alist, unsigned long addr)
+{
+	unsigned long *addrs;
+
+	if (alist->index >= alist->size)
+		return -ENOMEM;
+
+	alist->addrs[alist->index++] = addr;
+	if (alist->index < alist->size)
+		return 0;
+
+	/* Expand the address list */
+	addrs = kcalloc(alist->size * 2, sizeof(*addrs), GFP_KERNEL);
+	if (!addrs)
+		return -ENOMEM;
+
+	memcpy(addrs, alist->addrs, alist->size * sizeof(*addrs));
+	alist->size *= 2;
+	kfree(alist->addrs);
+	alist->addrs = addrs;
+
+	return 0;
+}
+
+static void fprobe_remove_node_in_module(struct module *mod, struct hlist_head *head,
+					struct fprobe_addr_list *alist)
+{
+	struct fprobe_hlist_node *node;
+	int ret = 0;
+
+	hlist_for_each_entry_rcu(node, head, hlist) {
+		if (!within_module(node->addr, mod))
+			continue;
+		if (delete_fprobe_node(node))
+			continue;
+		/*
+		 * If failed to update alist, just continue to update hlist.
+		 * Therefore, at list user handler will not hit anymore.
+		 */
+		if (!ret)
+			ret = fprobe_addr_list_add(alist, node->addr);
+	}
+}
+
+/* Handle module unloading to manage fprobe_ip_table. */
+static int fprobe_module_callback(struct notifier_block *nb,
+				  unsigned long val, void *data)
+{
+	struct fprobe_addr_list alist = {.size = FPROBE_IPS_BATCH_INIT};
+	struct module *mod = data;
+	int i;
+
+	if (val != MODULE_STATE_GOING)
+		return NOTIFY_DONE;
+
+	alist.addrs = kcalloc(alist.size, sizeof(*alist.addrs), GFP_KERNEL);
+	/* If failed to alloc memory, we can not remove ips from hash. */
+	if (!alist.addrs)
+		return NOTIFY_DONE;
+
+	mutex_lock(&fprobe_mutex);
+	for (i = 0; i < FPROBE_IP_TABLE_SIZE; i++)
+		fprobe_remove_node_in_module(mod, &fprobe_ip_table[i], &alist);
+
+	if (alist.index < alist.size && alist.index > 0)
+		ftrace_set_filter_ips(&fprobe_graph_ops.ops,
+				      alist.addrs, alist.index, 1, 0);
+	mutex_unlock(&fprobe_mutex);
+
+	kfree(alist.addrs);
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block fprobe_module_nb = {
+	.notifier_call = fprobe_module_callback,
+	.priority = 0,
+};
+
+static int __init init_fprobe_module(void)
+{
+	return register_module_notifier(&fprobe_module_nb);
+}
+early_initcall(init_fprobe_module);
+#endif
+
 static int symbols_cmp(const void *a, const void *b)
 {
 	const char **str_a = (const char **) a;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/7] tracing: tprobe-events: Remove mod field from tprobe-event
  2025-03-31 15:35 [PATCH v2 0/7] tracing: fprobe-events: Register fprobe only when the event is enabled Masami Hiramatsu (Google)
  2025-03-31 15:35 ` [PATCH v2 1/7] tracing: fprobe events: Fix possible UAF on modules Masami Hiramatsu (Google)
  2025-03-31 15:35 ` [PATCH v2 2/7] tracing: fprobe: Cleanup fprobe hash when module unloading Masami Hiramatsu (Google)
@ 2025-03-31 15:35 ` Masami Hiramatsu (Google)
  2025-03-31 15:36 ` [PATCH v2 4/7] tracing: tprobe-events: Support multiple tprobes on the same tracepoint Masami Hiramatsu (Google)
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-03-31 15:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Remove unneeded 'mod' struct module pointer field from trace_fprobe
because we don't need to save this info.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/trace_fprobe.c |   23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 2cd9ff1049f1..14a1e4f07002 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -46,7 +46,6 @@ struct trace_fprobe {
 	struct fprobe		fp;
 	const char		*symbol;
 	struct tracepoint	*tpoint;
-	struct module		*mod;
 	struct trace_probe	tp;
 };
 
@@ -426,7 +425,6 @@ static struct trace_fprobe *alloc_trace_fprobe(const char *group,
 					       const char *event,
 					       const char *symbol,
 					       struct tracepoint *tpoint,
-					       struct module *mod,
 					       int nargs, bool is_return)
 {
 	struct trace_fprobe *tf __free(free_trace_fprobe) = NULL;
@@ -446,7 +444,6 @@ static struct trace_fprobe *alloc_trace_fprobe(const char *group,
 		tf->fp.entry_handler = fentry_dispatcher;
 
 	tf->tpoint = tpoint;
-	tf->mod = mod;
 
 	ret = trace_probe_init(&tf->tp, event, group, false, nargs);
 	if (ret < 0)
@@ -776,7 +773,6 @@ static void __unregister_trace_fprobe(struct trace_fprobe *tf)
 			tracepoint_probe_unregister(tf->tpoint,
 					tf->tpoint->probestub, NULL);
 			tf->tpoint = NULL;
-			tf->mod = NULL;
 		}
 	}
 }
@@ -1001,23 +997,23 @@ static int __tracepoint_probe_module_cb(struct notifier_block *self,
 
 	mutex_lock(&event_mutex);
 	for_each_trace_fprobe(tf, pos) {
+		if (!trace_fprobe_is_tracepoint(tf))
+			continue;
 		if (val == MODULE_STATE_COMING && tf->tpoint == TRACEPOINT_STUB) {
 			tpoint = find_tracepoint_in_module(tp_mod->mod, tf->symbol);
 			if (tpoint) {
 				tf->tpoint = tpoint;
-				tf->mod = tp_mod->mod;
 				if (!WARN_ON_ONCE(__regsiter_tracepoint_fprobe(tf)) &&
 				    trace_probe_is_enabled(&tf->tp))
 					reenable_trace_fprobe(tf);
 			}
-		} else if (val == MODULE_STATE_GOING && tp_mod->mod == tf->mod) {
+		} else if (val == MODULE_STATE_GOING &&
+			   tf->tpoint != TRACEPOINT_STUB &&
+			   within_module((unsigned long)tf->tpoint->probestub, tp_mod->mod)) {
 			unregister_fprobe(&tf->fp);
-			if (trace_fprobe_is_tracepoint(tf)) {
-				tracepoint_probe_unregister(tf->tpoint,
-					tf->tpoint->probestub, NULL);
-				tf->tpoint = TRACEPOINT_STUB;
-				tf->mod = NULL;
-			}
+			tracepoint_probe_unregister(tf->tpoint,
+				tf->tpoint->probestub, NULL);
+			tf->tpoint = TRACEPOINT_STUB;
 		}
 	}
 	mutex_unlock(&event_mutex);
@@ -1215,8 +1211,7 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 		return ret;
 
 	/* setup a probe */
-	tf = alloc_trace_fprobe(group, event, symbol, tpoint, tp_mod,
-				argc, is_return);
+	tf = alloc_trace_fprobe(group, event, symbol, tpoint, argc, is_return);
 	if (IS_ERR(tf)) {
 		ret = PTR_ERR(tf);
 		/* This must return -ENOMEM, else there is a bug */


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 4/7] tracing: tprobe-events: Support multiple tprobes on the same tracepoint
  2025-03-31 15:35 [PATCH v2 0/7] tracing: fprobe-events: Register fprobe only when the event is enabled Masami Hiramatsu (Google)
                   ` (2 preceding siblings ...)
  2025-03-31 15:35 ` [PATCH v2 3/7] tracing: tprobe-events: Remove mod field from tprobe-event Masami Hiramatsu (Google)
@ 2025-03-31 15:36 ` Masami Hiramatsu (Google)
  2025-03-31 15:36 ` [PATCH v2 5/7] tracing: fprobe-events: Register fprobe-events only when it is enabled Masami Hiramatsu (Google)
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-03-31 15:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Allow user to set multiple tracepoint-probe events on the same
tracepoint. After the last tprobe-event is removed, the tracepoint
callback is unregistered.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v2:
  - Fix not to use unneeded __free().
  - Update comments to explain the code.
---
 include/linux/module.h      |    4 +
 kernel/trace/trace_fprobe.c |  251 ++++++++++++++++++++++++++++++++++---------
 2 files changed, 205 insertions(+), 50 deletions(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index 30e5b19bafa9..01d5208cf473 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -14,6 +14,7 @@
 #include <linux/buildid.h>
 #include <linux/compiler.h>
 #include <linux/cache.h>
+#include <linux/cleanup.h>
 #include <linux/kmod.h>
 #include <linux/init.h>
 #include <linux/elf.h>
@@ -1027,4 +1028,7 @@ static inline unsigned long find_kallsyms_symbol_value(struct module *mod,
 
 #endif  /* CONFIG_MODULES && CONFIG_KALLSYMS */
 
+/* Define __free(module_put) macro for struct module *. */
+DEFINE_FREE(module_put, struct module *, if (_T) module_put(_T))
+
 #endif /* _LINUX_MODULE_H */
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 14a1e4f07002..150975237a20 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -21,7 +21,6 @@
 #define FPROBE_EVENT_SYSTEM "fprobes"
 #define TRACEPOINT_EVENT_SYSTEM "tracepoints"
 #define RETHOOK_MAXACTIVE_MAX 4096
-#define TRACEPOINT_STUB ERR_PTR(-ENOENT)
 
 static int trace_fprobe_create(const char *raw_command);
 static int trace_fprobe_show(struct seq_file *m, struct dyn_event *ev);
@@ -38,6 +37,89 @@ static struct dyn_event_operations trace_fprobe_ops = {
 	.match = trace_fprobe_match,
 };
 
+struct tracepoint_user {
+	struct tracepoint	*tpoint;
+	unsigned int		refcount;
+};
+
+static bool tracepoint_user_is_registered(struct tracepoint_user *tuser)
+{
+	return tuser && tuser->tpoint;
+}
+
+static int tracepoint_user_register(struct tracepoint_user *tuser)
+{
+	struct tracepoint *tpoint = tuser->tpoint;
+
+	if (!tpoint)
+		return 0;
+
+	return tracepoint_probe_register_prio_may_exist(tpoint,
+					tpoint->probestub, NULL, 0);
+}
+
+static void tracepoint_user_unregister(struct tracepoint_user *tuser)
+{
+	if (!tuser->tpoint)
+		return;
+
+	WARN_ON_ONCE(tracepoint_probe_unregister(tuser->tpoint, tuser->tpoint->probestub, NULL));
+	tuser->tpoint = NULL;
+}
+
+static unsigned long tracepoint_user_ip(struct tracepoint_user *tuser)
+{
+	if (!tuser->tpoint)
+		return 0UL;
+
+	return (unsigned long)tuser->tpoint->probestub;
+}
+
+static bool tracepoint_user_within_module(struct tracepoint_user *tuser,
+					  struct module *mod)
+{
+	return within_module(tracepoint_user_ip(tuser), mod);
+}
+
+static struct tracepoint_user *tracepoint_user_allocate(struct tracepoint *tpoint)
+{
+	struct tracepoint_user *tuser;
+
+	tuser = kzalloc(sizeof(*tuser), GFP_KERNEL);
+	if (!tuser)
+		return NULL;
+	tuser->tpoint = tpoint;
+	tuser->refcount = 1;
+
+	return tuser;
+}
+
+/* These must be called with event_mutex */
+static void tracepoint_user_get(struct tracepoint_user *tuser)
+{
+	tuser->refcount++;
+}
+
+static void tracepoint_user_put(struct tracepoint_user *tuser)
+{
+	if (--tuser->refcount > 0)
+		return;
+
+	if (tracepoint_user_is_registered(tuser))
+		tracepoint_user_unregister(tuser);
+	kfree(tuser);
+}
+
+static const char *tracepoint_user_lookup(struct tracepoint_user *tuser, char *buf)
+{
+	struct tracepoint *tpoint = tuser->tpoint;
+
+	if (!tpoint)
+		return NULL;
+
+	return kallsyms_lookup((unsigned long)tpoint->probestub, NULL, NULL, NULL, buf);
+}
+
 /*
  * Fprobe event core functions
  */
@@ -45,7 +127,7 @@ struct trace_fprobe {
 	struct dyn_event	devent;
 	struct fprobe		fp;
 	const char		*symbol;
-	struct tracepoint	*tpoint;
+	struct tracepoint_user	*tuser;
 	struct trace_probe	tp;
 };
 
@@ -75,7 +157,7 @@ static bool trace_fprobe_is_return(struct trace_fprobe *tf)
 
 static bool trace_fprobe_is_tracepoint(struct trace_fprobe *tf)
 {
-	return tf->tpoint != NULL;
+	return tf->tuser != NULL;
 }
 
 static const char *trace_fprobe_symbol(struct trace_fprobe *tf)
@@ -125,6 +207,56 @@ static bool trace_fprobe_is_registered(struct trace_fprobe *tf)
 	return fprobe_is_registered(&tf->fp);
 }
 
+static struct tracepoint *find_tracepoint(const char *tp_name,
+	struct module **tp_mod);
+
+/*
+ * Get tracepoint_user if exist, or allocate new one. If tracepoint is on a
+ * module, get its refcounter.
+ */
+static struct tracepoint_user *
+trace_fprobe_get_tracepoint_user(const char *name, struct module **pmod)
+{
+	struct tracepoint_user *tuser __free(kfree) = NULL;
+	struct tracepoint *tpoint;
+	struct trace_fprobe *tf;
+	struct dyn_event *dpos;
+	struct module *mod __free(module_put) = NULL;
+	int ret;
+
+	/*
+	 * Find appropriate tracepoint and locking module.
+	 * Note: tpoint can be NULL if it is unloaded (or failed to get module.)
+	 */
+	tpoint = find_tracepoint(name, &mod);
+
+	/* Search existing tracepoint_user */
+	for_each_trace_fprobe(tf, dpos) {
+		if (!trace_fprobe_is_tracepoint(tf))
+			continue;
+		if (!strcmp(tf->symbol, name)) {
+			tracepoint_user_get(tf->tuser);
+			*pmod = no_free_ptr(mod);
+			return tf->tuser;
+		}
+	}
+
+	/* Not found, allocate and register new tracepoint_user. */
+	tuser = tracepoint_user_allocate(tpoint);
+	if (!tuser)
+		return NULL;
+
+	if (tpoint) {
+		/* If the tracepoint is not loaded, tpoint can be NULL. */
+		ret = tracepoint_user_register(tuser);
+		if (ret)
+			return ERR_PTR(ret);
+	}
+
+	*pmod = no_free_ptr(mod);
+	return_ptr(tuser);
+}
+
 /*
  * Note that we don't verify the fetch_insn code, since it does not come
  * from user space.
@@ -410,6 +542,8 @@ static void free_trace_fprobe(struct trace_fprobe *tf)
 {
 	if (tf) {
 		trace_probe_cleanup(&tf->tp);
+		if (tf->tuser)
+			tracepoint_user_put(tf->tuser);
 		kfree(tf->symbol);
 		kfree(tf);
 	}
@@ -424,7 +558,7 @@ DEFINE_FREE(free_trace_fprobe, struct trace_fprobe *, if (!IS_ERR_OR_NULL(_T)) f
 static struct trace_fprobe *alloc_trace_fprobe(const char *group,
 					       const char *event,
 					       const char *symbol,
-					       struct tracepoint *tpoint,
+					       struct tracepoint_user *tuser,
 					       int nargs, bool is_return)
 {
 	struct trace_fprobe *tf __free(free_trace_fprobe) = NULL;
@@ -443,7 +577,7 @@ static struct trace_fprobe *alloc_trace_fprobe(const char *group,
 	else
 		tf->fp.entry_handler = fentry_dispatcher;
 
-	tf->tpoint = tpoint;
+	tf->tuser = tuser;
 
 	ret = trace_probe_init(&tf->tp, event, group, false, nargs);
 	if (ret < 0)
@@ -709,19 +843,11 @@ static int unregister_fprobe_event(struct trace_fprobe *tf)
 
 static int __regsiter_tracepoint_fprobe(struct trace_fprobe *tf)
 {
-	struct tracepoint *tpoint = tf->tpoint;
-	unsigned long ip = (unsigned long)tpoint->probestub;
-	int ret;
+	unsigned long ip = tracepoint_user_ip(tf->tuser);
+
+	if (!ip)
+		return -ENOENT;
 
-	/*
-	 * Here, we do 2 steps to enable fprobe on a tracepoint.
-	 * At first, put __probestub_##TP function on the tracepoint
-	 * and put a fprobe on the stub function.
-	 */
-	ret = tracepoint_probe_register_prio_may_exist(tpoint,
-				tpoint->probestub, NULL, 0);
-	if (ret < 0)
-		return ret;
 	return register_fprobe_ips(&tf->fp, &ip, 1);
 }
 
@@ -753,7 +879,7 @@ static int __register_trace_fprobe(struct trace_fprobe *tf)
 	if (trace_fprobe_is_tracepoint(tf)) {
 
 		/* This tracepoint is not loaded yet */
-		if (tf->tpoint == TRACEPOINT_STUB)
+		if (!tracepoint_user_is_registered(tf->tuser))
 			return 0;
 
 		return __regsiter_tracepoint_fprobe(tf);
@@ -770,9 +896,8 @@ static void __unregister_trace_fprobe(struct trace_fprobe *tf)
 		unregister_fprobe(&tf->fp);
 		memset(&tf->fp, 0, sizeof(tf->fp));
 		if (trace_fprobe_is_tracepoint(tf)) {
-			tracepoint_probe_unregister(tf->tpoint,
-					tf->tpoint->probestub, NULL);
-			tf->tpoint = NULL;
+			tracepoint_user_put(tf->tuser);
+			tf->tuser = NULL;
 		}
 	}
 }
@@ -988,7 +1113,7 @@ static int __tracepoint_probe_module_cb(struct notifier_block *self,
 					unsigned long val, void *data)
 {
 	struct tp_module *tp_mod = data;
-	struct tracepoint *tpoint;
+	struct tracepoint_user *tuser;
 	struct trace_fprobe *tf;
 	struct dyn_event *pos;
 
@@ -999,21 +1124,46 @@ static int __tracepoint_probe_module_cb(struct notifier_block *self,
 	for_each_trace_fprobe(tf, pos) {
 		if (!trace_fprobe_is_tracepoint(tf))
 			continue;
-		if (val == MODULE_STATE_COMING && tf->tpoint == TRACEPOINT_STUB) {
+
+		if (val == MODULE_STATE_COMING) {
+			/*
+			 * If any tracepoint used by tprobe is in the module,
+			 * register the stub.
+			 */
+			struct tracepoint *tpoint;
+
 			tpoint = find_tracepoint_in_module(tp_mod->mod, tf->symbol);
-			if (tpoint) {
-				tf->tpoint = tpoint;
-				if (!WARN_ON_ONCE(__regsiter_tracepoint_fprobe(tf)) &&
-				    trace_probe_is_enabled(&tf->tp))
-					reenable_trace_fprobe(tf);
+			/* This is not a tracepoint in this module. Skip it. */
+			if (!tpoint)
+				continue;
+
+			tuser = tf->tuser;
+			/* If the tracepoint is not registered yet, register it. */
+			if (!tracepoint_user_is_registered(tuser)) {
+				tuser->tpoint = tpoint;
+				if (WARN_ON_ONCE(tracepoint_user_register(tuser)))
+					continue;
 			}
-		} else if (val == MODULE_STATE_GOING &&
-			   tf->tpoint != TRACEPOINT_STUB &&
-			   within_module((unsigned long)tf->tpoint->probestub, tp_mod->mod)) {
-			unregister_fprobe(&tf->fp);
-			tracepoint_probe_unregister(tf->tpoint,
-				tf->tpoint->probestub, NULL);
-			tf->tpoint = TRACEPOINT_STUB;
+
+			/* Finally enable fprobe on this module. */
+			if (!WARN_ON_ONCE(__regsiter_tracepoint_fprobe(tf)) &&
+			    trace_probe_is_enabled(&tf->tp))
+				reenable_trace_fprobe(tf);
+		} else if (val == MODULE_STATE_GOING) {
+			tuser = tf->tuser;
+			/* Unregister all tracepoint_user in this module. */
+			if (tracepoint_user_is_registered(tuser) &&
+			    tracepoint_user_within_module(tuser, tp_mod->mod))
+				tracepoint_user_unregister(tuser);
+
+			/*
+			 * Here we need to handle shared tracepoint_user case.
+			 * Such tuser is unregistered, but trace_fprobe itself
+			 * is registered. (Note this only handles tprobes.)
+			 */
+			if (!tracepoint_user_is_registered(tuser) &&
+			    trace_fprobe_is_registered(tf))
+				unregister_fprobe(&tf->fp);
 		}
 	}
 	mutex_unlock(&event_mutex);
@@ -1082,7 +1232,9 @@ static int parse_symbol_and_return(int argc, const char *argv[],
 	return 0;
 }
 
-DEFINE_FREE(module_put, struct module *, if (_T) module_put(_T))
+DEFINE_FREE(tuser_put, struct tracepoint_user *,
+	if (!IS_ERR_OR_NULL(_T))
+		tracepoint_user_put(_T))
 
 static int trace_fprobe_create_internal(int argc, const char *argv[],
 					struct traceprobe_parse_context *ctx)
@@ -1112,6 +1264,8 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 	 *  FETCHARG:TYPE : use TYPE instead of unsigned long.
 	 */
 	struct trace_fprobe *tf __free(free_trace_fprobe) = NULL;
+	struct tracepoint_user *tuser __free(tuser_put) = NULL;
+	struct module *mod __free(module_put) = NULL;
 	int i, new_argc = 0, ret = 0;
 	bool is_return = false;
 	char *symbol __free(kfree) = NULL;
@@ -1123,8 +1277,6 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 	char abuf[MAX_BTF_ARGS_LEN];
 	char *dbuf __free(kfree) = NULL;
 	bool is_tracepoint = false;
-	struct module *tp_mod __free(module_put) = NULL;
-	struct tracepoint *tpoint = NULL;
 
 	if ((argv[0][0] != 'f' && argv[0][0] != 't') || argc < 2)
 		return -ECANCELED;
@@ -1177,20 +1329,18 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 
 	if (is_tracepoint) {
 		ctx->flags |= TPARG_FL_TPOINT;
-		tpoint = find_tracepoint(symbol, &tp_mod);
-		if (tpoint) {
-			ctx->funcname = kallsyms_lookup(
-				(unsigned long)tpoint->probestub,
-				NULL, NULL, NULL, sbuf);
-		} else if (IS_ENABLED(CONFIG_MODULES)) {
-				/* This *may* be loaded afterwards */
-				tpoint = TRACEPOINT_STUB;
-				ctx->funcname = symbol;
-		} else {
+		tuser = trace_fprobe_get_tracepoint_user(symbol, &mod);
+		if (!tuser)
+			return -ENOMEM;
+		if (IS_ERR(tuser)) {
 			trace_probe_log_set_index(1);
 			trace_probe_log_err(0, NO_TRACEPOINT);
-			return -EINVAL;
+			return PTR_ERR(tuser);
 		}
+		ctx->funcname = tracepoint_user_lookup(tuser, sbuf);
+		/* If tracepoint is not loaded yet, use symbol name as funcname. */
+		if (!ctx->funcname)
+			ctx->funcname = symbol;
 	} else
 		ctx->funcname = symbol;
 
@@ -1211,13 +1361,14 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 		return ret;
 
 	/* setup a probe */
-	tf = alloc_trace_fprobe(group, event, symbol, tpoint, argc, is_return);
+	tf = alloc_trace_fprobe(group, event, symbol, tuser, argc, is_return);
 	if (IS_ERR(tf)) {
 		ret = PTR_ERR(tf);
 		/* This must return -ENOMEM, else there is a bug */
 		WARN_ON_ONCE(ret != -ENOMEM);
 		return ret;
 	}
+	tuser = NULL; /* Move tuser to tf. */
 
 	/* parse arguments */
 	for (i = 0; i < argc; i++) {


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 5/7] tracing: fprobe-events: Register fprobe-events only when it is enabled
  2025-03-31 15:35 [PATCH v2 0/7] tracing: fprobe-events: Register fprobe only when the event is enabled Masami Hiramatsu (Google)
                   ` (3 preceding siblings ...)
  2025-03-31 15:36 ` [PATCH v2 4/7] tracing: tprobe-events: Support multiple tprobes on the same tracepoint Masami Hiramatsu (Google)
@ 2025-03-31 15:36 ` Masami Hiramatsu (Google)
  2025-03-31 15:36 ` [PATCH v2 6/7] selftests: tracing: Enable fprobe events before checking enable_functions Masami Hiramatsu (Google)
  2025-03-31 15:36 ` [PATCH v2 7/7] tracing: tprobe-events: Register tracepoint when enable tprobe event Masami Hiramatsu (Google)
  6 siblings, 0 replies; 10+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-03-31 15:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Currently fprobe events are registered when it is defined. Thus it will
give some overhead even if it is disabled. This changes it to register the
fprobe only when it is enabled.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v2:
  - export fprobe_count_ips_from_filter() to just count addresses instead
    of allocating address list.
---
 include/linux/fprobe.h      |    5 +
 kernel/trace/fprobe.c       |    5 +
 kernel/trace/trace_fprobe.c |  237 +++++++++++++++++++++----------------------
 3 files changed, 126 insertions(+), 121 deletions(-)

diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h
index 702099f08929..7964db96e41a 100644
--- a/include/linux/fprobe.h
+++ b/include/linux/fprobe.h
@@ -94,6 +94,7 @@ int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num);
 int register_fprobe_syms(struct fprobe *fp, const char **syms, int num);
 int unregister_fprobe(struct fprobe *fp);
 bool fprobe_is_registered(struct fprobe *fp);
+int fprobe_count_ips_from_filter(const char *filter, const char *notfilter);
 #else
 static inline int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter)
 {
@@ -115,6 +116,10 @@ static inline bool fprobe_is_registered(struct fprobe *fp)
 {
 	return false;
 }
+static inline int fprobe_count_ips_from_filter(const char *filter, const char *notfilter)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 
 /**
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 95c6e3473a76..50eb7b718c4d 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -647,6 +647,11 @@ static int fprobe_init(struct fprobe *fp, unsigned long *addrs, int num)
 
 #define FPROBE_IPS_MAX	INT_MAX
 
+int fprobe_count_ips_from_filter(const char *filter, const char *notfilter)
+{
+	return get_ips_from_filter(filter, notfilter, NULL, NULL, FPROBE_IPS_MAX);
+}
+
 /**
  * register_fprobe() - Register fprobe to ftrace by pattern.
  * @fp: A fprobe data structure to be registered.
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 150975237a20..cee5153da7ec 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -600,98 +600,6 @@ static struct trace_fprobe *find_trace_fprobe(const char *event,
 	return NULL;
 }
 
-static inline int __enable_trace_fprobe(struct trace_fprobe *tf)
-{
-	if (trace_fprobe_is_registered(tf))
-		enable_fprobe(&tf->fp);
-
-	return 0;
-}
-
-static void __disable_trace_fprobe(struct trace_probe *tp)
-{
-	struct trace_fprobe *tf;
-
-	list_for_each_entry(tf, trace_probe_probe_list(tp), tp.list) {
-		if (!trace_fprobe_is_registered(tf))
-			continue;
-		disable_fprobe(&tf->fp);
-	}
-}
-
-/*
- * Enable trace_probe
- * if the file is NULL, enable "perf" handler, or enable "trace" handler.
- */
-static int enable_trace_fprobe(struct trace_event_call *call,
-			       struct trace_event_file *file)
-{
-	struct trace_probe *tp;
-	struct trace_fprobe *tf;
-	bool enabled;
-	int ret = 0;
-
-	tp = trace_probe_primary_from_call(call);
-	if (WARN_ON_ONCE(!tp))
-		return -ENODEV;
-	enabled = trace_probe_is_enabled(tp);
-
-	/* This also changes "enabled" state */
-	if (file) {
-		ret = trace_probe_add_file(tp, file);
-		if (ret)
-			return ret;
-	} else
-		trace_probe_set_flag(tp, TP_FLAG_PROFILE);
-
-	if (!enabled) {
-		list_for_each_entry(tf, trace_probe_probe_list(tp), tp.list) {
-			/* TODO: check the fprobe is gone */
-			__enable_trace_fprobe(tf);
-		}
-	}
-
-	return 0;
-}
-
-/*
- * Disable trace_probe
- * if the file is NULL, disable "perf" handler, or disable "trace" handler.
- */
-static int disable_trace_fprobe(struct trace_event_call *call,
-				struct trace_event_file *file)
-{
-	struct trace_probe *tp;
-
-	tp = trace_probe_primary_from_call(call);
-	if (WARN_ON_ONCE(!tp))
-		return -ENODEV;
-
-	if (file) {
-		if (!trace_probe_get_file_link(tp, file))
-			return -ENOENT;
-		if (!trace_probe_has_single_file(tp))
-			goto out;
-		trace_probe_clear_flag(tp, TP_FLAG_TRACE);
-	} else
-		trace_probe_clear_flag(tp, TP_FLAG_PROFILE);
-
-	if (!trace_probe_is_enabled(tp))
-		__disable_trace_fprobe(tp);
-
- out:
-	if (file)
-		/*
-		 * Synchronization is done in below function. For perf event,
-		 * file == NULL and perf_trace_event_unreg() calls
-		 * tracepoint_synchronize_unregister() to ensure synchronize
-		 * event. We don't need to care about it.
-		 */
-		trace_probe_remove_file(tp, file);
-
-	return 0;
-}
-
 /* Event entry printers */
 static enum print_line_t
 print_fentry_event(struct trace_iterator *iter, int flags,
@@ -851,6 +759,29 @@ static int __regsiter_tracepoint_fprobe(struct trace_fprobe *tf)
 	return register_fprobe_ips(&tf->fp, &ip, 1);
 }
 
+/* Returns an error if the target function is not available, or 0 */
+static int trace_fprobe_verify_target(struct trace_fprobe *tf)
+{
+	int ret;
+
+	if (trace_fprobe_is_tracepoint(tf)) {
+
+		/* This tracepoint is not loaded yet */
+		if (!tracepoint_user_is_registered(tf->tuser))
+			return 0;
+
+		/* We assume all stab function is tracable. */
+		return tracepoint_user_ip(tf->tuser) ? 0 : -ENOENT;
+	}
+
+	/*
+	 * Note: since we don't lock the module, even if this succeeded,
+	 * register_fprobe() later can fail.
+	 */
+	ret = fprobe_count_ips_from_filter(tf->symbol, NULL);
+	return (ret < 0) ? ret : 0;
+}
+
 /* Internal register function - just handle fprobe and flags */
 static int __register_trace_fprobe(struct trace_fprobe *tf)
 {
@@ -870,11 +801,7 @@ static int __register_trace_fprobe(struct trace_fprobe *tf)
 			return ret;
 	}
 
-	/* Set/clear disabled flag according to tp->flag */
-	if (trace_probe_is_enabled(&tf->tp))
-		tf->fp.flags &= ~FPROBE_FL_DISABLED;
-	else
-		tf->fp.flags |= FPROBE_FL_DISABLED;
+	tf->fp.flags &= ~FPROBE_FL_DISABLED;
 
 	if (trace_fprobe_is_tracepoint(tf)) {
 
@@ -895,10 +822,10 @@ static void __unregister_trace_fprobe(struct trace_fprobe *tf)
 	if (trace_fprobe_is_registered(tf)) {
 		unregister_fprobe(&tf->fp);
 		memset(&tf->fp, 0, sizeof(tf->fp));
-		if (trace_fprobe_is_tracepoint(tf)) {
-			tracepoint_user_put(tf->tuser);
-			tf->tuser = NULL;
-		}
+	}
+	if (trace_fprobe_is_tracepoint(tf)) {
+		tracepoint_user_put(tf->tuser);
+		tf->tuser = NULL;
 	}
 }
 
@@ -958,7 +885,7 @@ static bool trace_fprobe_has_same_fprobe(struct trace_fprobe *orig,
 	return false;
 }
 
-static int append_trace_fprobe(struct trace_fprobe *tf, struct trace_fprobe *to)
+static int append_trace_fprobe_event(struct trace_fprobe *tf, struct trace_fprobe *to)
 {
 	int ret;
 
@@ -986,7 +913,7 @@ static int append_trace_fprobe(struct trace_fprobe *tf, struct trace_fprobe *to)
 	if (ret)
 		return ret;
 
-	ret = __register_trace_fprobe(tf);
+	ret = trace_fprobe_verify_target(tf);
 	if (ret)
 		trace_probe_unlink(&tf->tp);
 	else
@@ -995,8 +922,8 @@ static int append_trace_fprobe(struct trace_fprobe *tf, struct trace_fprobe *to)
 	return ret;
 }
 
-/* Register a trace_probe and probe_event */
-static int register_trace_fprobe(struct trace_fprobe *tf)
+/* Register a trace_probe and probe_event, and check the fprobe is available. */
+static int register_trace_fprobe_event(struct trace_fprobe *tf)
 {
 	struct trace_fprobe *old_tf;
 	int ret;
@@ -1006,7 +933,7 @@ static int register_trace_fprobe(struct trace_fprobe *tf)
 	old_tf = find_trace_fprobe(trace_probe_name(&tf->tp),
 				   trace_probe_group_name(&tf->tp));
 	if (old_tf)
-		return append_trace_fprobe(tf, old_tf);
+		return append_trace_fprobe_event(tf, old_tf);
 
 	/* Register new event */
 	ret = register_fprobe_event(tf);
@@ -1019,8 +946,8 @@ static int register_trace_fprobe(struct trace_fprobe *tf)
 		return ret;
 	}
 
-	/* Register fprobe */
-	ret = __register_trace_fprobe(tf);
+	/* Verify fprobe is sane. */
+	ret = trace_fprobe_verify_target(tf);
 	if (ret < 0)
 		unregister_fprobe_event(tf);
 	else
@@ -1084,15 +1011,6 @@ static struct tracepoint *find_tracepoint(const char *tp_name,
 }
 
 #ifdef CONFIG_MODULES
-static void reenable_trace_fprobe(struct trace_fprobe *tf)
-{
-	struct trace_probe *tp = &tf->tp;
-
-	list_for_each_entry(tf, trace_probe_probe_list(tp), tp.list) {
-		__enable_trace_fprobe(tf);
-	}
-}
-
 /*
  * Find a tracepoint from specified module. In this case, this does not get the
  * module's refcount. The caller must ensure the module is not freed.
@@ -1146,9 +1064,8 @@ static int __tracepoint_probe_module_cb(struct notifier_block *self,
 			}
 
 			/* Finally enable fprobe on this module. */
-			if (!WARN_ON_ONCE(__regsiter_tracepoint_fprobe(tf)) &&
-			    trace_probe_is_enabled(&tf->tp))
-				reenable_trace_fprobe(tf);
+			if (trace_probe_is_enabled(&tf->tp) && !trace_fprobe_is_registered(tf))
+				WARN_ON_ONCE(__regsiter_tracepoint_fprobe(tf));
 		} else if (val == MODULE_STATE_GOING) {
 			tuser = tf->tuser;
 			/* Unregister all tracepoint_user in this module. */
@@ -1394,7 +1311,7 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 	if (ret < 0)
 		return ret;
 
-	ret = register_trace_fprobe(tf);
+	ret = register_trace_fprobe_event(tf);
 	if (ret) {
 		trace_probe_log_set_index(1);
 		if (ret == -EILSEQ)
@@ -1463,6 +1380,84 @@ static int trace_fprobe_show(struct seq_file *m, struct dyn_event *ev)
 	return 0;
 }
 
+/*
+ * Enable trace_probe
+ * if the file is NULL, enable "perf" handler, or enable "trace" handler.
+ */
+static int enable_trace_fprobe(struct trace_event_call *call,
+			       struct trace_event_file *file)
+{
+	struct trace_probe *tp;
+	struct trace_fprobe *tf;
+	bool enabled;
+	int ret = 0;
+
+	tp = trace_probe_primary_from_call(call);
+	if (WARN_ON_ONCE(!tp))
+		return -ENODEV;
+	enabled = trace_probe_is_enabled(tp);
+
+	/* This also changes "enabled" state */
+	if (file) {
+		ret = trace_probe_add_file(tp, file);
+		if (ret)
+			return ret;
+	} else
+		trace_probe_set_flag(tp, TP_FLAG_PROFILE);
+
+	if (!enabled) {
+		list_for_each_entry(tf, trace_probe_probe_list(tp), tp.list) {
+			ret = __register_trace_fprobe(tf);
+			if (ret < 0)
+				return ret;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * Disable trace_probe
+ * if the file is NULL, disable "perf" handler, or disable "trace" handler.
+ */
+static int disable_trace_fprobe(struct trace_event_call *call,
+				struct trace_event_file *file)
+{
+	struct trace_fprobe *tf;
+	struct trace_probe *tp;
+
+	tp = trace_probe_primary_from_call(call);
+	if (WARN_ON_ONCE(!tp))
+		return -ENODEV;
+
+	if (file) {
+		if (!trace_probe_get_file_link(tp, file))
+			return -ENOENT;
+		if (!trace_probe_has_single_file(tp))
+			goto out;
+		trace_probe_clear_flag(tp, TP_FLAG_TRACE);
+	} else
+		trace_probe_clear_flag(tp, TP_FLAG_PROFILE);
+
+	if (!trace_probe_is_enabled(tp)) {
+		list_for_each_entry(tf, trace_probe_probe_list(tp), tp.list) {
+			unregister_fprobe(&tf->fp);
+		}
+	}
+
+ out:
+	if (file)
+		/*
+		 * Synchronization is done in below function. For perf event,
+		 * file == NULL and perf_trace_event_unreg() calls
+		 * tracepoint_synchronize_unregister() to ensure synchronize
+		 * event. We don't need to care about it.
+		 */
+		trace_probe_remove_file(tp, file);
+
+	return 0;
+}
+
 /*
  * called by perf_trace_init() or __ftrace_set_clr_event() under event_mutex.
  */


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 6/7] selftests: tracing: Enable fprobe events before checking enable_functions
  2025-03-31 15:35 [PATCH v2 0/7] tracing: fprobe-events: Register fprobe only when the event is enabled Masami Hiramatsu (Google)
                   ` (4 preceding siblings ...)
  2025-03-31 15:36 ` [PATCH v2 5/7] tracing: fprobe-events: Register fprobe-events only when it is enabled Masami Hiramatsu (Google)
@ 2025-03-31 15:36 ` Masami Hiramatsu (Google)
  2025-03-31 15:36 ` [PATCH v2 7/7] tracing: tprobe-events: Register tracepoint when enable tprobe event Masami Hiramatsu (Google)
  6 siblings, 0 replies; 10+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-03-31 15:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Since the fprobe is not registered before enabling the fprobe events,
enable_functions is also empty before enabling it. Thus the tests
which checking enable_functions must ensure the event is enabled before
testing the enable_functions.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 .../ftrace/test.d/dynevent/add_remove_fprobe.tc    |   30 +++++++++++++-------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe.tc b/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe.tc
index 73f6c6fcecab..2506f464811b 100644
--- a/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe.tc
+++ b/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe.tc
@@ -16,6 +16,18 @@ ocnt=`cat enabled_functions | wc -l`
 
 echo "f:myevent1 $PLACE" >> dynamic_events
 
+echo "f:myevent2 $PLACE%return" >> dynamic_events
+
+# add another event
+echo "f:myevent3 $PLACE2" >> dynamic_events
+
+grep -q myevent1 dynamic_events
+grep -q myevent2 dynamic_events
+grep -q myevent3 dynamic_events
+test -d events/fprobes/myevent1
+test -d events/fprobes/myevent2
+
+echo 1 > events/fprobes/myevent1/enable
 # Make sure the event is attached and is the only one
 grep -q $PLACE enabled_functions
 cnt=`cat enabled_functions | wc -l`
@@ -23,29 +35,22 @@ if [ $cnt -ne $((ocnt + 1)) ]; then
 	exit_fail
 fi
 
-echo "f:myevent2 $PLACE%return" >> dynamic_events
-
+echo 1 > events/fprobes/myevent2/enable
 # It should till be the only attached function
 cnt=`cat enabled_functions | wc -l`
 if [ $cnt -ne $((ocnt + 1)) ]; then
 	exit_fail
 fi
 
-# add another event
-echo "f:myevent3 $PLACE2" >> dynamic_events
-
+echo 1 > events/fprobes/myevent3/enable
+# If the function is different, the attached function should be increased
 grep -q $PLACE2 enabled_functions
 cnt=`cat enabled_functions | wc -l`
 if [ $cnt -ne $((ocnt + 2)) ]; then
 	exit_fail
 fi
 
-grep -q myevent1 dynamic_events
-grep -q myevent2 dynamic_events
-grep -q myevent3 dynamic_events
-test -d events/fprobes/myevent1
-test -d events/fprobes/myevent2
-
+echo 0 > events/fprobes/myevent2/enable
 echo "-:myevent2" >> dynamic_events
 
 grep -q myevent1 dynamic_events
@@ -57,6 +62,7 @@ if [ $cnt -ne $((ocnt + 2)) ]; then
 	exit_fail
 fi
 
+echo 0 > events/fprobes/enable
 echo > dynamic_events
 
 # Should have none left
@@ -67,12 +73,14 @@ fi
 
 echo "f:myevent4 $PLACE" >> dynamic_events
 
+echo 1 > events/fprobes/myevent4/enable
 # Should only have one enabled
 cnt=`cat enabled_functions | wc -l`
 if [ $cnt -ne $((ocnt + 1)) ]; then
 	exit_fail
 fi
 
+echo 0 > events/fprobes/enable
 echo > dynamic_events
 
 # Should have none left


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 7/7] tracing: tprobe-events: Register tracepoint when enable tprobe event
  2025-03-31 15:35 [PATCH v2 0/7] tracing: fprobe-events: Register fprobe only when the event is enabled Masami Hiramatsu (Google)
                   ` (5 preceding siblings ...)
  2025-03-31 15:36 ` [PATCH v2 6/7] selftests: tracing: Enable fprobe events before checking enable_functions Masami Hiramatsu (Google)
@ 2025-03-31 15:36 ` Masami Hiramatsu (Google)
  6 siblings, 0 replies; 10+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-03-31 15:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

As same as fprobe, register tracepoint stub function only when enabling
tprobe events. The major changes are introducing a list of
tracepoint_user and its lock, and tprobe_event_module_nb, which is
another module notifier for module loading/unloading.  By spliting the
lock from event_mutex and a module notifier for trace_fprobe, it
solved AB-BA lock dependency issue between event_mutex and
tracepoint_module_list_mutex.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v2:
  - Add some input error checkings.
---
 kernel/trace/trace_fprobe.c |  382 ++++++++++++++++++++++++-------------------
 1 file changed, 217 insertions(+), 165 deletions(-)

diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index cee5153da7ec..fb94f935c5d3 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -7,7 +7,9 @@
 #include <asm/ptrace.h>
 
 #include <linux/fprobe.h>
+#include <linux/list.h>
 #include <linux/module.h>
+#include <linux/mutex.h>
 #include <linux/rculist.h>
 #include <linux/security.h>
 #include <linux/tracepoint.h>
@@ -37,15 +39,21 @@ static struct dyn_event_operations trace_fprobe_ops = {
 	.match = trace_fprobe_match,
 };
 
+/* List of tracepoint_user */
+static LIST_HEAD(tracepoint_user_list);
+static DEFINE_MUTEX(tracepoint_user_mutex);
+
+/* While living tracepoint_user, @tpoint can be NULL and @refcount != 0. */
 struct tracepoint_user {
+	struct list_head	list;
+	const char		*name;
 	struct tracepoint	*tpoint;
 	unsigned int		refcount;
 };
 
-static bool tracepoint_user_is_registered(struct tracepoint_user *tuser)
-{
-	return tuser && tuser->tpoint;
-}
+/* NOTE: you must lock tracepoint_user_mutex. */
+#define for_each_tracepoint_user(tuser)		\
+	list_for_each_entry(tuser, &tracepoint_user_list, list)
 
 static int tracepoint_user_register(struct tracepoint_user *tuser)
 {
@@ -75,58 +83,111 @@ static unsigned long tracepoint_user_ip(struct tracepoint_user *tuser)
 	return (unsigned long)tuser->tpoint->probestub;
 }
 
-static bool tracepoint_user_within_module(struct tracepoint_user *tuser,
-					  struct module *mod)
+static void __tracepoint_user_free(struct tracepoint_user *tuser)
 {
-	return within_module(tracepoint_user_ip(tuser), mod);
+	if (!tuser)
+		return;
+	kfree(tuser->name);
+	kfree(tuser);
 }
 
-static struct tracepoint_user *tracepoint_user_allocate(struct tracepoint *tpoint)
+DEFINE_FREE(tuser_free, struct tracepoint_user *, __tracepoint_user_free(_T))
+
+static struct tracepoint_user *__tracepoint_user_init(const char *name, struct tracepoint *tpoint)
 {
-	struct tracepoint_user *tuser;
+	struct tracepoint_user *tuser __free(tuser_free) = NULL;
+	int ret;
 
 	tuser = kzalloc(sizeof(*tuser), GFP_KERNEL);
 	if (!tuser)
 		return NULL;
+	tuser->name = kstrdup(name, GFP_KERNEL);
+	if (!tuser->name)
+		return NULL;
+
+	if (tpoint) {
+		ret = tracepoint_user_register(tuser);
+		if (ret)
+			return ERR_PTR(ret);
+	}
+
 	tuser->tpoint = tpoint;
 	tuser->refcount = 1;
+	INIT_LIST_HEAD(&tuser->list);
+	list_add(&tuser->list, &tracepoint_user_list);
 
-	return tuser;
+	return_ptr(tuser);
 }
 
-/* These must be called with event_mutex */
-static void tracepoint_user_get(struct tracepoint_user *tuser)
-{
-	tuser->refcount++;
-}
+static struct tracepoint *find_tracepoint(const char *tp_name,
+	struct module **tp_mod);
 
-static void tracepoint_user_put(struct tracepoint_user *tuser)
+/*
+ * Get tracepoint_user if exist, or allocate new one and register it.
+ * If tracepoint is on a module, get its refcounter too.
+ * This returns errno or NULL (not loaded yet) or tracepoint_user.
+ */
+static struct tracepoint_user *tracepoint_user_find_get(const char *name, struct module **pmod)
 {
-	if (--tuser->refcount > 0)
-		return;
+	struct module *mod __free(module_put) = NULL;
+	struct tracepoint_user *tuser;
+	struct tracepoint *tpoint;
 
-	if (tracepoint_user_is_registered(tuser))
-		tracepoint_user_unregister(tuser);
-	kfree(tuser);
+	if (!name || !pmod)
+		return ERR_PTR(-EINVAL);
+
+	/* Get and lock the module which has tracepoint. */
+	tpoint = find_tracepoint(name, &mod);
+
+	guard(mutex)(&tracepoint_user_mutex);
+	/* Search existing tracepoint_user */
+	for_each_tracepoint_user(tuser) {
+		if (!strcmp(tuser->name, name)) {
+			tuser->refcount++;
+			*pmod = no_free_ptr(mod);
+			return tuser;
+		}
+	}
+
+	/* The corresponding tracepoint_user is not found. */
+	tuser = __tracepoint_user_init(name, tpoint);
+	if (!IS_ERR_OR_NULL(tuser))
+		*pmod = no_free_ptr(mod);
+
+	return tuser;
 }
 
-static const char *tracepoint_user_lookup(struct tracepoint_user *tuser, char *buf)
+static void tracepoint_user_put(struct tracepoint_user *tuser)
 {
-	struct tracepoint *tpoint = tuser->tpoint;
+	scoped_guard(mutex, &tracepoint_user_mutex) {
+		if (--tuser->refcount > 0)
+			return;
 
-	if (!tpoint)
-		return NULL;
+		list_del(&tuser->list);
+		tracepoint_user_unregister(tuser);
+	}
 
-	return kallsyms_lookup((unsigned long)tpoint->probestub, NULL, NULL, NULL, buf);
+	__tracepoint_user_free(tuser);
 }
 
+DEFINE_FREE(tuser_put, struct tracepoint_user *,
+	if (!IS_ERR_OR_NULL(_T))
+		tracepoint_user_put(_T))
+
 /*
  * Fprobe event core functions
  */
+
+/*
+ * @tprobe is true for tracepoint probe.
+ * @tuser can be NULL if the trace_fprobe is disabled or the tracepoint is not
+ * loaded with a module. If @tuser != NULL, this trace_fprobe is enabled.
+ */
 struct trace_fprobe {
 	struct dyn_event	devent;
 	struct fprobe		fp;
 	const char		*symbol;
+	bool			tprobe;
 	struct tracepoint_user	*tuser;
 	struct trace_probe	tp;
 };
@@ -157,7 +218,7 @@ static bool trace_fprobe_is_return(struct trace_fprobe *tf)
 
 static bool trace_fprobe_is_tracepoint(struct trace_fprobe *tf)
 {
-	return tf->tuser != NULL;
+	return tf->tprobe;
 }
 
 static const char *trace_fprobe_symbol(struct trace_fprobe *tf)
@@ -207,56 +268,6 @@ static bool trace_fprobe_is_registered(struct trace_fprobe *tf)
 	return fprobe_is_registered(&tf->fp);
 }
 
-static struct tracepoint *find_tracepoint(const char *tp_name,
-	struct module **tp_mod);
-
-/*
- * Get tracepoint_user if exist, or allocate new one. If tracepoint is on a
- * module, get its refcounter.
- */
-static struct tracepoint_user *
-trace_fprobe_get_tracepoint_user(const char *name, struct module **pmod)
-{
-	struct tracepoint_user *tuser __free(kfree) = NULL;
-	struct tracepoint *tpoint;
-	struct trace_fprobe *tf;
-	struct dyn_event *dpos;
-	struct module *mod __free(module_put) = NULL;
-	int ret;
-
-	/*
-	 * Find appropriate tracepoint and locking module.
-	 * Note: tpoint can be NULL if it is unloaded (or failed to get module.)
-	 */
-	tpoint = find_tracepoint(name, &mod);
-
-	/* Search existing tracepoint_user */
-	for_each_trace_fprobe(tf, dpos) {
-		if (!trace_fprobe_is_tracepoint(tf))
-			continue;
-		if (!strcmp(tf->symbol, name)) {
-			tracepoint_user_get(tf->tuser);
-			*pmod = no_free_ptr(mod);
-			return tf->tuser;
-		}
-	}
-
-	/* Not found, allocate and register new tracepoint_user. */
-	tuser = tracepoint_user_allocate(tpoint);
-	if (!tuser)
-		return NULL;
-
-	if (tpoint) {
-		/* If the tracepoint is not loaded, tpoint can be NULL. */
-		ret = tracepoint_user_register(tuser);
-		if (ret)
-			return ERR_PTR(ret);
-	}
-
-	*pmod = no_free_ptr(mod);
-	return_ptr(tuser);
-}
-
 /*
  * Note that we don't verify the fetch_insn code, since it does not come
  * from user space.
@@ -558,8 +569,8 @@ DEFINE_FREE(free_trace_fprobe, struct trace_fprobe *, if (!IS_ERR_OR_NULL(_T)) f
 static struct trace_fprobe *alloc_trace_fprobe(const char *group,
 					       const char *event,
 					       const char *symbol,
-					       struct tracepoint_user *tuser,
-					       int nargs, bool is_return)
+					       int nargs, bool is_return,
+					       bool is_tracepoint)
 {
 	struct trace_fprobe *tf __free(free_trace_fprobe) = NULL;
 	int ret = -ENOMEM;
@@ -577,7 +588,7 @@ static struct trace_fprobe *alloc_trace_fprobe(const char *group,
 	else
 		tf->fp.entry_handler = fentry_dispatcher;
 
-	tf->tuser = tuser;
+	tf->tprobe = is_tracepoint;
 
 	ret = trace_probe_init(&tf->tp, event, group, false, nargs);
 	if (ret < 0)
@@ -751,12 +762,35 @@ static int unregister_fprobe_event(struct trace_fprobe *tf)
 
 static int __regsiter_tracepoint_fprobe(struct trace_fprobe *tf)
 {
-	unsigned long ip = tracepoint_user_ip(tf->tuser);
+	struct tracepoint_user *tuser __free(tuser_put) = NULL;
+	struct module *mod __free(module_put) = NULL;
+	unsigned long ip;
+	int ret;
 
-	if (!ip)
-		return -ENOENT;
+	if (WARN_ON_ONCE(tf->tuser))
+		return -EINVAL;
+
+	/* If the tracepoint is in a module, it must be locked in this function. */
+	tuser = tracepoint_user_find_get(tf->symbol, &mod);
+	/* This tracepoint is not loaded yet */
+	if (IS_ERR(tuser))
+		return PTR_ERR(tuser);
+	if (!tuser)
+		return -ENOMEM;
 
-	return register_fprobe_ips(&tf->fp, &ip, 1);
+	/* Register fprobe only if the tracepoint is loaded. */
+	if (tuser->tpoint) {
+		ip = tracepoint_user_ip(tuser);
+		if (WARN_ON_ONCE(!ip))
+			return -ENOENT;
+
+		ret = register_fprobe_ips(&tf->fp, &ip, 1);
+		if (ret < 0)
+			return ret;
+	}
+
+	tf->tuser = no_free_ptr(tuser);
+	return 0;
 }
 
 /* Returns an error if the target function is not available, or 0 */
@@ -764,15 +798,9 @@ static int trace_fprobe_verify_target(struct trace_fprobe *tf)
 {
 	int ret;
 
-	if (trace_fprobe_is_tracepoint(tf)) {
-
-		/* This tracepoint is not loaded yet */
-		if (!tracepoint_user_is_registered(tf->tuser))
-			return 0;
-
-		/* We assume all stab function is tracable. */
-		return tracepoint_user_ip(tf->tuser) ? 0 : -ENOENT;
-	}
+	/* Tracepoint should have a stub function. */
+	if (trace_fprobe_is_tracepoint(tf))
+		return 0;
 
 	/*
 	 * Note: since we don't lock the module, even if this succeeded,
@@ -803,14 +831,8 @@ static int __register_trace_fprobe(struct trace_fprobe *tf)
 
 	tf->fp.flags &= ~FPROBE_FL_DISABLED;
 
-	if (trace_fprobe_is_tracepoint(tf)) {
-
-		/* This tracepoint is not loaded yet */
-		if (!tracepoint_user_is_registered(tf->tuser))
-			return 0;
-
+	if (trace_fprobe_is_tracepoint(tf))
 		return __regsiter_tracepoint_fprobe(tf);
-	}
 
 	/* TODO: handle filter, nofilter or symbol list */
 	return register_fprobe(&tf->fp, tf->symbol, NULL);
@@ -819,11 +841,9 @@ static int __register_trace_fprobe(struct trace_fprobe *tf)
 /* Internal unregister function - just handle fprobe and flags */
 static void __unregister_trace_fprobe(struct trace_fprobe *tf)
 {
-	if (trace_fprobe_is_registered(tf)) {
+	if (trace_fprobe_is_registered(tf))
 		unregister_fprobe(&tf->fp);
-		memset(&tf->fp, 0, sizeof(tf->fp));
-	}
-	if (trace_fprobe_is_tracepoint(tf)) {
+	if (tf->tuser) {
 		tracepoint_user_put(tf->tuser);
 		tf->tuser = NULL;
 	}
@@ -1027,63 +1047,52 @@ static struct tracepoint *find_tracepoint_in_module(struct module *mod,
 	return data.tpoint;
 }
 
+/* These are CONFIG_MODULES=y specific functions. */
+static bool tracepoint_user_within_module(struct tracepoint_user *tuser,
+					  struct module *mod)
+{
+	return within_module(tracepoint_user_ip(tuser), mod);
+}
+
+static int tracepoint_user_register_again(struct tracepoint_user *tuser,
+					  struct tracepoint *tpoint)
+{
+	tuser->tpoint = tpoint;
+	return tracepoint_user_register(tuser);
+}
+
+static void tracepoint_user_unregister_clear(struct tracepoint_user *tuser)
+{
+	tracepoint_user_unregister(tuser);
+	tuser->tpoint = NULL;
+}
+
+/* module callback for tracepoint_user */
 static int __tracepoint_probe_module_cb(struct notifier_block *self,
 					unsigned long val, void *data)
 {
 	struct tp_module *tp_mod = data;
 	struct tracepoint_user *tuser;
-	struct trace_fprobe *tf;
-	struct dyn_event *pos;
+	struct tracepoint *tpoint;
 
 	if (val != MODULE_STATE_GOING && val != MODULE_STATE_COMING)
 		return NOTIFY_DONE;
 
-	mutex_lock(&event_mutex);
-	for_each_trace_fprobe(tf, pos) {
-		if (!trace_fprobe_is_tracepoint(tf))
-			continue;
-
+	mutex_lock(&tracepoint_user_mutex);
+	for_each_tracepoint_user(tuser) {
 		if (val == MODULE_STATE_COMING) {
-			/*
-			 * If any tracepoint used by tprobe is in the module,
-			 * register the stub.
-			 */
-			struct tracepoint *tpoint;
-
-			tpoint = find_tracepoint_in_module(tp_mod->mod, tf->symbol);
 			/* This is not a tracepoint in this module. Skip it. */
+			tpoint = find_tracepoint_in_module(tp_mod->mod, tuser->name);
 			if (!tpoint)
 				continue;
-
-			tuser = tf->tuser;
-			/* If the tracepoint is not registered yet, register it. */
-			if (!tracepoint_user_is_registered(tuser)) {
-				tuser->tpoint = tpoint;
-				if (WARN_ON_ONCE(tracepoint_user_register(tuser)))
-					continue;
-			}
-
-			/* Finally enable fprobe on this module. */
-			if (trace_probe_is_enabled(&tf->tp) && !trace_fprobe_is_registered(tf))
-				WARN_ON_ONCE(__regsiter_tracepoint_fprobe(tf));
-		} else if (val == MODULE_STATE_GOING) {
-			tuser = tf->tuser;
+			WARN_ON_ONCE(tracepoint_user_register_again(tuser, tpoint));
+		} else if (val == MODULE_STATE_GOING &&
+			  tracepoint_user_within_module(tuser, tp_mod->mod)) {
 			/* Unregister all tracepoint_user in this module. */
-			if (tracepoint_user_is_registered(tuser) &&
-			    tracepoint_user_within_module(tuser, tp_mod->mod))
-				tracepoint_user_unregister(tuser);
-
-			/*
-			 * Here we need to handle shared tracepoint_user case.
-			 * Such tuser is unregistered, but trace_fprobe itself
-			 * is registered. (Note this only handles tprobes.)
-			 */
-			if (!tracepoint_user_is_registered(tuser) &&
-			    trace_fprobe_is_registered(tf))
-				unregister_fprobe(&tf->fp);
+			tracepoint_user_unregister_clear(tuser);
 		}
 	}
-	mutex_unlock(&event_mutex);
+	mutex_unlock(&tracepoint_user_mutex);
 
 	return NOTIFY_DONE;
 }
@@ -1091,6 +1100,54 @@ static int __tracepoint_probe_module_cb(struct notifier_block *self,
 static struct notifier_block tracepoint_module_nb = {
 	.notifier_call = __tracepoint_probe_module_cb,
 };
+
+/* module callback for tprobe events */
+static int __tprobe_event_module_cb(struct notifier_block *self,
+				     unsigned long val, void *data)
+{
+	struct trace_fprobe *tf;
+	struct dyn_event *pos;
+	struct module *mod = data;
+
+	if (val != MODULE_STATE_GOING && val != MODULE_STATE_COMING)
+		return NOTIFY_DONE;
+
+	mutex_lock(&event_mutex);
+	for_each_trace_fprobe(tf, pos) {
+		/* Skip fprobe and disabled tprobe events. */
+		if (!trace_fprobe_is_tracepoint(tf) || !tf->tuser)
+			continue;
+
+		/* Before this notification, tracepoint notifier has already done. */
+		if (val == MODULE_STATE_COMING &&
+		    tracepoint_user_within_module(tf->tuser, mod)) {
+			unsigned long ip = tracepoint_user_ip(tf->tuser);
+
+			WARN_ON_ONCE(register_fprobe_ips(&tf->fp, &ip, 1));
+		} else if (val == MODULE_STATE_GOING &&
+			   /*
+			    * tracepoint_user_within_module() does not work here because
+			    * tracepoint_user is already unregistered and cleared tpoint.
+			    * Instead, checking whether the fprobe is registered but
+			    * tpoint is cleared(unregistered). Such unbalance probes
+			    * must be adjusted anyway.
+			    */
+			    trace_fprobe_is_registered(tf) &&
+			    !tf->tuser->tpoint) {
+			unregister_fprobe(&tf->fp);
+		}
+	}
+	mutex_unlock(&event_mutex);
+
+	return NOTIFY_DONE;
+}
+
+/* NOTE: this must be called after tracepoint callback */
+static struct notifier_block tprobe_event_module_nb = {
+	.notifier_call = __tprobe_event_module_cb,
+	/* Make sure this is later than tracepoint module notifier. */
+	.priority = -10,
+};
 #endif /* CONFIG_MODULES */
 
 static int parse_symbol_and_return(int argc, const char *argv[],
@@ -1149,10 +1206,6 @@ static int parse_symbol_and_return(int argc, const char *argv[],
 	return 0;
 }
 
-DEFINE_FREE(tuser_put, struct tracepoint_user *,
-	if (!IS_ERR_OR_NULL(_T))
-		tracepoint_user_put(_T))
-
 static int trace_fprobe_create_internal(int argc, const char *argv[],
 					struct traceprobe_parse_context *ctx)
 {
@@ -1181,7 +1234,6 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 	 *  FETCHARG:TYPE : use TYPE instead of unsigned long.
 	 */
 	struct trace_fprobe *tf __free(free_trace_fprobe) = NULL;
-	struct tracepoint_user *tuser __free(tuser_put) = NULL;
 	struct module *mod __free(module_put) = NULL;
 	int i, new_argc = 0, ret = 0;
 	bool is_return = false;
@@ -1244,21 +1296,19 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 	else
 		ctx->flags |= TPARG_FL_FENTRY;
 
+	ctx->funcname = NULL;
 	if (is_tracepoint) {
+		/* Get tracepoint and lock its module until the end of the registration. */
+		struct tracepoint *tpoint;
+
 		ctx->flags |= TPARG_FL_TPOINT;
-		tuser = trace_fprobe_get_tracepoint_user(symbol, &mod);
-		if (!tuser)
-			return -ENOMEM;
-		if (IS_ERR(tuser)) {
-			trace_probe_log_set_index(1);
-			trace_probe_log_err(0, NO_TRACEPOINT);
-			return PTR_ERR(tuser);
-		}
-		ctx->funcname = tracepoint_user_lookup(tuser, sbuf);
-		/* If tracepoint is not loaded yet, use symbol name as funcname. */
-		if (!ctx->funcname)
-			ctx->funcname = symbol;
-	} else
+		mod = NULL;
+		tpoint = find_tracepoint(symbol, &mod);
+		if (tpoint)
+			ctx->funcname = kallsyms_lookup((unsigned long)tpoint->probestub,
+							NULL, NULL, NULL, sbuf);
+	}
+	if (!ctx->funcname)
 		ctx->funcname = symbol;
 
 	argc -= 2; argv += 2;
@@ -1278,14 +1328,13 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
 		return ret;
 
 	/* setup a probe */
-	tf = alloc_trace_fprobe(group, event, symbol, tuser, argc, is_return);
+	tf = alloc_trace_fprobe(group, event, symbol, argc, is_return, is_tracepoint);
 	if (IS_ERR(tf)) {
 		ret = PTR_ERR(tf);
 		/* This must return -ENOMEM, else there is a bug */
 		WARN_ON_ONCE(ret != -ENOMEM);
 		return ret;
 	}
-	tuser = NULL; /* Move tuser to tf. */
 
 	/* parse arguments */
 	for (i = 0; i < argc; i++) {
@@ -1503,6 +1552,9 @@ static __init int init_fprobe_trace_early(void)
 	ret = register_tracepoint_module_notifier(&tracepoint_module_nb);
 	if (ret)
 		return ret;
+	ret = register_module_notifier(&tprobe_event_module_nb);
+	if (ret)
+		return ret;
 #endif
 
 	return 0;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/7] tracing: fprobe events: Fix possible UAF on modules
  2025-03-31 15:35 ` [PATCH v2 1/7] tracing: fprobe events: Fix possible UAF on modules Masami Hiramatsu (Google)
@ 2025-03-31 15:38   ` Masami Hiramatsu
  0 siblings, 0 replies; 10+ messages in thread
From: Masami Hiramatsu @ 2025-03-31 15:38 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Steven Rostedt, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

Oops, I have already sent this as;

https://lore.kernel.org/all/174342990779.781946.9138388479067729366.stgit@devnote2/

please ignore this.

On Tue,  1 Apr 2025 00:35:36 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Commit ac91052f0ae5 ("tracing: tprobe-events: Fix leakage of module
> refcount") moved try_module_get() from __find_tracepoint_module_cb()
> to find_tracepoint() caller, but that introduced a possible UAF
> because the module can be unloaded before try_module_get(). In this
> case, the module object should be freed too. Thus, try_module_get()
> does not only fail but may access to the freed object.
> 
> To avoid that, try_module_get() in __find_tracepoint_module_cb()
> again.
> 
> Fixes: ac91052f0ae5 ("tracing: tprobe-events: Fix leakage of module refcount")
> Cc: stable@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  kernel/trace/trace_fprobe.c |   26 +++++++++++++++++---------
>  1 file changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
> index 985ff98272da..2cd9ff1049f1 100644
> --- a/kernel/trace/trace_fprobe.c
> +++ b/kernel/trace/trace_fprobe.c
> @@ -919,9 +919,15 @@ static void __find_tracepoint_module_cb(struct tracepoint *tp, struct module *mo
>  	struct __find_tracepoint_cb_data *data = priv;
>  
>  	if (!data->tpoint && !strcmp(data->tp_name, tp->name)) {
> -		data->tpoint = tp;
> -		if (!data->mod)
> +		/* If module is not specified, try getting module refcount. */
> +		if (!data->mod && mod) {
> +			/* If failed to get refcount, ignore this tracepoint. */
> +			if (!try_module_get(mod))
> +				return;
> +
>  			data->mod = mod;
> +		}
> +		data->tpoint = tp;
>  	}
>  }
>  
> @@ -933,7 +939,11 @@ static void __find_tracepoint_cb(struct tracepoint *tp, void *priv)
>  		data->tpoint = tp;
>  }
>  
> -/* Find a tracepoint from kernel and module. */
> +/*
> + * Find a tracepoint from kernel and module. If the tracepoint is on the module,
> + * the module's refcount is incremented and returned as *@tp_mod. Thus, if it is
> + * not NULL, caller must call module_put(*tp_mod) after used the tracepoint.
> + */
>  static struct tracepoint *find_tracepoint(const char *tp_name,
>  					  struct module **tp_mod)
>  {
> @@ -962,7 +972,10 @@ static void reenable_trace_fprobe(struct trace_fprobe *tf)
>  	}
>  }
>  
> -/* Find a tracepoint from specified module. */
> +/*
> + * Find a tracepoint from specified module. In this case, this does not get the
> + * module's refcount. The caller must ensure the module is not freed.
> + */
>  static struct tracepoint *find_tracepoint_in_module(struct module *mod,
>  						    const char *tp_name)
>  {
> @@ -1169,11 +1182,6 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
>  	if (is_tracepoint) {
>  		ctx->flags |= TPARG_FL_TPOINT;
>  		tpoint = find_tracepoint(symbol, &tp_mod);
> -		/* lock module until register this tprobe. */
> -		if (tp_mod && !try_module_get(tp_mod)) {
> -			tpoint = NULL;
> -			tp_mod = NULL;
> -		}
>  		if (tpoint) {
>  			ctx->funcname = kallsyms_lookup(
>  				(unsigned long)tpoint->probestub,
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/7] tracing: fprobe: Cleanup fprobe hash when module unloading
  2025-03-31 15:35 ` [PATCH v2 2/7] tracing: fprobe: Cleanup fprobe hash when module unloading Masami Hiramatsu (Google)
@ 2025-04-07 23:52   ` Masami Hiramatsu
  0 siblings, 0 replies; 10+ messages in thread
From: Masami Hiramatsu @ 2025-04-07 23:52 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Steven Rostedt, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

On Tue,  1 Apr 2025 00:35:44 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Cleanup fprobe address hash table on module unloading because the
> target symbols will be disappeared when unloading module and not
> sure the same symbol is mapped on the same address.
> 
> Note that this is at least disables the fprobes if a part of target
> symbols on the unloaded modules. Unlike kprobes, fprobe does not
> re-enable the probe point by itself. To do that, the caller should
> take care register/unregister fprobe when loading/unloading modules.
> This simplifies the fprobe state managememt related to the module
> loading/unloading.
> 

I think this is a kind of fix for commit 4346ba160409 ("fprobe: Rewrite
 fprobe on function-graph tracer") which introduced the fprobe_hlist_node.

Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")

Thanks,

> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  kernel/trace/fprobe.c |  103 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 101 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index cb86f90d4b1e..95c6e3473a76 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -89,8 +89,11 @@ static bool delete_fprobe_node(struct fprobe_hlist_node *node)
>  {
>  	lockdep_assert_held(&fprobe_mutex);
>  
> -	WRITE_ONCE(node->fp, NULL);
> -	hlist_del_rcu(&node->hlist);
> +	/* Avoid double deleting */
> +	if (READ_ONCE(node->fp) != NULL) {
> +		WRITE_ONCE(node->fp, NULL);
> +		hlist_del_rcu(&node->hlist);
> +	}
>  	return !!find_first_fprobe_node(node->addr);
>  }
>  
> @@ -411,6 +414,102 @@ static void fprobe_graph_remove_ips(unsigned long *addrs, int num)
>  		ftrace_set_filter_ips(&fprobe_graph_ops.ops, addrs, num, 1, 0);
>  }
>  
> +#ifdef CONFIG_MODULES
> +
> +#define FPROBE_IPS_BATCH_INIT 8
> +/* instruction pointer address list */
> +struct fprobe_addr_list {
> +	int index;
> +	int size;
> +	unsigned long *addrs;
> +};
> +
> +static int fprobe_addr_list_add(struct fprobe_addr_list *alist, unsigned long addr)
> +{
> +	unsigned long *addrs;
> +
> +	if (alist->index >= alist->size)
> +		return -ENOMEM;
> +
> +	alist->addrs[alist->index++] = addr;
> +	if (alist->index < alist->size)
> +		return 0;
> +
> +	/* Expand the address list */
> +	addrs = kcalloc(alist->size * 2, sizeof(*addrs), GFP_KERNEL);
> +	if (!addrs)
> +		return -ENOMEM;
> +
> +	memcpy(addrs, alist->addrs, alist->size * sizeof(*addrs));
> +	alist->size *= 2;
> +	kfree(alist->addrs);
> +	alist->addrs = addrs;
> +
> +	return 0;
> +}
> +
> +static void fprobe_remove_node_in_module(struct module *mod, struct hlist_head *head,
> +					struct fprobe_addr_list *alist)
> +{
> +	struct fprobe_hlist_node *node;
> +	int ret = 0;
> +
> +	hlist_for_each_entry_rcu(node, head, hlist) {
> +		if (!within_module(node->addr, mod))
> +			continue;
> +		if (delete_fprobe_node(node))
> +			continue;
> +		/*
> +		 * If failed to update alist, just continue to update hlist.
> +		 * Therefore, at list user handler will not hit anymore.
> +		 */
> +		if (!ret)
> +			ret = fprobe_addr_list_add(alist, node->addr);
> +	}
> +}
> +
> +/* Handle module unloading to manage fprobe_ip_table. */
> +static int fprobe_module_callback(struct notifier_block *nb,
> +				  unsigned long val, void *data)
> +{
> +	struct fprobe_addr_list alist = {.size = FPROBE_IPS_BATCH_INIT};
> +	struct module *mod = data;
> +	int i;
> +
> +	if (val != MODULE_STATE_GOING)
> +		return NOTIFY_DONE;
> +
> +	alist.addrs = kcalloc(alist.size, sizeof(*alist.addrs), GFP_KERNEL);
> +	/* If failed to alloc memory, we can not remove ips from hash. */
> +	if (!alist.addrs)
> +		return NOTIFY_DONE;
> +
> +	mutex_lock(&fprobe_mutex);
> +	for (i = 0; i < FPROBE_IP_TABLE_SIZE; i++)
> +		fprobe_remove_node_in_module(mod, &fprobe_ip_table[i], &alist);
> +
> +	if (alist.index < alist.size && alist.index > 0)
> +		ftrace_set_filter_ips(&fprobe_graph_ops.ops,
> +				      alist.addrs, alist.index, 1, 0);
> +	mutex_unlock(&fprobe_mutex);
> +
> +	kfree(alist.addrs);
> +
> +	return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block fprobe_module_nb = {
> +	.notifier_call = fprobe_module_callback,
> +	.priority = 0,
> +};
> +
> +static int __init init_fprobe_module(void)
> +{
> +	return register_module_notifier(&fprobe_module_nb);
> +}
> +early_initcall(init_fprobe_module);
> +#endif
> +
>  static int symbols_cmp(const void *a, const void *b)
>  {
>  	const char **str_a = (const char **) a;
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-04-07 23:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-31 15:35 [PATCH v2 0/7] tracing: fprobe-events: Register fprobe only when the event is enabled Masami Hiramatsu (Google)
2025-03-31 15:35 ` [PATCH v2 1/7] tracing: fprobe events: Fix possible UAF on modules Masami Hiramatsu (Google)
2025-03-31 15:38   ` Masami Hiramatsu
2025-03-31 15:35 ` [PATCH v2 2/7] tracing: fprobe: Cleanup fprobe hash when module unloading Masami Hiramatsu (Google)
2025-04-07 23:52   ` Masami Hiramatsu
2025-03-31 15:35 ` [PATCH v2 3/7] tracing: tprobe-events: Remove mod field from tprobe-event Masami Hiramatsu (Google)
2025-03-31 15:36 ` [PATCH v2 4/7] tracing: tprobe-events: Support multiple tprobes on the same tracepoint Masami Hiramatsu (Google)
2025-03-31 15:36 ` [PATCH v2 5/7] tracing: fprobe-events: Register fprobe-events only when it is enabled Masami Hiramatsu (Google)
2025-03-31 15:36 ` [PATCH v2 6/7] selftests: tracing: Enable fprobe events before checking enable_functions Masami Hiramatsu (Google)
2025-03-31 15:36 ` [PATCH v2 7/7] tracing: tprobe-events: Register tracepoint when enable tprobe event Masami Hiramatsu (Google)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).