linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting
@ 2025-02-19 22:04 Steven Rostedt
  2025-02-19 22:04 ` [PATCH v2 1/5] ftrace: Fix accounting of adding subops to a manager ops Steven Rostedt
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Steven Rostedt @ 2025-02-19 22:04 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	Heiko Carstens, Sven Schnelle, Vasily Gorbik, Alexander Gordeev

Heiko Carstens reported[1] a bug when running the ftrace selftests.
After running them, he noticed that the enabled_functions file had
all functions enabled in it. That means something was registered to
ftrace that shouldn't be.

The issue was with the accounting of the new fprobe logic which was
built on top of the function graph infrastructure. Patch 3 of this
series is the fix for that bug, but while debugging that, 3 other
accounting bugs were discovered.

The bug fix for the reported bug was that fprobes would update its counter
every time a new fprobe was added, even if a new fprobe was attached to a
function that was already attached to another fprobe. But when removing the
fprobe, it would not decrement the counter of an fprobe with a duplicate.
This left the fprobe ops attached to function graph and when it removed the
last probe from the hash, it would create an empty filter hash which would
tell function graph that it wanted to trace all functions. The solution was
to always decrement the counter when a fprobe was removed.

The first of the other bugs was that when enabling a second subops to the
function graph infrastructure, it would add all functions to be called back
and not just the functions for the two subops for tracing. This was due to
the creation of the filter hash for the manager ops that controls the
subops. The first iteration where the manage ops hash was NULL was mistaken
as an "empty hash" which means to trace all functions.

The second bug was when adding a function to the hash where the hash already
had that function, it would allocate and create a new entry for it.  This
leaves duplicates and causes unnecessary overhead and memory wastage.

The third bug was when the last fprobe was removed, it would just unregister
from function graph but it did not remove the function from its own ops
hash. When adding a new fprobe, it would not only enable the function for
that new fprobe, but it would also enable the function of the last fprobe
that was removed.

Finally, a test was added to check and fail if any of the bugs were
introduced, with the exception of the duplicate hash entries, as that
was more of a memory waste and not something visible by user space.

[1] https://lore.kernel.org/all/20250217114918.10397-A-hca@linux.ibm.com/

Changes since v1: https://lore.kernel.org/all/20250218193033.338823920@goodmis.org/

- Moved the "Always unregister fgraph function from ops" to patch 3

- Change the "Fix accounting" patch to do the update in
  fprobe_graph_remove_ips() to make it match fprobe_graph_add_ips().

Steven Rostedt (5):
      ftrace: Fix accounting of adding subops to a manager ops
      ftrace: Do not add duplicate entries in subops manager ops
      fprobe: Always unregister fgraph function from ops
      fprobe: Fix accounting of when to unregister from function graph
      selftests/ftrace: Update fprobe test to check enabled_functions file

----
 kernel/trace/fprobe.c                              | 12 ++---
 kernel/trace/ftrace.c                              | 33 +++++++++----
 .../ftrace/test.d/dynevent/add_remove_fprobe.tc    | 54 ++++++++++++++++++++++
 3 files changed, 82 insertions(+), 17 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/5] ftrace: Fix accounting of adding subops to a manager ops
  2025-02-19 22:04 [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Steven Rostedt
@ 2025-02-19 22:04 ` Steven Rostedt
  2025-02-20  6:58   ` Masami Hiramatsu
  2025-02-19 22:04 ` [PATCH v2 2/5] ftrace: Do not add duplicate entries in subops " Steven Rostedt
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2025-02-19 22:04 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	Heiko Carstens, Sven Schnelle, Vasily Gorbik, Alexander Gordeev,
	stable

From: Steven Rostedt <rostedt@goodmis.org>

Function graph uses a subops and manager ops mechanism to attach to
ftrace.  The manager ops connects to ftrace and the functions it connects
to is defined by a list of subops that it manages.

The function hash that defines what the above ops attaches to limits the
functions to attach if the hash has any content. If the hash is empty, it
means to trace all functions.

The creation of the manager ops hash is done by iterating over all the
subops hashes. If any of the subops hashes is empty, it means that the
manager ops hash must trace all functions as well.

The issue is in the creation of the manager ops. When a second subops is
attached, a new hash is created by starting it as NULL and adding the
subops one at a time. But the NULL ops is mistaken as an empty hash, and
once an empty hash is found, it stops the loop of subops and just enables
all functions.

  # echo "f:myevent1 kernel_clone" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
kernel_clone (1)           	tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

  # echo "f:myevent2 schedule_timeout" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
trace_initcall_start_cb (1)             tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
run_init_process (1)            tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
try_to_run_init_process (1)             tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
x86_pmu_show_pmu_cap (1)                tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
cleanup_rapl_pmus (1)                   tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
uncore_free_pcibus_map (1)              tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
uncore_types_exit (1)                   tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
uncore_pci_exit.part.0 (1)              tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
kvm_shutdown (1)                tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
vmx_dump_msrs (1)               tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
vmx_cleanup_l1d_flush (1)               tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
[..]

Fix this by initializing the new hash to NULL and if the hash is NULL do
not treat it as an empty hash but instead allocate by copying the content
of the first sub ops. Then on subsequent iterations, the new hash will not
be NULL, but the content of the previous subops. If that first subops
attached to all functions, then new hash may assume that the manager ops
also needs to attach to all functions.

Cc: stable@vger.kernel.org
Fixes: 5fccc7552ccbc ("ftrace: Add subops logic to allow one ops to manage many")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 728ecda6e8d4..03b35a05808c 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3220,15 +3220,22 @@ static struct ftrace_hash *copy_hash(struct ftrace_hash *src)
  *  The filter_hash updates uses just the append_hash() function
  *  and the notrace_hash does not.
  */
-static int append_hash(struct ftrace_hash **hash, struct ftrace_hash *new_hash)
+static int append_hash(struct ftrace_hash **hash, struct ftrace_hash *new_hash,
+		       int size_bits)
 {
 	struct ftrace_func_entry *entry;
 	int size;
 	int i;
 
-	/* An empty hash does everything */
-	if (ftrace_hash_empty(*hash))
-		return 0;
+	if (*hash) {
+		/* An empty hash does everything */
+		if (ftrace_hash_empty(*hash))
+			return 0;
+	} else {
+		*hash = alloc_ftrace_hash(size_bits);
+		if (!*hash)
+			return -ENOMEM;
+	}
 
 	/* If new_hash has everything make hash have everything */
 	if (ftrace_hash_empty(new_hash)) {
@@ -3292,16 +3299,18 @@ static int intersect_hash(struct ftrace_hash **hash, struct ftrace_hash *new_has
 /* Return a new hash that has a union of all @ops->filter_hash entries */
 static struct ftrace_hash *append_hashes(struct ftrace_ops *ops)
 {
-	struct ftrace_hash *new_hash;
+	struct ftrace_hash *new_hash = NULL;
 	struct ftrace_ops *subops;
+	int size_bits;
 	int ret;
 
-	new_hash = alloc_ftrace_hash(ops->func_hash->filter_hash->size_bits);
-	if (!new_hash)
-		return NULL;
+	if (ops->func_hash->filter_hash)
+		size_bits = ops->func_hash->filter_hash->size_bits;
+	else
+		size_bits = FTRACE_HASH_DEFAULT_BITS;
 
 	list_for_each_entry(subops, &ops->subop_list, list) {
-		ret = append_hash(&new_hash, subops->func_hash->filter_hash);
+		ret = append_hash(&new_hash, subops->func_hash->filter_hash, size_bits);
 		if (ret < 0) {
 			free_ftrace_hash(new_hash);
 			return NULL;
@@ -3505,7 +3514,8 @@ int ftrace_startup_subops(struct ftrace_ops *ops, struct ftrace_ops *subops, int
 		filter_hash = alloc_and_copy_ftrace_hash(size_bits, ops->func_hash->filter_hash);
 		if (!filter_hash)
 			return -ENOMEM;
-		ret = append_hash(&filter_hash, subops->func_hash->filter_hash);
+		ret = append_hash(&filter_hash, subops->func_hash->filter_hash,
+				  size_bits);
 		if (ret < 0) {
 			free_ftrace_hash(filter_hash);
 			return ret;
-- 
2.47.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/5] ftrace: Do not add duplicate entries in subops manager ops
  2025-02-19 22:04 [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Steven Rostedt
  2025-02-19 22:04 ` [PATCH v2 1/5] ftrace: Fix accounting of adding subops to a manager ops Steven Rostedt
@ 2025-02-19 22:04 ` Steven Rostedt
  2025-02-20  6:59   ` Masami Hiramatsu
  2025-02-19 22:04 ` [PATCH v2 3/5] fprobe: Always unregister fgraph function from ops Steven Rostedt
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2025-02-19 22:04 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	Heiko Carstens, Sven Schnelle, Vasily Gorbik, Alexander Gordeev,
	stable

From: Steven Rostedt <rostedt@goodmis.org>

Check if a function is already in the manager ops of a subops. A manager
ops contains multiple subops, and if two or more subops are tracing the
same function, the manager ops only needs a single entry in its hash.

Cc: stable@vger.kernel.org
Fixes: 4f554e955614f ("ftrace: Add ftrace_set_filter_ips function")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 03b35a05808c..189eb0a12f4b 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5717,6 +5717,9 @@ __ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
 			return -ENOENT;
 		free_hash_entry(hash, entry);
 		return 0;
+	} else if (__ftrace_lookup_ip(hash, ip) != NULL) {
+		/* Already exists */
+		return 0;
 	}
 
 	entry = add_hash_entry(hash, ip);
-- 
2.47.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/5] fprobe: Always unregister fgraph function from ops
  2025-02-19 22:04 [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Steven Rostedt
  2025-02-19 22:04 ` [PATCH v2 1/5] ftrace: Fix accounting of adding subops to a manager ops Steven Rostedt
  2025-02-19 22:04 ` [PATCH v2 2/5] ftrace: Do not add duplicate entries in subops " Steven Rostedt
@ 2025-02-19 22:04 ` Steven Rostedt
  2025-02-20  4:41   ` Masami Hiramatsu
  2025-02-19 22:04 ` [PATCH v2 4/5] fprobe: Fix accounting of when to unregister from function graph Steven Rostedt
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2025-02-19 22:04 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	Heiko Carstens, Sven Schnelle, Vasily Gorbik, Alexander Gordeev,
	stable

From: Steven Rostedt <rostedt@goodmis.org>

When the last fprobe is removed, it calls unregister_ftrace_graph() to
remove the graph_ops from function graph. The issue is when it does so, it
calls return before removing the function from its graph ops via
ftrace_set_filter_ips(). This leaves the last function lingering in the
fprobe's fgraph ops and if a probe is added it also enables that last
function (even though the callback will just drop it, it does add unneeded
overhead to make that call).

  # echo "f:myevent1 kernel_clone" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
kernel_clone (1)           	tramp: 0xffffffffc02f3000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

  # echo "f:myevent2 schedule_timeout" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
kernel_clone (1)           	tramp: 0xffffffffc02f3000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
schedule_timeout (1)           	tramp: 0xffffffffc02f3000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

  # > /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions

  # echo "f:myevent3 kmem_cache_free" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
kmem_cache_free (1)           	tramp: 0xffffffffc0219000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
schedule_timeout (1)           	tramp: 0xffffffffc0219000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

The above enabled a fprobe on kernel_clone, and then on schedule_timeout.
The content of the enabled_functions shows the functions that have a
callback attached to them. The fprobe attached to those functions
properly. Then the fprobes were cleared, and enabled_functions was empty
after that. But after adding a fprobe on kmem_cache_free, the
enabled_functions shows that the schedule_timeout was attached again. This
is because it was still left in the fprobe ops that is used to tell
function graph what functions it wants callbacks from.

Cc: stable@vger.kernel.org
Fixes: 4346ba1604093 ("fprobe: Rewrite fprobe on function-graph tracer")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/fprobe.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 2560b312ad57..62e8f7d56602 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -403,11 +403,9 @@ static void fprobe_graph_remove_ips(unsigned long *addrs, int num)
 	lockdep_assert_held(&fprobe_mutex);
 
 	fprobe_graph_active--;
-	if (!fprobe_graph_active) {
-		/* Q: should we unregister it ? */
+	/* Q: should we unregister it ? */
+	if (!fprobe_graph_active)
 		unregister_ftrace_graph(&fprobe_graph_ops);
-		return;
-	}
 
 	ftrace_set_filter_ips(&fprobe_graph_ops.ops, addrs, num, 1, 0);
 }
-- 
2.47.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 4/5] fprobe: Fix accounting of when to unregister from function graph
  2025-02-19 22:04 [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Steven Rostedt
                   ` (2 preceding siblings ...)
  2025-02-19 22:04 ` [PATCH v2 3/5] fprobe: Always unregister fgraph function from ops Steven Rostedt
@ 2025-02-19 22:04 ` Steven Rostedt
  2025-02-20  4:42   ` Masami Hiramatsu
  2025-02-19 22:04 ` [PATCH v2 5/5] selftests/ftrace: Update fprobe test to check enabled_functions file Steven Rostedt
  2025-02-20 11:52 ` [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Heiko Carstens
  5 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2025-02-19 22:04 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	Heiko Carstens, Sven Schnelle, Vasily Gorbik, Alexander Gordeev,
	stable

From: Steven Rostedt <rostedt@goodmis.org>

When adding a new fprobe, it will update the function hash to the
functions the fprobe is attached to and register with function graph to
have it call the registered functions. The fprobe_graph_active variable
keeps track of the number of fprobes that are using function graph.

If two fprobes attach to the same function, it increments the
fprobe_graph_active for each of them. But when they are removed, the first
fprobe to be removed will see that the function it is attached to is also
used by another fprobe and it will not remove that function from
function_graph. The logic will skip decrementing the fprobe_graph_active
variable.

This causes the fprobe_graph_active variable to not go to zero when all
fprobes are removed, and in doing so it does not unregister from
function graph. As the fgraph ops hash will now be empty, and an empty
filter hash means all functions are enabled, this triggers function graph
to add a callback to the fprobe infrastructure for every function!

 # echo "f:myevent1 kernel_clone" >> /sys/kernel/tracing/dynamic_events
 # echo "f:myevent2 kernel_clone%return" >> /sys/kernel/tracing/dynamic_events
 # cat /sys/kernel/tracing/enabled_functions
kernel_clone (1)           	tramp: 0xffffffffc0024000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

 # > /sys/kernel/tracing/dynamic_events
 # cat /sys/kernel/tracing/enabled_functions
trace_initcall_start_cb (1)             tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
run_init_process (1)            tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
try_to_run_init_process (1)             tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
x86_pmu_show_pmu_cap (1)                tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
cleanup_rapl_pmus (1)                   tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
uncore_free_pcibus_map (1)              tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
uncore_types_exit (1)                   tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
uncore_pci_exit.part.0 (1)              tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
kvm_shutdown (1)                tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
vmx_dump_msrs (1)               tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
[..]

 # cat /sys/kernel/tracing/enabled_functions | wc -l
54702

If a fprobe is being removed and all its functions are also traced by
other fprobes, still decrement the fprobe_graph_active counter.

Cc: stable@vger.kernel.org
Fixes: 4346ba1604093 ("fprobe: Rewrite fprobe on function-graph tracer")
Closes: https://lore.kernel.org/all/20250217114918.10397-A-hca@linux.ibm.com/
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v1: https://lore.kernel.org/20250218193126.619197190@goodmis.org

- Move the check into fprobe_graph_remove_ips() to keep it matching
  with fprobe_graph_add_ips() (Masami Hiramatsu)

 kernel/trace/fprobe.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 62e8f7d56602..33082c4e8154 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -407,7 +407,8 @@ static void fprobe_graph_remove_ips(unsigned long *addrs, int num)
 	if (!fprobe_graph_active)
 		unregister_ftrace_graph(&fprobe_graph_ops);
 
-	ftrace_set_filter_ips(&fprobe_graph_ops.ops, addrs, num, 1, 0);
+	if (num)
+		ftrace_set_filter_ips(&fprobe_graph_ops.ops, addrs, num, 1, 0);
 }
 
 static int symbols_cmp(const void *a, const void *b)
@@ -677,8 +678,7 @@ int unregister_fprobe(struct fprobe *fp)
 	}
 	del_fprobe_hash(fp);
 
-	if (count)
-		fprobe_graph_remove_ips(addrs, count);
+	fprobe_graph_remove_ips(addrs, count);
 
 	kfree_rcu(hlist_array, rcu);
 	fp->hlist_array = NULL;
-- 
2.47.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 5/5] selftests/ftrace: Update fprobe test to check enabled_functions file
  2025-02-19 22:04 [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Steven Rostedt
                   ` (3 preceding siblings ...)
  2025-02-19 22:04 ` [PATCH v2 4/5] fprobe: Fix accounting of when to unregister from function graph Steven Rostedt
@ 2025-02-19 22:04 ` Steven Rostedt
  2025-02-20 11:52 ` [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Heiko Carstens
  5 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2025-02-19 22:04 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	Heiko Carstens, Sven Schnelle, Vasily Gorbik, Alexander Gordeev

From: Steven Rostedt <rostedt@goodmis.org>

A few bugs were found in the fprobe accounting logic along with it using
the function graph infrastructure. Update the fprobe selftest to catch
those bugs in case they or something similar shows up in the future.

The test now checks the enabled_functions file which shows all the
functions attached to ftrace or fgraph. When enabling a fprobe, make sure
that its corresponding function is also added to that file. Also add two
more fprobes to enable to make sure that the fprobe logic works properly
with multiple probes.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 .../test.d/dynevent/add_remove_fprobe.tc      | 54 +++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe.tc b/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe.tc
index dc25bcf4f9e2..449f9d8be746 100644
--- a/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe.tc
+++ b/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe.tc
@@ -7,12 +7,38 @@ echo 0 > events/enable
 echo > dynamic_events
 
 PLACE=$FUNCTION_FORK
+PLACE2="kmem_cache_free"
+PLACE3="schedule_timeout"
 
 echo "f:myevent1 $PLACE" >> dynamic_events
+
+# Make sure the event is attached and is the only one
+grep -q $PLACE enabled_functions
+cnt=`cat enabled_functions | wc -l`
+if [ $cnt -ne 1 ]; then
+	exit_fail
+fi
+
 echo "f:myevent2 $PLACE%return" >> dynamic_events
 
+# It should till be the only attached function
+cnt=`cat enabled_functions | wc -l`
+if [ $cnt -ne 1 ]; then
+	exit_fail
+fi
+
+# add another event
+echo "f:myevent3 $PLACE2" >> dynamic_events
+
+grep -q $PLACE2 enabled_functions
+cnt=`cat enabled_functions | wc -l`
+if [ $cnt -ne 2 ]; then
+	exit_fail
+fi
+
 grep -q myevent1 dynamic_events
 grep -q myevent2 dynamic_events
+grep -q myevent3 dynamic_events
 test -d events/fprobes/myevent1
 test -d events/fprobes/myevent2
 
@@ -21,6 +47,34 @@ echo "-:myevent2" >> dynamic_events
 grep -q myevent1 dynamic_events
 ! grep -q myevent2 dynamic_events
 
+# should still have 2 left
+cnt=`cat enabled_functions | wc -l`
+if [ $cnt -ne 2 ]; then
+	exit_fail
+fi
+
 echo > dynamic_events
 
+# Should have none left
+cnt=`cat enabled_functions | wc -l`
+if [ $cnt -ne 0 ]; then
+	exit_fail
+fi
+
+echo "f:myevent4 $PLACE" >> dynamic_events
+
+# Should only have one enabled
+cnt=`cat enabled_functions | wc -l`
+if [ $cnt -ne 1 ]; then
+	exit_fail
+fi
+
+echo > dynamic_events
+
+# Should have none left
+cnt=`cat enabled_functions | wc -l`
+if [ $cnt -ne 0 ]; then
+	exit_fail
+fi
+
 clear_trace
-- 
2.47.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/5] fprobe: Always unregister fgraph function from ops
  2025-02-19 22:04 ` [PATCH v2 3/5] fprobe: Always unregister fgraph function from ops Steven Rostedt
@ 2025-02-20  4:41   ` Masami Hiramatsu
  0 siblings, 0 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2025-02-20  4:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
	Mathieu Desnoyers, Andrew Morton, Heiko Carstens, Sven Schnelle,
	Vasily Gorbik, Alexander Gordeev, stable

On Wed, 19 Feb 2025 17:04:39 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: Steven Rostedt <rostedt@goodmis.org>
> 
> When the last fprobe is removed, it calls unregister_ftrace_graph() to
> remove the graph_ops from function graph. The issue is when it does so, it
> calls return before removing the function from its graph ops via
> ftrace_set_filter_ips(). This leaves the last function lingering in the
> fprobe's fgraph ops and if a probe is added it also enables that last
> function (even though the callback will just drop it, it does add unneeded
> overhead to make that call).
> 
>   # echo "f:myevent1 kernel_clone" >> /sys/kernel/tracing/dynamic_events
>   # cat /sys/kernel/tracing/enabled_functions
> kernel_clone (1)           	tramp: 0xffffffffc02f3000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
> 
>   # echo "f:myevent2 schedule_timeout" >> /sys/kernel/tracing/dynamic_events
>   # cat /sys/kernel/tracing/enabled_functions
> kernel_clone (1)           	tramp: 0xffffffffc02f3000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
> schedule_timeout (1)           	tramp: 0xffffffffc02f3000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
> 
>   # > /sys/kernel/tracing/dynamic_events
>   # cat /sys/kernel/tracing/enabled_functions
> 
>   # echo "f:myevent3 kmem_cache_free" >> /sys/kernel/tracing/dynamic_events
>   # cat /sys/kernel/tracing/enabled_functions
> kmem_cache_free (1)           	tramp: 0xffffffffc0219000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
> schedule_timeout (1)           	tramp: 0xffffffffc0219000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
> 
> The above enabled a fprobe on kernel_clone, and then on schedule_timeout.
> The content of the enabled_functions shows the functions that have a
> callback attached to them. The fprobe attached to those functions
> properly. Then the fprobes were cleared, and enabled_functions was empty
> after that. But after adding a fprobe on kmem_cache_free, the
> enabled_functions shows that the schedule_timeout was attached again. This
> is because it was still left in the fprobe ops that is used to tell
> function graph what functions it wants callbacks from.
> 

This looks good to me.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thanks!


> Cc: stable@vger.kernel.org
> Fixes: 4346ba1604093 ("fprobe: Rewrite fprobe on function-graph tracer")
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
>  kernel/trace/fprobe.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 2560b312ad57..62e8f7d56602 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -403,11 +403,9 @@ static void fprobe_graph_remove_ips(unsigned long *addrs, int num)
>  	lockdep_assert_held(&fprobe_mutex);
>  
>  	fprobe_graph_active--;
> -	if (!fprobe_graph_active) {
> -		/* Q: should we unregister it ? */
> +	/* Q: should we unregister it ? */
> +	if (!fprobe_graph_active)
>  		unregister_ftrace_graph(&fprobe_graph_ops);
> -		return;
> -	}
>  
>  	ftrace_set_filter_ips(&fprobe_graph_ops.ops, addrs, num, 1, 0);
>  }
> -- 
> 2.47.2
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/5] fprobe: Fix accounting of when to unregister from function graph
  2025-02-19 22:04 ` [PATCH v2 4/5] fprobe: Fix accounting of when to unregister from function graph Steven Rostedt
@ 2025-02-20  4:42   ` Masami Hiramatsu
  0 siblings, 0 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2025-02-20  4:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
	Mathieu Desnoyers, Andrew Morton, Heiko Carstens, Sven Schnelle,
	Vasily Gorbik, Alexander Gordeev, stable

On Wed, 19 Feb 2025 17:04:40 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: Steven Rostedt <rostedt@goodmis.org>
> 
> When adding a new fprobe, it will update the function hash to the
> functions the fprobe is attached to and register with function graph to
> have it call the registered functions. The fprobe_graph_active variable
> keeps track of the number of fprobes that are using function graph.
> 
> If two fprobes attach to the same function, it increments the
> fprobe_graph_active for each of them. But when they are removed, the first
> fprobe to be removed will see that the function it is attached to is also
> used by another fprobe and it will not remove that function from
> function_graph. The logic will skip decrementing the fprobe_graph_active
> variable.
> 
> This causes the fprobe_graph_active variable to not go to zero when all
> fprobes are removed, and in doing so it does not unregister from
> function graph. As the fgraph ops hash will now be empty, and an empty
> filter hash means all functions are enabled, this triggers function graph
> to add a callback to the fprobe infrastructure for every function!
> 
>  # echo "f:myevent1 kernel_clone" >> /sys/kernel/tracing/dynamic_events
>  # echo "f:myevent2 kernel_clone%return" >> /sys/kernel/tracing/dynamic_events
>  # cat /sys/kernel/tracing/enabled_functions
> kernel_clone (1)           	tramp: 0xffffffffc0024000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
> 
>  # > /sys/kernel/tracing/dynamic_events
>  # cat /sys/kernel/tracing/enabled_functions
> trace_initcall_start_cb (1)             tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> run_init_process (1)            tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> try_to_run_init_process (1)             tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> x86_pmu_show_pmu_cap (1)                tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> cleanup_rapl_pmus (1)                   tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> uncore_free_pcibus_map (1)              tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> uncore_types_exit (1)                   tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> uncore_pci_exit.part.0 (1)              tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> kvm_shutdown (1)                tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> vmx_dump_msrs (1)               tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
> [..]
> 
>  # cat /sys/kernel/tracing/enabled_functions | wc -l
> 54702
> 
> If a fprobe is being removed and all its functions are also traced by
> other fprobes, still decrement the fprobe_graph_active counter.
> 

This looks good to me.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

And please push this series via your branch because the selftest
updates should cover the series.

Thanks!

> Cc: stable@vger.kernel.org
> Fixes: 4346ba1604093 ("fprobe: Rewrite fprobe on function-graph tracer")
> Closes: https://lore.kernel.org/all/20250217114918.10397-A-hca@linux.ibm.com/
> Reported-by: Heiko Carstens <hca@linux.ibm.com>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
> Changes since v1: https://lore.kernel.org/20250218193126.619197190@goodmis.org
> 
> - Move the check into fprobe_graph_remove_ips() to keep it matching
>   with fprobe_graph_add_ips() (Masami Hiramatsu)
> 
>  kernel/trace/fprobe.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 62e8f7d56602..33082c4e8154 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -407,7 +407,8 @@ static void fprobe_graph_remove_ips(unsigned long *addrs, int num)
>  	if (!fprobe_graph_active)
>  		unregister_ftrace_graph(&fprobe_graph_ops);
>  
> -	ftrace_set_filter_ips(&fprobe_graph_ops.ops, addrs, num, 1, 0);
> +	if (num)
> +		ftrace_set_filter_ips(&fprobe_graph_ops.ops, addrs, num, 1, 0);
>  }
>  
>  static int symbols_cmp(const void *a, const void *b)
> @@ -677,8 +678,7 @@ int unregister_fprobe(struct fprobe *fp)
>  	}
>  	del_fprobe_hash(fp);
>  
> -	if (count)
> -		fprobe_graph_remove_ips(addrs, count);
> +	fprobe_graph_remove_ips(addrs, count);
>  
>  	kfree_rcu(hlist_array, rcu);
>  	fp->hlist_array = NULL;
> -- 
> 2.47.2
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/5] ftrace: Fix accounting of adding subops to a manager ops
  2025-02-19 22:04 ` [PATCH v2 1/5] ftrace: Fix accounting of adding subops to a manager ops Steven Rostedt
@ 2025-02-20  6:58   ` Masami Hiramatsu
  2025-02-20 20:06     ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Masami Hiramatsu @ 2025-02-20  6:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
	Mathieu Desnoyers, Andrew Morton, Heiko Carstens, Sven Schnelle,
	Vasily Gorbik, Alexander Gordeev, stable

On Wed, 19 Feb 2025 17:04:37 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> @@ -3292,16 +3299,18 @@ static int intersect_hash(struct ftrace_hash **hash, struct ftrace_hash *new_has
>  /* Return a new hash that has a union of all @ops->filter_hash entries */
>  static struct ftrace_hash *append_hashes(struct ftrace_ops *ops)
>  {
> -	struct ftrace_hash *new_hash;
> +	struct ftrace_hash *new_hash = NULL;

Isn't this "= EMPTY_HASH"?

Thank you,


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/5] ftrace: Do not add duplicate entries in subops manager ops
  2025-02-19 22:04 ` [PATCH v2 2/5] ftrace: Do not add duplicate entries in subops " Steven Rostedt
@ 2025-02-20  6:59   ` Masami Hiramatsu
  0 siblings, 0 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2025-02-20  6:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
	Mathieu Desnoyers, Andrew Morton, Heiko Carstens, Sven Schnelle,
	Vasily Gorbik, Alexander Gordeev, stable

On Wed, 19 Feb 2025 17:04:38 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: Steven Rostedt <rostedt@goodmis.org>
> 
> Check if a function is already in the manager ops of a subops. A manager
> ops contains multiple subops, and if two or more subops are tracing the
> same function, the manager ops only needs a single entry in its hash.
> 
> Cc: stable@vger.kernel.org
> Fixes: 4f554e955614f ("ftrace: Add ftrace_set_filter_ips function")
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thank you,

> ---
>  kernel/trace/ftrace.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 03b35a05808c..189eb0a12f4b 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -5717,6 +5717,9 @@ __ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
>  			return -ENOENT;
>  		free_hash_entry(hash, entry);
>  		return 0;
> +	} else if (__ftrace_lookup_ip(hash, ip) != NULL) {
> +		/* Already exists */
> +		return 0;
>  	}
>  
>  	entry = add_hash_entry(hash, ip);
> -- 
> 2.47.2
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting
  2025-02-19 22:04 [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Steven Rostedt
                   ` (4 preceding siblings ...)
  2025-02-19 22:04 ` [PATCH v2 5/5] selftests/ftrace: Update fprobe test to check enabled_functions file Steven Rostedt
@ 2025-02-20 11:52 ` Heiko Carstens
  5 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2025-02-20 11:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
	Mathieu Desnoyers, Andrew Morton, Sven Schnelle, Vasily Gorbik,
	Alexander Gordeev

On Wed, Feb 19, 2025 at 05:04:36PM -0500, Steven Rostedt wrote:
> Heiko Carstens reported[1] a bug when running the ftrace selftests.
> After running them, he noticed that the enabled_functions file had
> all functions enabled in it. That means something was registered to
> ftrace that shouldn't be.
> 
> The issue was with the accounting of the new fprobe logic which was
> built on top of the function graph infrastructure. Patch 3 of this
> series is the fix for that bug, but while debugging that, 3 other
> accounting bugs were discovered.
...
> Steven Rostedt (5):
>       ftrace: Fix accounting of adding subops to a manager ops
>       ftrace: Do not add duplicate entries in subops manager ops
>       fprobe: Always unregister fgraph function from ops
>       fprobe: Fix accounting of when to unregister from function graph
>       selftests/ftrace: Update fprobe test to check enabled_functions file
> 
> ----
>  kernel/trace/fprobe.c                              | 12 ++---
>  kernel/trace/ftrace.c                              | 33 +++++++++----
>  .../ftrace/test.d/dynevent/add_remove_fprobe.tc    | 54 ++++++++++++++++++++++
>  3 files changed, 82 insertions(+), 17 deletions(-)

FWIW, I can confirm that this fixes the bug I reported.
Feel free to add

Tested-by: Heiko Carstens <hca@linux.ibm.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/5] ftrace: Fix accounting of adding subops to a manager ops
  2025-02-20  6:58   ` Masami Hiramatsu
@ 2025-02-20 20:06     ` Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2025-02-20 20:06 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: linux-kernel, linux-trace-kernel, Mark Rutland, Mathieu Desnoyers,
	Andrew Morton, Heiko Carstens, Sven Schnelle, Vasily Gorbik,
	Alexander Gordeev, stable

On Thu, 20 Feb 2025 15:58:58 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> On Wed, 19 Feb 2025 17:04:37 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > @@ -3292,16 +3299,18 @@ static int intersect_hash(struct ftrace_hash **hash, struct ftrace_hash *new_has
> >  /* Return a new hash that has a union of all @ops->filter_hash entries */
> >  static struct ftrace_hash *append_hashes(struct ftrace_ops *ops)
> >  {
> > -	struct ftrace_hash *new_hash;
> > +	struct ftrace_hash *new_hash = NULL;  
> 
> Isn't this "= EMPTY_HASH"?
> 

No it has to be NULL. As the change log stated:

   Fix this by initializing the new hash to NULL and if the hash is NULL do
   not treat it as an empty hash but instead allocate by copying the content
   of the first sub ops. Then on subsequent iterations, the new hash will not
   be NULL, but the content of the previous subops. If that first subops
   attached to all functions, then new hash may assume that the manager ops
   also needs to attach to all functions.


Hmm, but we should return EMPTY_HASH if new_hash is still NULL after the
update. Otherwise the caller may confuse this as a failed allocation.

I'll send a v3.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-02-20 20:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-19 22:04 [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Steven Rostedt
2025-02-19 22:04 ` [PATCH v2 1/5] ftrace: Fix accounting of adding subops to a manager ops Steven Rostedt
2025-02-20  6:58   ` Masami Hiramatsu
2025-02-20 20:06     ` Steven Rostedt
2025-02-19 22:04 ` [PATCH v2 2/5] ftrace: Do not add duplicate entries in subops " Steven Rostedt
2025-02-20  6:59   ` Masami Hiramatsu
2025-02-19 22:04 ` [PATCH v2 3/5] fprobe: Always unregister fgraph function from ops Steven Rostedt
2025-02-20  4:41   ` Masami Hiramatsu
2025-02-19 22:04 ` [PATCH v2 4/5] fprobe: Fix accounting of when to unregister from function graph Steven Rostedt
2025-02-20  4:42   ` Masami Hiramatsu
2025-02-19 22:04 ` [PATCH v2 5/5] selftests/ftrace: Update fprobe test to check enabled_functions file Steven Rostedt
2025-02-20 11:52 ` [PATCH v2 0/5] ftrace: Fix fprobe with function graph accounting Heiko Carstens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).