* [PATCH v2 0/4] Fix user stack traces captured from uprobes
@ 2024-05-22 1:38 Andrii Nakryiko
2024-05-22 1:38 ` [PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global Andrii Nakryiko
` (4 more replies)
0 siblings, 5 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2024-05-22 1:38 UTC (permalink / raw)
To: linux-trace-kernel, rostedt, mhiramat
Cc: x86, peterz, mingo, tglx, bpf, rihams, linux-perf-users,
Andrii Nakryiko
This patch set reports two issues with captured stack traces.
First issue, fixed in patch #2, deals with fixing up uretprobe trampoline
addresses in captured stack trace. This issue happens when there are pending
return probes, for which kernel hijacks some of the return addresses on user
stacks. The code is matching those special uretprobe trampoline addresses with
the list of pending return probe instances and replaces them with actual
return addresses. This is the same fixup logic that fprobe/kretprobe has for
kernel stack traces.
Second issue, which patch #3 is fixing with the help of heuristic, is having
to do with capturing user stack traces in entry uprobes. At the very entrance
to user function, frame pointer in rbp register is not yet setup, so actual
caller return address is still pointed to by rsp. Patch is using a simple
heuristic, looking for `push %rbp` instruction, to fetch this extra direct
caller return address, before proceeding to unwind the stack using rbp.
Patch #4 adds tests into BPF selftests, that validate that captured stack
traces at various points is what we expect to get. This patch, while being BPF
selftests, is isolated from any other BPF selftests changes and can go in
through non-BPF tree without the risk of merge conflicts.
Patches are based on latest linux-trace/probes/for-next.
v1->v2:
- fixed GCC aggressively inlining test_uretprobe_stack() function (BPF CI);
- fixed comments (Peter).
Andrii Nakryiko (4):
uprobes: rename get_trampoline_vaddr() and make it global
perf,uprobes: fix user stack traces in the presence of pending
uretprobes
perf,x86: avoid missing caller address in stack traces captured in
uprobe
selftests/bpf: add test validating uprobe/uretprobe stack traces
arch/x86/events/core.c | 20 ++
include/linux/uprobes.h | 3 +
kernel/events/callchain.c | 43 +++-
kernel/events/uprobes.c | 17 +-
.../bpf/prog_tests/uretprobe_stack.c | 186 ++++++++++++++++++
.../selftests/bpf/progs/uretprobe_stack.c | 96 +++++++++
6 files changed, 361 insertions(+), 4 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_stack.c
--
2.43.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global
2024-05-22 1:38 [PATCH v2 0/4] Fix user stack traces captured from uprobes Andrii Nakryiko
@ 2024-05-22 1:38 ` Andrii Nakryiko
2024-06-25 0:28 ` Masami Hiramatsu
2024-06-25 1:02 ` Masami Hiramatsu
2024-05-22 1:38 ` [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes Andrii Nakryiko
` (3 subsequent siblings)
4 siblings, 2 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2024-05-22 1:38 UTC (permalink / raw)
To: linux-trace-kernel, rostedt, mhiramat
Cc: x86, peterz, mingo, tglx, bpf, rihams, linux-perf-users,
Andrii Nakryiko
This helper is needed in another file, so make it a bit more uniquely
named and expose it internally.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 6 +++---
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index f46e0ca0169c..0c57eec85339 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -138,6 +138,7 @@ extern bool arch_uretprobe_is_alive(struct return_instance *ret, enum rp_check c
extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs);
extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
void *src, unsigned long len);
+extern unsigned long uprobe_get_trampoline_vaddr(void);
#else /* !CONFIG_UPROBES */
struct uprobes_state {
};
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 8ae0eefc3a34..d60d24f0f2f4 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1827,7 +1827,7 @@ void uprobe_copy_process(struct task_struct *t, unsigned long flags)
*
* Returns -1 in case the xol_area is not allocated.
*/
-static unsigned long get_trampoline_vaddr(void)
+unsigned long uprobe_get_trampoline_vaddr(void)
{
struct xol_area *area;
unsigned long trampoline_vaddr = -1;
@@ -1878,7 +1878,7 @@ static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
if (!ri)
return;
- trampoline_vaddr = get_trampoline_vaddr();
+ trampoline_vaddr = uprobe_get_trampoline_vaddr();
orig_ret_vaddr = arch_uretprobe_hijack_return_addr(trampoline_vaddr, regs);
if (orig_ret_vaddr == -1)
goto fail;
@@ -2187,7 +2187,7 @@ static void handle_swbp(struct pt_regs *regs)
int is_swbp;
bp_vaddr = uprobe_get_swbp_addr(regs);
- if (bp_vaddr == get_trampoline_vaddr())
+ if (bp_vaddr == uprobe_get_trampoline_vaddr())
return handle_trampoline(regs);
uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes
2024-05-22 1:38 [PATCH v2 0/4] Fix user stack traces captured from uprobes Andrii Nakryiko
2024-05-22 1:38 ` [PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global Andrii Nakryiko
@ 2024-05-22 1:38 ` Andrii Nakryiko
2024-06-04 14:13 ` Masami Hiramatsu
2024-05-22 1:38 ` [PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe Andrii Nakryiko
` (2 subsequent siblings)
4 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2024-05-22 1:38 UTC (permalink / raw)
To: linux-trace-kernel, rostedt, mhiramat
Cc: x86, peterz, mingo, tglx, bpf, rihams, linux-perf-users,
Andrii Nakryiko, Riham Selim
When kernel has pending uretprobes installed, it hijacks original user
function return address on the stack with a uretprobe trampoline
address. There could be multiple such pending uretprobes (either on
different user functions or on the same recursive one) at any given
time within the same task.
This approach interferes with the user stack trace capture logic, which
would report suprising addresses (like 0x7fffffffe000) that correspond
to a special "[uprobes]" section that kernel installs in the target
process address space for uretprobe trampoline code, while logically it
should be an address somewhere within the calling function of another
traced user function.
This is easy to correct for, though. Uprobes subsystem keeps track of
pending uretprobes and records original return addresses. This patch is
using this to do a post-processing step and restore each trampoline
address entries with correct original return address. This is done only
if there are pending uretprobes for current task.
This is a similar approach to what fprobe/kretprobe infrastructure is
doing when capturing kernel stack traces in the presence of pending
return probes.
Reported-by: Riham Selim <rihams@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
kernel/events/callchain.c | 43 ++++++++++++++++++++++++++++++++++++++-
kernel/events/uprobes.c | 9 ++++++++
2 files changed, 51 insertions(+), 1 deletion(-)
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index 1273be84392c..b17e3323f7f6 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -11,6 +11,7 @@
#include <linux/perf_event.h>
#include <linux/slab.h>
#include <linux/sched/task_stack.h>
+#include <linux/uprobes.h>
#include "internal.h"
@@ -176,13 +177,51 @@ put_callchain_entry(int rctx)
put_recursion_context(this_cpu_ptr(callchain_recursion), rctx);
}
+static void fixup_uretprobe_trampoline_entries(struct perf_callchain_entry *entry,
+ int start_entry_idx)
+{
+#ifdef CONFIG_UPROBES
+ struct uprobe_task *utask = current->utask;
+ struct return_instance *ri;
+ __u64 *cur_ip, *last_ip, tramp_addr;
+
+ if (likely(!utask || !utask->return_instances))
+ return;
+
+ cur_ip = &entry->ip[start_entry_idx];
+ last_ip = &entry->ip[entry->nr - 1];
+ ri = utask->return_instances;
+ tramp_addr = uprobe_get_trampoline_vaddr();
+
+ /*
+ * If there are pending uretprobes for the current thread, they are
+ * recorded in a list inside utask->return_instances; each such
+ * pending uretprobe replaces traced user function's return address on
+ * the stack, so when stack trace is captured, instead of seeing
+ * actual function's return address, we'll have one or many uretprobe
+ * trampoline addresses in the stack trace, which are not helpful and
+ * misleading to users.
+ * So here we go over the pending list of uretprobes, and each
+ * encountered trampoline address is replaced with actual return
+ * address.
+ */
+ while (ri && cur_ip <= last_ip) {
+ if (*cur_ip == tramp_addr) {
+ *cur_ip = ri->orig_ret_vaddr;
+ ri = ri->next;
+ }
+ cur_ip++;
+ }
+#endif
+}
+
struct perf_callchain_entry *
get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
u32 max_stack, bool crosstask, bool add_mark)
{
struct perf_callchain_entry *entry;
struct perf_callchain_entry_ctx ctx;
- int rctx;
+ int rctx, start_entry_idx;
entry = get_callchain_entry(&rctx);
if (!entry)
@@ -215,7 +254,9 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
if (add_mark)
perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);
+ start_entry_idx = entry->nr;
perf_callchain_user(&ctx, regs);
+ fixup_uretprobe_trampoline_entries(entry, start_entry_idx);
}
}
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index d60d24f0f2f4..1c99380dc89d 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -2149,6 +2149,15 @@ static void handle_trampoline(struct pt_regs *regs)
instruction_pointer_set(regs, ri->orig_ret_vaddr);
do {
+ /* pop current instance from the stack of pending return instances,
+ * as it's not pending anymore: we just fixed up original
+ * instruction pointer in regs and are about to call handlers;
+ * this allows fixup_uretprobe_trampoline_entries() to properly fix up
+ * captured stack traces from uretprobe handlers, in which pending
+ * trampoline addresses on the stack are replaced with correct
+ * original return addresses
+ */
+ utask->return_instances = ri->next;
if (valid)
handle_uretprobe_chain(ri, regs);
ri = free_ret_instance(ri);
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe
2024-05-22 1:38 [PATCH v2 0/4] Fix user stack traces captured from uprobes Andrii Nakryiko
2024-05-22 1:38 ` [PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global Andrii Nakryiko
2024-05-22 1:38 ` [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes Andrii Nakryiko
@ 2024-05-22 1:38 ` Andrii Nakryiko
2024-06-04 9:24 ` Jiri Olsa
2024-06-04 14:06 ` Masami Hiramatsu
2024-05-22 1:38 ` [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces Andrii Nakryiko
2024-06-03 21:09 ` [PATCH v2 0/4] Fix user stack traces captured from uprobes Andrii Nakryiko
4 siblings, 2 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2024-05-22 1:38 UTC (permalink / raw)
To: linux-trace-kernel, rostedt, mhiramat
Cc: x86, peterz, mingo, tglx, bpf, rihams, linux-perf-users,
Andrii Nakryiko
When tracing user functions with uprobe functionality, it's common to
install the probe (e.g., a BPF program) at the first instruction of the
function. This is often going to be `push %rbp` instruction in function
preamble, which means that within that function frame pointer hasn't
been established yet. This leads to consistently missing an actual
caller of the traced function, because perf_callchain_user() only
records current IP (capturing traced function) and then following frame
pointer chain (which would be caller's frame, containing the address of
caller's caller).
So when we have target_1 -> target_2 -> target_3 call chain and we are
tracing an entry to target_3, captured stack trace will report
target_1 -> target_3 call chain, which is wrong and confusing.
This patch proposes a x86-64-specific heuristic to detect `push %rbp`
instruction being traced. If that's the case, with the assumption that
applicatoin is compiled with frame pointers, this instruction would be
a strong indicator that this is the entry to the function. In that case,
return address is still pointed to by %rsp, so we fetch it and add to
stack trace before proceeding to unwind the rest using frame
pointer-based logic.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
arch/x86/events/core.c | 20 ++++++++++++++++++++
include/linux/uprobes.h | 2 ++
kernel/events/uprobes.c | 2 ++
3 files changed, 24 insertions(+)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 5b0dd07b1ef1..82d5570b58ff 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2884,6 +2884,26 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs
return;
pagefault_disable();
+
+#ifdef CONFIG_UPROBES
+ /*
+ * If we are called from uprobe handler, and we are indeed at the very
+ * entry to user function (which is normally a `push %rbp` instruction,
+ * under assumption of application being compiled with frame pointers),
+ * we should read return address from *regs->sp before proceeding
+ * to follow frame pointers, otherwise we'll skip immediate caller
+ * as %rbp is not yet setup.
+ */
+ if (current->utask) {
+ struct arch_uprobe *auprobe = current->utask->auprobe;
+ u64 ret_addr;
+
+ if (auprobe && auprobe->insn[0] == 0x55 /* push %rbp */ &&
+ !__get_user(ret_addr, (const u64 __user *)regs->sp))
+ perf_callchain_store(entry, ret_addr);
+ }
+#endif
+
while (entry->nr < entry->max_stack) {
if (!valid_user_frame(fp, sizeof(frame)))
break;
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index 0c57eec85339..7b785cd30d86 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -76,6 +76,8 @@ struct uprobe_task {
struct uprobe *active_uprobe;
unsigned long xol_vaddr;
+ struct arch_uprobe *auprobe;
+
struct return_instance *return_instances;
unsigned int depth;
};
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 1c99380dc89d..504693845187 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -2072,6 +2072,7 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
bool need_prep = false; /* prepare return uprobe, when needed */
down_read(&uprobe->register_rwsem);
+ current->utask->auprobe = &uprobe->arch;
for (uc = uprobe->consumers; uc; uc = uc->next) {
int rc = 0;
@@ -2086,6 +2087,7 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
remove &= rc;
}
+ current->utask->auprobe = NULL;
if (need_prep && !remove)
prepare_uretprobe(uprobe, regs); /* put bp at return */
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces
2024-05-22 1:38 [PATCH v2 0/4] Fix user stack traces captured from uprobes Andrii Nakryiko
` (2 preceding siblings ...)
2024-05-22 1:38 ` [PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe Andrii Nakryiko
@ 2024-05-22 1:38 ` Andrii Nakryiko
2024-06-04 9:24 ` Jiri Olsa
` (2 more replies)
2024-06-03 21:09 ` [PATCH v2 0/4] Fix user stack traces captured from uprobes Andrii Nakryiko
4 siblings, 3 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2024-05-22 1:38 UTC (permalink / raw)
To: linux-trace-kernel, rostedt, mhiramat
Cc: x86, peterz, mingo, tglx, bpf, rihams, linux-perf-users,
Andrii Nakryiko
Add a set of tests to validate that stack traces captured from or in the
presence of active uprobes and uretprobes are valid and complete.
For this we use BPF program that are installed either on entry or exit
of user function, plus deep-nested USDT. One of target funtions
(target_1) is recursive to generate two different entries in the stack
trace for the same uprobe/uretprobe, testing potential edge conditions.
Without fixes in this patch set, we get something like this for one of
the scenarios:
caller: 0x758fff - 0x7595ab
target_1: 0x758fd5 - 0x758fff
target_2: 0x758fca - 0x758fd5
target_3: 0x758fbf - 0x758fca
target_4: 0x758fb3 - 0x758fbf
ENTRY #0: 0x758fb3 (in target_4)
ENTRY #1: 0x758fd3 (in target_2)
ENTRY #2: 0x758ffd (in target_1)
ENTRY #3: 0x7fffffffe000
ENTRY #4: 0x7fffffffe000
ENTRY #5: 0x6f8f39
ENTRY #6: 0x6fa6f0
ENTRY #7: 0x7f403f229590
Entry #3 and #4 (0x7fffffffe000) are uretprobe trampoline addresses
which obscure actual target_1 and another target_1 invocations. Also
note that between entry #0 and entry #1 we are missing an entry for
target_3, which is fixed in patch #2.
With all the fixes, we get desired full stack traces:
caller: 0x758fff - 0x7595ab
target_1: 0x758fd5 - 0x758fff
target_2: 0x758fca - 0x758fd5
target_3: 0x758fbf - 0x758fca
target_4: 0x758fb3 - 0x758fbf
ENTRY #0: 0x758fb7 (in target_4)
ENTRY #1: 0x758fc8 (in target_3)
ENTRY #2: 0x758fd3 (in target_2)
ENTRY #3: 0x758ffd (in target_1)
ENTRY #4: 0x758ff3 (in target_1)
ENTRY #5: 0x75922c (in caller)
ENTRY #6: 0x6f8f39
ENTRY #7: 0x6fa6f0
ENTRY #8: 0x7f986adc4cd0
Now there is a logical and complete sequence of function calls.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
.../bpf/prog_tests/uretprobe_stack.c | 186 ++++++++++++++++++
.../selftests/bpf/progs/uretprobe_stack.c | 96 +++++++++
2 files changed, 282 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_stack.c
diff --git a/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
new file mode 100644
index 000000000000..6deb8d560ddd
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include "uretprobe_stack.skel.h"
+#include "../sdt.h"
+
+/* We set up target_1() -> target_2() -> target_3() -> target_4() -> USDT()
+ * call chain, each being traced by our BPF program. On entry or return from
+ * each target_*() we are capturing user stack trace and recording it in
+ * global variable, so that user space part of the test can validate it.
+ *
+ * Note, we put each target function into a custom section to get those
+ * __start_XXX/__stop_XXX symbols, generated by linker for us, which allow us
+ * to know address range of those functions
+ */
+__attribute__((section("uprobe__target_4")))
+__weak int target_4(void)
+{
+ STAP_PROBE1(uretprobe_stack, target, 42);
+ return 42;
+}
+
+extern const void *__start_uprobe__target_4;
+extern const void *__stop_uprobe__target_4;
+
+__attribute__((section("uprobe__target_3")))
+__weak int target_3(void)
+{
+ return target_4();
+}
+
+extern const void *__start_uprobe__target_3;
+extern const void *__stop_uprobe__target_3;
+
+__attribute__((section("uprobe__target_2")))
+__weak int target_2(void)
+{
+ return target_3();
+}
+
+extern const void *__start_uprobe__target_2;
+extern const void *__stop_uprobe__target_2;
+
+__attribute__((section("uprobe__target_1")))
+__weak int target_1(int depth)
+{
+ if (depth < 1)
+ return 1 + target_1(depth + 1);
+ else
+ return target_2();
+}
+
+extern const void *__start_uprobe__target_1;
+extern const void *__stop_uprobe__target_1;
+
+extern const void *__start_uretprobe_stack_sec;
+extern const void *__stop_uretprobe_stack_sec;
+
+struct range {
+ long start;
+ long stop;
+};
+
+static struct range targets[] = {
+ {}, /* we want target_1 to map to target[1], so need 1-based indexing */
+ { (long)&__start_uprobe__target_1, (long)&__stop_uprobe__target_1 },
+ { (long)&__start_uprobe__target_2, (long)&__stop_uprobe__target_2 },
+ { (long)&__start_uprobe__target_3, (long)&__stop_uprobe__target_3 },
+ { (long)&__start_uprobe__target_4, (long)&__stop_uprobe__target_4 },
+};
+
+static struct range caller = {
+ (long)&__start_uretprobe_stack_sec,
+ (long)&__stop_uretprobe_stack_sec,
+};
+
+static void validate_stack(__u64 *ips, int stack_len, int cnt, ...)
+{
+ int i, j;
+ va_list args;
+
+ if (!ASSERT_GT(stack_len, 0, "stack_len"))
+ return;
+
+ stack_len /= 8;
+
+ /* check if we have enough entries to satisfy test expectations */
+ if (!ASSERT_GE(stack_len, cnt, "stack_len2"))
+ return;
+
+ if (env.verbosity >= VERBOSE_NORMAL) {
+ printf("caller: %#lx - %#lx\n", caller.start, caller.stop);
+ for (i = 1; i < ARRAY_SIZE(targets); i++)
+ printf("target_%d: %#lx - %#lx\n", i, targets[i].start, targets[i].stop);
+ for (i = 0; i < stack_len; i++) {
+ for (j = 1; j < ARRAY_SIZE(targets); j++) {
+ if (ips[i] >= targets[j].start && ips[i] < targets[j].stop)
+ break;
+ }
+ if (j < ARRAY_SIZE(targets)) { /* found target match */
+ printf("ENTRY #%d: %#lx (in target_%d)\n", i, (long)ips[i], j);
+ } else if (ips[i] >= caller.start && ips[i] < caller.stop) {
+ printf("ENTRY #%d: %#lx (in caller)\n", i, (long)ips[i]);
+ } else {
+ printf("ENTRY #%d: %#lx\n", i, (long)ips[i]);
+ }
+ }
+ }
+
+ va_start(args, cnt);
+
+ for (i = cnt - 1; i >= 0; i--) {
+ /* most recent entry is the deepest target function */
+ const struct range *t = va_arg(args, const struct range *);
+
+ ASSERT_GE(ips[i], t->start, "addr_start");
+ ASSERT_LT(ips[i], t->stop, "addr_stop");
+ }
+
+ va_end(args);
+}
+
+/* __weak prevents inlining */
+__attribute__((section("uretprobe_stack_sec")))
+__weak void test_uretprobe_stack(void)
+{
+ LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
+ struct uretprobe_stack *skel;
+ int err;
+
+ skel = uretprobe_stack__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "skel_open"))
+ return;
+
+ err = uretprobe_stack__attach(skel);
+ if (!ASSERT_OK(err, "skel_attach"))
+ goto cleanup;
+
+ /* trigger */
+ ASSERT_EQ(target_1(0), 42 + 1, "trigger_return");
+
+ /*
+ * Stacks captured on ENTRY uprobes
+ */
+
+ /* (uprobe 1) target_1 in stack trace*/
+ validate_stack(skel->bss->entry_stack1, skel->bss->entry1_len,
+ 2, &caller, &targets[1]);
+ /* (uprobe 1, recursed) */
+ validate_stack(skel->bss->entry_stack1_recur, skel->bss->entry1_recur_len,
+ 3, &caller, &targets[1], &targets[1]);
+ /* (uprobe 2) caller -> target_1 -> target_1 -> target_2 */
+ validate_stack(skel->bss->entry_stack2, skel->bss->entry2_len,
+ 4, &caller, &targets[1], &targets[1], &targets[2]);
+ /* (uprobe 3) */
+ validate_stack(skel->bss->entry_stack3, skel->bss->entry3_len,
+ 5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
+ /* (uprobe 4) caller -> target_1 -> target_1 -> target_2 -> target_3 -> target_4 */
+ validate_stack(skel->bss->entry_stack4, skel->bss->entry4_len,
+ 6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
+
+ /* (USDT): full caller -> target_1 -> target_1 -> target_2 (uretprobed)
+ * -> target_3 -> target_4 (uretprobes) chain
+ */
+ validate_stack(skel->bss->usdt_stack, skel->bss->usdt_len,
+ 6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
+
+ /*
+ * Now stacks captured on the way out in EXIT uprobes
+ */
+
+ /* (uretprobe 4) everything up to target_4, but excluding it */
+ validate_stack(skel->bss->exit_stack4, skel->bss->exit4_len,
+ 5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
+ /* we didn't install uretprobes on target_2 and target_3 */
+ /* (uretprobe 1, recur) first target_1 call only */
+ validate_stack(skel->bss->exit_stack1_recur, skel->bss->exit1_recur_len,
+ 2, &caller, &targets[1]);
+ /* (uretprobe 1) just a caller in the stack trace */
+ validate_stack(skel->bss->exit_stack1, skel->bss->exit1_len,
+ 1, &caller);
+
+cleanup:
+ uretprobe_stack__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/uretprobe_stack.c b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
new file mode 100644
index 000000000000..9fdcf396b8f4
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/usdt.bpf.h>
+
+char _license[] SEC("license") = "GPL";
+
+__u64 entry_stack1[32], exit_stack1[32];
+__u64 entry_stack1_recur[32], exit_stack1_recur[32];
+__u64 entry_stack2[32];
+__u64 entry_stack3[32];
+__u64 entry_stack4[32], exit_stack4[32];
+__u64 usdt_stack[32];
+
+int entry1_len, exit1_len;
+int entry1_recur_len, exit1_recur_len;
+int entry2_len, exit2_len;
+int entry3_len, exit3_len;
+int entry4_len, exit4_len;
+int usdt_len;
+
+#define SZ sizeof(usdt_stack)
+
+SEC("uprobe//proc/self/exe:target_1")
+int BPF_UPROBE(uprobe_1)
+{
+ /* target_1 is recursive wit depth of 2, so we capture two separate
+ * stack traces, depending on which occurence it is
+ */
+ static bool recur = false;
+
+ if (!recur)
+ entry1_len = bpf_get_stack(ctx, &entry_stack1, SZ, BPF_F_USER_STACK);
+ else
+ entry1_recur_len = bpf_get_stack(ctx, &entry_stack1_recur, SZ, BPF_F_USER_STACK);
+
+ recur = true;
+ return 0;
+}
+
+SEC("uretprobe//proc/self/exe:target_1")
+int BPF_URETPROBE(uretprobe_1)
+{
+ /* see above, target_1 is recursive */
+ static bool recur = false;
+
+ /* NOTE: order of returns is reversed to order of entries */
+ if (!recur)
+ exit1_recur_len = bpf_get_stack(ctx, &exit_stack1_recur, SZ, BPF_F_USER_STACK);
+ else
+ exit1_len = bpf_get_stack(ctx, &exit_stack1, SZ, BPF_F_USER_STACK);
+
+ recur = true;
+ return 0;
+}
+
+SEC("uprobe//proc/self/exe:target_2")
+int BPF_UPROBE(uprobe_2)
+{
+ entry2_len = bpf_get_stack(ctx, &entry_stack2, SZ, BPF_F_USER_STACK);
+ return 0;
+}
+
+/* no uretprobe for target_2 */
+
+SEC("uprobe//proc/self/exe:target_3")
+int BPF_UPROBE(uprobe_3)
+{
+ entry3_len = bpf_get_stack(ctx, &entry_stack3, SZ, BPF_F_USER_STACK);
+ return 0;
+}
+
+/* no uretprobe for target_3 */
+
+SEC("uprobe//proc/self/exe:target_4")
+int BPF_UPROBE(uprobe_4)
+{
+ entry4_len = bpf_get_stack(ctx, &entry_stack4, SZ, BPF_F_USER_STACK);
+ return 0;
+}
+
+SEC("uretprobe//proc/self/exe:target_4")
+int BPF_URETPROBE(uretprobe_4)
+{
+ exit4_len = bpf_get_stack(ctx, &exit_stack4, SZ, BPF_F_USER_STACK);
+ return 0;
+}
+
+SEC("usdt//proc/self/exe:uretprobe_stack:target")
+int BPF_USDT(usdt_probe)
+{
+ usdt_len = bpf_get_stack(ctx, &usdt_stack, SZ, BPF_F_USER_STACK);
+ return 0;
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/4] Fix user stack traces captured from uprobes
2024-05-22 1:38 [PATCH v2 0/4] Fix user stack traces captured from uprobes Andrii Nakryiko
` (3 preceding siblings ...)
2024-05-22 1:38 ` [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces Andrii Nakryiko
@ 2024-06-03 21:09 ` Andrii Nakryiko
4 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2024-06-03 21:09 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, rostedt, mhiramat, x86, peterz, mingo, tglx,
bpf, rihams, linux-perf-users
On Tue, May 21, 2024 at 6:38 PM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> This patch set reports two issues with captured stack traces.
>
> First issue, fixed in patch #2, deals with fixing up uretprobe trampoline
> addresses in captured stack trace. This issue happens when there are pending
> return probes, for which kernel hijacks some of the return addresses on user
> stacks. The code is matching those special uretprobe trampoline addresses with
> the list of pending return probe instances and replaces them with actual
> return addresses. This is the same fixup logic that fprobe/kretprobe has for
> kernel stack traces.
>
> Second issue, which patch #3 is fixing with the help of heuristic, is having
> to do with capturing user stack traces in entry uprobes. At the very entrance
> to user function, frame pointer in rbp register is not yet setup, so actual
> caller return address is still pointed to by rsp. Patch is using a simple
> heuristic, looking for `push %rbp` instruction, to fetch this extra direct
> caller return address, before proceeding to unwind the stack using rbp.
>
> Patch #4 adds tests into BPF selftests, that validate that captured stack
> traces at various points is what we expect to get. This patch, while being BPF
> selftests, is isolated from any other BPF selftests changes and can go in
> through non-BPF tree without the risk of merge conflicts.
>
> Patches are based on latest linux-trace/probes/for-next.
>
> v1->v2:
> - fixed GCC aggressively inlining test_uretprobe_stack() function (BPF CI);
> - fixed comments (Peter).
>
> Andrii Nakryiko (4):
> uprobes: rename get_trampoline_vaddr() and make it global
> perf,uprobes: fix user stack traces in the presence of pending
> uretprobes
> perf,x86: avoid missing caller address in stack traces captured in
> uprobe
> selftests/bpf: add test validating uprobe/uretprobe stack traces
>
> arch/x86/events/core.c | 20 ++
> include/linux/uprobes.h | 3 +
> kernel/events/callchain.c | 43 +++-
> kernel/events/uprobes.c | 17 +-
> .../bpf/prog_tests/uretprobe_stack.c | 186 ++++++++++++++++++
> .../selftests/bpf/progs/uretprobe_stack.c | 96 +++++++++
> 6 files changed, 361 insertions(+), 4 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_stack.c
>
> --
> 2.43.0
>
Friendly ping. This is a real issue in practice that our production
users are eager to be fixed, please help to get it upstream. Thanks!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe
2024-05-22 1:38 ` [PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe Andrii Nakryiko
@ 2024-06-04 9:24 ` Jiri Olsa
2024-06-04 14:06 ` Masami Hiramatsu
1 sibling, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2024-06-04 9:24 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, rostedt, mhiramat, x86, peterz, mingo, tglx,
bpf, rihams, linux-perf-users
On Tue, May 21, 2024 at 06:38:44PM -0700, Andrii Nakryiko wrote:
> When tracing user functions with uprobe functionality, it's common to
> install the probe (e.g., a BPF program) at the first instruction of the
> function. This is often going to be `push %rbp` instruction in function
> preamble, which means that within that function frame pointer hasn't
> been established yet. This leads to consistently missing an actual
> caller of the traced function, because perf_callchain_user() only
> records current IP (capturing traced function) and then following frame
> pointer chain (which would be caller's frame, containing the address of
> caller's caller).
>
> So when we have target_1 -> target_2 -> target_3 call chain and we are
> tracing an entry to target_3, captured stack trace will report
> target_1 -> target_3 call chain, which is wrong and confusing.
>
> This patch proposes a x86-64-specific heuristic to detect `push %rbp`
> instruction being traced. If that's the case, with the assumption that
> applicatoin is compiled with frame pointers, this instruction would be
> a strong indicator that this is the entry to the function. In that case,
> return address is still pointed to by %rsp, so we fetch it and add to
> stack trace before proceeding to unwind the rest using frame
> pointer-based logic.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> arch/x86/events/core.c | 20 ++++++++++++++++++++
> include/linux/uprobes.h | 2 ++
> kernel/events/uprobes.c | 2 ++
> 3 files changed, 24 insertions(+)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 5b0dd07b1ef1..82d5570b58ff 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2884,6 +2884,26 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs
> return;
>
> pagefault_disable();
> +
> +#ifdef CONFIG_UPROBES
> + /*
> + * If we are called from uprobe handler, and we are indeed at the very
> + * entry to user function (which is normally a `push %rbp` instruction,
> + * under assumption of application being compiled with frame pointers),
> + * we should read return address from *regs->sp before proceeding
> + * to follow frame pointers, otherwise we'll skip immediate caller
> + * as %rbp is not yet setup.
> + */
> + if (current->utask) {
> + struct arch_uprobe *auprobe = current->utask->auprobe;
> + u64 ret_addr;
> +
> + if (auprobe && auprobe->insn[0] == 0x55 /* push %rbp */ &&
> + !__get_user(ret_addr, (const u64 __user *)regs->sp))
> + perf_callchain_store(entry, ret_addr);
> + }
> +#endif
> +
> while (entry->nr < entry->max_stack) {
> if (!valid_user_frame(fp, sizeof(frame)))
> break;
> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> index 0c57eec85339..7b785cd30d86 100644
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -76,6 +76,8 @@ struct uprobe_task {
> struct uprobe *active_uprobe;
> unsigned long xol_vaddr;
>
> + struct arch_uprobe *auprobe;
I wonder we could use active_uprobe for this?
jirka
> +
> struct return_instance *return_instances;
> unsigned int depth;
> };
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 1c99380dc89d..504693845187 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -2072,6 +2072,7 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
> bool need_prep = false; /* prepare return uprobe, when needed */
>
> down_read(&uprobe->register_rwsem);
> + current->utask->auprobe = &uprobe->arch;
> for (uc = uprobe->consumers; uc; uc = uc->next) {
> int rc = 0;
>
> @@ -2086,6 +2087,7 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
>
> remove &= rc;
> }
> + current->utask->auprobe = NULL;
>
> if (need_prep && !remove)
> prepare_uretprobe(uprobe, regs); /* put bp at return */
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces
2024-05-22 1:38 ` [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces Andrii Nakryiko
@ 2024-06-04 9:24 ` Jiri Olsa
2024-06-25 1:14 ` Masami Hiramatsu
2024-06-25 1:22 ` Masami Hiramatsu
2 siblings, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2024-06-04 9:24 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, rostedt, mhiramat, x86, peterz, mingo, tglx,
bpf, rihams, linux-perf-users
On Tue, May 21, 2024 at 06:38:45PM -0700, Andrii Nakryiko wrote:
> Add a set of tests to validate that stack traces captured from or in the
> presence of active uprobes and uretprobes are valid and complete.
>
> For this we use BPF program that are installed either on entry or exit
> of user function, plus deep-nested USDT. One of target funtions
> (target_1) is recursive to generate two different entries in the stack
> trace for the same uprobe/uretprobe, testing potential edge conditions.
>
> Without fixes in this patch set, we get something like this for one of
> the scenarios:
>
> caller: 0x758fff - 0x7595ab
> target_1: 0x758fd5 - 0x758fff
> target_2: 0x758fca - 0x758fd5
> target_3: 0x758fbf - 0x758fca
> target_4: 0x758fb3 - 0x758fbf
> ENTRY #0: 0x758fb3 (in target_4)
> ENTRY #1: 0x758fd3 (in target_2)
> ENTRY #2: 0x758ffd (in target_1)
> ENTRY #3: 0x7fffffffe000
> ENTRY #4: 0x7fffffffe000
> ENTRY #5: 0x6f8f39
> ENTRY #6: 0x6fa6f0
> ENTRY #7: 0x7f403f229590
>
> Entry #3 and #4 (0x7fffffffe000) are uretprobe trampoline addresses
> which obscure actual target_1 and another target_1 invocations. Also
> note that between entry #0 and entry #1 we are missing an entry for
> target_3, which is fixed in patch #2.
>
> With all the fixes, we get desired full stack traces:
>
> caller: 0x758fff - 0x7595ab
> target_1: 0x758fd5 - 0x758fff
> target_2: 0x758fca - 0x758fd5
> target_3: 0x758fbf - 0x758fca
> target_4: 0x758fb3 - 0x758fbf
> ENTRY #0: 0x758fb7 (in target_4)
> ENTRY #1: 0x758fc8 (in target_3)
> ENTRY #2: 0x758fd3 (in target_2)
> ENTRY #3: 0x758ffd (in target_1)
> ENTRY #4: 0x758ff3 (in target_1)
> ENTRY #5: 0x75922c (in caller)
> ENTRY #6: 0x6f8f39
> ENTRY #7: 0x6fa6f0
> ENTRY #8: 0x7f986adc4cd0
>
> Now there is a logical and complete sequence of function calls.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
jirka
> ---
> .../bpf/prog_tests/uretprobe_stack.c | 186 ++++++++++++++++++
> .../selftests/bpf/progs/uretprobe_stack.c | 96 +++++++++
> 2 files changed, 282 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_stack.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> new file mode 100644
> index 000000000000..6deb8d560ddd
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> @@ -0,0 +1,186 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
> +
> +#include <test_progs.h>
> +#include "uretprobe_stack.skel.h"
> +#include "../sdt.h"
> +
> +/* We set up target_1() -> target_2() -> target_3() -> target_4() -> USDT()
> + * call chain, each being traced by our BPF program. On entry or return from
> + * each target_*() we are capturing user stack trace and recording it in
> + * global variable, so that user space part of the test can validate it.
> + *
> + * Note, we put each target function into a custom section to get those
> + * __start_XXX/__stop_XXX symbols, generated by linker for us, which allow us
> + * to know address range of those functions
> + */
> +__attribute__((section("uprobe__target_4")))
> +__weak int target_4(void)
> +{
> + STAP_PROBE1(uretprobe_stack, target, 42);
> + return 42;
> +}
> +
> +extern const void *__start_uprobe__target_4;
> +extern const void *__stop_uprobe__target_4;
> +
> +__attribute__((section("uprobe__target_3")))
> +__weak int target_3(void)
> +{
> + return target_4();
> +}
> +
> +extern const void *__start_uprobe__target_3;
> +extern const void *__stop_uprobe__target_3;
> +
> +__attribute__((section("uprobe__target_2")))
> +__weak int target_2(void)
> +{
> + return target_3();
> +}
> +
> +extern const void *__start_uprobe__target_2;
> +extern const void *__stop_uprobe__target_2;
> +
> +__attribute__((section("uprobe__target_1")))
> +__weak int target_1(int depth)
> +{
> + if (depth < 1)
> + return 1 + target_1(depth + 1);
> + else
> + return target_2();
> +}
> +
> +extern const void *__start_uprobe__target_1;
> +extern const void *__stop_uprobe__target_1;
> +
> +extern const void *__start_uretprobe_stack_sec;
> +extern const void *__stop_uretprobe_stack_sec;
> +
> +struct range {
> + long start;
> + long stop;
> +};
> +
> +static struct range targets[] = {
> + {}, /* we want target_1 to map to target[1], so need 1-based indexing */
> + { (long)&__start_uprobe__target_1, (long)&__stop_uprobe__target_1 },
> + { (long)&__start_uprobe__target_2, (long)&__stop_uprobe__target_2 },
> + { (long)&__start_uprobe__target_3, (long)&__stop_uprobe__target_3 },
> + { (long)&__start_uprobe__target_4, (long)&__stop_uprobe__target_4 },
> +};
> +
> +static struct range caller = {
> + (long)&__start_uretprobe_stack_sec,
> + (long)&__stop_uretprobe_stack_sec,
> +};
> +
> +static void validate_stack(__u64 *ips, int stack_len, int cnt, ...)
> +{
> + int i, j;
> + va_list args;
> +
> + if (!ASSERT_GT(stack_len, 0, "stack_len"))
> + return;
> +
> + stack_len /= 8;
> +
> + /* check if we have enough entries to satisfy test expectations */
> + if (!ASSERT_GE(stack_len, cnt, "stack_len2"))
> + return;
> +
> + if (env.verbosity >= VERBOSE_NORMAL) {
> + printf("caller: %#lx - %#lx\n", caller.start, caller.stop);
> + for (i = 1; i < ARRAY_SIZE(targets); i++)
> + printf("target_%d: %#lx - %#lx\n", i, targets[i].start, targets[i].stop);
> + for (i = 0; i < stack_len; i++) {
> + for (j = 1; j < ARRAY_SIZE(targets); j++) {
> + if (ips[i] >= targets[j].start && ips[i] < targets[j].stop)
> + break;
> + }
> + if (j < ARRAY_SIZE(targets)) { /* found target match */
> + printf("ENTRY #%d: %#lx (in target_%d)\n", i, (long)ips[i], j);
> + } else if (ips[i] >= caller.start && ips[i] < caller.stop) {
> + printf("ENTRY #%d: %#lx (in caller)\n", i, (long)ips[i]);
> + } else {
> + printf("ENTRY #%d: %#lx\n", i, (long)ips[i]);
> + }
> + }
> + }
> +
> + va_start(args, cnt);
> +
> + for (i = cnt - 1; i >= 0; i--) {
> + /* most recent entry is the deepest target function */
> + const struct range *t = va_arg(args, const struct range *);
> +
> + ASSERT_GE(ips[i], t->start, "addr_start");
> + ASSERT_LT(ips[i], t->stop, "addr_stop");
> + }
> +
> + va_end(args);
> +}
> +
> +/* __weak prevents inlining */
> +__attribute__((section("uretprobe_stack_sec")))
> +__weak void test_uretprobe_stack(void)
> +{
> + LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
> + struct uretprobe_stack *skel;
> + int err;
> +
> + skel = uretprobe_stack__open_and_load();
> + if (!ASSERT_OK_PTR(skel, "skel_open"))
> + return;
> +
> + err = uretprobe_stack__attach(skel);
> + if (!ASSERT_OK(err, "skel_attach"))
> + goto cleanup;
> +
> + /* trigger */
> + ASSERT_EQ(target_1(0), 42 + 1, "trigger_return");
> +
> + /*
> + * Stacks captured on ENTRY uprobes
> + */
> +
> + /* (uprobe 1) target_1 in stack trace*/
> + validate_stack(skel->bss->entry_stack1, skel->bss->entry1_len,
> + 2, &caller, &targets[1]);
> + /* (uprobe 1, recursed) */
> + validate_stack(skel->bss->entry_stack1_recur, skel->bss->entry1_recur_len,
> + 3, &caller, &targets[1], &targets[1]);
> + /* (uprobe 2) caller -> target_1 -> target_1 -> target_2 */
> + validate_stack(skel->bss->entry_stack2, skel->bss->entry2_len,
> + 4, &caller, &targets[1], &targets[1], &targets[2]);
> + /* (uprobe 3) */
> + validate_stack(skel->bss->entry_stack3, skel->bss->entry3_len,
> + 5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
> + /* (uprobe 4) caller -> target_1 -> target_1 -> target_2 -> target_3 -> target_4 */
> + validate_stack(skel->bss->entry_stack4, skel->bss->entry4_len,
> + 6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
> +
> + /* (USDT): full caller -> target_1 -> target_1 -> target_2 (uretprobed)
> + * -> target_3 -> target_4 (uretprobes) chain
> + */
> + validate_stack(skel->bss->usdt_stack, skel->bss->usdt_len,
> + 6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
> +
> + /*
> + * Now stacks captured on the way out in EXIT uprobes
> + */
> +
> + /* (uretprobe 4) everything up to target_4, but excluding it */
> + validate_stack(skel->bss->exit_stack4, skel->bss->exit4_len,
> + 5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
> + /* we didn't install uretprobes on target_2 and target_3 */
> + /* (uretprobe 1, recur) first target_1 call only */
> + validate_stack(skel->bss->exit_stack1_recur, skel->bss->exit1_recur_len,
> + 2, &caller, &targets[1]);
> + /* (uretprobe 1) just a caller in the stack trace */
> + validate_stack(skel->bss->exit_stack1, skel->bss->exit1_len,
> + 1, &caller);
> +
> +cleanup:
> + uretprobe_stack__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/uretprobe_stack.c b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
> new file mode 100644
> index 000000000000..9fdcf396b8f4
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
> @@ -0,0 +1,96 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
> +#include <vmlinux.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include <bpf/usdt.bpf.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +__u64 entry_stack1[32], exit_stack1[32];
> +__u64 entry_stack1_recur[32], exit_stack1_recur[32];
> +__u64 entry_stack2[32];
> +__u64 entry_stack3[32];
> +__u64 entry_stack4[32], exit_stack4[32];
> +__u64 usdt_stack[32];
> +
> +int entry1_len, exit1_len;
> +int entry1_recur_len, exit1_recur_len;
> +int entry2_len, exit2_len;
> +int entry3_len, exit3_len;
> +int entry4_len, exit4_len;
> +int usdt_len;
> +
> +#define SZ sizeof(usdt_stack)
> +
> +SEC("uprobe//proc/self/exe:target_1")
> +int BPF_UPROBE(uprobe_1)
> +{
> + /* target_1 is recursive wit depth of 2, so we capture two separate
> + * stack traces, depending on which occurence it is
> + */
> + static bool recur = false;
> +
> + if (!recur)
> + entry1_len = bpf_get_stack(ctx, &entry_stack1, SZ, BPF_F_USER_STACK);
> + else
> + entry1_recur_len = bpf_get_stack(ctx, &entry_stack1_recur, SZ, BPF_F_USER_STACK);
> +
> + recur = true;
> + return 0;
> +}
> +
> +SEC("uretprobe//proc/self/exe:target_1")
> +int BPF_URETPROBE(uretprobe_1)
> +{
> + /* see above, target_1 is recursive */
> + static bool recur = false;
> +
> + /* NOTE: order of returns is reversed to order of entries */
> + if (!recur)
> + exit1_recur_len = bpf_get_stack(ctx, &exit_stack1_recur, SZ, BPF_F_USER_STACK);
> + else
> + exit1_len = bpf_get_stack(ctx, &exit_stack1, SZ, BPF_F_USER_STACK);
> +
> + recur = true;
> + return 0;
> +}
> +
> +SEC("uprobe//proc/self/exe:target_2")
> +int BPF_UPROBE(uprobe_2)
> +{
> + entry2_len = bpf_get_stack(ctx, &entry_stack2, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +/* no uretprobe for target_2 */
> +
> +SEC("uprobe//proc/self/exe:target_3")
> +int BPF_UPROBE(uprobe_3)
> +{
> + entry3_len = bpf_get_stack(ctx, &entry_stack3, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +/* no uretprobe for target_3 */
> +
> +SEC("uprobe//proc/self/exe:target_4")
> +int BPF_UPROBE(uprobe_4)
> +{
> + entry4_len = bpf_get_stack(ctx, &entry_stack4, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +SEC("uretprobe//proc/self/exe:target_4")
> +int BPF_URETPROBE(uretprobe_4)
> +{
> + exit4_len = bpf_get_stack(ctx, &exit_stack4, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +SEC("usdt//proc/self/exe:uretprobe_stack:target")
> +int BPF_USDT(usdt_probe)
> +{
> + usdt_len = bpf_get_stack(ctx, &usdt_stack, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe
2024-05-22 1:38 ` [PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe Andrii Nakryiko
2024-06-04 9:24 ` Jiri Olsa
@ 2024-06-04 14:06 ` Masami Hiramatsu
2024-06-04 17:13 ` Andrii Nakryiko
1 sibling, 1 reply; 21+ messages in thread
From: Masami Hiramatsu @ 2024-06-04 14:06 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, rostedt, x86, peterz, mingo, tglx, bpf,
rihams, linux-perf-users
On Tue, 21 May 2024 18:38:44 -0700
Andrii Nakryiko <andrii@kernel.org> wrote:
> When tracing user functions with uprobe functionality, it's common to
> install the probe (e.g., a BPF program) at the first instruction of the
> function. This is often going to be `push %rbp` instruction in function
> preamble, which means that within that function frame pointer hasn't
> been established yet. This leads to consistently missing an actual
> caller of the traced function, because perf_callchain_user() only
> records current IP (capturing traced function) and then following frame
> pointer chain (which would be caller's frame, containing the address of
> caller's caller).
I thought this problem might be solved by sframe.
>
> So when we have target_1 -> target_2 -> target_3 call chain and we are
> tracing an entry to target_3, captured stack trace will report
> target_1 -> target_3 call chain, which is wrong and confusing.
>
> This patch proposes a x86-64-specific heuristic to detect `push %rbp`
> instruction being traced.
I like this kind of idea :) But I think this should be done in
the user-space, not in the kernel because it is not always sure
that the user program uses stack frames.
> If that's the case, with the assumption that
> applicatoin is compiled with frame pointers, this instruction would be
> a strong indicator that this is the entry to the function. In that case,
> return address is still pointed to by %rsp, so we fetch it and add to
> stack trace before proceeding to unwind the rest using frame
> pointer-based logic.
Why don't we make it in the userspace BPF program? If it is done
in the user space, like perf-probe, I'm OK. But I doubt to do this in
kernel. That means it is not flexible.
More than anything, without user-space helper to find function
symbols, uprobe does not know the function entry. Then I'm curious
why don't you do this in the user space.
At least, this should be done in the user of uprobes, like trace_uprobe
or bpf.
Thank you,
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> arch/x86/events/core.c | 20 ++++++++++++++++++++
> include/linux/uprobes.h | 2 ++
> kernel/events/uprobes.c | 2 ++
> 3 files changed, 24 insertions(+)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 5b0dd07b1ef1..82d5570b58ff 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2884,6 +2884,26 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs
> return;
>
> pagefault_disable();
> +
> +#ifdef CONFIG_UPROBES
> + /*
> + * If we are called from uprobe handler, and we are indeed at the very
> + * entry to user function (which is normally a `push %rbp` instruction,
> + * under assumption of application being compiled with frame pointers),
> + * we should read return address from *regs->sp before proceeding
> + * to follow frame pointers, otherwise we'll skip immediate caller
> + * as %rbp is not yet setup.
> + */
> + if (current->utask) {
> + struct arch_uprobe *auprobe = current->utask->auprobe;
> + u64 ret_addr;
> +
> + if (auprobe && auprobe->insn[0] == 0x55 /* push %rbp */ &&
> + !__get_user(ret_addr, (const u64 __user *)regs->sp))
> + perf_callchain_store(entry, ret_addr);
> + }
> +#endif
> +
> while (entry->nr < entry->max_stack) {
> if (!valid_user_frame(fp, sizeof(frame)))
> break;
> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> index 0c57eec85339..7b785cd30d86 100644
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -76,6 +76,8 @@ struct uprobe_task {
> struct uprobe *active_uprobe;
> unsigned long xol_vaddr;
>
> + struct arch_uprobe *auprobe;
> +
> struct return_instance *return_instances;
> unsigned int depth;
> };
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 1c99380dc89d..504693845187 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -2072,6 +2072,7 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
> bool need_prep = false; /* prepare return uprobe, when needed */
>
> down_read(&uprobe->register_rwsem);
> + current->utask->auprobe = &uprobe->arch;
> for (uc = uprobe->consumers; uc; uc = uc->next) {
> int rc = 0;
>
> @@ -2086,6 +2087,7 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
>
> remove &= rc;
> }
> + current->utask->auprobe = NULL;
>
> if (need_prep && !remove)
> prepare_uretprobe(uprobe, regs); /* put bp at return */
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes
2024-05-22 1:38 ` [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes Andrii Nakryiko
@ 2024-06-04 14:13 ` Masami Hiramatsu
2024-06-04 17:16 ` Andrii Nakryiko
0 siblings, 1 reply; 21+ messages in thread
From: Masami Hiramatsu @ 2024-06-04 14:13 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, rostedt, x86, peterz, mingo, tglx, bpf,
rihams, linux-perf-users, Riham Selim
On Tue, 21 May 2024 18:38:43 -0700
Andrii Nakryiko <andrii@kernel.org> wrote:
> When kernel has pending uretprobes installed, it hijacks original user
> function return address on the stack with a uretprobe trampoline
> address. There could be multiple such pending uretprobes (either on
> different user functions or on the same recursive one) at any given
> time within the same task.
>
> This approach interferes with the user stack trace capture logic, which
> would report suprising addresses (like 0x7fffffffe000) that correspond
> to a special "[uprobes]" section that kernel installs in the target
> process address space for uretprobe trampoline code, while logically it
> should be an address somewhere within the calling function of another
> traced user function.
>
> This is easy to correct for, though. Uprobes subsystem keeps track of
> pending uretprobes and records original return addresses. This patch is
> using this to do a post-processing step and restore each trampoline
> address entries with correct original return address. This is done only
> if there are pending uretprobes for current task.
>
> This is a similar approach to what fprobe/kretprobe infrastructure is
> doing when capturing kernel stack traces in the presence of pending
> return probes.
>
This looks good to me because this trampoline information is only
managed in uprobes. And it should be provided when unwinding user
stack.
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Thank you!
> Reported-by: Riham Selim <rihams@meta.com>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> kernel/events/callchain.c | 43 ++++++++++++++++++++++++++++++++++++++-
> kernel/events/uprobes.c | 9 ++++++++
> 2 files changed, 51 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
> index 1273be84392c..b17e3323f7f6 100644
> --- a/kernel/events/callchain.c
> +++ b/kernel/events/callchain.c
> @@ -11,6 +11,7 @@
> #include <linux/perf_event.h>
> #include <linux/slab.h>
> #include <linux/sched/task_stack.h>
> +#include <linux/uprobes.h>
>
> #include "internal.h"
>
> @@ -176,13 +177,51 @@ put_callchain_entry(int rctx)
> put_recursion_context(this_cpu_ptr(callchain_recursion), rctx);
> }
>
> +static void fixup_uretprobe_trampoline_entries(struct perf_callchain_entry *entry,
> + int start_entry_idx)
> +{
> +#ifdef CONFIG_UPROBES
> + struct uprobe_task *utask = current->utask;
> + struct return_instance *ri;
> + __u64 *cur_ip, *last_ip, tramp_addr;
> +
> + if (likely(!utask || !utask->return_instances))
> + return;
> +
> + cur_ip = &entry->ip[start_entry_idx];
> + last_ip = &entry->ip[entry->nr - 1];
> + ri = utask->return_instances;
> + tramp_addr = uprobe_get_trampoline_vaddr();
> +
> + /*
> + * If there are pending uretprobes for the current thread, they are
> + * recorded in a list inside utask->return_instances; each such
> + * pending uretprobe replaces traced user function's return address on
> + * the stack, so when stack trace is captured, instead of seeing
> + * actual function's return address, we'll have one or many uretprobe
> + * trampoline addresses in the stack trace, which are not helpful and
> + * misleading to users.
> + * So here we go over the pending list of uretprobes, and each
> + * encountered trampoline address is replaced with actual return
> + * address.
> + */
> + while (ri && cur_ip <= last_ip) {
> + if (*cur_ip == tramp_addr) {
> + *cur_ip = ri->orig_ret_vaddr;
> + ri = ri->next;
> + }
> + cur_ip++;
> + }
> +#endif
> +}
> +
> struct perf_callchain_entry *
> get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
> u32 max_stack, bool crosstask, bool add_mark)
> {
> struct perf_callchain_entry *entry;
> struct perf_callchain_entry_ctx ctx;
> - int rctx;
> + int rctx, start_entry_idx;
>
> entry = get_callchain_entry(&rctx);
> if (!entry)
> @@ -215,7 +254,9 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
> if (add_mark)
> perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);
>
> + start_entry_idx = entry->nr;
> perf_callchain_user(&ctx, regs);
> + fixup_uretprobe_trampoline_entries(entry, start_entry_idx);
> }
> }
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index d60d24f0f2f4..1c99380dc89d 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -2149,6 +2149,15 @@ static void handle_trampoline(struct pt_regs *regs)
>
> instruction_pointer_set(regs, ri->orig_ret_vaddr);
> do {
> + /* pop current instance from the stack of pending return instances,
> + * as it's not pending anymore: we just fixed up original
> + * instruction pointer in regs and are about to call handlers;
> + * this allows fixup_uretprobe_trampoline_entries() to properly fix up
> + * captured stack traces from uretprobe handlers, in which pending
> + * trampoline addresses on the stack are replaced with correct
> + * original return addresses
> + */
> + utask->return_instances = ri->next;
> if (valid)
> handle_uretprobe_chain(ri, regs);
> ri = free_ret_instance(ri);
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe
2024-06-04 14:06 ` Masami Hiramatsu
@ 2024-06-04 17:13 ` Andrii Nakryiko
0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2024-06-04 17:13 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Andrii Nakryiko, linux-trace-kernel, rostedt, x86, peterz, mingo,
tglx, bpf, rihams, linux-perf-users
On Tue, Jun 4, 2024 at 7:06 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Tue, 21 May 2024 18:38:44 -0700
> Andrii Nakryiko <andrii@kernel.org> wrote:
>
> > When tracing user functions with uprobe functionality, it's common to
> > install the probe (e.g., a BPF program) at the first instruction of the
> > function. This is often going to be `push %rbp` instruction in function
> > preamble, which means that within that function frame pointer hasn't
> > been established yet. This leads to consistently missing an actual
> > caller of the traced function, because perf_callchain_user() only
> > records current IP (capturing traced function) and then following frame
> > pointer chain (which would be caller's frame, containing the address of
> > caller's caller).
>
> I thought this problem might be solved by sframe.
Eventually, yes, when real-world applications switch to sframe and we
get proper support for it in the kernel. But right now there are tons
of applications relying on kernel capturing stack traces based on
frame pointers, so it would be good to improve that as well.
>
> >
> > So when we have target_1 -> target_2 -> target_3 call chain and we are
> > tracing an entry to target_3, captured stack trace will report
> > target_1 -> target_3 call chain, which is wrong and confusing.
> >
> > This patch proposes a x86-64-specific heuristic to detect `push %rbp`
> > instruction being traced.
>
> I like this kind of idea :) But I think this should be done in
> the user-space, not in the kernel because it is not always sure
> that the user program uses stack frames.
Existing kernel code that captures user space stack trace already
assumes that code was compiled with a frame pointer (unconditionally),
because that's the best kernel can do. So under that assumption this
heuristic is valid and not harmful, IMO.
User space can do nothing about this, because it is the kernel that
captures stack trace (e.g., from BPF program), and we will lose the
calling frame if we don't do it here.
>
> > If that's the case, with the assumption that
> > applicatoin is compiled with frame pointers, this instruction would be
> > a strong indicator that this is the entry to the function. In that case,
> > return address is still pointed to by %rsp, so we fetch it and add to
> > stack trace before proceeding to unwind the rest using frame
> > pointer-based logic.
>
> Why don't we make it in the userspace BPF program? If it is done
> in the user space, like perf-probe, I'm OK. But I doubt to do this in
> kernel. That means it is not flexible.
>
You mean for the BPF program to capture the entire stack trace by
itself, without asking the kernel for help? It's doable, but:
a) it's inconvenient for all users to have to reimplement this
low-level "primitive" operation, that we already promise is provided
by kernel (through bpf_get_stack() API, and kernel has internal
perf_callchain API for this)
b) it's faster for kernel to do this, as kernel code disables page
faults once and unwinds the stack, while BPF program would have to do
multiple bpf_probe_read_user() calls, each individually disabling page
faults.
But really, there is an already existing API, which in some cases
returns partially invalid stack traces (skipping function caller's
frame), so this is an attempt to fix this issue.
> More than anything, without user-space helper to find function
> symbols, uprobe does not know the function entry. Then I'm curious
> why don't you do this in the user space.
You are mixing stack *capture* (in which we get memory addresses
representing where a function call or currently running instruction
pointer is) with stack *symbolization* (where user space needs ELF
symbols and/or DWARF information to translate those addresses into
something human-readable).
This heuristic improves the former as performed by the kernel. Stack
symbolization is completely orthogonal to all of this.
>
> At least, this should be done in the user of uprobes, like trace_uprobe
> or bpf.
>
This is a really miserable user experience, if they have to implement
their own stack trace capture for uprobes, but use built-in
bpf_get_stack() API for any other type of program.
>
> Thank you,
>
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> > arch/x86/events/core.c | 20 ++++++++++++++++++++
> > include/linux/uprobes.h | 2 ++
> > kernel/events/uprobes.c | 2 ++
> > 3 files changed, 24 insertions(+)
> >
[...]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes
2024-06-04 14:13 ` Masami Hiramatsu
@ 2024-06-04 17:16 ` Andrii Nakryiko
2024-06-17 22:37 ` Andrii Nakryiko
0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2024-06-04 17:16 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Andrii Nakryiko, linux-trace-kernel, rostedt, x86, peterz, mingo,
tglx, bpf, rihams, linux-perf-users, Riham Selim
On Tue, Jun 4, 2024 at 7:13 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Tue, 21 May 2024 18:38:43 -0700
> Andrii Nakryiko <andrii@kernel.org> wrote:
>
> > When kernel has pending uretprobes installed, it hijacks original user
> > function return address on the stack with a uretprobe trampoline
> > address. There could be multiple such pending uretprobes (either on
> > different user functions or on the same recursive one) at any given
> > time within the same task.
> >
> > This approach interferes with the user stack trace capture logic, which
> > would report suprising addresses (like 0x7fffffffe000) that correspond
> > to a special "[uprobes]" section that kernel installs in the target
> > process address space for uretprobe trampoline code, while logically it
> > should be an address somewhere within the calling function of another
> > traced user function.
> >
> > This is easy to correct for, though. Uprobes subsystem keeps track of
> > pending uretprobes and records original return addresses. This patch is
> > using this to do a post-processing step and restore each trampoline
> > address entries with correct original return address. This is done only
> > if there are pending uretprobes for current task.
> >
> > This is a similar approach to what fprobe/kretprobe infrastructure is
> > doing when capturing kernel stack traces in the presence of pending
> > return probes.
> >
>
> This looks good to me because this trampoline information is only
> managed in uprobes. And it should be provided when unwinding user
> stack.
>
> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Thank you!
Great, thanks for reviewing, Masami!
Would you take this fix through your tree, or where should it be routed to?
>
> > Reported-by: Riham Selim <rihams@meta.com>
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> > kernel/events/callchain.c | 43 ++++++++++++++++++++++++++++++++++++++-
> > kernel/events/uprobes.c | 9 ++++++++
> > 2 files changed, 51 insertions(+), 1 deletion(-)
> >
[...]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes
2024-06-04 17:16 ` Andrii Nakryiko
@ 2024-06-17 22:37 ` Andrii Nakryiko
2024-06-24 20:32 ` Andrii Nakryiko
0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2024-06-17 22:37 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Andrii Nakryiko, linux-trace-kernel, rostedt, x86, peterz, mingo,
tglx, bpf, rihams, linux-perf-users, Riham Selim
On Tue, Jun 4, 2024 at 10:16 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 7:13 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Tue, 21 May 2024 18:38:43 -0700
> > Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > > When kernel has pending uretprobes installed, it hijacks original user
> > > function return address on the stack with a uretprobe trampoline
> > > address. There could be multiple such pending uretprobes (either on
> > > different user functions or on the same recursive one) at any given
> > > time within the same task.
> > >
> > > This approach interferes with the user stack trace capture logic, which
> > > would report suprising addresses (like 0x7fffffffe000) that correspond
> > > to a special "[uprobes]" section that kernel installs in the target
> > > process address space for uretprobe trampoline code, while logically it
> > > should be an address somewhere within the calling function of another
> > > traced user function.
> > >
> > > This is easy to correct for, though. Uprobes subsystem keeps track of
> > > pending uretprobes and records original return addresses. This patch is
> > > using this to do a post-processing step and restore each trampoline
> > > address entries with correct original return address. This is done only
> > > if there are pending uretprobes for current task.
> > >
> > > This is a similar approach to what fprobe/kretprobe infrastructure is
> > > doing when capturing kernel stack traces in the presence of pending
> > > return probes.
> > >
> >
> > This looks good to me because this trampoline information is only
> > managed in uprobes. And it should be provided when unwinding user
> > stack.
> >
> > Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > Thank you!
>
> Great, thanks for reviewing, Masami!
>
> Would you take this fix through your tree, or where should it be routed to?
>
Ping! What would you like me to do with this patch set? Should I
resend it without patch 3 (the one that tries to guess whether we are
at the entry to the function?), or did I manage to convince you that
this heuristic is OK, given perf's stack trace capturing logic already
makes heavy assumption of rbp register being used for frame pointer?
Please let me know your preference, I could drop patch 3 and send it
separately, if that helps move the main fix forward. Thanks!
> >
> > > Reported-by: Riham Selim <rihams@meta.com>
> > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > ---
> > > kernel/events/callchain.c | 43 ++++++++++++++++++++++++++++++++++++++-
> > > kernel/events/uprobes.c | 9 ++++++++
> > > 2 files changed, 51 insertions(+), 1 deletion(-)
> > >
>
> [...]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes
2024-06-17 22:37 ` Andrii Nakryiko
@ 2024-06-24 20:32 ` Andrii Nakryiko
2024-06-25 0:39 ` Masami Hiramatsu
0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2024-06-24 20:32 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Andrii Nakryiko, linux-trace-kernel, rostedt, x86, peterz, mingo,
tglx, bpf, rihams, linux-perf-users, Riham Selim
On Mon, Jun 17, 2024 at 3:37 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 10:16 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Jun 4, 2024 at 7:13 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > On Tue, 21 May 2024 18:38:43 -0700
> > > Andrii Nakryiko <andrii@kernel.org> wrote:
> > >
> > > > When kernel has pending uretprobes installed, it hijacks original user
> > > > function return address on the stack with a uretprobe trampoline
> > > > address. There could be multiple such pending uretprobes (either on
> > > > different user functions or on the same recursive one) at any given
> > > > time within the same task.
> > > >
> > > > This approach interferes with the user stack trace capture logic, which
> > > > would report suprising addresses (like 0x7fffffffe000) that correspond
> > > > to a special "[uprobes]" section that kernel installs in the target
> > > > process address space for uretprobe trampoline code, while logically it
> > > > should be an address somewhere within the calling function of another
> > > > traced user function.
> > > >
> > > > This is easy to correct for, though. Uprobes subsystem keeps track of
> > > > pending uretprobes and records original return addresses. This patch is
> > > > using this to do a post-processing step and restore each trampoline
> > > > address entries with correct original return address. This is done only
> > > > if there are pending uretprobes for current task.
> > > >
> > > > This is a similar approach to what fprobe/kretprobe infrastructure is
> > > > doing when capturing kernel stack traces in the presence of pending
> > > > return probes.
> > > >
> > >
> > > This looks good to me because this trampoline information is only
> > > managed in uprobes. And it should be provided when unwinding user
> > > stack.
> > >
> > > Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > >
> > > Thank you!
> >
> > Great, thanks for reviewing, Masami!
> >
> > Would you take this fix through your tree, or where should it be routed to?
> >
>
> Ping! What would you like me to do with this patch set? Should I
> resend it without patch 3 (the one that tries to guess whether we are
> at the entry to the function?), or did I manage to convince you that
> this heuristic is OK, given perf's stack trace capturing logic already
> makes heavy assumption of rbp register being used for frame pointer?
>
> Please let me know your preference, I could drop patch 3 and send it
> separately, if that helps move the main fix forward. Thanks!
Masami,
Another week went by with absolutely no action or reaction from you.
Is there any way I can help improve the collaboration here?
I'm preparing more patches for uprobes and about to submit them. If
each reviewed and ready to be applied patch set has to sit idle for
multiple weeks for no good reason, we all will soon be lost just plain
forgetting the context in which the patch was prepared.
Please, prioritize handling patches that are meant to be routed
through your tree in a more timely fashion. Or propose some
alternative acceptable arrangement.
Thank you.
>
> > >
> > > > Reported-by: Riham Selim <rihams@meta.com>
> > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > ---
> > > > kernel/events/callchain.c | 43 ++++++++++++++++++++++++++++++++++++++-
> > > > kernel/events/uprobes.c | 9 ++++++++
> > > > 2 files changed, 51 insertions(+), 1 deletion(-)
> > > >
> >
> > [...]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global
2024-05-22 1:38 ` [PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global Andrii Nakryiko
@ 2024-06-25 0:28 ` Masami Hiramatsu
2024-06-25 1:02 ` Masami Hiramatsu
1 sibling, 0 replies; 21+ messages in thread
From: Masami Hiramatsu @ 2024-06-25 0:28 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, rostedt, x86, peterz, mingo, tglx, bpf,
rihams, linux-perf-users
On Tue, 21 May 2024 18:38:42 -0700
Andrii Nakryiko <andrii@kernel.org> wrote:
> This helper is needed in another file, so make it a bit more uniquely
> named and expose it internally.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Looks good to me.
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Thank you
> ---
> include/linux/uprobes.h | 1 +
> kernel/events/uprobes.c | 6 +++---
> 2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> index f46e0ca0169c..0c57eec85339 100644
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -138,6 +138,7 @@ extern bool arch_uretprobe_is_alive(struct return_instance *ret, enum rp_check c
> extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs);
> extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
> void *src, unsigned long len);
> +extern unsigned long uprobe_get_trampoline_vaddr(void);
> #else /* !CONFIG_UPROBES */
> struct uprobes_state {
> };
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 8ae0eefc3a34..d60d24f0f2f4 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1827,7 +1827,7 @@ void uprobe_copy_process(struct task_struct *t, unsigned long flags)
> *
> * Returns -1 in case the xol_area is not allocated.
> */
> -static unsigned long get_trampoline_vaddr(void)
> +unsigned long uprobe_get_trampoline_vaddr(void)
> {
> struct xol_area *area;
> unsigned long trampoline_vaddr = -1;
> @@ -1878,7 +1878,7 @@ static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
> if (!ri)
> return;
>
> - trampoline_vaddr = get_trampoline_vaddr();
> + trampoline_vaddr = uprobe_get_trampoline_vaddr();
> orig_ret_vaddr = arch_uretprobe_hijack_return_addr(trampoline_vaddr, regs);
> if (orig_ret_vaddr == -1)
> goto fail;
> @@ -2187,7 +2187,7 @@ static void handle_swbp(struct pt_regs *regs)
> int is_swbp;
>
> bp_vaddr = uprobe_get_swbp_addr(regs);
> - if (bp_vaddr == get_trampoline_vaddr())
> + if (bp_vaddr == uprobe_get_trampoline_vaddr())
> return handle_trampoline(regs);
>
> uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes
2024-06-24 20:32 ` Andrii Nakryiko
@ 2024-06-25 0:39 ` Masami Hiramatsu
2024-06-25 2:51 ` Andrii Nakryiko
0 siblings, 1 reply; 21+ messages in thread
From: Masami Hiramatsu @ 2024-06-25 0:39 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Andrii Nakryiko, linux-trace-kernel, rostedt, x86, peterz, mingo,
tglx, bpf, rihams, linux-perf-users, Riham Selim
On Mon, 24 Jun 2024 13:32:35 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> On Mon, Jun 17, 2024 at 3:37 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Jun 4, 2024 at 10:16 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Tue, Jun 4, 2024 at 7:13 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >
> > > > On Tue, 21 May 2024 18:38:43 -0700
> > > > Andrii Nakryiko <andrii@kernel.org> wrote:
> > > >
> > > > > When kernel has pending uretprobes installed, it hijacks original user
> > > > > function return address on the stack with a uretprobe trampoline
> > > > > address. There could be multiple such pending uretprobes (either on
> > > > > different user functions or on the same recursive one) at any given
> > > > > time within the same task.
> > > > >
> > > > > This approach interferes with the user stack trace capture logic, which
> > > > > would report suprising addresses (like 0x7fffffffe000) that correspond
> > > > > to a special "[uprobes]" section that kernel installs in the target
> > > > > process address space for uretprobe trampoline code, while logically it
> > > > > should be an address somewhere within the calling function of another
> > > > > traced user function.
> > > > >
> > > > > This is easy to correct for, though. Uprobes subsystem keeps track of
> > > > > pending uretprobes and records original return addresses. This patch is
> > > > > using this to do a post-processing step and restore each trampoline
> > > > > address entries with correct original return address. This is done only
> > > > > if there are pending uretprobes for current task.
> > > > >
> > > > > This is a similar approach to what fprobe/kretprobe infrastructure is
> > > > > doing when capturing kernel stack traces in the presence of pending
> > > > > return probes.
> > > > >
> > > >
> > > > This looks good to me because this trampoline information is only
> > > > managed in uprobes. And it should be provided when unwinding user
> > > > stack.
> > > >
> > > > Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > >
> > > > Thank you!
> > >
> > > Great, thanks for reviewing, Masami!
> > >
> > > Would you take this fix through your tree, or where should it be routed to?
> > >
> >
> > Ping! What would you like me to do with this patch set? Should I
> > resend it without patch 3 (the one that tries to guess whether we are
> > at the entry to the function?), or did I manage to convince you that
> > this heuristic is OK, given perf's stack trace capturing logic already
> > makes heavy assumption of rbp register being used for frame pointer?
> >
> > Please let me know your preference, I could drop patch 3 and send it
> > separately, if that helps move the main fix forward. Thanks!
>
> Masami,
>
> Another week went by with absolutely no action or reaction from you.
> Is there any way I can help improve the collaboration here?
OK, if there is no change without [3/4], let me pick the others on
probes/for-next directly. [3/4] I need other x86 maintainer's
comments. And it should be handled by PMU maintainers.
Thanks,
>
> I'm preparing more patches for uprobes and about to submit them. If
> each reviewed and ready to be applied patch set has to sit idle for
> multiple weeks for no good reason, we all will soon be lost just plain
> forgetting the context in which the patch was prepared.
>
> Please, prioritize handling patches that are meant to be routed
> through your tree in a more timely fashion. Or propose some
> alternative acceptable arrangement.
>
> Thank you.
>
> >
> > > >
> > > > > Reported-by: Riham Selim <rihams@meta.com>
> > > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > > ---
> > > > > kernel/events/callchain.c | 43 ++++++++++++++++++++++++++++++++++++++-
> > > > > kernel/events/uprobes.c | 9 ++++++++
> > > > > 2 files changed, 51 insertions(+), 1 deletion(-)
> > > > >
> > >
> > > [...]
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global
2024-05-22 1:38 ` [PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global Andrii Nakryiko
2024-06-25 0:28 ` Masami Hiramatsu
@ 2024-06-25 1:02 ` Masami Hiramatsu
1 sibling, 0 replies; 21+ messages in thread
From: Masami Hiramatsu @ 2024-06-25 1:02 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, rostedt, x86, peterz, mingo, tglx, bpf,
rihams, linux-perf-users
On Tue, 21 May 2024 18:38:42 -0700
Andrii Nakryiko <andrii@kernel.org> wrote:
> This helper is needed in another file, so make it a bit more uniquely
> named and expose it internally.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Sorry, I think this conflicts with
https://lore.kernel.org/all/20240611112158.40795-4-jolsa@kernel.org/
And that is already picked.
Thanks,
> ---
> include/linux/uprobes.h | 1 +
> kernel/events/uprobes.c | 6 +++---
> 2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> index f46e0ca0169c..0c57eec85339 100644
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -138,6 +138,7 @@ extern bool arch_uretprobe_is_alive(struct return_instance *ret, enum rp_check c
> extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs);
> extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
> void *src, unsigned long len);
> +extern unsigned long uprobe_get_trampoline_vaddr(void);
> #else /* !CONFIG_UPROBES */
> struct uprobes_state {
> };
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 8ae0eefc3a34..d60d24f0f2f4 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1827,7 +1827,7 @@ void uprobe_copy_process(struct task_struct *t, unsigned long flags)
> *
> * Returns -1 in case the xol_area is not allocated.
> */
> -static unsigned long get_trampoline_vaddr(void)
> +unsigned long uprobe_get_trampoline_vaddr(void)
> {
> struct xol_area *area;
> unsigned long trampoline_vaddr = -1;
> @@ -1878,7 +1878,7 @@ static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
> if (!ri)
> return;
>
> - trampoline_vaddr = get_trampoline_vaddr();
> + trampoline_vaddr = uprobe_get_trampoline_vaddr();
> orig_ret_vaddr = arch_uretprobe_hijack_return_addr(trampoline_vaddr, regs);
> if (orig_ret_vaddr == -1)
> goto fail;
> @@ -2187,7 +2187,7 @@ static void handle_swbp(struct pt_regs *regs)
> int is_swbp;
>
> bp_vaddr = uprobe_get_swbp_addr(regs);
> - if (bp_vaddr == get_trampoline_vaddr())
> + if (bp_vaddr == uprobe_get_trampoline_vaddr())
> return handle_trampoline(regs);
>
> uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces
2024-05-22 1:38 ` [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces Andrii Nakryiko
2024-06-04 9:24 ` Jiri Olsa
@ 2024-06-25 1:14 ` Masami Hiramatsu
2024-06-25 2:53 ` Andrii Nakryiko
2024-06-25 1:22 ` Masami Hiramatsu
2 siblings, 1 reply; 21+ messages in thread
From: Masami Hiramatsu @ 2024-06-25 1:14 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, rostedt, x86, peterz, mingo, tglx, bpf,
rihams, linux-perf-users
On Tue, 21 May 2024 18:38:45 -0700
Andrii Nakryiko <andrii@kernel.org> wrote:
> Add a set of tests to validate that stack traces captured from or in the
> presence of active uprobes and uretprobes are valid and complete.
>
> For this we use BPF program that are installed either on entry or exit
> of user function, plus deep-nested USDT. One of target funtions
> (target_1) is recursive to generate two different entries in the stack
> trace for the same uprobe/uretprobe, testing potential edge conditions.
>
> Without fixes in this patch set, we get something like this for one of
> the scenarios:
>
> caller: 0x758fff - 0x7595ab
> target_1: 0x758fd5 - 0x758fff
> target_2: 0x758fca - 0x758fd5
> target_3: 0x758fbf - 0x758fca
> target_4: 0x758fb3 - 0x758fbf
> ENTRY #0: 0x758fb3 (in target_4)
> ENTRY #1: 0x758fd3 (in target_2)
> ENTRY #2: 0x758ffd (in target_1)
> ENTRY #3: 0x7fffffffe000
> ENTRY #4: 0x7fffffffe000
> ENTRY #5: 0x6f8f39
> ENTRY #6: 0x6fa6f0
> ENTRY #7: 0x7f403f229590
>
> Entry #3 and #4 (0x7fffffffe000) are uretprobe trampoline addresses
> which obscure actual target_1 and another target_1 invocations. Also
> note that between entry #0 and entry #1 we are missing an entry for
> target_3, which is fixed in patch #2.
Please avoid using `patch #2` because after commit, this means nothing.
Thank you,
>
> With all the fixes, we get desired full stack traces:
>
> caller: 0x758fff - 0x7595ab
> target_1: 0x758fd5 - 0x758fff
> target_2: 0x758fca - 0x758fd5
> target_3: 0x758fbf - 0x758fca
> target_4: 0x758fb3 - 0x758fbf
> ENTRY #0: 0x758fb7 (in target_4)
> ENTRY #1: 0x758fc8 (in target_3)
> ENTRY #2: 0x758fd3 (in target_2)
> ENTRY #3: 0x758ffd (in target_1)
> ENTRY #4: 0x758ff3 (in target_1)
> ENTRY #5: 0x75922c (in caller)
> ENTRY #6: 0x6f8f39
> ENTRY #7: 0x6fa6f0
> ENTRY #8: 0x7f986adc4cd0
>
> Now there is a logical and complete sequence of function calls.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> .../bpf/prog_tests/uretprobe_stack.c | 186 ++++++++++++++++++
> .../selftests/bpf/progs/uretprobe_stack.c | 96 +++++++++
> 2 files changed, 282 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_stack.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> new file mode 100644
> index 000000000000..6deb8d560ddd
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> @@ -0,0 +1,186 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
> +
> +#include <test_progs.h>
> +#include "uretprobe_stack.skel.h"
> +#include "../sdt.h"
> +
> +/* We set up target_1() -> target_2() -> target_3() -> target_4() -> USDT()
> + * call chain, each being traced by our BPF program. On entry or return from
> + * each target_*() we are capturing user stack trace and recording it in
> + * global variable, so that user space part of the test can validate it.
> + *
> + * Note, we put each target function into a custom section to get those
> + * __start_XXX/__stop_XXX symbols, generated by linker for us, which allow us
> + * to know address range of those functions
> + */
> +__attribute__((section("uprobe__target_4")))
> +__weak int target_4(void)
> +{
> + STAP_PROBE1(uretprobe_stack, target, 42);
> + return 42;
> +}
> +
> +extern const void *__start_uprobe__target_4;
> +extern const void *__stop_uprobe__target_4;
> +
> +__attribute__((section("uprobe__target_3")))
> +__weak int target_3(void)
> +{
> + return target_4();
> +}
> +
> +extern const void *__start_uprobe__target_3;
> +extern const void *__stop_uprobe__target_3;
> +
> +__attribute__((section("uprobe__target_2")))
> +__weak int target_2(void)
> +{
> + return target_3();
> +}
> +
> +extern const void *__start_uprobe__target_2;
> +extern const void *__stop_uprobe__target_2;
> +
> +__attribute__((section("uprobe__target_1")))
> +__weak int target_1(int depth)
> +{
> + if (depth < 1)
> + return 1 + target_1(depth + 1);
> + else
> + return target_2();
> +}
> +
> +extern const void *__start_uprobe__target_1;
> +extern const void *__stop_uprobe__target_1;
> +
> +extern const void *__start_uretprobe_stack_sec;
> +extern const void *__stop_uretprobe_stack_sec;
> +
> +struct range {
> + long start;
> + long stop;
> +};
> +
> +static struct range targets[] = {
> + {}, /* we want target_1 to map to target[1], so need 1-based indexing */
> + { (long)&__start_uprobe__target_1, (long)&__stop_uprobe__target_1 },
> + { (long)&__start_uprobe__target_2, (long)&__stop_uprobe__target_2 },
> + { (long)&__start_uprobe__target_3, (long)&__stop_uprobe__target_3 },
> + { (long)&__start_uprobe__target_4, (long)&__stop_uprobe__target_4 },
> +};
> +
> +static struct range caller = {
> + (long)&__start_uretprobe_stack_sec,
> + (long)&__stop_uretprobe_stack_sec,
> +};
> +
> +static void validate_stack(__u64 *ips, int stack_len, int cnt, ...)
> +{
> + int i, j;
> + va_list args;
> +
> + if (!ASSERT_GT(stack_len, 0, "stack_len"))
> + return;
> +
> + stack_len /= 8;
> +
> + /* check if we have enough entries to satisfy test expectations */
> + if (!ASSERT_GE(stack_len, cnt, "stack_len2"))
> + return;
> +
> + if (env.verbosity >= VERBOSE_NORMAL) {
> + printf("caller: %#lx - %#lx\n", caller.start, caller.stop);
> + for (i = 1; i < ARRAY_SIZE(targets); i++)
> + printf("target_%d: %#lx - %#lx\n", i, targets[i].start, targets[i].stop);
> + for (i = 0; i < stack_len; i++) {
> + for (j = 1; j < ARRAY_SIZE(targets); j++) {
> + if (ips[i] >= targets[j].start && ips[i] < targets[j].stop)
> + break;
> + }
> + if (j < ARRAY_SIZE(targets)) { /* found target match */
> + printf("ENTRY #%d: %#lx (in target_%d)\n", i, (long)ips[i], j);
> + } else if (ips[i] >= caller.start && ips[i] < caller.stop) {
> + printf("ENTRY #%d: %#lx (in caller)\n", i, (long)ips[i]);
> + } else {
> + printf("ENTRY #%d: %#lx\n", i, (long)ips[i]);
> + }
> + }
> + }
> +
> + va_start(args, cnt);
> +
> + for (i = cnt - 1; i >= 0; i--) {
> + /* most recent entry is the deepest target function */
> + const struct range *t = va_arg(args, const struct range *);
> +
> + ASSERT_GE(ips[i], t->start, "addr_start");
> + ASSERT_LT(ips[i], t->stop, "addr_stop");
> + }
> +
> + va_end(args);
> +}
> +
> +/* __weak prevents inlining */
> +__attribute__((section("uretprobe_stack_sec")))
> +__weak void test_uretprobe_stack(void)
> +{
> + LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
> + struct uretprobe_stack *skel;
> + int err;
> +
> + skel = uretprobe_stack__open_and_load();
> + if (!ASSERT_OK_PTR(skel, "skel_open"))
> + return;
> +
> + err = uretprobe_stack__attach(skel);
> + if (!ASSERT_OK(err, "skel_attach"))
> + goto cleanup;
> +
> + /* trigger */
> + ASSERT_EQ(target_1(0), 42 + 1, "trigger_return");
> +
> + /*
> + * Stacks captured on ENTRY uprobes
> + */
> +
> + /* (uprobe 1) target_1 in stack trace*/
> + validate_stack(skel->bss->entry_stack1, skel->bss->entry1_len,
> + 2, &caller, &targets[1]);
> + /* (uprobe 1, recursed) */
> + validate_stack(skel->bss->entry_stack1_recur, skel->bss->entry1_recur_len,
> + 3, &caller, &targets[1], &targets[1]);
> + /* (uprobe 2) caller -> target_1 -> target_1 -> target_2 */
> + validate_stack(skel->bss->entry_stack2, skel->bss->entry2_len,
> + 4, &caller, &targets[1], &targets[1], &targets[2]);
> + /* (uprobe 3) */
> + validate_stack(skel->bss->entry_stack3, skel->bss->entry3_len,
> + 5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
> + /* (uprobe 4) caller -> target_1 -> target_1 -> target_2 -> target_3 -> target_4 */
> + validate_stack(skel->bss->entry_stack4, skel->bss->entry4_len,
> + 6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
> +
> + /* (USDT): full caller -> target_1 -> target_1 -> target_2 (uretprobed)
> + * -> target_3 -> target_4 (uretprobes) chain
> + */
> + validate_stack(skel->bss->usdt_stack, skel->bss->usdt_len,
> + 6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
> +
> + /*
> + * Now stacks captured on the way out in EXIT uprobes
> + */
> +
> + /* (uretprobe 4) everything up to target_4, but excluding it */
> + validate_stack(skel->bss->exit_stack4, skel->bss->exit4_len,
> + 5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
> + /* we didn't install uretprobes on target_2 and target_3 */
> + /* (uretprobe 1, recur) first target_1 call only */
> + validate_stack(skel->bss->exit_stack1_recur, skel->bss->exit1_recur_len,
> + 2, &caller, &targets[1]);
> + /* (uretprobe 1) just a caller in the stack trace */
> + validate_stack(skel->bss->exit_stack1, skel->bss->exit1_len,
> + 1, &caller);
> +
> +cleanup:
> + uretprobe_stack__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/uretprobe_stack.c b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
> new file mode 100644
> index 000000000000..9fdcf396b8f4
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
> @@ -0,0 +1,96 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
> +#include <vmlinux.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include <bpf/usdt.bpf.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +__u64 entry_stack1[32], exit_stack1[32];
> +__u64 entry_stack1_recur[32], exit_stack1_recur[32];
> +__u64 entry_stack2[32];
> +__u64 entry_stack3[32];
> +__u64 entry_stack4[32], exit_stack4[32];
> +__u64 usdt_stack[32];
> +
> +int entry1_len, exit1_len;
> +int entry1_recur_len, exit1_recur_len;
> +int entry2_len, exit2_len;
> +int entry3_len, exit3_len;
> +int entry4_len, exit4_len;
> +int usdt_len;
> +
> +#define SZ sizeof(usdt_stack)
> +
> +SEC("uprobe//proc/self/exe:target_1")
> +int BPF_UPROBE(uprobe_1)
> +{
> + /* target_1 is recursive wit depth of 2, so we capture two separate
> + * stack traces, depending on which occurence it is
> + */
> + static bool recur = false;
> +
> + if (!recur)
> + entry1_len = bpf_get_stack(ctx, &entry_stack1, SZ, BPF_F_USER_STACK);
> + else
> + entry1_recur_len = bpf_get_stack(ctx, &entry_stack1_recur, SZ, BPF_F_USER_STACK);
> +
> + recur = true;
> + return 0;
> +}
> +
> +SEC("uretprobe//proc/self/exe:target_1")
> +int BPF_URETPROBE(uretprobe_1)
> +{
> + /* see above, target_1 is recursive */
> + static bool recur = false;
> +
> + /* NOTE: order of returns is reversed to order of entries */
> + if (!recur)
> + exit1_recur_len = bpf_get_stack(ctx, &exit_stack1_recur, SZ, BPF_F_USER_STACK);
> + else
> + exit1_len = bpf_get_stack(ctx, &exit_stack1, SZ, BPF_F_USER_STACK);
> +
> + recur = true;
> + return 0;
> +}
> +
> +SEC("uprobe//proc/self/exe:target_2")
> +int BPF_UPROBE(uprobe_2)
> +{
> + entry2_len = bpf_get_stack(ctx, &entry_stack2, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +/* no uretprobe for target_2 */
> +
> +SEC("uprobe//proc/self/exe:target_3")
> +int BPF_UPROBE(uprobe_3)
> +{
> + entry3_len = bpf_get_stack(ctx, &entry_stack3, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +/* no uretprobe for target_3 */
> +
> +SEC("uprobe//proc/self/exe:target_4")
> +int BPF_UPROBE(uprobe_4)
> +{
> + entry4_len = bpf_get_stack(ctx, &entry_stack4, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +SEC("uretprobe//proc/self/exe:target_4")
> +int BPF_URETPROBE(uretprobe_4)
> +{
> + exit4_len = bpf_get_stack(ctx, &exit_stack4, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +SEC("usdt//proc/self/exe:uretprobe_stack:target")
> +int BPF_USDT(usdt_probe)
> +{
> + usdt_len = bpf_get_stack(ctx, &usdt_stack, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces
2024-05-22 1:38 ` [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces Andrii Nakryiko
2024-06-04 9:24 ` Jiri Olsa
2024-06-25 1:14 ` Masami Hiramatsu
@ 2024-06-25 1:22 ` Masami Hiramatsu
2 siblings, 0 replies; 21+ messages in thread
From: Masami Hiramatsu @ 2024-06-25 1:22 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, rostedt, x86, peterz, mingo, tglx, bpf,
rihams, linux-perf-users
On Tue, 21 May 2024 18:38:45 -0700
Andrii Nakryiko <andrii@kernel.org> wrote:
> Add a set of tests to validate that stack traces captured from or in the
> presence of active uprobes and uretprobes are valid and complete.
>
> For this we use BPF program that are installed either on entry or exit
> of user function, plus deep-nested USDT. One of target funtions
> (target_1) is recursive to generate two different entries in the stack
> trace for the same uprobe/uretprobe, testing potential edge conditions.
>
> Without fixes in this patch set, we get something like this for one of
> the scenarios:
I changed it to;
If there is no fixes, we get something like this for one of the scenarios:
>
> caller: 0x758fff - 0x7595ab
> target_1: 0x758fd5 - 0x758fff
> target_2: 0x758fca - 0x758fd5
> target_3: 0x758fbf - 0x758fca
> target_4: 0x758fb3 - 0x758fbf
> ENTRY #0: 0x758fb3 (in target_4)
> ENTRY #1: 0x758fd3 (in target_2)
> ENTRY #2: 0x758ffd (in target_1)
> ENTRY #3: 0x7fffffffe000
> ENTRY #4: 0x7fffffffe000
> ENTRY #5: 0x6f8f39
> ENTRY #6: 0x6fa6f0
> ENTRY #7: 0x7f403f229590
>
> Entry #3 and #4 (0x7fffffffe000) are uretprobe trampoline addresses
> which obscure actual target_1 and another target_1 invocations. Also
> note that between entry #0 and entry #1 we are missing an entry for
> target_3, which is fixed in patch #2.
And remove ", which is fixed in patch #2".
Is that OK?
Thank you,
>
> With all the fixes, we get desired full stack traces:
>
> caller: 0x758fff - 0x7595ab
> target_1: 0x758fd5 - 0x758fff
> target_2: 0x758fca - 0x758fd5
> target_3: 0x758fbf - 0x758fca
> target_4: 0x758fb3 - 0x758fbf
> ENTRY #0: 0x758fb7 (in target_4)
> ENTRY #1: 0x758fc8 (in target_3)
> ENTRY #2: 0x758fd3 (in target_2)
> ENTRY #3: 0x758ffd (in target_1)
> ENTRY #4: 0x758ff3 (in target_1)
> ENTRY #5: 0x75922c (in caller)
> ENTRY #6: 0x6f8f39
> ENTRY #7: 0x6fa6f0
> ENTRY #8: 0x7f986adc4cd0
>
> Now there is a logical and complete sequence of function calls.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> .../bpf/prog_tests/uretprobe_stack.c | 186 ++++++++++++++++++
> .../selftests/bpf/progs/uretprobe_stack.c | 96 +++++++++
> 2 files changed, 282 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_stack.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> new file mode 100644
> index 000000000000..6deb8d560ddd
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> @@ -0,0 +1,186 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
> +
> +#include <test_progs.h>
> +#include "uretprobe_stack.skel.h"
> +#include "../sdt.h"
> +
> +/* We set up target_1() -> target_2() -> target_3() -> target_4() -> USDT()
> + * call chain, each being traced by our BPF program. On entry or return from
> + * each target_*() we are capturing user stack trace and recording it in
> + * global variable, so that user space part of the test can validate it.
> + *
> + * Note, we put each target function into a custom section to get those
> + * __start_XXX/__stop_XXX symbols, generated by linker for us, which allow us
> + * to know address range of those functions
> + */
> +__attribute__((section("uprobe__target_4")))
> +__weak int target_4(void)
> +{
> + STAP_PROBE1(uretprobe_stack, target, 42);
> + return 42;
> +}
> +
> +extern const void *__start_uprobe__target_4;
> +extern const void *__stop_uprobe__target_4;
> +
> +__attribute__((section("uprobe__target_3")))
> +__weak int target_3(void)
> +{
> + return target_4();
> +}
> +
> +extern const void *__start_uprobe__target_3;
> +extern const void *__stop_uprobe__target_3;
> +
> +__attribute__((section("uprobe__target_2")))
> +__weak int target_2(void)
> +{
> + return target_3();
> +}
> +
> +extern const void *__start_uprobe__target_2;
> +extern const void *__stop_uprobe__target_2;
> +
> +__attribute__((section("uprobe__target_1")))
> +__weak int target_1(int depth)
> +{
> + if (depth < 1)
> + return 1 + target_1(depth + 1);
> + else
> + return target_2();
> +}
> +
> +extern const void *__start_uprobe__target_1;
> +extern const void *__stop_uprobe__target_1;
> +
> +extern const void *__start_uretprobe_stack_sec;
> +extern const void *__stop_uretprobe_stack_sec;
> +
> +struct range {
> + long start;
> + long stop;
> +};
> +
> +static struct range targets[] = {
> + {}, /* we want target_1 to map to target[1], so need 1-based indexing */
> + { (long)&__start_uprobe__target_1, (long)&__stop_uprobe__target_1 },
> + { (long)&__start_uprobe__target_2, (long)&__stop_uprobe__target_2 },
> + { (long)&__start_uprobe__target_3, (long)&__stop_uprobe__target_3 },
> + { (long)&__start_uprobe__target_4, (long)&__stop_uprobe__target_4 },
> +};
> +
> +static struct range caller = {
> + (long)&__start_uretprobe_stack_sec,
> + (long)&__stop_uretprobe_stack_sec,
> +};
> +
> +static void validate_stack(__u64 *ips, int stack_len, int cnt, ...)
> +{
> + int i, j;
> + va_list args;
> +
> + if (!ASSERT_GT(stack_len, 0, "stack_len"))
> + return;
> +
> + stack_len /= 8;
> +
> + /* check if we have enough entries to satisfy test expectations */
> + if (!ASSERT_GE(stack_len, cnt, "stack_len2"))
> + return;
> +
> + if (env.verbosity >= VERBOSE_NORMAL) {
> + printf("caller: %#lx - %#lx\n", caller.start, caller.stop);
> + for (i = 1; i < ARRAY_SIZE(targets); i++)
> + printf("target_%d: %#lx - %#lx\n", i, targets[i].start, targets[i].stop);
> + for (i = 0; i < stack_len; i++) {
> + for (j = 1; j < ARRAY_SIZE(targets); j++) {
> + if (ips[i] >= targets[j].start && ips[i] < targets[j].stop)
> + break;
> + }
> + if (j < ARRAY_SIZE(targets)) { /* found target match */
> + printf("ENTRY #%d: %#lx (in target_%d)\n", i, (long)ips[i], j);
> + } else if (ips[i] >= caller.start && ips[i] < caller.stop) {
> + printf("ENTRY #%d: %#lx (in caller)\n", i, (long)ips[i]);
> + } else {
> + printf("ENTRY #%d: %#lx\n", i, (long)ips[i]);
> + }
> + }
> + }
> +
> + va_start(args, cnt);
> +
> + for (i = cnt - 1; i >= 0; i--) {
> + /* most recent entry is the deepest target function */
> + const struct range *t = va_arg(args, const struct range *);
> +
> + ASSERT_GE(ips[i], t->start, "addr_start");
> + ASSERT_LT(ips[i], t->stop, "addr_stop");
> + }
> +
> + va_end(args);
> +}
> +
> +/* __weak prevents inlining */
> +__attribute__((section("uretprobe_stack_sec")))
> +__weak void test_uretprobe_stack(void)
> +{
> + LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
> + struct uretprobe_stack *skel;
> + int err;
> +
> + skel = uretprobe_stack__open_and_load();
> + if (!ASSERT_OK_PTR(skel, "skel_open"))
> + return;
> +
> + err = uretprobe_stack__attach(skel);
> + if (!ASSERT_OK(err, "skel_attach"))
> + goto cleanup;
> +
> + /* trigger */
> + ASSERT_EQ(target_1(0), 42 + 1, "trigger_return");
> +
> + /*
> + * Stacks captured on ENTRY uprobes
> + */
> +
> + /* (uprobe 1) target_1 in stack trace*/
> + validate_stack(skel->bss->entry_stack1, skel->bss->entry1_len,
> + 2, &caller, &targets[1]);
> + /* (uprobe 1, recursed) */
> + validate_stack(skel->bss->entry_stack1_recur, skel->bss->entry1_recur_len,
> + 3, &caller, &targets[1], &targets[1]);
> + /* (uprobe 2) caller -> target_1 -> target_1 -> target_2 */
> + validate_stack(skel->bss->entry_stack2, skel->bss->entry2_len,
> + 4, &caller, &targets[1], &targets[1], &targets[2]);
> + /* (uprobe 3) */
> + validate_stack(skel->bss->entry_stack3, skel->bss->entry3_len,
> + 5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
> + /* (uprobe 4) caller -> target_1 -> target_1 -> target_2 -> target_3 -> target_4 */
> + validate_stack(skel->bss->entry_stack4, skel->bss->entry4_len,
> + 6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
> +
> + /* (USDT): full caller -> target_1 -> target_1 -> target_2 (uretprobed)
> + * -> target_3 -> target_4 (uretprobes) chain
> + */
> + validate_stack(skel->bss->usdt_stack, skel->bss->usdt_len,
> + 6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
> +
> + /*
> + * Now stacks captured on the way out in EXIT uprobes
> + */
> +
> + /* (uretprobe 4) everything up to target_4, but excluding it */
> + validate_stack(skel->bss->exit_stack4, skel->bss->exit4_len,
> + 5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
> + /* we didn't install uretprobes on target_2 and target_3 */
> + /* (uretprobe 1, recur) first target_1 call only */
> + validate_stack(skel->bss->exit_stack1_recur, skel->bss->exit1_recur_len,
> + 2, &caller, &targets[1]);
> + /* (uretprobe 1) just a caller in the stack trace */
> + validate_stack(skel->bss->exit_stack1, skel->bss->exit1_len,
> + 1, &caller);
> +
> +cleanup:
> + uretprobe_stack__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/uretprobe_stack.c b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
> new file mode 100644
> index 000000000000..9fdcf396b8f4
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
> @@ -0,0 +1,96 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
> +#include <vmlinux.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include <bpf/usdt.bpf.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +__u64 entry_stack1[32], exit_stack1[32];
> +__u64 entry_stack1_recur[32], exit_stack1_recur[32];
> +__u64 entry_stack2[32];
> +__u64 entry_stack3[32];
> +__u64 entry_stack4[32], exit_stack4[32];
> +__u64 usdt_stack[32];
> +
> +int entry1_len, exit1_len;
> +int entry1_recur_len, exit1_recur_len;
> +int entry2_len, exit2_len;
> +int entry3_len, exit3_len;
> +int entry4_len, exit4_len;
> +int usdt_len;
> +
> +#define SZ sizeof(usdt_stack)
> +
> +SEC("uprobe//proc/self/exe:target_1")
> +int BPF_UPROBE(uprobe_1)
> +{
> + /* target_1 is recursive wit depth of 2, so we capture two separate
> + * stack traces, depending on which occurence it is
> + */
> + static bool recur = false;
> +
> + if (!recur)
> + entry1_len = bpf_get_stack(ctx, &entry_stack1, SZ, BPF_F_USER_STACK);
> + else
> + entry1_recur_len = bpf_get_stack(ctx, &entry_stack1_recur, SZ, BPF_F_USER_STACK);
> +
> + recur = true;
> + return 0;
> +}
> +
> +SEC("uretprobe//proc/self/exe:target_1")
> +int BPF_URETPROBE(uretprobe_1)
> +{
> + /* see above, target_1 is recursive */
> + static bool recur = false;
> +
> + /* NOTE: order of returns is reversed to order of entries */
> + if (!recur)
> + exit1_recur_len = bpf_get_stack(ctx, &exit_stack1_recur, SZ, BPF_F_USER_STACK);
> + else
> + exit1_len = bpf_get_stack(ctx, &exit_stack1, SZ, BPF_F_USER_STACK);
> +
> + recur = true;
> + return 0;
> +}
> +
> +SEC("uprobe//proc/self/exe:target_2")
> +int BPF_UPROBE(uprobe_2)
> +{
> + entry2_len = bpf_get_stack(ctx, &entry_stack2, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +/* no uretprobe for target_2 */
> +
> +SEC("uprobe//proc/self/exe:target_3")
> +int BPF_UPROBE(uprobe_3)
> +{
> + entry3_len = bpf_get_stack(ctx, &entry_stack3, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +/* no uretprobe for target_3 */
> +
> +SEC("uprobe//proc/self/exe:target_4")
> +int BPF_UPROBE(uprobe_4)
> +{
> + entry4_len = bpf_get_stack(ctx, &entry_stack4, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +SEC("uretprobe//proc/self/exe:target_4")
> +int BPF_URETPROBE(uretprobe_4)
> +{
> + exit4_len = bpf_get_stack(ctx, &exit_stack4, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> +
> +SEC("usdt//proc/self/exe:uretprobe_stack:target")
> +int BPF_USDT(usdt_probe)
> +{
> + usdt_len = bpf_get_stack(ctx, &usdt_stack, SZ, BPF_F_USER_STACK);
> + return 0;
> +}
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes
2024-06-25 0:39 ` Masami Hiramatsu
@ 2024-06-25 2:51 ` Andrii Nakryiko
0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2024-06-25 2:51 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Andrii Nakryiko, linux-trace-kernel, rostedt, x86, peterz, mingo,
tglx, bpf, rihams, linux-perf-users, Riham Selim
On Mon, Jun 24, 2024 at 5:39 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Mon, 24 Jun 2024 13:32:35 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Mon, Jun 17, 2024 at 3:37 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Tue, Jun 4, 2024 at 10:16 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 4, 2024 at 7:13 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > >
> > > > > On Tue, 21 May 2024 18:38:43 -0700
> > > > > Andrii Nakryiko <andrii@kernel.org> wrote:
> > > > >
> > > > > > When kernel has pending uretprobes installed, it hijacks original user
> > > > > > function return address on the stack with a uretprobe trampoline
> > > > > > address. There could be multiple such pending uretprobes (either on
> > > > > > different user functions or on the same recursive one) at any given
> > > > > > time within the same task.
> > > > > >
> > > > > > This approach interferes with the user stack trace capture logic, which
> > > > > > would report suprising addresses (like 0x7fffffffe000) that correspond
> > > > > > to a special "[uprobes]" section that kernel installs in the target
> > > > > > process address space for uretprobe trampoline code, while logically it
> > > > > > should be an address somewhere within the calling function of another
> > > > > > traced user function.
> > > > > >
> > > > > > This is easy to correct for, though. Uprobes subsystem keeps track of
> > > > > > pending uretprobes and records original return addresses. This patch is
> > > > > > using this to do a post-processing step and restore each trampoline
> > > > > > address entries with correct original return address. This is done only
> > > > > > if there are pending uretprobes for current task.
> > > > > >
> > > > > > This is a similar approach to what fprobe/kretprobe infrastructure is
> > > > > > doing when capturing kernel stack traces in the presence of pending
> > > > > > return probes.
> > > > > >
> > > > >
> > > > > This looks good to me because this trampoline information is only
> > > > > managed in uprobes. And it should be provided when unwinding user
> > > > > stack.
> > > > >
> > > > > Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > > >
> > > > > Thank you!
> > > >
> > > > Great, thanks for reviewing, Masami!
> > > >
> > > > Would you take this fix through your tree, or where should it be routed to?
> > > >
> > >
> > > Ping! What would you like me to do with this patch set? Should I
> > > resend it without patch 3 (the one that tries to guess whether we are
> > > at the entry to the function?), or did I manage to convince you that
> > > this heuristic is OK, given perf's stack trace capturing logic already
> > > makes heavy assumption of rbp register being used for frame pointer?
> > >
> > > Please let me know your preference, I could drop patch 3 and send it
> > > separately, if that helps move the main fix forward. Thanks!
> >
> > Masami,
> >
> > Another week went by with absolutely no action or reaction from you.
> > Is there any way I can help improve the collaboration here?
>
> OK, if there is no change without [3/4], let me pick the others on
Thanks, Masami!
Selftest is probably failing (as it expects correct stack trace), but
that's ok, we can fix it up once linux-trace-kernel and bpf-next trees
converge.
> probes/for-next directly. [3/4] I need other x86 maintainer's
> comments. And it should be handled by PMU maintainers.
Sounds good, I'll repost it separately. Do I need to CC anyone else
besides people on this thread already?
>
> Thanks,
>
>
> >
> > I'm preparing more patches for uprobes and about to submit them. If
> > each reviewed and ready to be applied patch set has to sit idle for
> > multiple weeks for no good reason, we all will soon be lost just plain
> > forgetting the context in which the patch was prepared.
> >
> > Please, prioritize handling patches that are meant to be routed
> > through your tree in a more timely fashion. Or propose some
> > alternative acceptable arrangement.
> >
> > Thank you.
> >
> > >
> > > > >
> > > > > > Reported-by: Riham Selim <rihams@meta.com>
> > > > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > > > ---
> > > > > > kernel/events/callchain.c | 43 ++++++++++++++++++++++++++++++++++++++-
> > > > > > kernel/events/uprobes.c | 9 ++++++++
> > > > > > 2 files changed, 51 insertions(+), 1 deletion(-)
> > > > > >
> > > >
> > > > [...]
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces
2024-06-25 1:14 ` Masami Hiramatsu
@ 2024-06-25 2:53 ` Andrii Nakryiko
0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2024-06-25 2:53 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Andrii Nakryiko, linux-trace-kernel, rostedt, x86, peterz, mingo,
tglx, bpf, rihams, linux-perf-users
On Mon, Jun 24, 2024 at 6:14 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Tue, 21 May 2024 18:38:45 -0700
> Andrii Nakryiko <andrii@kernel.org> wrote:
>
> > Add a set of tests to validate that stack traces captured from or in the
> > presence of active uprobes and uretprobes are valid and complete.
> >
> > For this we use BPF program that are installed either on entry or exit
> > of user function, plus deep-nested USDT. One of target funtions
> > (target_1) is recursive to generate two different entries in the stack
> > trace for the same uprobe/uretprobe, testing potential edge conditions.
> >
> > Without fixes in this patch set, we get something like this for one of
> > the scenarios:
> >
> > caller: 0x758fff - 0x7595ab
> > target_1: 0x758fd5 - 0x758fff
> > target_2: 0x758fca - 0x758fd5
> > target_3: 0x758fbf - 0x758fca
> > target_4: 0x758fb3 - 0x758fbf
> > ENTRY #0: 0x758fb3 (in target_4)
> > ENTRY #1: 0x758fd3 (in target_2)
> > ENTRY #2: 0x758ffd (in target_1)
> > ENTRY #3: 0x7fffffffe000
> > ENTRY #4: 0x7fffffffe000
> > ENTRY #5: 0x6f8f39
> > ENTRY #6: 0x6fa6f0
> > ENTRY #7: 0x7f403f229590
> >
> > Entry #3 and #4 (0x7fffffffe000) are uretprobe trampoline addresses
> > which obscure actual target_1 and another target_1 invocations. Also
> > note that between entry #0 and entry #1 we are missing an entry for
> > target_3, which is fixed in patch #2.
>
> Please avoid using `patch #2` because after commit, this means nothing.
Yep, makes sense, sorry about that, will keep descriptions a bit more
general going forward.
>
> Thank you,
>
> >
> > With all the fixes, we get desired full stack traces:
> >
> > caller: 0x758fff - 0x7595ab
> > target_1: 0x758fd5 - 0x758fff
> > target_2: 0x758fca - 0x758fd5
> > target_3: 0x758fbf - 0x758fca
> > target_4: 0x758fb3 - 0x758fbf
> > ENTRY #0: 0x758fb7 (in target_4)
> > ENTRY #1: 0x758fc8 (in target_3)
> > ENTRY #2: 0x758fd3 (in target_2)
> > ENTRY #3: 0x758ffd (in target_1)
> > ENTRY #4: 0x758ff3 (in target_1)
> > ENTRY #5: 0x75922c (in caller)
> > ENTRY #6: 0x6f8f39
> > ENTRY #7: 0x6fa6f0
> > ENTRY #8: 0x7f986adc4cd0
> >
> > Now there is a logical and complete sequence of function calls.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> > .../bpf/prog_tests/uretprobe_stack.c | 186 ++++++++++++++++++
> > .../selftests/bpf/progs/uretprobe_stack.c | 96 +++++++++
> > 2 files changed, 282 insertions(+)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
> > create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_stack.c
> >
[...]
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-06-25 2:53 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-22 1:38 [PATCH v2 0/4] Fix user stack traces captured from uprobes Andrii Nakryiko
2024-05-22 1:38 ` [PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global Andrii Nakryiko
2024-06-25 0:28 ` Masami Hiramatsu
2024-06-25 1:02 ` Masami Hiramatsu
2024-05-22 1:38 ` [PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes Andrii Nakryiko
2024-06-04 14:13 ` Masami Hiramatsu
2024-06-04 17:16 ` Andrii Nakryiko
2024-06-17 22:37 ` Andrii Nakryiko
2024-06-24 20:32 ` Andrii Nakryiko
2024-06-25 0:39 ` Masami Hiramatsu
2024-06-25 2:51 ` Andrii Nakryiko
2024-05-22 1:38 ` [PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe Andrii Nakryiko
2024-06-04 9:24 ` Jiri Olsa
2024-06-04 14:06 ` Masami Hiramatsu
2024-06-04 17:13 ` Andrii Nakryiko
2024-05-22 1:38 ` [PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces Andrii Nakryiko
2024-06-04 9:24 ` Jiri Olsa
2024-06-25 1:14 ` Masami Hiramatsu
2024-06-25 2:53 ` Andrii Nakryiko
2024-06-25 1:22 ` Masami Hiramatsu
2024-06-03 21:09 ` [PATCH v2 0/4] Fix user stack traces captured from uprobes Andrii Nakryiko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).