long BPF stack traces Re: Broken stack traces with --call-graph=fp and a multi-threaded app due to page faults?

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Maksymilian Graczyk <maksymilian.graczyk@cern.ch>
Cc: Hao Luo <haoluo@google.com>, Namhyung Kim <namhyung@kernel.org>,
	Jiri Olsa <jolsa@kernel.org>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	linux-perf-users@vger.kernel.org,
	syclops-project <syclops-project@cern.ch>,
	Guilherme Amadio <guilherme.amadio@cern.ch>,
	Stephan Hageboeck <stephan.hageboeck@cern.ch>
Subject: long BPF stack traces Re: Broken stack traces with --call-graph=fp and a multi-threaded app due to page faults?
Date: Fri, 10 Nov 2023 14:40:44 -0300	[thread overview]
Message-ID: <ZU5rHB4DCiBqlKtC@kernel.org> (raw)
In-Reply-To: <ZU5TZiXMUZ4VLOO+@kernel.org>

Em Fri, Nov 10, 2023 at 12:59:34PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Nov 08, 2023 at 11:46:03AM +0100, Maksymilian Graczyk escreveu:
> > Alongside sampling-based profiling, I run syscall profiling with a separate
> > "perf record" instance attached to the same PID.

> > When I debug the kernel using kgdb, I see more-or-less the following
> > behaviour happening in the stack traversal loop in perf_callchain_user() in
> > arch/x86/events/core.c for the same thread being profiled:

> > 1. The first sample goes fine, the entire stack is traversed.
> > 2. The second sample breaks at some point inside my program, with a page
> > fault due to page not present.
> > 3. The third sample breaks at another *earlier* point inside my program,
> > with a page fault due to page not present.
> > 4. The fourth sample breaks at another *later* point inside my program, with
> > a page fault due to page not present.
 
> Namhyung, Jiri: ideas? I have to stop this analysis now, will continue later.

Would https://lore.kernel.org/all/20220225234339.2386398-7-haoluo@google.com/
this come into play, i.e. we would need to use a sleepable tp_btf (see
below about __get_user(frame.next_frame...) page faulting) in
tools/perf/util/bpf_skel/off_cpu.bpf.c, here:

[acme@quaco perf-tools-next]$ grep tp_btf tools/perf/util/bpf_skel/off_cpu.bpf.c
SEC("tp_btf/task_newtask")
SEC("tp_btf/sched_switch")
[acme@quaco perf-tools-next]$

But Hao's patch (CCed here) doesn't seem to have made its way to
tools/lib/bpf/, Hao, why this hasn't made into libbpf?

+	SEC_DEF("tp_btf.s/",            TRACING, BPF_TRACE_RAW_TP, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),

- Arnaldo
  
> > The stack frame addresses do not change throughout profiling and all page
> > faults happen at __get_user(frame.next_frame, &fp->next_frame). The
> > behaviour above also occurs occasionally in a single-threaded variant of the
> > code (without pthread at all) with a very high sampling frequency (tens of
> > thousands Hz).

> > This issue makes profiling results unreliable for my use case, as I usually
> > profile multi-threaded applications with deep stacks with hundreds of
> > entries (hence why my test program also produces a deep stack) and use flame
> > graphs for later analysis.

> > Could you help me diagnose the problem? For example, what may be the cause
> > of my page faults? I also did tests (without debugging though) without
> > syscall profiling and the "--off-cpu" flag, broken stacks still appeared.

> > (I cannot use DWARF because it makes profiling too slow and perf.data size
> > too large in my tests. I also want to avoid using
> > non-portable/vendor-specific stack unwinding solutions like LBR, as we may
> > need to run profiling on non-Intel CPUs.)

next prev parent reply	other threads:[~2023-11-10 17:40 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-08 10:46 Broken stack traces with --call-graph=fp and a multi-threaded app due to page faults? Maksymilian Graczyk
2023-11-10 10:45 ` Maksymilian Graczyk
2023-11-10 10:51   ` Maksymilian Graczyk
2023-11-10 15:59 ` Arnaldo Carvalho de Melo
2023-11-10 17:40   ` Arnaldo Carvalho de Melo [this message]
2023-11-10 23:01     ` long BPF stack traces " Namhyung Kim
2023-11-11 13:37       ` Maksymilian Graczyk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZU5rHB4DCiBqlKtC@kernel.org \
    --to=acme@kernel.org \
    --cc=andrii.nakryiko@gmail.com \
    --cc=guilherme.amadio@cern.ch \
    --cc=haoluo@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=maksymilian.graczyk@cern.ch \
    --cc=namhyung@kernel.org \
    --cc=stephan.hageboeck@cern.ch \
    --cc=syclops-project@cern.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).