BPF List
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii@kernel.org>
To: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@kernel.org
Cc: peterz@infradead.org, song@kernel.org,
	Andrii Nakryiko <andrii@kernel.org>
Subject: [PATCH bpf-next 0/3] Inline two LBR-related helpers
Date: Thu, 21 Mar 2024 11:04:58 -0700	[thread overview]
Message-ID: <20240321180501.734779-1-andrii@kernel.org> (raw)

Implement inlining of bpf_get_branch_snapshot() BPF helper using generic BPF
assembly approach.

Also inline bpf_get_smp_processor_id() BPF helper but using
architecture-specific assembly code in x86-64 JIT compiler, given getting CPU
ID is highly architecture-specific.

These two helpers are on a criticl direct path to grabbing LBR records from
BPF program and inlining them help save 3 LBR records in PERF_SAMPLE_BRANCH_ANY
mode.

Just to give some visual idea of the effect of these changes (and inlining of
kprobe_multi_link_prog_run() posted as a separte patch) based on retsnoop's
LBR output (with --lbr=any flag). I only show "wasted" records that are needed
to go from when some event happened (kernel function return in this case), to
triggering BPF program that captures LBR *the very first thing* (after getting
CPU ID to get a temporary buffer).

There are still ways to reduce number of "wasted" records further, this is
a problem that requires many small and rather independent steps.

fentry mode
===========

BEFORE
------
  [#10] __sys_bpf+0x270                          ->  __x64_sys_bpf+0x18
  [#09] __x64_sys_bpf+0x1a                       ->  bpf_trampoline_6442508684+0x7f
  [#08] bpf_trampoline_6442508684+0x9c           ->  __bpf_prog_enter_recur+0x0
  [#07] __bpf_prog_enter_recur+0x9               ->  migrate_disable+0x0
  [#06] migrate_disable+0x37                     ->  __bpf_prog_enter_recur+0xe
  [#05] __bpf_prog_enter_recur+0x43              ->  bpf_trampoline_6442508684+0xa1
  [#04] bpf_trampoline_6442508684+0xad           ->  bpf_prog_dc54a596b39d4177_fexit1+0x0
  [#03] bpf_prog_dc54a596b39d4177_fexit1+0x32    ->  bpf_get_smp_processor_id+0x0
  [#02] bpf_get_smp_processor_id+0xe             ->  bpf_prog_dc54a596b39d4177_fexit1+0x37
  [#01] bpf_prog_dc54a596b39d4177_fexit1+0xe0    ->  bpf_get_branch_snapshot+0x0
  [#00] bpf_get_branch_snapshot+0x13             ->  intel_pmu_snapshot_branch_stack+0x0

AFTER
-----
  [#07] __sys_bpf+0xdfc                          ->  __x64_sys_bpf+0x18
  [#06] __x64_sys_bpf+0x1a                       ->  bpf_trampoline_6442508829+0x7f
  [#05] bpf_trampoline_6442508829+0x9c           ->  __bpf_prog_enter_recur+0x0
  [#04] __bpf_prog_enter_recur+0x9               ->  migrate_disable+0x0
  [#03] migrate_disable+0x37                     ->  __bpf_prog_enter_recur+0xe
  [#02] __bpf_prog_enter_recur+0x43              ->  bpf_trampoline_6442508829+0xa1
  [#01] bpf_trampoline_6442508829+0xad           ->  bpf_prog_dc54a596b39d4177_fexit1+0x0
  [#00] bpf_prog_dc54a596b39d4177_fexit1+0x101   ->  intel_pmu_snapshot_branch_stack+0x0

multi-kprobe mode
=================

BEFORE
------
  [#14] __sys_bpf+0x270                          ->  arch_rethook_trampoline+0x0
  [#13] arch_rethook_trampoline+0x27             ->  arch_rethook_trampoline_callback+0x0
  [#12] arch_rethook_trampoline_callback+0x31    ->  rethook_trampoline_handler+0x0
  [#11] rethook_trampoline_handler+0x6f          ->  fprobe_exit_handler+0x0
  [#10] fprobe_exit_handler+0x3d                 ->  rcu_is_watching+0x0
  [#09] rcu_is_watching+0x17                     ->  fprobe_exit_handler+0x42
  [#08] fprobe_exit_handler+0xb4                 ->  kprobe_multi_link_exit_handler+0x0
  [#07] kprobe_multi_link_exit_handler+0x4       ->  kprobe_multi_link_prog_run+0x0
  [#06] kprobe_multi_link_prog_run+0x2d          ->  migrate_disable+0x0
  [#05] migrate_disable+0x37                     ->  kprobe_multi_link_prog_run+0x32
  [#04] kprobe_multi_link_prog_run+0x58          ->  bpf_prog_2b455b4f8a8d48c5_kexit+0x0
  [#03] bpf_prog_2b455b4f8a8d48c5_kexit+0x32     ->  bpf_get_smp_processor_id+0x0
  [#02] bpf_get_smp_processor_id+0xe             ->  bpf_prog_2b455b4f8a8d48c5_kexit+0x37
  [#01] bpf_prog_2b455b4f8a8d48c5_kexit+0x82     ->  bpf_get_branch_snapshot+0x0
  [#00] bpf_get_branch_snapshot+0x13             ->  intel_pmu_snapshot_branch_stack+0x0

AFTER
-----
  [#10] __sys_bpf+0xdfc                          ->  arch_rethook_trampoline+0x0
  [#09] arch_rethook_trampoline+0x27             ->  arch_rethook_trampoline_callback+0x0
  [#08] arch_rethook_trampoline_callback+0x31    ->  rethook_trampoline_handler+0x0
  [#07] rethook_trampoline_handler+0x6f          ->  fprobe_exit_handler+0x0
  [#06] fprobe_exit_handler+0x3d                 ->  rcu_is_watching+0x0
  [#05] rcu_is_watching+0x17                     ->  fprobe_exit_handler+0x42
  [#04] fprobe_exit_handler+0xb4                 ->  kprobe_multi_link_exit_handler+0x0
  [#03] kprobe_multi_link_exit_handler+0x31      ->  migrate_disable+0x0
  [#02] migrate_disable+0x37                     ->  kprobe_multi_link_exit_handler+0x36
  [#01] kprobe_multi_link_exit_handler+0x5c      ->  bpf_prog_2b455b4f8a8d48c5_kexit+0x0
  [#00] bpf_prog_2b455b4f8a8d48c5_kexit+0xa3     ->  intel_pmu_snapshot_branch_stack+0x0


For default --lbr mode (PERF_SAMPLE_BRANCH_ANY_RETURN), interestingly enough,
multi-kprobe is *less* wasteful (by one function call):

fentry mode
===========

BEFORE
------
  [#04] __sys_bpf+0x270                          ->  __x64_sys_bpf+0x18
  [#03] __x64_sys_bpf+0x1a                       ->  bpf_trampoline_6442508684+0x7f
  [#02] migrate_disable+0x37                     ->  __bpf_prog_enter_recur+0xe
  [#01] __bpf_prog_enter_recur+0x43              ->  bpf_trampoline_6442508684+0xa1
  [#00] bpf_get_smp_processor_id+0xe             ->  bpf_prog_dc54a596b39d4177_fexit1+0x37

AFTER
-----
  [#03] __sys_bpf+0xdfc                          ->  __x64_sys_bpf+0x18
  [#02] __x64_sys_bpf+0x1a                       ->  bpf_trampoline_6442508829+0x7f
  [#01] migrate_disable+0x37                     ->  __bpf_prog_enter_recur+0xe
  [#00] __bpf_prog_enter_recur+0x43              ->  bpf_trampoline_6442508829+0xa1

multi-kprobe mode
=================

BEFORE
------
  [#03] __sys_bpf+0x270                          ->  arch_rethook_trampoline+0x0
  [#02] rcu_is_watching+0x17                     ->  fprobe_exit_handler+0x42
  [#01] migrate_disable+0x37                     ->  kprobe_multi_link_prog_run+0x32
  [#00] bpf_get_smp_processor_id+0xe             ->  bpf_prog_2b455b4f8a8d48c5_kexit+0x37

AFTER
-----
  [#02] __sys_bpf+0xdfc                          ->  arch_rethook_trampoline+0x0
  [#01] rcu_is_watching+0x17                     ->  fprobe_exit_handler+0x42
  [#00] migrate_disable+0x37                     ->  kprobe_multi_link_exit_handler+0x36

Andrii Nakryiko (3):
  bpf: make bpf_get_branch_snapshot() architecture-agnostic
  bpf: inline bpf_get_branch_snapshot() helper
  bpf,x86: inline bpf_get_smp_processor_id() on x86-64

 arch/x86/net/bpf_jit_comp.c | 26 +++++++++++++++++++++++++-
 kernel/bpf/verifier.c       | 37 +++++++++++++++++++++++++++++++++++++
 kernel/trace/bpf_trace.c    |  4 ----
 3 files changed, 62 insertions(+), 5 deletions(-)

-- 
2.43.0


             reply	other threads:[~2024-03-21 18:05 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21 18:04 Andrii Nakryiko [this message]
2024-03-21 18:04 ` [PATCH bpf-next 1/3] bpf: make bpf_get_branch_snapshot() architecture-agnostic Andrii Nakryiko
2024-03-21 21:08   ` Jiri Olsa
2024-03-21 18:05 ` [PATCH bpf-next 2/3] bpf: inline bpf_get_branch_snapshot() helper Andrii Nakryiko
2024-03-21 21:08   ` Jiri Olsa
2024-03-21 21:27     ` Andrii Nakryiko
2024-03-21 18:05 ` [PATCH bpf-next 3/3] bpf,x86: inline bpf_get_smp_processor_id() on x86-64 Andrii Nakryiko
2024-03-21 21:08   ` Jiri Olsa
2024-03-21 21:09     ` Andrii Nakryiko
2024-03-21 22:57       ` Jiri Olsa
2024-03-21 23:38         ` Andrii Nakryiko
2024-03-21 23:49   ` Alexei Starovoitov
2024-03-22 16:45     ` Andrii Nakryiko
2024-03-25  3:28       ` Alexei Starovoitov
2024-03-25 17:01         ` Andrii Nakryiko
2024-03-21 23:46 ` [PATCH bpf-next 0/3] Inline two LBR-related helpers Alexei Starovoitov
2024-03-22 16:45   ` Andrii Nakryiko
2024-03-25  2:05     ` Alexei Starovoitov
2024-03-25 17:20       ` Andrii Nakryiko
2024-03-26  3:13         ` Alexei Starovoitov
2024-03-26 16:50           ` Andrii Nakryiko
2024-03-27 21:59             ` Alexei Starovoitov
2024-03-28 22:53               ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240321180501.734779-1-andrii@kernel.org \
    --to=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=martin.lau@kernel.org \
    --cc=peterz@infradead.org \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox