BPF List
 help / color / mirror / Atom feed
From: Viktor Malik <vmalik@redhat.com>
To: Alan Maguire <alan.maguire@oracle.com>, bpf@vger.kernel.org
Cc: Andrii Nakryiko <andrii@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>,
	Yonghong Song <yonghong.song@linux.dev>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>
Subject: Re: [RFC bpf-next 0/3] libbpf: Add support for aliased BPF programs
Date: Tue, 3 Sep 2024 07:57:00 +0200	[thread overview]
Message-ID: <3adea7f7-0e8d-4114-ba04-356cdf9d20d1@redhat.com> (raw)
In-Reply-To: <92146771-8756-4259-88f0-e0b61c11ad55@oracle.com>

On 9/2/24 19:01, Alan Maguire wrote:
> On 02/09/2024 07:58, Viktor Malik wrote:
>> TL;DR
>>
>> This adds libbpf support for creating multiple BPF programs having the
>> same instructions using symbol aliases.
>>
>> Context
>> =======
>>
>> bpftrace has so-called "wildcarded" probes which allow to attach the
>> same program to multple different attach points. For k(u)probes, this is
>> easy to do as we can leverage k(u)probe_multi, however, other program
>> types (fentry/fexit, tracepoints) don't have such features.
>>
>> Currently, what bpftrace does is that it creates a copy of the program
>> for each attach point. This naturally results in a lot of redundant code
>> in the produced BPF object.
>>
>> Proposal
>> ========
>>
>> One way to address this problem would be to use *symbol aliases*. In
>> short, they allow to have multiple symbol table entries for the same
>> address. In bpftrace, we would create them using llvm::GlobalAlias. In
>> C, it can be achieved using compiler __attribute__((alias(...))):
>>
>>     int BPF_PROG(prog)
>>     {
>>         [...]
>>     }
>>     int prog_alias() __attribute__((alias("prog")));
>>
>> When calling bpf_object__open, libbpf is currently able to discover all
>> the programs and internally does a separate copy of the instructions for
>> each aliased program. What libbpf cannot do, is perform relocations b/c
>> it assumes that each instruction belongs to a single program only. The
>> second patch of this series changes relocation collection such that it
>> records relocations for each aliased program. With that, bpftrace can
>> emit just one copy of the full program and an alias for each target
>> attach point.
>>
>> For example, considering the following bpftrace script collecting the
>> number of hits of each VFS function using fentry over a one second
>> period:
>>
>>     $ bpftrace -e 'kfunc:vfs_* { @[func] = count() } i:s:1 { exit() }'
>>     [...]
>>
>> this change will allow to reduce the size of the in-memory BPF object
>> that bpftrace generates from 60K to 9K.
>>
>> For reference, the bpftrace PoC is in [1].
>>
>> The advantage of this change is that for BPF objects without aliases, it
>> doesn't introduce any overhead.
>>
> 
> A few high-level questions - apologies in advance if I'm missing the
> point here.
> 
> Could bpftrace use program linking to solve this issue instead? So we'd
> have separate progs for the various attach points associated with vfs_*
> functions, but they would all call the same global function. That
> _should_ reduce the memory footprint of the object I think - or are
> there issues with doing that? 

That's a good suggestion, thanks! We added subprograms to bpftrace only
relatively recently so I didn't really think about this option. I'll
definitely give it a try as it could be even more efficient.

> I also wonder if aliasing helps memory
> footprint fully, especially if we end up with separate copies of the
> program for relocation purposes; won't we have separate copies in-kernel
> then too? So I _think_ the memory utilization you're concerned about is
> not what's running in the kernel, but the BPF object representation in
> bpftrace; is that right?

Yes, that is correct. libbpf will create a copy of the program for each
symbol in PROGBITS section that it discovers (including aliases) and the
copies will be loaded into kernel.

It's mainly the footprint of the BPF object produced by bpftrace that I
was concerned about. (The reason is that we work on ahead-of-time
compilation so it will directly affect the size of the pre-compiled
binaries). But the above solution using global subprograms should reduce
the in-kernel footprint, too, so I'll try to add it and see if it would
work for bpftrace.

Thanks!
Viktor

> 
> Thanks!
> 
> Alan
> 


  reply	other threads:[~2024-09-03  5:57 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-02  6:58 [RFC bpf-next 0/3] libbpf: Add support for aliased BPF programs Viktor Malik
2024-09-02  6:58 ` [RFC bpf-next 1/3] libbpf: Support aliased symbols in linker Viktor Malik
2024-09-03 11:16   ` Jiri Olsa
2024-09-03 13:08     ` Viktor Malik
2024-09-03 14:08       ` Arnaldo Carvalho de Melo
2024-09-04  5:46         ` Viktor Malik
2024-09-03 14:53       ` Jiri Olsa
2024-09-02  6:58 ` [RFC bpf-next 2/3] libbpf: Handle relocations in aliased symbols Viktor Malik
2024-09-02  6:58 ` [RFC bpf-next 3/3] selftests/bpf: Add tests for aliased programs Viktor Malik
2024-09-02 17:01 ` [RFC bpf-next 0/3] libbpf: Add support for aliased BPF programs Alan Maguire
2024-09-03  5:57   ` Viktor Malik [this message]
2024-09-03 20:19     ` Alexei Starovoitov
2024-09-04 19:07       ` Andrii Nakryiko
2024-09-06  5:04         ` Viktor Malik
2024-09-06 15:15           ` Alexei Starovoitov
2024-09-06 17:37           ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3adea7f7-0e8d-4114-ba04-356cdf9d20d1@redhat.com \
    --to=vmalik@redhat.com \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox