From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Farid Zakaria <farid.m.zakaria@gmail.com>,
Xdp <xdp-newbies@vger.kernel.org>
Subject: Re: bpf_helpers and you... some more...
Date: Thu, 31 Oct 2019 10:58:18 +0100 [thread overview]
Message-ID: <87bltxp77p.fsf@toke.dk> (raw)
In-Reply-To: <CACCo2j=TJYZ68ur53vNYxaS2qQgPv6ouij3P=tmrno-SJFTw0Q@mail.gmail.com>
Farid Zakaria <farid.m.zakaria@gmail.com> writes:
> This is my attempt of a continuation of David's prior e-mail
> https://www.spinics.net/lists/xdp-newbies/msg00179.html
>
> I was curious how ebpf filters are wired and work. The heavy use of C
> macros makes the source code difficult for me to comprehend (maybe
> there's an online pre-processed version?).
> I'm hoping others may find this exploratory-dive insightful (hopefully
> it's accurate enough).
>
> Let's write a very trivial ebpf filter (hello_world_kern.c) and have
> it print "hello world"
>
> #include <linux/bpf.h>
>
> #define __section(NAME) __attribute__((section(NAME), used))
>
> static char _license[] __section("license") = "GPL";
>
> /* helper functions called from eBPF programs written in C */
> static int (*bpf_trace_printk)(const char *fmt, int fmt_size,
> ...) = (void *)BPF_FUNC_trace_printk;
>
> __section("hello_world") int hello_world_filter(struct __sk_buff *skb) {
> char msg[] = "hello world";
> bpf_debug_printk(msg, sizeof(msg));
> return 0;
> }
>
> If we compile the above using the below we can inspect the LLVM IR.
> clang -c -o hello_world_kern.ll -x c -S -emit-llvm hello_world_kern.c
>
> The few lines that standout are:
>
> @bpf_trace_printk = internal global i32 (i8*, i32, ...)* inttoptr
> (i64 6 to i32 (i8*, i32, ...)*), align 8
> ....
> %6 = load i32 (i8*, i32, ...)*, i32 (i8*, i32, ...)**
> @bpf_trace_printk, align 8
> %7 = getelementptr inbounds [13 x i8], [13 x i8]* %3, i32 0, i32 0
> %8 = call i32 (i8*, i32, ...) %6(i8* %7, i32 13)
>
> The above demonstrates that the value of BPF_FUNC_trace_printk is
> simply the integer 6 and it is being casted to a function pointer.
> Sure enough, we can confirm that `bpf_trace_printk` is the 6th value
> in the enumeration of known bpf bpf_helpers.
> (https://elixir.bootlin.com/linux/v5.3.7/source/include/uapi/linux/bpf.h#L2724)
>
> We can go even further and take this LLVM IR and generate human
> readable eBPF assembly using `llc`
>
> llc hello_world_kern.ll -march=bpf
>
> Depending on the optimization level of the earlier `clang` call you
> may see different results however using `-O3` we can see
>
> call 6
>
> Great! so we know that the call to `bpf_trace_printk` gets translated
> into a call instruction with immediate value of 6.
>
> How does it end up calling code within the kernel though?
> Once the Verifier verifies the bytecode it calls `fixup_bpf_calls`
> (https://elixir.bootlin.com/linux/v5.3.8/source/kernel/bpf/verifier.c#L8869)
> which goes through all the instructions and makes the necessary
> adjustment to the immediate value
>
> fixup_bpf_calls(...) {
> ...
> patch_call_imm:
> fn = env->ops->get_func_proto(insn->imm, env->prog);
> /* all functions that have prototype and verifier allowed
> * programs to call them, must be real in-kernel functions
> */
> if (!fn->func) {
> verbose(env,
> "kernel subsystem misconfigured func %s#%d\n",
> func_id_name(insn->imm), insn->imm);
> return -EFAULT;
> }
> insn->imm = fn->func - __bpf_call_base;
>
> N.B. I haven't deciphered how __bpf_call_base is used / works
>
> The `get_func_proto` will return the function prototypes registered by
> every subsystem such as in net.
> (https://elixir.bootlin.com/linux/v5.3.8/source/net/core/filter.c#L5991)
> At this point in the method it's a simple switch statement to get the
> matching function prototype given the numeric value.
>
> I'd love to see more on the code path of how the non-JIT vs JIT
> instructions get handled.
> For the net subsystem, I can see where the ebpf prog is invoked
> (https://elixir.bootlin.com/linux/v5.3.8/source/net/core/filter.c#L119),
> but it's difficult to work out how the choice of executing the
> function directly (in the case of JIT) vs running it through the
> interpreter is handled.
When a program is jit'ed, the function pointer in struct
bpf_prog->bpf_func is replaced with a pointer to the machine code
generated by the jit. The jit does this for calls:
https://elixir.bootlin.com/linux/v5.3.8/source/arch/x86/net/bpf_jit_comp.c#L828
-Toke
prev parent reply other threads:[~2019-10-31 9:58 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-30 19:03 bpf_helpers and you... some more Farid Zakaria
2019-10-31 9:58 ` Toke Høiland-Jørgensen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bltxp77p.fsf@toke.dk \
--to=toke@redhat.com \
--cc=farid.m.zakaria@gmail.com \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.