Re: [PATCH bpf-next 6/7] libbpf: BPF Static Keys support

BPF List
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Anton Protopopov <aspsk@isovalent.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jiri Olsa <jolsa@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Stanislav Fomichev <sdf@google.com>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 6/7] libbpf: BPF Static Keys support
Date: Wed, 13 Dec 2023 19:04:33 -0800	[thread overview]
Message-ID: <b0530d96-ab92-47e7-94e9-61b961a73557@linux.dev> (raw)
In-Reply-To: <CAADnVQ+b3_5qzaR9pr6B23xDxCO10iz685tHfsakW3MnoVYMbg@mail.gmail.com>


On 12/13/23 6:15 PM, Alexei Starovoitov wrote:
> On Tue, Dec 12, 2023 at 2:28 AM Anton Protopopov <aspsk@isovalent.com> wrote:
>>
>> This seems to have a benefit that there is no back compatibility issue
>> (if we use r1, because r0/r11 will be rejected by old verifiers). We
>> can have
>>
>>      r1 = 64bit_const
>>      if r1 == r1 goto
>>
>> and
>>
>>      r1 = 64bit_const
>>      if r1 != r1 goto
>>
>> and translate it on prog load to new instruction as JUMP_OF_NOP and
>> NOP_OR_JUMP, correspondingly. On older kernels it will have the
>> default (key is off) behaviour.
> As Andrii pointed out any new insn either JA with extra bits
> or special meaning if rX == rX can be sanitized by libbpf
> into plain JA.
> There will be no backward compat issues.
>
>> Ok, from BPF arch perspective this can work with two bits (not for
>> practical purposes though, IMO, see my next e-mail).
> I read this email and I still don't understand why you need a 3rd bit.
>
>>> And the special map really doesn't fit.
>>> Whatever we do, let's keep text_poke-able insn logic separate
>>> from bookkeeping of addresses of those insns.
>>> I think a special prefixed section that is understood by libbpf
>>> (like what I proposed with "name.static_branch") will do fine.
>>> If it's not good enough we can add a "set" map type
>>> that will be a generic set of values.
>>> It can be a set of 8-byte addresses to keep locations of static_branches,
>>> but let's keep it generic.
>>> I think it's fine to add:
>>> __uint(type, BPF_MAP_TYPE_SET)
>>> and let libbpf populate it with addresses of insns,
>>> or address of variables, or other values
>>> when it prepares a program for loading.
>> What is the higher-level API in this case? The static_branch_set(branch,
>> bool on) is not enough because we want to distinguish between "normal"
>> and "inverse" branches (for unlikely/likely cases).
> What is "likely/unlikely cases" ?
> likely() is a hint to the compiler to order basic blocks in
> a certain way. There is no likely/unlikely bit in the binary code
> after compilation on x86 or other architectures.
>
> There used to be a special bit on sparc64 that would mean
> a default jmp|fallthrough action for a conditional jmp.
> But that was before sparc became out of order and gained proper
> branch predictor in HW.
>
>>   We can implement
>> this using something like this:
>>
>> static_key_set(key, bool new_value)
>> {
>>      /* true if we change key value */
>>      bool key_changed = key->old_value ^ new_value;
>>
>>      for_each_prog(prog, key)
>>          for_each_branch(branch, prog, key)
>>              static_branch_flip(prog, branch, key_changed)
>> }
>>
>> where static_branch_flip flips the second bit of SRC_REG.
> I don't understand why you keep bringing up 'flip' use case.
> The kernel doesn't have such an operation on static branches.
> Which makes me believe that it wasn't necessary.
> Why do we need one for the bpf static branch?
>
>> We need to
>> keep track of prog->branches and key->progs. How is this different
>> from what my patch implements?
> What I'm proposing is to have a generic map __uint(type, BPF_MAP_TYPE_SET)
> and by naming convention libbpf will populate it with addresses
> of JA_OR_NOP from all progs.
> In asm it could be:
> asm volatile ("r0 = %[set_A] ll; goto_or_nop ...");
> (and libbpf will remove ld_imm64 from the prog before loading.)
>
> or via
> asm volatile ("goto_or_nop ...; .pushsection set_A_name+suffix; .long");
> (and libbpf will copy from the special section into a set and remove
> special section).

This is one more alternative.

|asm volatile goto ("r0 = 0; \ static_key_loc_1: \ gotol_or_nop 
%l[label]; \ r2 = 2; \ r3 = 3; \ ":: : "r0", "r2", "r3" :label);|

User code can use libbpf API static_key_enable("|static_key_loc_1|")
to enable the above gotol_or_nop. static_key_disable("static_key_loc_1")
or static_key_enabled("static_key_loc_1") can be similary defined.

Inside the libbpf, it can look at ELF file, find label "static_key_loc_1",
and also find label's corresponding insn index and verify that
the insn with label "static_key_loc_1" must be a gotol_or_nop
or nop_or_gotol insn. Eventually libbpf can call a bpf syscall
e.g. sys_bpf(cmd=STATIC_KEY_ENABLE, prog_fd, insn_offset)
to actually change the "current" state to gotol or nop depending
on what ENABLE means.

For enable/disable static keys in the bpf program itself,
similary, bpf program can have bpf_static_key_enable("static_key_loc_1"),
the libbpf needs to change it to bpf_static_key_enable(insn_offset)
and kernel verifier should process it properly.

Slightly different from what Alexei proposed, but another approach
for consideration and discussion.

>
> It will be a libbpf convention and the kernel doesn't need
> to know about a special static branch map type or array of addresses
> in prog_load cmd.
> Only JA insn is relevant to the verifier and JITs.
>
> Ideally we don't need to introduce SET map type and
> libbpf wouldn't need to populate it.
> If we can make it work with an array of values that .pushsection + .long
> automatically populates and libbpf treats it as a normal global data array
> that would be ideal.
> Insn addresses from all progs will be in that array after loading.
> Sort of like ".kconfig" section that libbpf populates,
> but it's a normal array underneath.
>
>> If this is implemented in userspace, then how we prevent synchronous
>> updates of the key (and a relocation variant doesn't seem to work from
>> userspace)? Or is this a new kfunc? If yes, then how is it
>> executed,
> then user space can have small helper in libbpf that iterates
> over SET (or array) and
> calls sys_bpf(cmd=STATIC_BRANCH_ENABLE, one_value_from_set)
>
> Similar in the kernel. When bpf progs want to enable a key it does
> bpf_for_each(set) { // open coded iterator
>     bpf_static_branch_enable(addr); // kfunc call
> }

next prev parent reply	other threads:[~2023-12-14  3:04 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-06 14:10 [PATCH bpf-next 0/7] BPF Static Keys Anton Protopopov
2023-12-06 14:10 ` [PATCH bpf-next 1/7] bpf: extract bpf_prog_bind_map logic into an inline helper Anton Protopopov
2023-12-06 14:10 ` [PATCH bpf-next 2/7] bpf: rename and export a struct definition Anton Protopopov
2023-12-06 14:10 ` [PATCH bpf-next 3/7] bpf: adjust functions offsets when patching progs Anton Protopopov
2023-12-06 14:10 ` [PATCH bpf-next 4/7] bpf: implement BPF Static Keys support Anton Protopopov
2023-12-06 14:10 ` [PATCH bpf-next 5/7] bpf: x86: implement static keys support Anton Protopopov
2023-12-06 14:10 ` [PATCH bpf-next 6/7] libbpf: BPF Static Keys support Anton Protopopov
2023-12-08  3:45   ` Alexei Starovoitov
2023-12-08 16:19     ` Anton Protopopov
2023-12-08 22:04       ` Andrii Nakryiko
2023-12-08 23:07         ` Eduard Zingerman
2023-12-09  4:07           ` Alexei Starovoitov
2023-12-09  4:05         ` Alexei Starovoitov
2023-12-09  4:15           ` Yonghong Song
2023-12-09  4:25             ` Alexei Starovoitov
2023-12-09  5:04               ` Yonghong Song
2023-12-09 17:18                 ` Alexei Starovoitov
2023-12-10  6:32                   ` Yonghong Song
2023-12-10 10:30                     ` Eduard Zingerman
2023-12-11  3:33                       ` Alexei Starovoitov
2023-12-11 18:49                         ` Andrii Nakryiko
2023-12-12 10:25                         ` Anton Protopopov
2023-12-14  2:15                           ` Alexei Starovoitov
2023-12-14  3:04                             ` Yonghong Song [this message]
2023-12-14 16:56                               ` Eduard Zingerman
2023-12-14 16:33                             ` Anton Protopopov
2023-12-11 17:31                     ` Anton Protopopov
2023-12-11 19:08                       ` Alexei Starovoitov
2023-12-12  9:06                         ` Anton Protopopov
2023-12-11 21:51                       ` Yonghong Song
2023-12-11 22:52                         ` Yonghong Song
2023-12-06 14:10 ` [PATCH bpf-next 7/7] selftests/bpf: Add tests for BPF Static Keys Anton Protopopov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b0530d96-ab92-47e7-94e9-61b961a73557@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=aspsk@isovalent.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=sdf@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox