Re: [PATCH v1 bpf-next 3/9] bpf: expose how xlated insns map to jitted insns

BPF List
 help / color / mirror / Atom feed

From: Anton Protopopov <aspsk@isovalent.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jiri Olsa <jolsa@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Stanislav Fomichev <sdf@google.com>,
	Yonghong Song <yonghong.song@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Quentin Monnet <quentin@isovalent.com>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH v1 bpf-next 3/9] bpf: expose how xlated insns map to jitted insns
Date: Thu, 14 Mar 2024 09:03:31 +0000	[thread overview]
Message-ID: <ZfK9Y5c7RuOu0YiN@zh-lab-node-5> (raw)
In-Reply-To: <CAADnVQJnh+j9etPYkst7hhnepvsoBuJY-ds-AF0fn3VqqJNXNg@mail.gmail.com>

On Wed, Mar 13, 2024 at 06:56:34PM -0700, Alexei Starovoitov wrote:
> On Wed, Mar 6, 2024 at 2:51 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > On Tue, Feb 20, 2024 at 05:09:36PM -0800, Alexei Starovoitov wrote:
> > > On Fri, Feb 16, 2024 at 6:04 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > >
> > > > On Wed, Feb 14, 2024 at 10:48:26PM -0800, Alexei Starovoitov wrote:
> > > > > On Thu, Feb 8, 2024 at 3:11 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > > > >
> > > > > > On Tue, Feb 06, 2024 at 06:26:12PM -0800, Alexei Starovoitov wrote:
> > > > > > > On Tue, Feb 6, 2024 at 2:08 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Feb 05, 2024 at 05:09:51PM -0800, Alexei Starovoitov wrote:
> > > > > > > > > On Fri, Feb 2, 2024 at 8:34 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > > > > > > > >
> > > > > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > > > > > > index 4def3dde35f6..bdd6be718e82 100644
> > > > > > > > > > --- a/include/linux/bpf.h
> > > > > > > > > > +++ b/include/linux/bpf.h
> > > > > > > > > > @@ -1524,6 +1524,13 @@ struct bpf_prog_aux {
> > > > > > > > > >         };
> > > > > > > > > >         /* an array of original indexes for all xlated instructions */
> > > > > > > > > >         u32 *orig_idx;
> > > > > > > > > > +       /* for every xlated instruction point to all generated jited
> > > > > > > > > > +        * instructions, if allocated
> > > > > > > > > > +        */
> > > > > > > > > > +       struct {
> > > > > > > > > > +               u32 off;        /* local offset in the jitted code */
> > > > > > > > > > +               u32 len;        /* the total len of generated jit code */
> > > > > > > > > > +       } *xlated_to_jit;
> > > > > > > > >
> > > > > > > > > Simply put Nack to this approach.
> > > > > > > > >
> > > > > > > > > Patches 2 and 3 add an extreme amount of memory overhead.
> > > > > > > > >
> > > > > > > > > As we discussed during office hours we need a "pointer to insn" concept
> > > > > > > > > aka "index on insn".
> > > > > > > > > The verifier would need to track that such things exist and adjust
> > > > > > > > > indices of insns when patching affects those indices.
> > > > > > > > >
> > > > > > > > > For every static branch there will be one such "pointer to insn".
> > > > > > > > > Different algorithms can be used to keep them correct.
> > > > > > > > > The simplest 'lets iterate over all such pointers and update them'
> > > > > > > > > during patch_insn() may even be ok to start.
> > > > > > > > >
> > > > > > > > > Such "pointer to insn" won't add any memory overhead.
> > > > > > > > > When patch+jit is done all such "pointer to insn" are fixed value.
> > > > > > > >
> > > > > > > > Ok, thanks for looking, this makes sense.
> > > > > > >
> > > > > > > Before jumping into coding I think it would be good to discuss
> > > > > > > the design first.
> > > > > > > I'm thinking such "address of insn" will be similar to
> > > > > > > existing "address of subprog",
> > > > > > > which is encoded in ld_imm64 as BPF_PSEUDO_FUNC.
> > > > > > > "address of insn" would be a bit more involved to track
> > > > > > > during JIT and likely trivial during insn patching,
> > > > > > > since we're already doing imm adjustment for pseudo_func.
> > > > > > > So that part of design is straightforward.
> > > > > > > Implementation in the kernel and libbpf can copy paste from pseudo_func too.
> > > > > >
> > > > > > To implement the "primitive version" of static branches, where the
> > > > > > only API is `static_branch_update(xlated off, on/off)` the only
> > > > > > requirement is to build `xlated -> jitted` mapping (which is done
> > > > > > in JIT, after the verification). This can be done in a simplified
> > > > > > version of this patch, without xlated->orig mapping and with
> > > > > > xlated->jit mapping only done to gotol_or_nop instructions.
> > > > >
> > > > > yes. The array of insn->jit_addr sized with as many goto_or_nop-s
> > > > > the prog will work for user space to flip them, but...
> > > > >
> > > > > > The "address of insn" appears when we want to provide a more
> > > > > > higher-level API when some object (in user-space or in kernel) keeps
> > > > > > track of one or more gotol_or_nop instructions so that after the
> > > > > > program load this controlling object has a list of xlated offsets.
> > > > > > But this would be a follow-up to the initial static branches patch.
> > > > >
> > > > > this won't work as a follow up,
> > > > > since such an array won't work for bpf prog that wants to flip branches.
> > > > > There is nothing that associates static_branch name/id with
> > > > > particular goto_or_nop.
> > > > > There could be a kfunc that bpf prog calls, but it can only
> > > > > flip all of such insns in the prog.
> > > > > Unless we start encoding a special id inside goto_or_nop or other hacks.
> > > > >
> > > > > > > The question is whether such "address of insn" should be allowed
> > > > > > > in the data section. If so, we need to brainstorm how to
> > > > > > > do it cleanly.
> > > > > > > We had various hacks for similar things in the past. Like prog_array.
> > > > > > > Let's not repeat such mistakes.
> > > > > >
> > > > > > So, data section is required for implementing jump tables? Like,
> > > > > > to add a new PTR_TO_LABEL or PTR_TO_INSN data type, and a
> > > > > > corresponding "ptr to insn" object for every occurence of &&label,
> > > > > > which will be adjusted during verification.
> > > > > > Looks to me like this one doesn't require any more API than specifying
> > > > > > a list of &&label occurencies on program load.
> > > > > >
> > > > > > For "static keys" though (a feature on top of this patch series) we
> > > > > > need to have access to the corresponding set of adjusted pointers.
> > > > > >
> > > > > > Isn't this enough to add something like an array of
> > > > > >
> > > > > >   struct insn_ptr {
> > > > > >       u32 type; /* LABEL, STATIC_BRANCH,... */
> > > > > >       u32 insn_off; /* original offset on load */
> > > > > >       union {
> > > > > >           struct label {...};
> > > > > >           struct st_branch { u32 key_id, ..};
> > > > > >       };
> > > > > >   };
> > > > >
> > > > > which I don't like because it hard codes static_branch needs into
> > > > > insn->jit_addr association.
> > > > > "address of insn" should be an individual building block without
> > > > > bolted on parts.
> > > > >
> > > > > A data section with a set of such "address of insn"
> > > > > can be a description of one static_branch.
> > > > > There will be different ways to combine such building blocks.
> > > > > For example:
> > > > > static_branch(foo) can emit goto_or_nop into bpf code
> > > > > and add "address of insn" into a section '.insn_addrs.foo".
> > > > > This section is what libbpf and bpf prog will recognize as a set
> > > > > of "address of insn" that can be passed into static_branch_update kfunc
> > > > > or static_branch_update sys_bpf command.
> > > > > The question is whether we need a new map type (array derivative)
> > > > > to hold a set of "address of insn" or it can be a part of an existing
> > > > > global data array.
> > > > > A new map type is easier to reason about.
> > > > > Notice how such a new map type is not a map type of static branches.
> > > > > It's not a map type of goto_or_nop instructions either.
> > > > >
> > > > > At load time libbpf can populate this array with indices of insns
> > > > > that the verifier and JIT need to track. Once JITed the array is readonly
> > > > > for bpf prog and for user space.
> > > >
> > > > So this will be a map per .insn_addrs.X section (where X is key or
> > > > a pre-defined suffix for jump tables or indirect calls). And to tell
> > > > the verifier about these maps we will need to pass an array of
> > > >
> > > >     struct {
> > > >             u32 map_fd;
> > > >             u32 type; /* static key, jump table, etc. */
> > > >     }
> > > >
> > > > on program load. Is this correct?
> > >
> > >
> > > Probably not.
> > > Since we're going with a new map type (at least for the sake of this
> > > discussion) it shouldn't need a new way to tell the verifier about it.
> > > If .insn_addrs.jmp_table_A was a section generated for switch() statement
> > > by llvm it will be created as a map by libbpf,
> > > and there will be an ld_imm64 insn generated by llvm that points
> > > to that map.
> > > libbpf will populate ld_imm64 insn with map_fd, just like it does
> > > for global data.
> >
> > I understand how this works for indirect jumps (and for the
> > bpf_static_branch_update(&foo) kfunc) where we have a ld_imm64 with a
> > map, however, I am still not sure how this will work for static
> > branches where we just have a 8 byte JA insn + an index in the
> > corresponding ".insn_addrs.foo" section. How kernel will know that the
> > program is using a corresponding map which we create from
> > ".insn_addrs.foo" without specifying this on program load?
> >
> > (Sorry for replying late, catching up after [simultaneous] pto &
> > covid.)
> 
> sorry. ctx switch takes time.
> libbpf can just bpf_prog_bind_map to associate this new map type
> of ptr_to_insn with a program.
> Or I misunderstood the question?

All ptr_to_insn maps are required during the verification. So
bpf_prog_bind_map can't be used as it requires an existing program.

What could work and what I am proposing is to pass a list of bound
maps in prog_load attributes. Then such maps can be used during the
verification. For normal maps

  prog = prog_load(attr={.bound_maps=maps})

will be semantically the same as

  prog = prog_load()
  bpf_prog_bind_map(prog, maps)

next prev parent reply	other threads:[~2024-03-14  9:11 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-02 16:28 [PATCH v1 bpf-next 0/9] BPF static branches Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 1/9] bpf: fix potential error return Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 2/9] bpf: keep track of and expose xlated insn offsets Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 3/9] bpf: expose how xlated insns map to jitted insns Anton Protopopov
2024-02-06  1:09   ` Alexei Starovoitov
2024-02-06 10:02     ` Anton Protopopov
2024-02-07  2:26       ` Alexei Starovoitov
2024-02-08 11:05         ` Anton Protopopov
2024-02-15  6:48           ` Alexei Starovoitov
2024-02-16 13:57             ` Anton Protopopov
2024-02-21  1:09               ` Alexei Starovoitov
2024-03-06 10:44                 ` Anton Protopopov
2024-03-14  1:56                   ` Alexei Starovoitov
2024-03-14  9:03                     ` Anton Protopopov [this message]
2024-03-14 17:07                       ` Alexei Starovoitov
2024-03-14 20:06                         ` Andrii Nakryiko
2024-03-14 21:41                           ` Alexei Starovoitov
2024-03-15 13:11                             ` Anton Protopopov
2024-03-15 16:32                             ` Andrii Nakryiko
2024-03-15 17:22                               ` Alexei Starovoitov
2024-03-15 17:29                                 ` Andrii Nakryiko
2024-03-28 16:37                                   ` Anton Protopopov
2024-03-29 22:44                                     ` Andrii Nakryiko
2024-04-01  9:47                                       ` Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 4/9] selftests/bpf: Add tests for instructions mappings Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 5/9] bpftool: dump new fields of bpf prog info Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 6/9] bpf: add support for an extended JA instruction Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 7/9] bpf: Add kernel/bpftool asm support for new instructions Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 8/9] bpf: add BPF_STATIC_BRANCH_UPDATE syscall Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 9/9] selftests/bpf: Add tests for new ja* instructions Anton Protopopov
2024-02-02 22:39 ` [PATCH v1 bpf-next 0/9] BPF static branches Andrii Nakryiko
2024-02-04 16:05   ` Anton Protopopov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZfK9Y5c7RuOu0YiN@zh-lab-node-5 \
    --to=aspsk@isovalent.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=quentin@isovalent.com \
    --cc=sdf@google.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox