All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anton Protopopov <aspsk@isovalent.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jiri Olsa <jolsa@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Stanislav Fomichev <sdf@google.com>,
	Yonghong Song <yonghong.song@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Quentin Monnet <quentin@isovalent.com>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH v1 bpf-next 3/9] bpf: expose how xlated insns map to jitted insns
Date: Fri, 16 Feb 2024 13:57:53 +0000	[thread overview]
Message-ID: <Zc9p4e1EccPkMnmY@zh-lab-node-5> (raw)
In-Reply-To: <CAADnVQKH1aitaADzCo__PfMJJA2SxAvYjTS8Z7A6Y01G2OWJgw@mail.gmail.com>

On Wed, Feb 14, 2024 at 10:48:26PM -0800, Alexei Starovoitov wrote:
> On Thu, Feb 8, 2024 at 3:11 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > On Tue, Feb 06, 2024 at 06:26:12PM -0800, Alexei Starovoitov wrote:
> > > On Tue, Feb 6, 2024 at 2:08 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > >
> > > > On Mon, Feb 05, 2024 at 05:09:51PM -0800, Alexei Starovoitov wrote:
> > > > > On Fri, Feb 2, 2024 at 8:34 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > > > >
> > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > > index 4def3dde35f6..bdd6be718e82 100644
> > > > > > --- a/include/linux/bpf.h
> > > > > > +++ b/include/linux/bpf.h
> > > > > > @@ -1524,6 +1524,13 @@ struct bpf_prog_aux {
> > > > > >         };
> > > > > >         /* an array of original indexes for all xlated instructions */
> > > > > >         u32 *orig_idx;
> > > > > > +       /* for every xlated instruction point to all generated jited
> > > > > > +        * instructions, if allocated
> > > > > > +        */
> > > > > > +       struct {
> > > > > > +               u32 off;        /* local offset in the jitted code */
> > > > > > +               u32 len;        /* the total len of generated jit code */
> > > > > > +       } *xlated_to_jit;
> > > > >
> > > > > Simply put Nack to this approach.
> > > > >
> > > > > Patches 2 and 3 add an extreme amount of memory overhead.
> > > > >
> > > > > As we discussed during office hours we need a "pointer to insn" concept
> > > > > aka "index on insn".
> > > > > The verifier would need to track that such things exist and adjust
> > > > > indices of insns when patching affects those indices.
> > > > >
> > > > > For every static branch there will be one such "pointer to insn".
> > > > > Different algorithms can be used to keep them correct.
> > > > > The simplest 'lets iterate over all such pointers and update them'
> > > > > during patch_insn() may even be ok to start.
> > > > >
> > > > > Such "pointer to insn" won't add any memory overhead.
> > > > > When patch+jit is done all such "pointer to insn" are fixed value.
> > > >
> > > > Ok, thanks for looking, this makes sense.
> > >
> > > Before jumping into coding I think it would be good to discuss
> > > the design first.
> > > I'm thinking such "address of insn" will be similar to
> > > existing "address of subprog",
> > > which is encoded in ld_imm64 as BPF_PSEUDO_FUNC.
> > > "address of insn" would be a bit more involved to track
> > > during JIT and likely trivial during insn patching,
> > > since we're already doing imm adjustment for pseudo_func.
> > > So that part of design is straightforward.
> > > Implementation in the kernel and libbpf can copy paste from pseudo_func too.
> >
> > To implement the "primitive version" of static branches, where the
> > only API is `static_branch_update(xlated off, on/off)` the only
> > requirement is to build `xlated -> jitted` mapping (which is done
> > in JIT, after the verification). This can be done in a simplified
> > version of this patch, without xlated->orig mapping and with
> > xlated->jit mapping only done to gotol_or_nop instructions.
> 
> yes. The array of insn->jit_addr sized with as many goto_or_nop-s
> the prog will work for user space to flip them, but...
> 
> > The "address of insn" appears when we want to provide a more
> > higher-level API when some object (in user-space or in kernel) keeps
> > track of one or more gotol_or_nop instructions so that after the
> > program load this controlling object has a list of xlated offsets.
> > But this would be a follow-up to the initial static branches patch.
> 
> this won't work as a follow up,
> since such an array won't work for bpf prog that wants to flip branches.
> There is nothing that associates static_branch name/id with
> particular goto_or_nop.
> There could be a kfunc that bpf prog calls, but it can only
> flip all of such insns in the prog.
> Unless we start encoding a special id inside goto_or_nop or other hacks.
> 
> > > The question is whether such "address of insn" should be allowed
> > > in the data section. If so, we need to brainstorm how to
> > > do it cleanly.
> > > We had various hacks for similar things in the past. Like prog_array.
> > > Let's not repeat such mistakes.
> >
> > So, data section is required for implementing jump tables? Like,
> > to add a new PTR_TO_LABEL or PTR_TO_INSN data type, and a
> > corresponding "ptr to insn" object for every occurence of &&label,
> > which will be adjusted during verification.
> > Looks to me like this one doesn't require any more API than specifying
> > a list of &&label occurencies on program load.
> >
> > For "static keys" though (a feature on top of this patch series) we
> > need to have access to the corresponding set of adjusted pointers.
> >
> > Isn't this enough to add something like an array of
> >
> >   struct insn_ptr {
> >       u32 type; /* LABEL, STATIC_BRANCH,... */
> >       u32 insn_off; /* original offset on load */
> >       union {
> >           struct label {...};
> >           struct st_branch { u32 key_id, ..};
> >       };
> >   };
> 
> which I don't like because it hard codes static_branch needs into
> insn->jit_addr association.
> "address of insn" should be an individual building block without
> bolted on parts.
> 
> A data section with a set of such "address of insn"
> can be a description of one static_branch.
> There will be different ways to combine such building blocks.
> For example:
> static_branch(foo) can emit goto_or_nop into bpf code
> and add "address of insn" into a section '.insn_addrs.foo".
> This section is what libbpf and bpf prog will recognize as a set
> of "address of insn" that can be passed into static_branch_update kfunc
> or static_branch_update sys_bpf command.
> The question is whether we need a new map type (array derivative)
> to hold a set of "address of insn" or it can be a part of an existing
> global data array.
> A new map type is easier to reason about.
> Notice how such a new map type is not a map type of static branches.
> It's not a map type of goto_or_nop instructions either.
> 
> At load time libbpf can populate this array with indices of insns
> that the verifier and JIT need to track. Once JITed the array is readonly
> for bpf prog and for user space.

So this will be a map per .insn_addrs.X section (where X is key or
a pre-defined suffix for jump tables or indirect calls). And to tell
the verifier about these maps we will need to pass an array of

    struct {
            u32 map_fd;
            u32 type; /* static key, jump table, etc. */
    }

on program load. Is this correct?

> With that mechanism compilers can generate a proper switch() jmp table.
> llvm work can be a follow up, of course, but the whole design needs
> to be thought through to cover all use cases.
> 
> To summarize, here's what I'm proposing:
> - PTR_TO_INSN verifier regtype that can be passed to static_branch_update kfunc

If we have a set of pointers to jump instructions, generated from
static_branch(foo) for same foo, then this makes more sense to
provide a

    static_branch_update(foo)

(where foo is substituted by libbpf with a map fd of .insn_addrs.foo
on load). The same for userspace:

    bpf(STATIC_BRANCH_UPDATE, .attrs={.map_fd=foo})

> - new map type (array) that holds objects that are PTR_TO_INSN for the verifier
> libbpf populates this array with indices of insn it wants to track.
> bpf prog needs to "use" this array, so prog/map association is built.
> - verifier/JIT update each PTR_TO_INSN during transformations.
> - static_branch(foo) macro emits goto_or_nop insn and adds 8 bytes
> into ".insn_addrs.foo" section with an ELF relocation that
> libbpf will convert into index.
> 
> When compilers implement jmptables for switch(key) they will generate
> ".insn_addrs.uniq_suffix" sections and emit
> rX = ld_imm64 that_section
> rX += switch_key
> rY = *(u64 *)rX
> jmpx rY

What are the types for rX and rY? I thought that we will need to do
smth like

  rX = .insn_addrs.uniq_suffix[switch_key] /* rX has type PTR_TO_INSN */
  ...
  jmpx rX

this can be done if for switch cases (or any other goto *label alike) we generate

  rX = map_lookup_elem(.insn_addrs.uniq_suffix, index)
  jmpx rX

> The verifier would need to do push_stack() for this indirect jmp insn
> as many times as there are elements in ".insn_addrs.uniq_suffix" array.
> 
> And similar for indirect calls.
> That section becomes an array of pointers to functions.
> We can make it more flexible for indirect callx by
> storing BTF func proto and allowing global subprogs with same proto
> to match as safe call targets.

  reply	other threads:[~2024-02-16 14:04 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-02 16:28 [PATCH v1 bpf-next 0/9] BPF static branches Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 1/9] bpf: fix potential error return Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 2/9] bpf: keep track of and expose xlated insn offsets Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 3/9] bpf: expose how xlated insns map to jitted insns Anton Protopopov
2024-02-06  1:09   ` Alexei Starovoitov
2024-02-06 10:02     ` Anton Protopopov
2024-02-07  2:26       ` Alexei Starovoitov
2024-02-08 11:05         ` Anton Protopopov
2024-02-15  6:48           ` Alexei Starovoitov
2024-02-16 13:57             ` Anton Protopopov [this message]
2024-02-21  1:09               ` Alexei Starovoitov
2024-03-06 10:44                 ` Anton Protopopov
2024-03-14  1:56                   ` Alexei Starovoitov
2024-03-14  9:03                     ` Anton Protopopov
2024-03-14 17:07                       ` Alexei Starovoitov
2024-03-14 20:06                         ` Andrii Nakryiko
2024-03-14 21:41                           ` Alexei Starovoitov
2024-03-15 13:11                             ` Anton Protopopov
2024-03-15 16:32                             ` Andrii Nakryiko
2024-03-15 17:22                               ` Alexei Starovoitov
2024-03-15 17:29                                 ` Andrii Nakryiko
2024-03-28 16:37                                   ` Anton Protopopov
2024-03-29 22:44                                     ` Andrii Nakryiko
2024-04-01  9:47                                       ` Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 4/9] selftests/bpf: Add tests for instructions mappings Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 5/9] bpftool: dump new fields of bpf prog info Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 6/9] bpf: add support for an extended JA instruction Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 7/9] bpf: Add kernel/bpftool asm support for new instructions Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 8/9] bpf: add BPF_STATIC_BRANCH_UPDATE syscall Anton Protopopov
2024-02-02 16:28 ` [PATCH v1 bpf-next 9/9] selftests/bpf: Add tests for new ja* instructions Anton Protopopov
2024-02-02 22:39 ` [PATCH v1 bpf-next 0/9] BPF static branches Andrii Nakryiko
2024-02-04 16:05   ` Anton Protopopov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zc9p4e1EccPkMnmY@zh-lab-node-5 \
    --to=aspsk@isovalent.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=quentin@isovalent.com \
    --cc=sdf@google.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.