BPF List
 help / color / mirror / Atom feed
From: Alan Maguire <alan.maguire@oracle.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: andrii@kernel.org, jolsa@kernel.org, acme@redhat.com,
	quentin@isovalent.com, eddyz87@gmail.com, mykolal@fb.com,
	ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
	song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com,
	haoluo@google.com, houtao1@huawei.com, bpf@vger.kernel.org,
	masahiroy@kernel.org, mcgrof@kernel.org, nathan@kernel.org
Subject: Re: [RFC bpf-next 09/13] libbpf: split BTF reconciliation
Date: Fri, 5 Apr 2024 11:06:34 +0100	[thread overview]
Message-ID: <9be97b85-104b-4c33-a23a-ad68ba7f5838@oracle.com> (raw)
In-Reply-To: <CAEf4BzbQEYgiJ=Xc7qaf0Tm_geTd40Ld+623Gk0TaBjPtTsnKg@mail.gmail.com>

On 29/03/2024 22:01, Andrii Nakryiko wrote:
> On Fri, Mar 22, 2024 at 3:26 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> Map base reference BTF type ids referenced in split BTF and their
>> references to the base BTF passed in, and if the mapping succeeds,
>> reparent the split BTF to the base BTF.
>>
>> Reconciliation rules are
>>
>> - base types must match exactly
>> - enum[64] types should match all value name/value pairs, but the
>>   to-be-reconciled enum[64] can also define additional name/value pairs
>> - named fwds match to the correspondingly-named struct/union/enum/enum64
> 
> yeah, but what about their size if they are embedded? Using FWD was a
> nice trick, but it's not flexible enough for recording (optionally)
> size... Probably emitting an empty (but named) struct/union/enum would
> be a bit better (and actually make split BTF using base ref more valid
> even without pre-processing).
> 
>> - anon struct/unions must have field names/offsets specified in base
>>   reference BTF matched by those in base BTF we are matching with
>>
>> Reconciliation can not recurse, since it will be used in-kernel also and
>> we do not want to blow up the kernel stack when carrying out type
>> compatibility checks.  Hence we use a stack for reference type
>> reconciliation rather then recursive function calls.
>>
>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>> ---
>>  tools/lib/bpf/Build             |   2 +-
>>  tools/lib/bpf/btf.c             |  58 ++++
>>  tools/lib/bpf/btf.h             |   8 +
>>  tools/lib/bpf/btf_reconcile.c   | 590 ++++++++++++++++++++++++++++++++
> 
> how wrong would it be to call this process "relocate" rather than "reconcile"?
>

seems fine to me.

>>  tools/lib/bpf/libbpf.map        |   1 +
>>  tools/lib/bpf/libbpf_internal.h |   2 +
>>  6 files changed, 660 insertions(+), 1 deletion(-)
>>  create mode 100644 tools/lib/bpf/btf_reconcile.c
>>
> 
> [...]
> 
>> +/* Find next type after *id in base BTF that matches kind of type t passed in
>> + * and name (if it is specified).  Match fwd kinds to appropriate kind also.
>> + */
>> +static int btf_reconcile_find_next(struct btf_reconcile *r, const struct btf_type *t,
>> +                                  __u32 *id, const struct btf_type **tp)
> 
> I haven't grokked the whole patch logic just yet, doing a first pass,
> so I might be asking stupid questions, sorry.
> 
> But it looks like we have these linear searches here to find matching
> types, is that right? Wouldn't it be better to build an index first to
> speed up search?
> 

It would, but the aim here was to keep things simple with an eye to
sharing the code with the kernel. A lot of the libbpf hash stuff would
be handy but then we'd have to have something on the kernel side. Given
that the size of the base BTF is so small, the linear searches aren't
much of a cost.

>> +{
>> +       const struct btf_type *nt;
>> +       int kind, tkind = btf_kind(t);
>> +       int tkflag = btf_kflag(t);
>> +       __u32 i;
>> +
> 
> [...]
> 
>> +static int btf_reconcile_int(struct btf_reconcile *r, const char *name,
>> +                            const struct btf_type *t, const struct btf_type *bt)
>> +{
>> +       __u32 *info = (__u32 *)(t + 1);
>> +       __u32 *binfo = (__u32 *)(bt + 1);
>> +
>> +       if (t->size != bt->size) {
>> +               pr_warn("INT types '%s' disagree on size; reference base BTF says %d; base BTF says %d\n",
>> +                       name, t->size, bt->size);
>> +               return -EINVAL;
>> +       }
>> +       if (BTF_INT_ENCODING(*info) != BTF_INT_ENCODING(*binfo)) {
>> +               pr_warn("INT types '%s' disagree on encoding; reference base BTF says '(%s/%s/%s); base BTF says '(%s/%s/%s)'\n",
>> +                       name,
>> +                       BTF_INT_ENCODING(*info) & BTF_INT_SIGNED ? "signed" : "unsigned",
>> +                       BTF_INT_ENCODING(*info) & BTF_INT_CHAR ? "char" : "nonchar",
>> +                       BTF_INT_ENCODING(*info) & BTF_INT_BOOL ? "bool" : "nonbool",
>> +                       BTF_INT_ENCODING(*binfo) & BTF_INT_SIGNED ? "signed" : "unsigned",
>> +                       BTF_INT_ENCODING(*binfo) & BTF_INT_CHAR ? "char" : "nonchar",
>> +                       BTF_INT_ENCODING(*binfo) & BTF_INT_BOOL ? "bool" : "nonbool");
> 
> there is btf_int_encoding() helper, please use it
>
will do, and for others below too.

>> +               return -EINVAL;
>> +       }
>> +       if (BTF_INT_BITS(*info) != BTF_INT_BITS(*binfo)) {
> 
> btf_int_bits()
> 
>> +               pr_warn("INT types '%s' disagree on bit size; reference base BTF says %d; base BTF says %d\n",
>> +                       name, BTF_INT_BITS(*info), BTF_INT_BITS(*binfo));
>> +               return -EINVAL;
>> +       }
>> +       return 0;
>> +}
>> +
>> +static int btf_reconcile_float(struct btf_reconcile *r, const char *name,
>> +                              const struct btf_type *t, const struct btf_type *bt)
>> +{
>> +
>> +       if (t->size != bt->size) {
>> +               pr_warn("float types '%s' disagree on size; reference base BTF says %d; base BTF says %d\n",
>> +                       name, t->size, bt->size);
>> +               return -EINVAL;
>> +       }
>> +       return 0;
>> +}
>> +
>> +/* Ensure each enum value in type t has equivalent in base BTF and that values (if any) match. */
>> +static int btf_reconcile_enum(struct btf_reconcile *r, const char *name,
>> +                             const struct btf_type *t, const struct btf_type *bt)
>> +{
> 
> should we care about compatibility between ENUM and ENUM64, they can
> both represent the same values of the same size?
> 

so do you mean if one representation uses an enum, another an enum64?
Yep that might well be the case, I'll add that too.

>> +       struct btf_enum *v = (struct btf_enum *)(t + 1);
>> +       struct btf_enum *bv = (struct btf_enum *)(bt + 1);
>> +       bool found, match;
>> +       int i, j;
>> +
>> +       for (i = 0; i < btf_vlen(t); i++, v++) {
>> +               found = match = false;
>> +
>> +               if (!v->name_off)
>> +                       continue;
>> +               for (j = 0; j < btf_vlen(bt); j++, bv++) {
>> +                       if (!bv->name_off)
>> +                               continue;
>> +                       if (strcmp(btf__name_by_offset(r->base_ref_btf, v->name_off),
>> +                                  btf__name_by_offset(r->base_btf, bv->name_off)) != 0)
> 
> nit: please use local vars to shorten these long if conditions, it
> will be easier to read and follow
>

will do.


>> +                               continue;
>> +                       found = true;
>> +                       match = (v->val == bv->val);
>> +                       break;
>> +               }
> 
> [...]
> 
>> +       while ((err = btf_reconcile_find_next(r, t, &base_id, &bt)) != -ENOENT) {
>> +               bt = btf_type_by_id(r->base_btf, base_id);
>> +               switch (btf_kind(t)) {
>> +               case BTF_KIND_INT:
>> +                       err = btf_reconcile_int(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_ENUM:
>> +                       err = btf_reconcile_enum(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_FLOAT:
>> +                       err = btf_reconcile_float(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_ENUM64:
>> +                       err = btf_reconcile_enum64(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_FWD:
> 
> well, FWD can be for enum vs struct, gotta check that
> 
>> +                       err = 0;
>> +                       break;
>> +               default:
>> +                       return 0;
>> +               }
>> +               if (!err) {
>> +                       r->map[id] = base_id;
>> +                       return 0;
>> +               }
>> +       }
>> +       return err;
>> +}
> 
> [...]
> 
>> +                               if (err) {
>> +                                       pr_warn("could not find base BTF type for base reference type[%u]\n",
>> +                                               id);
>> +                                       return err;
>> +                               }
>> +                       } else {
>> +                               if (btf_reconcile_push(r, id) < 0 ||
>> +                                   btf_reconcile_push(r, t->type) < 0)
>> +                                       return -ENOSPC;
> 
> I'm missing something, please help me understand. I don't get why we
> need a recursive algorithm at all.
> 
> In my mind, we have this small "base ref" set of types referenced from
> module's BTF (split BTF part), right? So all we should need is to map
> every type from base ref set to vmlinux BTF.
> 
> What I don't yet fully get is why CONST/VOLATILE or PTR need to
> postpone reconciliation via a queue. By the time we get to types in
> split BTF all base ref types should be mapped, so all you need is to
> remap t->type to resolved vmlinux BTF, no?
> 

It's possible to have multiple layers of reference in the distilled base
BTF though; for example here's a case from a module's distilled base BTF:

[41] PTR '(anon)' type_id=42
[42] CONST '(anon)' type_id=0

To resolve type id 41 we need to resolve type id 42, and since type id 0
already has a mapping, at that point we can look for CONSTs that refer
to type id 0 and then once we've established that mapping we can find
PTRs that have a t->type that refers to the const. So as described below
we keep pushing type ids onto the stack until we find one with a t->type
mapping to base BTF; once we hit that we can start looking in base BTF
for types that have the mapped t->type value.

> I suspect the answer might have something to do with those anonymous
> structs/unions which you copy verbatim into base ref BTF?
>

They do add a few more types, but we can get base BTF references that
don't come from that source too. The above case PTR CONST void for
example wasn't referred to via any other distilled base types.

> But on the latter topic, I wonder if we at all need this? Why not keep
> all those anon struct/union/enum in module's part of BTF? If they are
> unnamed, I doubt they will ever be referenced from kfuncs or anything
> like that, so their BTF ID isn't that important.
> 

Changing the module BTF would make things a bit more complicated.
Currently we just update type id references and string offsets. The anon
structs we end up with tend to be very small; from the same distilled
base BTF used above here are the instances of struct/union:

[35] STRUCT '(anon)' size=4 vlen=1
        'counter' type_id=6 bits_offset=0
[62] STRUCT '(anon)' size=8 vlen=1
        'raw_lock' type_id=48 bits_offset=0
[113] UNION '(anon)' size=8 vlen=2
        'kernel' type_id=13 bits_offset=0
        'user' type_id=13 bits_offset=0
[114] STRUCT '(anon)' size=16 vlen=2
        '(anon)' type_id=113 bits_offset=0
        'is_kernel' type_id=11 bits_offset=64 bitfield_size=1
[119] STRUCT '(anon)' size=8 vlen=1
        'net' type_id=85 bits_offset=0

These only added one type that wasn't referenced elsewhere - typedef
arch_rwlock_t.

> If base BTF is all named types, it would simplify the reconciliation
> process significantly, I think.
> 
> But again, I only skimmed the overall algorithm, sorry for my
> laziness, but I figured it would be good to discuss the above first
> anyways.
>

I'll try and walk through an example of how the algorithm proceeds; that
might help make the approach concrete and we can see if it can be
simplified.

Consider split BTF that uses the base BTF type "int *", i.e. a PTR to an
"int". It could do so in an anonymous struct/union, but also as an array
member type or a FUNC_PROTO or VAR. For such a reference type, we first
encounter the outer PTR. If its t->type has a mapping to base BTF "int"
(it will), we look through the base BTF for reference types that match
the kind (here a PTR) _and_ have the required t->type (int). Once we
find the reference type in base BTF, we can add the mapping for PTR to
int from distilled->base BTF id. So the reference type with one layer of
reference is pretty straightforward.

In the case where we don't have a mapping for a t->type - let's say PTR
to PTR to int (int **) - we push the type id for PTR to PTR to int onto
the stack along with the type id for t->type (PTR to int) after. So we
will first pop PTR to int and go through the above-described type
resolution. Next we will pop the PTR to PTR to int and because we've
resolved PTR to int now, it's t->type will have a mapping and we can go
through the search process to find a PTR that refers to a PTR to int
in base BTF, and we then add that mapping too.

For an array we push the member and index types, a func proto the return
type and parameter types, etc.

So once multiple layers of reference are part of the picture I _think_
we need something like this approach.

Alan

>> +                       }
>> +                       break;
> 
> [...]

  reply	other threads:[~2024-04-05 10:07 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-22 10:24 [RFC bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
2024-03-22 10:24 ` [RFC dwarves] btf_encoder: add base_ref BTF feature to generate split BTF with base refs Alan Maguire
2024-03-22 10:24 ` [RFC bpf-next 01/13] libbpf: add support to btf__add_fwd() for ENUM64 Alan Maguire
2024-03-29 21:59   ` Andrii Nakryiko
2024-03-22 10:24 ` [RFC bpf-next 02/13] libbpf: add btf__new_split_base_ref() creating split BTF with reference base BTF Alan Maguire
2024-03-29 22:00   ` Andrii Nakryiko
2024-04-04 15:21     ` Alan Maguire
2024-04-04 22:49       ` Andrii Nakryiko
2024-03-22 10:24 ` [RFC bpf-next 03/13] selftests/bpf: test split base reference BTF generation Alan Maguire
2024-03-22 10:24 ` [RFC bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing Alan Maguire
2024-03-29 22:00   ` Andrii Nakryiko
2024-03-22 10:24 ` [RFC bpf-next 05/13] bpftool: support displaying raw split BTF using base reference BTF as base Alan Maguire
2024-03-22 10:24 ` [RFC bpf-next 06/13] kbuild,bpf: switch to using --btf_features for pahole v1.26 and later Alan Maguire
2024-03-29 22:01   ` Andrii Nakryiko
2024-03-22 10:24 ` [RFC bpf-next 07/13] resolve_btfids: use .BTF.base_ref BTF as base BTF if -r option is used Alan Maguire
2024-03-22 10:24 ` [RFC bpf-next 08/13] kbuild, bpf: add module-specific pahole/resolve_btfids flags for base reference BTF Alan Maguire
2024-03-23  2:50   ` Alexei Starovoitov
2024-03-25  9:51     ` Alan Maguire
2024-03-25 16:41       ` Alexei Starovoitov
2024-03-22 10:24 ` [RFC bpf-next 09/13] libbpf: split BTF reconciliation Alan Maguire
2024-03-29 22:01   ` Andrii Nakryiko
2024-04-05 10:06     ` Alan Maguire [this message]
2024-04-05 19:58       ` Andrii Nakryiko
2024-03-22 10:24 ` [RFC bpf-next 10/13] module, bpf: store BTF base reference pointer in struct module Alan Maguire
2024-03-22 10:24 ` [RFC bpf-next 11/13] libbpf,bpf: share BTF reconcile-related code with kernel Alan Maguire
2024-03-29 22:04   ` Andrii Nakryiko
2024-04-01 15:58     ` Andrii Nakryiko
2024-03-22 10:24 ` [RFC bpf-next 12/13] selftests/bpf: extend base reference tests cover BTF reconciliation Alan Maguire
2024-03-22 10:24 ` [RFC bpf-next 13/13] bpftool: support displaying reconciled-with-base split BTF Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9be97b85-104b-4c33-a23a-ad68ba7f5838@oracle.com \
    --to=alan.maguire@oracle.com \
    --cc=acme@redhat.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=houtao1@huawei.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=masahiroy@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mykolal@fb.com \
    --cc=nathan@kernel.org \
    --cc=quentin@isovalent.com \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox