BPF List
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: Alan Maguire <alan.maguire@oracle.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
	daniel@iogearbox.net, kernel-team@fb.com, yhs@fb.com,
	arnaldo.melo@gmail.com
Subject: Re: [RFC bpf-next 01/12] libbpf: Deduplicate unambigous standalone forward declarations
Date: Tue, 01 Nov 2022 19:37:56 +0200	[thread overview]
Message-ID: <31c8edcbdda4b4d7ed05e0b25180c8ebf0d94f05.camel@gmail.com> (raw)
In-Reply-To: <c70549d8-f9c7-636b-7e4c-2b3e918978ec@oracle.com>

On Tue, 2022-11-01 at 17:08 +0000, Alan Maguire wrote:
> On 31/10/2022 15:49, Eduard Zingerman wrote:
> > On Thu, 2022-10-27 at 15:07 -0700, Andrii Nakryiko wrote:
> > > On Tue, Oct 25, 2022 at 3:28 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > [...] 
> > > > +
> > > > +/*
> > > > + * Collect a `name_off_map` that maps type names to type ids for all
> > > > + * canonical structs and unions. If the same name is shared by several
> > > > + * canonical types use a special value 0 to indicate this fact.
> > > > + */
> > > > +static int btf_dedup_fill_unique_names_map(struct btf_dedup *d, struct hashmap *names_map)
> > > > +{
> > > > +       int i, err = 0;
> > > > +       __u32 type_id, collision_id;
> > > > +       __u16 kind;
> > > > +       struct btf_type *t;
> > > > +
> > > > +       for (i = 0; i < d->btf->nr_types; i++) {
> > > > +               type_id = d->btf->start_id + i;
> > > > +               t = btf_type_by_id(d->btf, type_id);
> > > > +               kind = btf_kind(t);
> > > > +
> > > > +               if (kind != BTF_KIND_STRUCT && kind != BTF_KIND_UNION)
> > > > +                       continue;
> > > 
> > > let's also do ENUM FWD resolution. ENUM FWD is just ENUM with vlen=0
> > 
> > Interestingly this is necessary only for mixed enum / enum64 case.
> > Forward enum declarations are resolved by bpf/btf.c:btf_dedup_prim_type:
> > 
> 
> Ah, great catch! A forward can look like an enum to one CU but another CU can
> specify values that make it an enum64.
> 
> > 	case BTF_KIND_ENUM:
> > 		h = btf_hash_enum(t);
> > 		for_each_dedup_cand(d, hash_entry, h) {
> > 			cand_id = (__u32)(long)hash_entry->value;
> > 			cand = btf_type_by_id(d->btf, cand_id);
> > 			if (btf_equal_enum(t, cand)) {
> > 				new_id = cand_id;
> > 				break;
> > 			}
> > 			if (btf_compat_enum(t, cand)) {
> > 				if (btf_is_enum_fwd(t)) {
> > 					/* resolve fwd to full enum */
> > 					new_id = cand_id;
> > 					break;
> > 				}
> > 				/* resolve canonical enum fwd to full enum */
> > 				d->map[cand_id] = type_id;
> > 			}
> > 		}
> > 		break;
> >     // ... similar logic for ENUM64 ...
> > 
> > - btf_hash_enum ignores vlen when hashing;
> > - btf_compat_enum compares only names and sizes.
> > 
> > So, if forward and main declaration kinds match (either BTF_KIND_ENUM
> > or BTF_KIND_ENUM64) the forward declaration would be removed. But if
> > the kinds are different the forward declaration would remain. E.g.:
> > 
> > CU #1:
> > enum foo;
> > enum foo *a;
> > 
> > CU #2:
> > enum foo { x = 0xfffffffff };
> > enum foo *b;
> > 
> > BTF:
> > [1] ENUM64 'foo' encoding=UNSIGNED size=8 vlen=1
> > 	'x' val=68719476735ULL
> > [2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
> > [3] PTR '(anon)' type_id=1
> > [4] ENUM 'foo' encoding=UNSIGNED size=4 vlen=0
> > [5] PTR '(anon)' type_id=4
> > 
> > BTF_KIND_FWDs are unified during btf_dedup_struct_types but enum
> > forward declarations are not. So it would be incorrect to add enum
> > forward declaration unification logic to btf_dedup_resolve_fwds,
> > because the following case would not be covered:
> > 
> > CU #1:
> > enum foo;
> > struct s { enum foo *a; } *a;
> > 
> > CU #2:
> > enum foo { x = 0xfffffffff };
> > struct s { enum foo *a; } *b;
> > 
> > Currently STRUCTs 's' are not de-duplicated.
> > 
> 
> What if CU#1 is in base BTF and CU#2 in split module BTF? I think we'd explicitly
> want to avoid deduping "struct s" then since we can't be sure that it is the
> same enum they are pointing at.  That's the logic we employ for structs at 
> least, based upon the rationale that we can't feed back knowledge of types
> from module to kernel BTF since the latter is now fixed (Andrii, do correct me
> if I have this wrong). In such a case the enum is no longer standalone; it
> serves the purpose of allowing us to define a pointer to a module-specific
> type. We recently found some examples of this sort of thing with structs,
> where the struct was defined in module BTF, making dedup fail for some core
> kernel data types, but the problem was restricted to modules which _did_
> define the type so wasn't a major driver of dedup failures. Not sure if
> there's many (any?) enum cases of this in practice.

Hi Alan,

As far as I understand the loop in `btf_dedup_prim_types` guarantees
that only ids from the split module would be remapped:

	struct btf {
    	...
		/* BTF type ID of the first type in this BTF instance:
		 *   - for base BTF it's equal to 1;
		 *   - for split BTF it's equal to biggest type ID of base BTF plus 1.
		 */
		int start_id;
    	...
	}

    ...
	for (i = 0; i < d->btf->nr_types; i++) {
		err = btf_dedup_prim_type(d, d->btf->start_id + i);
		if (err)
			return err;
	}

Thus CU1:foo won't be updated to be CU2:foo and CU1:s will not be the
same as CU2:s. Is that right or am I confused?

Thanks,
Eduard

> 
> I suppose if we could guarantee the dedup happened within the same object
> (kernel or module) we could relax this constraint though?
> 
> Alan


  reply	other threads:[~2022-11-01 17:38 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-25 22:27 [RFC bpf-next 00/12] Use uapi kernel headers with vmlinux.h Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 01/12] libbpf: Deduplicate unambigous standalone forward declarations Eduard Zingerman
2022-10-27 22:07   ` Andrii Nakryiko
2022-10-31  1:00     ` Eduard Zingerman
2022-10-31 15:49     ` Eduard Zingerman
2022-11-01 17:08       ` Alan Maguire
2022-11-01 17:37         ` Eduard Zingerman [this message]
2022-10-25 22:27 ` [RFC bpf-next 02/12] selftests/bpf: Tests for standalone forward BTF declarations deduplication Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 03/12] libbpf: Support for BTF_DECL_TAG dump in C format Eduard Zingerman
2022-10-27 22:36   ` Andrii Nakryiko
2022-10-25 22:27 ` [RFC bpf-next 04/12] selftests/bpf: Tests " Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 05/12] libbpf: Header guards for selected data structures in vmlinux.h Eduard Zingerman
2022-10-27 22:44   ` Andrii Nakryiko
2022-10-25 22:27 ` [RFC bpf-next 06/12] selftests/bpf: Tests for header guards printing in BTF dump Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 07/12] bpftool: Enable header guards generation Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 08/12] kbuild: Script to infer header guard values for uapi headers Eduard Zingerman
2022-10-27 22:51   ` Andrii Nakryiko
2022-10-25 22:27 ` [RFC bpf-next 09/12] kbuild: Header guards for types from include/uapi/*.h in kernel BTF Eduard Zingerman
2022-10-27 18:43   ` Yonghong Song
2022-10-27 18:55     ` Yonghong Song
2022-10-27 22:44       ` Yonghong Song
2022-10-28  0:00         ` Eduard Zingerman
2022-10-28  0:14           ` Mykola Lysenko
2022-10-28  1:23             ` Yonghong Song
2022-10-28  1:21           ` Yonghong Song
2022-10-25 22:27 ` [RFC bpf-next 10/12] selftests/bpf: Script to verify uapi headers usage with vmlinux.h Eduard Zingerman
2022-10-25 22:28 ` [RFC bpf-next 11/12] selftests/bpf: Known good uapi headers for test_uapi_headers.py Eduard Zingerman
2022-10-25 22:28 ` [RFC bpf-next 12/12] selftests/bpf: script for infer_header_guards.pl testing Eduard Zingerman
2022-10-25 23:46 ` [RFC bpf-next 00/12] Use uapi kernel headers with vmlinux.h Alexei Starovoitov
2022-10-26 22:46   ` Eduard Zingerman
2022-10-26 11:10 ` Alan Maguire
2022-10-26 23:54   ` Eduard Zingerman
2022-10-27 23:14 ` Andrii Nakryiko
2022-10-28  1:33   ` Yonghong Song
2022-10-28 17:13     ` Andrii Nakryiko
2022-10-28 18:56       ` Yonghong Song
2022-10-28 21:35         ` Andrii Nakryiko
2022-11-01 16:01           ` Alan Maguire
2022-11-01 18:35             ` Alexei Starovoitov
2022-11-01 19:21               ` Eduard Zingerman
2022-11-01 19:44                 ` Alexei Starovoitov
2022-11-11 21:55         ` Eduard Zingerman
2022-11-14  7:52           ` Yonghong Song
2022-11-14 21:13             ` Eduard Zingerman
2022-11-14 21:50               ` Alexei Starovoitov
2022-11-16  2:01                 ` Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31c8edcbdda4b4d7ed05e0b25180c8ebf0d94f05.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=alan.maguire@oracle.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=arnaldo.melo@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox