From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Artem Savkov <asavkov@redhat.com>
Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, Jiri Benc <jbenc@redhat.com>,
Lorenzo Bianconi <lorenzo@kernel.org>,
Alexei Starovoitov <ast@kernel.org>
Subject: Re: Calling kfuncs in modules - BTF mismatch?
Date: Tue, 29 Nov 2022 21:12:42 +0100 [thread overview]
Message-ID: <877czdzfid.fsf@toke.dk> (raw)
In-Reply-To: <Y4Yeom8vSZtBM3o2@wtfbox.lan>
Artem Savkov <asavkov@redhat.com> writes:
> On Sun, Nov 13, 2022 at 07:04:43PM +0100, Toke Høiland-Jørgensen wrote:
>> Lorenzo Bianconi <lorenzo.bianconi@redhat.com> writes:
>>
>> >>
>> >> Hi everyone
>> >>
>> >> There seems to be some issue with BTF mismatch when trying to run the
>> >> bpf_ct_set_nat_info() kfunc from a module. I was under the impression
>> >> that this is supposed to work, so is there some kind of BTF dedup issue
>> >> here or something?
>> >>
>> >> Steps to reproduce:
>> >>
>> >> 1. Compile kernel with nf_conntrack built-in and run selftests;
>> >> './test_progs -a bpf_nf' works
>> >>
>> >> 2. Change the kernel config so nf_conntrack is build as a module
>> >>
>> >> 3. Start the test kernel and manually modprobe nf_conntrack and nf_nat
>> >>
>> >> 4. Run ./test_progs -a bpf_nf; this now fails with an error like:
>> >>
>> >> kernel function bpf_ct_set_nat_info args#0 expected pointer to STRUCT nf_conn___init but R1 has a pointer to STRUCT nf_conn___init
>> >
>> > This week Kumar and I took a look at this issue and we ended up
>> > identifying a duplication of nf_conn___init structure. In particular:
>> >
>> > [~/workspace/bpf-next]$ bpftool btf --base-btf vmlinux dump file
>> > net/netfilter/nf_conntrack.ko format raw | grep nf_conn__
>> > [110941] STRUCT 'nf_conn___init' size=248 vlen=1
>> > [~/workspace/bpf-next]$ bpftool btf --base-btf vmlinux dump file
>> > net/netfilter/nf_nat.ko format raw | grep nf_conn__
>> > [107488] STRUCT 'nf_conn___init' size=248 vlen=1
>> >
>> > Is it the root cause of the problem?
>>
>> It certainly seems to be related to it, at least. Amending the log
>> message to include the BTF object IDs of the two versions shows that the
>> register has a reference to nf_conn__init in nf_conntrack.ko, while the kernel
>> expects it to point to nf_nat.ko.
>>
>> Not sure what's the right fix for this? Should libbpf be smart enough to
>> pull the kfunc arg ID from the same BTF ID as the function itself? Or
>
> Verifier chose the ID based on where the variable was allocated, which
> is bpf_(skb|xdp)_ct_alloc() from nf_conntrack.ko. Assuming btf id based
> on usage location might be error prone.
Yeah, I think we need something more robust.
>> should the kernel compare structs and allow things if they're identical?
>> Andrii, WDYT?
>
> This works but might make verifier slower.
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 195d24316750..562d2c15906d 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -24,6 +24,7 @@
> #include <linux/bpf_lsm.h>
> #include <linux/btf_ids.h>
> #include <linux/poison.h>
> +#include "../tools/lib/bpf/relo_core.h"
>
> #include "disasm.h"
>
> @@ -8236,7 +8237,8 @@ static int process_kf_arg_ptr_to_btf_id(struct bpf_verifier_env *env,
>
> reg_ref_t = btf_type_skip_modifiers(reg_btf, reg_ref_id, ®_ref_id);
> reg_ref_tname = btf_name_by_offset(reg_btf, reg_ref_t->name_off);
> - if (!btf_struct_ids_match(&env->log, reg_btf, reg_ref_id, reg->off, meta->btf, ref_id, strict_type_match)) {
> + if (!btf_struct_ids_match(&env->log, reg_btf, reg_ref_id, reg->off, meta->btf, ref_id, strict_type_match) && \
> + !__bpf_core_types_match(reg_btf, reg_ref_id, meta->btf, ref_id, false, 10)) {
> verbose(env, "kernel function %s args#%d expected pointer to %s %s but R%d has a pointer to %s %s\n",
> meta->func_name, argno, btf_type_str(ref_t), ref_tname, argno + 1,
> btf_type_str(reg_ref_t), reg_ref_tname);
Ah, cool! This is a lot of code to important into the kernel, though; do
we really want to do that?
I guess an alternative could be to make sure that every data type used
by a kfunc is always referenced in a built-in module somewhere, so
modules will use the definition from vmlinux.
Andrii, Alexei, any opinions on which way to go with this?
-Toke
next prev parent reply other threads:[~2022-11-29 20:13 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-11 14:21 Calling kfuncs in modules - BTF mismatch? Toke Høiland-Jørgensen
2022-11-11 18:08 ` Lorenzo Bianconi
2022-11-13 18:04 ` Toke Høiland-Jørgensen
2022-11-29 15:00 ` Artem Savkov
2022-11-29 20:12 ` Toke Høiland-Jørgensen [this message]
2022-12-01 1:09 ` Andrii Nakryiko
2022-12-01 13:51 ` Artem Savkov
2022-11-29 16:21 ` Alan Maguire
2022-11-29 19:41 ` Arnaldo Carvalho de Melo
2022-11-29 20:12 ` Toke Høiland-Jørgensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877czdzfid.fsf@toke.dk \
--to=toke@redhat.com \
--cc=andrii@kernel.org \
--cc=asavkov@redhat.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=jbenc@redhat.com \
--cc=lorenzo.bianconi@redhat.com \
--cc=lorenzo@kernel.org \
--cc=memxor@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox