From: Yonghong Song <yhs@meta.com>
To: John Fastabend <john.fastabend@gmail.com>,
Yonghong Song <yhs@fb.com>,
bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
kernel-team@fb.com, Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
Date: Mon, 21 Nov 2022 09:29:07 -0800 [thread overview]
Message-ID: <2c4f8cac-6935-2c72-cc1b-34a34708e127@meta.com> (raw)
In-Reply-To: <637ade2851bc6_99c62086@john.notmuch>
On 11/20/22 6:10 PM, John Fastabend wrote:
> Yonghong Song wrote:
>> Currenty, a non-tracing bpf program typically has a single 'context' argument
>> with predefined uapi struct type. Following these uapi struct, user is able
>> to access other fields defined in uapi header. Inside the kernel, the
>> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
>> in short) which can access more information than what uapi header provides.
>> To access other info not in uapi header, people typically do two things:
>> (1). extend uapi to access more fields rooted from 'context'.
>> (2). use bpf_probe_read_kernl() helper to read particular field based on
>> kctx.
>> Using (1) needs uapi change and using (2) makes code more complex since
>> direct memory access is not allowed.
>>
>> There are already a few instances trying to access more information from
>> kctx:
>> . trying to access some fields from perf_event kctx ([1]).
>> . trying to access some fields from xdp kctx ([2]).
>>
>> This patch set tried to allow direct memory access for kctx fields
>> by introducing bpf_cast_to_kern_ctx() kfunc.
>>
>> Martin mentioned a use case like type casting below:
>> #define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB)))
>> basically a 'unsigned char *" casted to 'struct skb_shared_info *'. This patch
>> set tries to support such a use case as well with bpf_rdonly_cast().
>>
>> For the patch series, Patch 1 added support for a kfunc available to all
>> prog types. Patch 2 added bpf_cast_to_kern_ctx() kfunc. Patch 3 added
>> bpf_rdonly_cast() kfunc. Patch 4 added a few positive and negative tests.
>>
>> [1] https://lore.kernel.org/bpf/ad15b398-9069-4a0e-48cb-4bb651ec3088@meta.com/
>> [2] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@gmail.com/
>>
>> Changelog:
>> v3 -> v4:
>> - remove unnecessary bpf_ctx_convert.t error checking
>> - add and use meta.ret_btf_id instead of meta.arg_constant.value for
>> bpf_cast_to_kern_ctx().
>> - add PTR_TRUSTED to the return PTR_TO_BTF_ID type for bpf_cast_to_kern_ctx().
>> v2 -> v3:
>> - rebase on top of bpf-next (for merging conflicts)
>> - add the selftest to s390x deny list
>> rfcv1 -> v2:
>> - break original one kfunc into two.
>> - add missing error checks and error logs.
>> - adapt to the new conventions in
>> https://lore.kernel.org/all/20221118015614.2013203-1-memxor@gmail.com/
>> for example, with __ign and __k suffix.
>> - added support in fixup_kfunc_call() to replace kfunc calls with a single mov.
>>
>> Yonghong Song (4):
>> bpf: Add support for kfunc set with common btf_ids
>> bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx
>> bpf: Add a kfunc for generic type cast
>> bpf: Add type cast unit tests
>
> Thanks Yonghong! Ack for the series for me, but looks like Alexei is
> quick.
>
> From myside this allows us to pull in the dev info and from that get
> netns so fixes a gap we had to split into a kprobe + xdp.
>
> If we can get a pointer to the recv queue then with a few reads we
> get the hash, vlan, etc. (see timestapm thread)
Thanks, John. Glad to see it is useful.
>
> And then last bit is if we can get a ptr to the net ns list, plus
Unfortunately, currently vmlinux btf does not have non-percpu global
variables, so net_namespace_list is not available to bpf programs.
But I think we could do the following with a little bit user space
initial involvement as a workaround.
In bpf program, we could have global variable
__u64 net_namespace_list;
and user space can lookup /proc/kallsyms for net_namespace_list
and assign it to bpf program 'net_namespace_list' before prog load.
After that, you could implement an in-bpf-prog iterator with bounded
loop to ensure eventual ending. You can use
struct list_head *lh = bpf_rdonly_cast(net_namespace_list,
struct_list_head_btf_id)
cast to struct list_head pointer. From there you can tracing down
the list with needed bpf_rdonly_cast() for casting to element type.
> the rcu patch we can build the net ns iterator directly in BPF
I just posted rcu patch
https://lore.kernel.org/bpf/20221121170515.1193967-1-yhs@fb.com/
Please help take a look whether it can serve your need.
> which seems stronger than an iterator IMO because we can kick it
> off on events anywhere in the kernel. Or based on event kick of
> some specific iterator e.g. walk net_devs in netns X with SR-IOV
> interfaces). Ideally we would also wire it up to timers so we
> can call it every N seconds without any user space intervention.
> Eventually, its nice if the user space can crash, restart, and
> so on without impacting the logic in kernel.
>
> Thanks again.
next prev parent reply other threads:[~2022-11-21 17:29 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-20 19:54 [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs Yonghong Song
2022-11-20 19:54 ` [PATCH bpf-next v4 1/4] bpf: Add support for kfunc set with common btf_ids Yonghong Song
2022-11-20 19:54 ` [PATCH bpf-next v4 2/4] bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx Yonghong Song
2022-11-20 19:54 ` [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast Yonghong Song
2022-11-20 20:16 ` Alexei Starovoitov
2022-11-20 20:49 ` Kumar Kartikeya Dwivedi
2022-11-20 22:34 ` Alexei Starovoitov
2022-11-20 23:32 ` Alexei Starovoitov
2022-11-20 23:47 ` Alexei Starovoitov
2022-11-20 19:54 ` [PATCH bpf-next v4 4/4] bpf: Add type cast unit tests Yonghong Song
2022-11-21 0:00 ` [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs patchwork-bot+netdevbpf
2022-11-21 2:10 ` John Fastabend
2022-11-21 17:29 ` Yonghong Song [this message]
2022-11-22 1:48 ` John Fastabend
2022-11-22 4:52 ` Alexei Starovoitov
2022-11-23 3:18 ` John Fastabend
2022-11-23 20:46 ` Alexei Starovoitov
2022-11-29 16:30 ` Alan Maguire
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2c4f8cac-6935-2c72-cc1b-34a34708e127@meta.com \
--to=yhs@meta.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox