BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@meta.com>
To: John Fastabend <john.fastabend@gmail.com>,
	Yonghong Song <yhs@fb.com>,
	bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	kernel-team@fb.com, Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
Date: Mon, 21 Nov 2022 09:29:07 -0800	[thread overview]
Message-ID: <2c4f8cac-6935-2c72-cc1b-34a34708e127@meta.com> (raw)
In-Reply-To: <637ade2851bc6_99c62086@john.notmuch>



On 11/20/22 6:10 PM, John Fastabend wrote:
> Yonghong Song wrote:
>> Currenty, a non-tracing bpf program typically has a single 'context' argument
>> with predefined uapi struct type. Following these uapi struct, user is able
>> to access other fields defined in uapi header. Inside the kernel, the
>> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
>> in short) which can access more information than what uapi header provides.
>> To access other info not in uapi header, people typically do two things:
>>    (1). extend uapi to access more fields rooted from 'context'.
>>    (2). use bpf_probe_read_kernl() helper to read particular field based on
>>      kctx.
>> Using (1) needs uapi change and using (2) makes code more complex since
>> direct memory access is not allowed.
>>
>> There are already a few instances trying to access more information from
>> kctx:
>>    . trying to access some fields from perf_event kctx ([1]).
>>    . trying to access some fields from xdp kctx ([2]).
>>
>> This patch set tried to allow direct memory access for kctx fields
>> by introducing bpf_cast_to_kern_ctx() kfunc.
>>
>> Martin mentioned a use case like type casting below:
>>    #define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB)))
>> basically a 'unsigned char *" casted to 'struct skb_shared_info *'. This patch
>> set tries to support such a use case as well with bpf_rdonly_cast().
>>
>> For the patch series, Patch 1 added support for a kfunc available to all
>> prog types. Patch 2 added bpf_cast_to_kern_ctx() kfunc. Patch 3 added
>> bpf_rdonly_cast() kfunc. Patch 4 added a few positive and negative tests.
>>
>>    [1] https://lore.kernel.org/bpf/ad15b398-9069-4a0e-48cb-4bb651ec3088@meta.com/
>>    [2] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@gmail.com/
>>
>> Changelog:
>>    v3 -> v4:
>>      - remove unnecessary bpf_ctx_convert.t error checking
>>      - add and use meta.ret_btf_id instead of meta.arg_constant.value for
>>        bpf_cast_to_kern_ctx().
>>      - add PTR_TRUSTED to the return PTR_TO_BTF_ID type for bpf_cast_to_kern_ctx().
>>    v2 -> v3:
>>      - rebase on top of bpf-next (for merging conflicts)
>>      - add the selftest to s390x deny list
>>    rfcv1 -> v2:
>>      - break original one kfunc into two.
>>      - add missing error checks and error logs.
>>      - adapt to the new conventions in
>>        https://lore.kernel.org/all/20221118015614.2013203-1-memxor@gmail.com/
>>        for example, with __ign and __k suffix.
>>      - added support in fixup_kfunc_call() to replace kfunc calls with a single mov.
>>
>> Yonghong Song (4):
>>    bpf: Add support for kfunc set with common btf_ids
>>    bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx
>>    bpf: Add a kfunc for generic type cast
>>    bpf: Add type cast unit tests
> 
> Thanks Yonghong! Ack for the series for me, but looks like Alexei is
> quick.
> 
>  From myside this allows us to pull in the dev info and from that get
> netns so fixes a gap we had to split into a kprobe + xdp.
> 
> If we can get a pointer to the recv queue then with a few reads we
> get the hash, vlan, etc. (see timestapm thread)

Thanks, John. Glad to see it is useful.

> 
> And then last bit is if we can get a ptr to the net ns list, plus

Unfortunately, currently vmlinux btf does not have non-percpu global
variables, so net_namespace_list is not available to bpf programs.
But I think we could do the following with a little bit user space
initial involvement as a workaround.

In bpf program, we could have global variable
   __u64 net_namespace_list;
and user space can lookup /proc/kallsyms for net_namespace_list
and assign it to bpf program 'net_namespace_list' before prog load.

After that, you could implement an in-bpf-prog iterator with bounded
loop to ensure eventual ending. You can use
   struct list_head *lh = bpf_rdonly_cast(net_namespace_list, 
struct_list_head_btf_id)
cast to struct list_head pointer. From there you can tracing down
the list with needed bpf_rdonly_cast() for casting to element type.

> the rcu patch we can build the net ns iterator directly in BPF

I just posted rcu patch 
https://lore.kernel.org/bpf/20221121170515.1193967-1-yhs@fb.com/
Please help take a look whether it can serve your need.

> which seems stronger than an iterator IMO because we can kick it
> off on events anywhere in the kernel. Or based on event kick of
> some specific iterator e.g. walk net_devs in netns X with SR-IOV
> interfaces). Ideally we would also wire it up to timers so we
> can call it every N seconds without any user space intervention.
> Eventually, its nice if the user space can crash, restart, and
> so on without impacting the logic in kernel.
> 
> Thanks again.

  reply	other threads:[~2022-11-21 17:29 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-20 19:54 [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs Yonghong Song
2022-11-20 19:54 ` [PATCH bpf-next v4 1/4] bpf: Add support for kfunc set with common btf_ids Yonghong Song
2022-11-20 19:54 ` [PATCH bpf-next v4 2/4] bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx Yonghong Song
2022-11-20 19:54 ` [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast Yonghong Song
2022-11-20 20:16   ` Alexei Starovoitov
2022-11-20 20:49     ` Kumar Kartikeya Dwivedi
2022-11-20 22:34       ` Alexei Starovoitov
2022-11-20 23:32         ` Alexei Starovoitov
2022-11-20 23:47           ` Alexei Starovoitov
2022-11-20 19:54 ` [PATCH bpf-next v4 4/4] bpf: Add type cast unit tests Yonghong Song
2022-11-21  0:00 ` [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs patchwork-bot+netdevbpf
2022-11-21  2:10 ` John Fastabend
2022-11-21 17:29   ` Yonghong Song [this message]
2022-11-22  1:48     ` John Fastabend
2022-11-22  4:52       ` Alexei Starovoitov
2022-11-23  3:18         ` John Fastabend
2022-11-23 20:46           ` Alexei Starovoitov
2022-11-29 16:30             ` Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2c4f8cac-6935-2c72-cc1b-34a34708e127@meta.com \
    --to=yhs@meta.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox