From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: Amery Hung <ameryhung@gmail.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
yangpeihao@sjtu.edu.cn, daniel@iogearbox.net, andrii@kernel.org,
martin.lau@kernel.org, sinquersw@gmail.com, toke@redhat.com,
jhs@mojatatu.com, jiri@resnulli.us, sdf@google.com,
xiyou.wangcong@gmail.com, yepeilin.cs@gmail.com
Subject: Re: [RFC PATCH v8 01/20] bpf: Support passing referenced kptr to struct_ops programs
Date: Fri, 17 May 2024 01:59:03 +0200 [thread overview]
Message-ID: <CAP01T74iSVPnRsAbdNfzXYYS7GsdCSgp3QiaPSzex6d+3J5AAA@mail.gmail.com> (raw)
In-Reply-To: <20240510192412.3297104-2-amery.hung@bytedance.com>
On Fri, 10 May 2024 at 21:24, Amery Hung <ameryhung@gmail.com> wrote:
>
> This patch supports struct_ops programs that acqurie referenced kptrs
> throguh arguments. In Qdisc_ops, an skb is passed to ".enqueue" in the
> first argument. The qdisc becomes the sole owner of the skb and must
> enqueue or drop the skb. This matches the referenced kptr semantic
> in bpf. However, the existing practice of acquiring a referenced kptr via
> a kfunc with KF_ACQUIRE does not play well in this case. Calling kfuncs
> repeatedly allows the user to acquire multiple references, while there
> should be only one reference to a unique skb in a qdisc.
>
> The solutioin is to make a struct_ops program automatically acquire a
> referenced kptr through a tagged argument in the stub function. When
> tagged with "__ref_acquired" (suggestion for a better name?), an
> reference kptr (ref_obj_id > 0) will be acquired automatically when
> entering the program. In addition, only the first read to the arguement
> is allowed and it will yeild a referenced kptr.
>
> Signed-off-by: Amery Hung <amery.hung@bytedance.com>
> ---
> include/linux/bpf.h | 3 +++
> kernel/bpf/bpf_struct_ops.c | 17 +++++++++++++----
> kernel/bpf/btf.c | 10 +++++++++-
> kernel/bpf/verifier.c | 16 +++++++++++++---
> 4 files changed, 38 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 9c6a7b8ff963..6aabca1581fe 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -914,6 +914,7 @@ struct bpf_insn_access_aux {
> struct {
> struct btf *btf;
> u32 btf_id;
> + u32 ref_obj_id;
> };
> };
> struct bpf_verifier_log *log; /* for verbose logs */
> @@ -1416,6 +1417,8 @@ struct bpf_ctx_arg_aux {
> enum bpf_reg_type reg_type;
> struct btf *btf;
> u32 btf_id;
> + u32 ref_obj_id;
> + bool ref_acquired;
> };
>
> struct btf_mod_pair {
> diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> index 86c7884abaf8..bca8e5936846 100644
> --- a/kernel/bpf/bpf_struct_ops.c
> +++ b/kernel/bpf/bpf_struct_ops.c
> @@ -143,6 +143,7 @@ void bpf_struct_ops_image_free(void *image)
> }
>
> #define MAYBE_NULL_SUFFIX "__nullable"
> +#define REF_ACQUIRED_SUFFIX "__ref_acquired"
> #define MAX_STUB_NAME 128
>
> /* Return the type info of a stub function, if it exists.
> @@ -204,6 +205,7 @@ static int prepare_arg_info(struct btf *btf,
> struct bpf_struct_ops_arg_info *arg_info)
> {
> const struct btf_type *stub_func_proto, *pointed_type;
> + bool is_nullable = false, is_ref_acquired = false;
> const struct btf_param *stub_args, *args;
> struct bpf_ctx_arg_aux *info, *info_buf;
> u32 nargs, arg_no, info_cnt = 0;
> @@ -240,8 +242,11 @@ static int prepare_arg_info(struct btf *btf,
> /* Skip arguments that is not suffixed with
> * "__nullable".
> */
> - if (!btf_param_match_suffix(btf, &stub_args[arg_no],
> - MAYBE_NULL_SUFFIX))
> + is_nullable = btf_param_match_suffix(btf, &stub_args[arg_no],
> + MAYBE_NULL_SUFFIX);
> + is_ref_acquired = btf_param_match_suffix(btf, &stub_args[arg_no],
> + REF_ACQUIRED_SUFFIX);
> + if (!(is_nullable || is_ref_acquired))
> continue;
>
> /* Should be a pointer to struct */
> @@ -269,11 +274,15 @@ static int prepare_arg_info(struct btf *btf,
> }
>
> /* Fill the information of the new argument */
> - info->reg_type =
> - PTR_TRUSTED | PTR_TO_BTF_ID | PTR_MAYBE_NULL;
> info->btf_id = arg_btf_id;
> info->btf = btf;
> info->offset = offset;
> + if (is_nullable) {
> + info->reg_type = PTR_TRUSTED | PTR_TO_BTF_ID | PTR_MAYBE_NULL;
> + } else if (is_ref_acquired) {
> + info->reg_type = PTR_TRUSTED | PTR_TO_BTF_ID;
> + info->ref_acquired = true;
> + }
>
> info++;
> info_cnt++;
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 8c95392214ed..e462fb4a4598 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -6316,7 +6316,8 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
>
> /* this is a pointer to another type */
> for (i = 0; i < prog->aux->ctx_arg_info_size; i++) {
> - const struct bpf_ctx_arg_aux *ctx_arg_info = &prog->aux->ctx_arg_info[i];
> + struct bpf_ctx_arg_aux *ctx_arg_info =
> + (struct bpf_ctx_arg_aux *)&prog->aux->ctx_arg_info[i];
>
> if (ctx_arg_info->offset == off) {
> if (!ctx_arg_info->btf_id) {
> @@ -6324,9 +6325,16 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
> return false;
> }
>
> + if (ctx_arg_info->ref_acquired && !ctx_arg_info->ref_obj_id) {
> + bpf_log(log, "cannot acquire a reference to context argument offset %u\n", off);
> + return false;
> + }
> +
> info->reg_type = ctx_arg_info->reg_type;
> info->btf = ctx_arg_info->btf ? : btf_vmlinux;
> info->btf_id = ctx_arg_info->btf_id;
> + info->ref_obj_id = ctx_arg_info->ref_obj_id;
> + ctx_arg_info->ref_obj_id = 0;
> return true;
I think this is fragile. What if the compiler produces two independent
paths in the program which read the skb pointer once?
Technically, the program is still reading the skb pointer once at runtime.
Then you will reset ref_obj_id to 0 when exploring one, and assign as
0 in the other one, causing errors.
ctx_arg_info appears to be global for the program.
I think the better way would be to check if ref_obj_id is still part
of the reference state.
If the ref_obj_id has already been dropped from reference_state, then
any loads should get ref_obj_id = 0.
That would happen when dropping or enqueueing the skb into qdisc,
which would (I presume) do release_reference_state(ref_obj_id).
Otherwise, all of them can share the same ref_obj_id. You won't have
to implement "can only read once" logic,
and when you enqueue stuff in the qdisc, all identical copies produced
from different load instructions will be invalidated.
Same ref_obj_id == unique ownership of the same object.
You can already have multiple copies through rX = rY, multiple ctx
loads of skb will produce a similar verifier state.
So, on entry, assign ctx_arg_info->ref_obj_id uniquely, then on each load:
if reference_state.find(ctx_arg_info->ref_obj_id) == true; then
info->ref_obj_id = ctx_arg_info->ref_obj_id; else info->ref_obj_id =
0;
Let me know if I missed something.
next prev parent reply other threads:[~2024-05-16 23:59 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-10 19:23 [RFC PATCH v8 00/20] bpf qdisc Amery Hung
2024-05-10 19:23 ` [RFC PATCH v8 01/20] bpf: Support passing referenced kptr to struct_ops programs Amery Hung
2024-05-16 23:59 ` Kumar Kartikeya Dwivedi [this message]
2024-05-17 0:17 ` Amery Hung
2024-05-17 0:23 ` Kumar Kartikeya Dwivedi
2024-05-17 1:22 ` Amery Hung
2024-05-17 2:00 ` Kumar Kartikeya Dwivedi
2024-05-10 19:23 ` [RFC PATCH v8 02/20] selftests/bpf: Test referenced kptr arguments of " Amery Hung
2024-05-10 21:33 ` Kui-Feng Lee
2024-05-10 22:16 ` Amery Hung
2024-05-16 23:14 ` Amery Hung
2024-05-16 23:43 ` Martin KaFai Lau
2024-05-17 0:54 ` Amery Hung
2024-05-17 1:07 ` Martin KaFai Lau
2024-05-10 19:23 ` [RFC PATCH v8 03/20] bpf: Allow struct_ops prog to return referenced kptr Amery Hung
2024-05-17 2:06 ` Amery Hung
2024-05-17 5:30 ` Martin KaFai Lau
2024-05-10 19:23 ` [RFC PATCH v8 04/20] selftests/bpf: Test returning kptr from struct_ops programs Amery Hung
2024-05-10 19:23 ` [RFC PATCH v8 05/20] bpf: Generate btf_struct_metas for kernel BTF Amery Hung
2024-05-10 19:23 ` [RFC PATCH v8 06/20] bpf: Recognize kernel types as graph values Amery Hung
2024-05-10 19:23 ` [RFC PATCH v8 07/20] bpf: Allow adding kernel objects to collections Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 08/20] selftests/bpf: Test adding kernel object to bpf graph Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 09/20] bpf: Find special BTF fields in union Amery Hung
2024-05-16 23:37 ` Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 10/20] bpf: Introduce exclusive-ownership list and rbtree nodes Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 11/20] bpf: Allow adding exclusive nodes to bpf list and rbtree Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 12/20] selftests/bpf: Modify linked_list tests to work with macro-ified removes Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 13/20] bpf: net_sched: Support implementation of Qdisc_ops in bpf Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 14/20] bpf: net_sched: Add bpf qdisc kfuncs Amery Hung
2024-05-22 23:55 ` Martin KaFai Lau
2024-05-23 1:06 ` Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 15/20] bpf: net_sched: Allow more optional methods in Qdisc_ops Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 16/20] libbpf: Support creating and destroying qdisc Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 17/20] selftests: Add a basic fifo qdisc test Amery Hung
2024-05-21 3:15 ` Stanislav Fomichev
2024-05-21 15:03 ` Amery Hung
2024-05-21 17:57 ` Stanislav Fomichev
2024-05-10 19:24 ` [RFC PATCH v8 18/20] selftests: Add a bpf fq qdisc to selftest Amery Hung
2024-05-24 6:24 ` Martin KaFai Lau
2024-05-24 7:40 ` Toke Høiland-Jørgensen
2024-05-26 1:08 ` Martin KaFai Lau
2024-05-27 10:09 ` Toke Høiland-Jørgensen
2024-05-24 19:33 ` Alexei Starovoitov
2024-05-24 20:54 ` Martin KaFai Lau
2024-05-10 19:24 ` [RFC PATCH v8 19/20] selftests: Add a bpf netem " Amery Hung
2024-05-10 19:24 ` [RFC PATCH v8 20/20] selftests: Add a prio bpf qdisc Amery Hung
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAP01T74iSVPnRsAbdNfzXYYS7GsdCSgp3QiaPSzex6d+3J5AAA@mail.gmail.com \
--to=memxor@gmail.com \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=martin.lau@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sdf@google.com \
--cc=sinquersw@gmail.com \
--cc=toke@redhat.com \
--cc=xiyou.wangcong@gmail.com \
--cc=yangpeihao@sjtu.edu.cn \
--cc=yepeilin.cs@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).