From: Kui-Feng Lee <sinquersw@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kui-Feng Lee <thinker.li@gmail.com>, bpf <bpf@vger.kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>, Kernel Team <kernel-team@meta.com>,
Andrii Nakryiko <andrii@kernel.org>,
Kui-Feng Lee <kuifeng@meta.com>
Subject: Re: [PATCH bpf-next v2 00/11] Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head.
Date: Mon, 22 Apr 2024 19:54:49 -0700 [thread overview]
Message-ID: <90652139-f541-4a99-837e-e5857c901f61@gmail.com> (raw)
In-Reply-To: <57b4d1ca-a444-4e28-9c22-9b81c352b4cb@gmail.com>
On 4/22/24 19:45, Kui-Feng Lee wrote:
>
>
> On 4/18/24 07:53, Alexei Starovoitov wrote:
>> On Wed, Apr 17, 2024 at 11:07 PM Kui-Feng Lee <sinquersw@gmail.com>
>> wrote:
>>>
>>>
>>>
>>> On 4/17/24 22:11, Alexei Starovoitov wrote:
>>>> On Wed, Apr 17, 2024 at 9:31 PM Kui-Feng Lee <sinquersw@gmail.com>
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 4/17/24 20:30, Alexei Starovoitov wrote:
>>>>>> On Fri, Apr 12, 2024 at 2:08 PM Kui-Feng Lee
>>>>>> <thinker.li@gmail.com> wrote:
>>>>>>>
>>>>>>> The arrays of kptr, bpf_rb_root, and bpf_list_head didn't work as
>>>>>>> global variables. This was due to these types being initialized and
>>>>>>> verified in a special manner in the kernel. This patchset allows BPF
>>>>>>> programs to declare arrays of kptr, bpf_rb_root, and
>>>>>>> bpf_list_head in
>>>>>>> the global namespace.
>>>>>>>
>>>>>>> The main change is to add "nelems" to btf_fields. The value of
>>>>>>> "nelems" represents the number of elements in the array if a
>>>>>>> btf_field
>>>>>>> represents an array. Otherwise, "nelem" will be 1. The verifier
>>>>>>> verifies these types based on the information provided by the
>>>>>>> btf_field.
>>>>>>>
>>>>>>> The value of "size" will be the size of the entire array if a
>>>>>>> btf_field represents an array. Dividing "size" by "nelems" gives the
>>>>>>> size of an element. The value of "offset" will be the offset of the
>>>>>>> beginning for an array. By putting this together, we can
>>>>>>> determine the
>>>>>>> offset of each element in an array. For example,
>>>>>>>
>>>>>>> struct bpf_cpumask __kptr * global_mask_array[2];
>>>>>>
>>>>>> Looks like this patch set enables arrays only.
>>>>>> Meaning the following is supported already:
>>>>>>
>>>>>> +private(C) struct bpf_spin_lock glock_c;
>>>>>> +private(C) struct bpf_list_head ghead_array1 __contains(foo, node2);
>>>>>> +private(C) struct bpf_list_head ghead_array2 __contains(foo, node2);
>>>>>>
>>>>>> while this support is added:
>>>>>>
>>>>>> +private(C) struct bpf_spin_lock glock_c;
>>>>>> +private(C) struct bpf_list_head ghead_array1[3] __contains(foo,
>>>>>> node2);
>>>>>> +private(C) struct bpf_list_head ghead_array2[2] __contains(foo,
>>>>>> node2);
>>>>>>
>>>>>> Am I right?
>>>>>>
>>>>>> What about the case when bpf_list_head is wrapped in a struct?
>>>>>> private(C) struct foo {
>>>>>> struct bpf_list_head ghead;
>>>>>> } ghead;
>>>>>>
>>>>>> that's not enabled in this patch. I think.
>>>>>>
>>>>>> And the following:
>>>>>> private(C) struct foo {
>>>>>> struct bpf_list_head ghead;
>>>>>> } ghead[2];
>>>>>>
>>>>>>
>>>>>> or
>>>>>>
>>>>>> private(C) struct foo {
>>>>>> struct bpf_list_head ghead[2];
>>>>>> } ghead;
>>>>>>
>>>>>> Won't work either.
>>>>>
>>>>> No, they don't work.
>>>>> We had a discussion about this in the other day.
>>>>> I proposed to have another patch set to work on struct types.
>>>>> Do you prefer to handle it in this patch set?
>>>>>
>>>>>>
>>>>>> I think eventually we want to support all such combinations and
>>>>>> the approach proposed in this patch with 'nelems'
>>>>>> won't work for wrapper structs.
>>>>>>
>>>>>> I think it's better to unroll/flatten all structs and arrays
>>>>>> and represent them as individual elements in the flattened
>>>>>> structure. Then there will be no need to special case array with
>>>>>> 'nelems'.
>>>>>> All special BTF types will be individual elements with unique offset.
>>>>>>
>>>>>> Does this make sense?
>>>>>
>>>>> That means it will creates 10 btf_field(s) for an array having 10
>>>>> elements. The purpose of adding "nelems" is to avoid the
>>>>> repetition. Do
>>>>> you prefer to expand them?
>>>>
>>>> It's not just expansion, but a common way to handle nested structs too.
>>>>
>>>> I suspect by delaying nested into another patchset this approach
>>>> will become useless.
>>>>
>>>> So try adding nested structs in all combinations as a follow up and
>>>> I suspect you're realize that "nelems" approach doesn't really help.
>>>> You'd need to flatten them all.
>>>> And once you do there is no need for "nelems".
>>>
>>> For me, "nelems" is more like a choice of avoiding repetition of
>>> information, not a necessary. Before adding "nelems", I had considered
>>> to expand them as well. But, eventually, I chose to add "nelems".
>>>
>>> Since you think this repetition is not a problem, I will expand array as
>>> individual elements.
>>
>> You don't sound convinced :)
>> Please add support for nested structs on top of your "nelems" approach
>> and prototype the same without "nelems" and let's compare the two.
>
>
> The following is the prototype that flatten arrays and struct types.
> This approach is definitely simpler than "nelems" one. However,
> it will repeat same information as many times as the size of an array.
> For now, we have a limitation on the number of btf_fields (<= 10).
>
> The core part of the "nelems" approach is quiet similar to this
> "flatten" version. However, the following function has to be modified
> to handle "nelem" and fields in "BPF_REPEAT_FIELDS" type.
>
> - bpf_obj_init_field() & bpf_obj_free_fields()
> - btf_record_find()
> - check_map_access()
> - btf_record_find()
> - check_map_access()
> - bpf_obj_memcpy()
> - bpf_obj_memzero()
>
>
The following is the core part that I extracted from the patchset.
It doesn't include the change on the functions mentioned above.
---
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index caea4e560eb3..bd9d56b9b6e4 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -202,6 +202,7 @@ enum btf_field_type {
BPF_GRAPH_NODE = BPF_RB_NODE | BPF_LIST_NODE,
BPF_GRAPH_ROOT = BPF_RB_ROOT | BPF_LIST_HEAD,
BPF_REFCOUNT = (1 << 9),
+ BPF_REPEAT_FIELDS = (1 << 10),
};
typedef void (*btf_dtor_kfunc_t)(void *);
@@ -226,10 +227,12 @@ struct btf_field_graph_root {
struct btf_field {
u32 offset;
u32 size;
+ u32 nelems;
enum btf_field_type type;
union {
struct btf_field_kptr kptr;
struct btf_field_graph_root graph_root;
+ u32 repeated_cnt;
};
};
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 3233832f064f..005e530bf7e5 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3289,6 +3289,7 @@ enum {
struct btf_field_info {
enum btf_field_type type;
u32 off;
+ u32 nelems;
union {
struct {
u32 type_id;
@@ -3297,6 +3298,10 @@ struct btf_field_info {
const char *node_name;
u32 value_btf_id;
} graph_root;
+ struct {
+ u32 cnt;
+ u32 elem_size;
+ } repeat;
};
};
@@ -3484,6 +3489,43 @@ static int btf_get_field_type(const char *name,
u32 field_mask, u32 *seen_mask,
#undef field_mask_test_name
+static int btf_find_struct_field(const struct btf *btf,
+ const struct btf_type *t, u32 field_mask,
+ struct btf_field_info *info, int info_cnt);
+
+static int btf_find_nested_struct(const struct btf *btf, const struct
btf_type *t,
+ u32 off, u32 nelems,
+ u32 field_mask, struct btf_field_info *info,
+ int info_cnt)
+{
+ int ret, i;
+
+ ret = btf_find_struct_field(btf, t, field_mask, info, info_cnt);
+
+ if (ret <= 0)
+ return ret;
+
+ /* Shift the offsets of the nested struct fields to the offsets
+ * related to the container.
+ */
+ for (i = 0; i < ret; i++)
+ info[i].off += off;
+
+ if (nelems > 1) {
+ /* Repeat fields created for nested struct */
+ if (ret >= info_cnt)
+ return -E2BIG;
+ info[ret].type = BPF_REPEAT_FIELDS;
+ info[ret].off = off + t->size;
+ info[ret].nelems = nelems - 1;
+ info[ret].repeat.cnt = ret;
+ info[ret].repeat.elem_size = t->size;
+ ret += 1;
+ }
+
+ return ret;
+}
+
static int btf_find_struct_field(const struct btf *btf,
const struct btf_type *t, u32 field_mask,
struct btf_field_info *info, int info_cnt)
@@ -3496,9 +3538,26 @@ static int btf_find_struct_field(const struct btf
*btf,
for_each_member(i, t, member) {
const struct btf_type *member_type = btf_type_by_id(btf,
member->type);
+ const struct btf_array *array;
+ u32 j, nelems = 1;
+
+ /* Walk into array types to find the element type and the
+ * number of elements in the (flattened) array.
+ */
+ for (j = 0; j < MAX_RESOLVE_DEPTH && btf_type_is_array(member_type);
j++) {
+ array = btf_array(member_type);
+ nelems *= array->nelems;
+ member_type = btf_type_by_id(btf, array->type);
+ }
+ if (nelems == 0)
+ continue;
field_type = btf_get_field_type(__btf_name_by_offset(btf,
member_type->name_off),
- field_mask, &seen_mask, &align, &sz);
+ field_mask, &seen_mask, &align, &sz);
+ if ((field_type == BPF_KPTR_REF || !field_type) &&
+ __btf_type_is_struct(member_type))
+ field_type = BPF_REPEAT_FIELDS;
+
if (field_type == 0)
continue;
if (field_type < 0)
@@ -3540,6 +3599,13 @@ static int btf_find_struct_field(const struct btf
*btf,
if (ret < 0)
return ret;
break;
+ case BPF_REPEAT_FIELDS:
+ ret = btf_find_nested_struct(btf, member_type, off, nelems, field_mask,
+ &info[idx], info_cnt - idx);
+ if (ret < 0)
+ return ret;
+ idx += ret;
+ continue;
default:
return -EFAULT;
}
@@ -3548,6 +3614,7 @@ static int btf_find_struct_field(const struct btf
*btf,
continue;
if (idx >= info_cnt)
return -E2BIG;
+ info[idx].nelems = nelems;
++idx;
}
return idx;
@@ -3565,16 +3632,35 @@ static int btf_find_datasec_var(const struct btf
*btf, const struct btf_type *t,
for_each_vsi(i, t, vsi) {
const struct btf_type *var = btf_type_by_id(btf, vsi->type);
const struct btf_type *var_type = btf_type_by_id(btf, var->type);
+ const struct btf_array *array;
+ u32 j, nelems = 1;
+
+ /* Walk into array types to find the element type and the
+ * number of elements in the (flattened) array.
+ */
+ for (j = 0; j < MAX_RESOLVE_DEPTH && btf_type_is_array(var_type); j++) {
+ array = btf_array(var_type);
+ nelems *= array->nelems;
+ var_type = btf_type_by_id(btf, array->type);
+ }
+ if (nelems == 0)
+ continue;
field_type = btf_get_field_type(__btf_name_by_offset(btf,
var_type->name_off),
field_mask, &seen_mask, &align, &sz);
+ if ((field_type == BPF_KPTR_REF || !field_type) &&
+ __btf_type_is_struct(var_type)) {
+ field_type = BPF_REPEAT_FIELDS;
+ sz = var_type->size;
+ }
+
if (field_type == 0)
continue;
if (field_type < 0)
return field_type;
off = vsi->offset;
- if (vsi->size != sz)
+ if (vsi->size != sz * nelems)
continue;
if (off % align)
continue;
@@ -3582,9 +3668,11 @@ static int btf_find_datasec_var(const struct btf
*btf, const struct btf_type *t,
switch (field_type) {
case BPF_SPIN_LOCK:
case BPF_TIMER:
+ case BPF_REFCOUNT:
case BPF_LIST_NODE:
case BPF_RB_NODE:
- case BPF_REFCOUNT:
+ if (nelems != 1)
+ continue;
ret = btf_find_struct(btf, var_type, off, sz, field_type,
idx < info_cnt ? &info[idx] : &tmp);
if (ret < 0)
@@ -3607,6 +3695,13 @@ static int btf_find_datasec_var(const struct btf
*btf, const struct btf_type *t,
if (ret < 0)
return ret;
break;
+ case BPF_REPEAT_FIELDS:
+ ret = btf_find_nested_struct(btf, var_type, off, nelems, field_mask,
+ &info[idx], info_cnt - idx);
+ if (ret < 0)
+ return ret;
+ idx += ret;
+ continue;
default:
return -EFAULT;
}
@@ -3615,8 +3710,9 @@ static int btf_find_datasec_var(const struct btf
*btf, const struct btf_type *t,
continue;
if (idx >= info_cnt)
return -E2BIG;
- ++idx;
+ info[idx++].nelems = nelems;
}
+
return idx;
}
@@ -3818,7 +3914,10 @@ struct btf_record *btf_parse_fields(const struct
btf *btf, const struct btf_type
rec->timer_off = -EINVAL;
rec->refcount_off = -EINVAL;
for (i = 0; i < cnt; i++) {
- field_type_size = btf_field_type_size(info_arr[i].type);
+ if (info_arr[i].type == BPF_REPEAT_FIELDS)
+ field_type_size = info_arr[i].repeat.elem_size * info_arr[i].nelems;
+ else
+ field_type_size = btf_field_type_size(info_arr[i].type) *
info_arr[i].nelems;
if (info_arr[i].off + field_type_size > value_size) {
WARN_ONCE(1, "verifier bug off %d size %d", info_arr[i].off,
value_size);
ret = -EFAULT;
@@ -3830,10 +3929,12 @@ struct btf_record *btf_parse_fields(const struct
btf *btf, const struct btf_type
}
next_off = info_arr[i].off + field_type_size;
- rec->field_mask |= info_arr[i].type;
+ if (info_arr[i].type != BPF_REPEAT_FIELDS)
+ rec->field_mask |= info_arr[i].type;
rec->fields[i].offset = info_arr[i].off;
rec->fields[i].type = info_arr[i].type;
rec->fields[i].size = field_type_size;
+ rec->fields[i].nelems = info_arr[i].nelems;
switch (info_arr[i].type) {
case BPF_SPIN_LOCK:
@@ -3871,6 +3972,10 @@ struct btf_record *btf_parse_fields(const struct
btf *btf, const struct btf_type
case BPF_LIST_NODE:
case BPF_RB_NODE:
break;
+
+ case BPF_REPEAT_FIELDS:
+ rec->fields[i].repeated_cnt = info_arr[i].repeat.cnt;
+ break;
default:
ret = -EFAULT;
goto end;
next prev parent reply other threads:[~2024-04-23 2:54 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-12 21:08 [PATCH bpf-next v2 00/11] Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 01/11] bpf: Remove unnecessary checks on the offset of btf_field Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 02/11] bpf: Remove unnecessary call to btf_field_type_size() Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 03/11] bpf: Add nelems to struct btf_field_info and btf_field Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 04/11] bpf: initialize/free array of btf_field(s) Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 05/11] bpf: Find btf_field with the knowledge of arrays Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 06/11] bpf: check_map_access() " Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 07/11] bpf: check_map_kptr_access() compute the offset from the reg state Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 08/11] bpf: Enable and verify btf_field arrays Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 09/11] selftests/bpf: Test global kptr arrays Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 10/11] selftests/bpf: Test global bpf_rb_root arrays Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 11/11] selftests/bpf: Test global bpf_list_head arrays Kui-Feng Lee
2024-04-18 3:30 ` [PATCH bpf-next v2 00/11] Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head Alexei Starovoitov
2024-04-18 4:31 ` Kui-Feng Lee
2024-04-18 5:11 ` Alexei Starovoitov
2024-04-18 6:07 ` Kui-Feng Lee
2024-04-18 14:53 ` Alexei Starovoitov
2024-04-18 18:27 ` Kui-Feng Lee
2024-04-19 18:36 ` Kui-Feng Lee
2024-04-23 2:45 ` Kui-Feng Lee
2024-04-23 2:54 ` Kui-Feng Lee [this message]
2024-04-24 20:09 ` Alexei Starovoitov
2024-04-24 22:32 ` Kui-Feng Lee
2024-04-24 22:34 ` Kui-Feng Lee
2024-04-24 22:36 ` Kui-Feng Lee
2024-04-25 0:49 ` Alexei Starovoitov
2024-04-25 17:08 ` Kui-Feng Lee
2024-04-25 0:48 ` Andrii Nakryiko
2024-04-25 17:09 ` Kui-Feng Lee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=90652139-f541-4a99-837e-e5857c901f61@gmail.com \
--to=sinquersw@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=kernel-team@meta.com \
--cc=kuifeng@meta.com \
--cc=martin.lau@linux.dev \
--cc=song@kernel.org \
--cc=thinker.li@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox