BPF List
 help / color / mirror / Atom feed
From: Kui-Feng Lee <sinquersw@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kui-Feng Lee <thinker.li@gmail.com>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Kernel Team <kernel-team@meta.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Kui-Feng Lee <kuifeng@meta.com>
Subject: Re: [PATCH bpf-next v2 00/11] Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head.
Date: Mon, 22 Apr 2024 19:54:49 -0700	[thread overview]
Message-ID: <90652139-f541-4a99-837e-e5857c901f61@gmail.com> (raw)
In-Reply-To: <57b4d1ca-a444-4e28-9c22-9b81c352b4cb@gmail.com>



On 4/22/24 19:45, Kui-Feng Lee wrote:
> 
> 
> On 4/18/24 07:53, Alexei Starovoitov wrote:
>> On Wed, Apr 17, 2024 at 11:07 PM Kui-Feng Lee <sinquersw@gmail.com> 
>> wrote:
>>>
>>>
>>>
>>> On 4/17/24 22:11, Alexei Starovoitov wrote:
>>>> On Wed, Apr 17, 2024 at 9:31 PM Kui-Feng Lee <sinquersw@gmail.com> 
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 4/17/24 20:30, Alexei Starovoitov wrote:
>>>>>> On Fri, Apr 12, 2024 at 2:08 PM Kui-Feng Lee 
>>>>>> <thinker.li@gmail.com> wrote:
>>>>>>>
>>>>>>> The arrays of kptr, bpf_rb_root, and bpf_list_head didn't work as
>>>>>>> global variables. This was due to these types being initialized and
>>>>>>> verified in a special manner in the kernel. This patchset allows BPF
>>>>>>> programs to declare arrays of kptr, bpf_rb_root, and 
>>>>>>> bpf_list_head in
>>>>>>> the global namespace.
>>>>>>>
>>>>>>> The main change is to add "nelems" to btf_fields. The value of
>>>>>>> "nelems" represents the number of elements in the array if a 
>>>>>>> btf_field
>>>>>>> represents an array. Otherwise, "nelem" will be 1. The verifier
>>>>>>> verifies these types based on the information provided by the
>>>>>>> btf_field.
>>>>>>>
>>>>>>> The value of "size" will be the size of the entire array if a
>>>>>>> btf_field represents an array. Dividing "size" by "nelems" gives the
>>>>>>> size of an element. The value of "offset" will be the offset of the
>>>>>>> beginning for an array. By putting this together, we can 
>>>>>>> determine the
>>>>>>> offset of each element in an array. For example,
>>>>>>>
>>>>>>>        struct bpf_cpumask __kptr * global_mask_array[2];
>>>>>>
>>>>>> Looks like this patch set enables arrays only.
>>>>>> Meaning the following is supported already:
>>>>>>
>>>>>> +private(C) struct bpf_spin_lock glock_c;
>>>>>> +private(C) struct bpf_list_head ghead_array1 __contains(foo, node2);
>>>>>> +private(C) struct bpf_list_head ghead_array2 __contains(foo, node2);
>>>>>>
>>>>>> while this support is added:
>>>>>>
>>>>>> +private(C) struct bpf_spin_lock glock_c;
>>>>>> +private(C) struct bpf_list_head ghead_array1[3] __contains(foo, 
>>>>>> node2);
>>>>>> +private(C) struct bpf_list_head ghead_array2[2] __contains(foo, 
>>>>>> node2);
>>>>>>
>>>>>> Am I right?
>>>>>>
>>>>>> What about the case when bpf_list_head is wrapped in a struct?
>>>>>> private(C) struct foo {
>>>>>>      struct bpf_list_head ghead;
>>>>>> } ghead;
>>>>>>
>>>>>> that's not enabled in this patch. I think.
>>>>>>
>>>>>> And the following:
>>>>>> private(C) struct foo {
>>>>>>      struct bpf_list_head ghead;
>>>>>> } ghead[2];
>>>>>>
>>>>>>
>>>>>> or
>>>>>>
>>>>>> private(C) struct foo {
>>>>>>      struct bpf_list_head ghead[2];
>>>>>> } ghead;
>>>>>>
>>>>>> Won't work either.
>>>>>
>>>>> No, they don't work.
>>>>> We had a discussion about this in the other day.
>>>>> I proposed to have another patch set to work on struct types.
>>>>> Do you prefer to handle it in this patch set?
>>>>>
>>>>>>
>>>>>> I think eventually we want to support all such combinations and
>>>>>> the approach proposed in this patch with 'nelems'
>>>>>> won't work for wrapper structs.
>>>>>>
>>>>>> I think it's better to unroll/flatten all structs and arrays
>>>>>> and represent them as individual elements in the flattened
>>>>>> structure. Then there will be no need to special case array with 
>>>>>> 'nelems'.
>>>>>> All special BTF types will be individual elements with unique offset.
>>>>>>
>>>>>> Does this make sense?
>>>>>
>>>>> That means it will creates 10 btf_field(s) for an array having 10
>>>>> elements. The purpose of adding "nelems" is to avoid the 
>>>>> repetition. Do
>>>>> you prefer to expand them?
>>>>
>>>> It's not just expansion, but a common way to handle nested structs too.
>>>>
>>>> I suspect by delaying nested into another patchset this approach
>>>> will become useless.
>>>>
>>>> So try adding nested structs in all combinations as a follow up and
>>>> I suspect you're realize that "nelems" approach doesn't really help.
>>>> You'd need to flatten them all.
>>>> And once you do there is no need for "nelems".
>>>
>>> For me, "nelems" is more like a choice of avoiding repetition of
>>> information, not a necessary. Before adding "nelems", I had considered
>>> to expand them as well. But, eventually, I chose to add "nelems".
>>>
>>> Since you think this repetition is not a problem, I will expand array as
>>> individual elements.
>>
>> You don't sound convinced :)
>> Please add support for nested structs on top of your "nelems" approach
>> and prototype the same without "nelems" and let's compare the two.
> 
> 
> The following is the prototype that flatten arrays and struct types.
> This approach is definitely simpler than "nelems" one.  However,
> it will repeat same information as many times as the size of an array.
> For now, we have a limitation on the number of btf_fields (<= 10).
> 
> The core part of the "nelems" approach is quiet similar to this
> "flatten" version.  However, the following function has to be modified
> to handle "nelem" and fields in "BPF_REPEAT_FIELDS" type.
> 
>   - bpf_obj_init_field() & bpf_obj_free_fields()
>   - btf_record_find()
>   - check_map_access()
>   - btf_record_find()
>   - check_map_access()
>   - bpf_obj_memcpy()
>   - bpf_obj_memzero()
> 
> 


The following is the core part that I extracted from the patchset.
It doesn't include the change on the functions mentioned above.


---
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index caea4e560eb3..bd9d56b9b6e4 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -202,6 +202,7 @@ enum btf_field_type {
  	BPF_GRAPH_NODE = BPF_RB_NODE | BPF_LIST_NODE,
  	BPF_GRAPH_ROOT = BPF_RB_ROOT | BPF_LIST_HEAD,
  	BPF_REFCOUNT   = (1 << 9),
+	BPF_REPEAT_FIELDS = (1 << 10),
  };

  typedef void (*btf_dtor_kfunc_t)(void *);
@@ -226,10 +227,12 @@ struct btf_field_graph_root {
  struct btf_field {
  	u32 offset;
  	u32 size;
+	u32 nelems;
  	enum btf_field_type type;
  	union {
  		struct btf_field_kptr kptr;
  		struct btf_field_graph_root graph_root;
+		u32 repeated_cnt;
  	};
  };

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 3233832f064f..005e530bf7e5 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3289,6 +3289,7 @@ enum {
  struct btf_field_info {
  	enum btf_field_type type;
  	u32 off;
+	u32 nelems;
  	union {
  		struct {
  			u32 type_id;
@@ -3297,6 +3298,10 @@ struct btf_field_info {
  			const char *node_name;
  			u32 value_btf_id;
  		} graph_root;
+		struct {
+			u32 cnt;
+			u32 elem_size;
+		} repeat;
  	};
  };

@@ -3484,6 +3489,43 @@ static int btf_get_field_type(const char *name, 
u32 field_mask, u32 *seen_mask,

  #undef field_mask_test_name

+static int btf_find_struct_field(const struct btf *btf,
+				 const struct btf_type *t, u32 field_mask,
+				 struct btf_field_info *info, int info_cnt);
+
+static int btf_find_nested_struct(const struct btf *btf, const struct 
btf_type *t,
+				  u32 off, u32 nelems,
+				  u32 field_mask, struct btf_field_info *info,
+				  int info_cnt)
+{
+	int ret, i;
+
+	ret = btf_find_struct_field(btf, t, field_mask, info, info_cnt);
+
+	if (ret <= 0)
+		return ret;
+
+	/* Shift the offsets of the nested struct fields to the offsets
+	 * related to the container.
+	 */
+	for (i = 0; i < ret; i++)
+		info[i].off += off;
+
+	if (nelems > 1) {
+		/* Repeat fields created for nested struct */
+		if (ret >= info_cnt)
+			return -E2BIG;
+		info[ret].type = BPF_REPEAT_FIELDS;
+		info[ret].off = off + t->size;
+		info[ret].nelems = nelems - 1;
+		info[ret].repeat.cnt = ret;
+		info[ret].repeat.elem_size = t->size;
+		ret += 1;
+	}
+
+	return ret;
+}
+
  static int btf_find_struct_field(const struct btf *btf,
  				 const struct btf_type *t, u32 field_mask,
  				 struct btf_field_info *info, int info_cnt)
@@ -3496,9 +3538,26 @@ static int btf_find_struct_field(const struct btf 
*btf,
  	for_each_member(i, t, member) {
  		const struct btf_type *member_type = btf_type_by_id(btf,
  								    member->type);
+		const struct btf_array *array;
+		u32 j, nelems = 1;
+
+		/* Walk into array types to find the element type and the
+		 * number of elements in the (flattened) array.
+		 */
+		for (j = 0; j < MAX_RESOLVE_DEPTH && btf_type_is_array(member_type); 
j++) {
+			array = btf_array(member_type);
+			nelems *= array->nelems;
+			member_type = btf_type_by_id(btf, array->type);
+		}
+		if (nelems == 0)
+			continue;

  		field_type = btf_get_field_type(__btf_name_by_offset(btf, 
member_type->name_off),
-						field_mask, &seen_mask, &align, &sz);
+							field_mask, &seen_mask, &align, &sz);
+		if ((field_type == BPF_KPTR_REF || !field_type) &&
+		    __btf_type_is_struct(member_type))
+			field_type = BPF_REPEAT_FIELDS;
+
  		if (field_type == 0)
  			continue;
  		if (field_type < 0)
@@ -3540,6 +3599,13 @@ static int btf_find_struct_field(const struct btf 
*btf,
  			if (ret < 0)
  				return ret;
  			break;
+		case BPF_REPEAT_FIELDS:
+			ret = btf_find_nested_struct(btf, member_type, off, nelems, field_mask,
+						    &info[idx], info_cnt - idx);
+			if (ret < 0)
+				return ret;
+			idx += ret;
+			continue;
  		default:
  			return -EFAULT;
  		}
@@ -3548,6 +3614,7 @@ static int btf_find_struct_field(const struct btf 
*btf,
  			continue;
  		if (idx >= info_cnt)
  			return -E2BIG;
+		info[idx].nelems = nelems;
  		++idx;
  	}
  	return idx;
@@ -3565,16 +3632,35 @@ static int btf_find_datasec_var(const struct btf 
*btf, const struct btf_type *t,
  	for_each_vsi(i, t, vsi) {
  		const struct btf_type *var = btf_type_by_id(btf, vsi->type);
  		const struct btf_type *var_type = btf_type_by_id(btf, var->type);
+		const struct btf_array *array;
+		u32 j, nelems = 1;
+
+		/* Walk into array types to find the element type and the
+		 * number of elements in the (flattened) array.
+		 */
+		for (j = 0; j < MAX_RESOLVE_DEPTH && btf_type_is_array(var_type); j++) {
+			array = btf_array(var_type);
+			nelems *= array->nelems;
+			var_type = btf_type_by_id(btf, array->type);
+		}
+		if (nelems == 0)
+			continue;

  		field_type = btf_get_field_type(__btf_name_by_offset(btf, 
var_type->name_off),
  						field_mask, &seen_mask, &align, &sz);
+		if ((field_type == BPF_KPTR_REF || !field_type) &&
+		    __btf_type_is_struct(var_type)) {
+			field_type = BPF_REPEAT_FIELDS;
+			sz = var_type->size;
+		}
+
  		if (field_type == 0)
  			continue;
  		if (field_type < 0)
  			return field_type;

  		off = vsi->offset;
-		if (vsi->size != sz)
+		if (vsi->size != sz * nelems)
  			continue;
  		if (off % align)
  			continue;
@@ -3582,9 +3668,11 @@ static int btf_find_datasec_var(const struct btf 
*btf, const struct btf_type *t,
  		switch (field_type) {
  		case BPF_SPIN_LOCK:
  		case BPF_TIMER:
+		case BPF_REFCOUNT:
  		case BPF_LIST_NODE:
  		case BPF_RB_NODE:
-		case BPF_REFCOUNT:
+			if (nelems != 1)
+				continue;
  			ret = btf_find_struct(btf, var_type, off, sz, field_type,
  					      idx < info_cnt ? &info[idx] : &tmp);
  			if (ret < 0)
@@ -3607,6 +3695,13 @@ static int btf_find_datasec_var(const struct btf 
*btf, const struct btf_type *t,
  			if (ret < 0)
  				return ret;
  			break;
+		case BPF_REPEAT_FIELDS:
+			ret = btf_find_nested_struct(btf, var_type, off, nelems, field_mask,
+						     &info[idx], info_cnt - idx);
+			if (ret < 0)
+				return ret;
+			idx += ret;
+			continue;
  		default:
  			return -EFAULT;
  		}
@@ -3615,8 +3710,9 @@ static int btf_find_datasec_var(const struct btf 
*btf, const struct btf_type *t,
  			continue;
  		if (idx >= info_cnt)
  			return -E2BIG;
-		++idx;
+		info[idx++].nelems = nelems;
  	}
+
  	return idx;
  }

@@ -3818,7 +3914,10 @@ struct btf_record *btf_parse_fields(const struct 
btf *btf, const struct btf_type
  	rec->timer_off = -EINVAL;
  	rec->refcount_off = -EINVAL;
  	for (i = 0; i < cnt; i++) {
-		field_type_size = btf_field_type_size(info_arr[i].type);
+		if (info_arr[i].type == BPF_REPEAT_FIELDS)
+			field_type_size = info_arr[i].repeat.elem_size * info_arr[i].nelems;
+		else
+			field_type_size = btf_field_type_size(info_arr[i].type) * 
info_arr[i].nelems;
  		if (info_arr[i].off + field_type_size > value_size) {
  			WARN_ONCE(1, "verifier bug off %d size %d", info_arr[i].off, 
value_size);
  			ret = -EFAULT;
@@ -3830,10 +3929,12 @@ struct btf_record *btf_parse_fields(const struct 
btf *btf, const struct btf_type
  		}
  		next_off = info_arr[i].off + field_type_size;

-		rec->field_mask |= info_arr[i].type;
+		if (info_arr[i].type != BPF_REPEAT_FIELDS)
+			rec->field_mask |= info_arr[i].type;
  		rec->fields[i].offset = info_arr[i].off;
  		rec->fields[i].type = info_arr[i].type;
  		rec->fields[i].size = field_type_size;
+		rec->fields[i].nelems = info_arr[i].nelems;

  		switch (info_arr[i].type) {
  		case BPF_SPIN_LOCK:
@@ -3871,6 +3972,10 @@ struct btf_record *btf_parse_fields(const struct 
btf *btf, const struct btf_type
  		case BPF_LIST_NODE:
  		case BPF_RB_NODE:
  			break;
+
+		case BPF_REPEAT_FIELDS:
+			rec->fields[i].repeated_cnt = info_arr[i].repeat.cnt;
+			break;
  		default:
  			ret = -EFAULT;
  			goto end;

  reply	other threads:[~2024-04-23  2:54 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-12 21:08 [PATCH bpf-next v2 00/11] Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 01/11] bpf: Remove unnecessary checks on the offset of btf_field Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 02/11] bpf: Remove unnecessary call to btf_field_type_size() Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 03/11] bpf: Add nelems to struct btf_field_info and btf_field Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 04/11] bpf: initialize/free array of btf_field(s) Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 05/11] bpf: Find btf_field with the knowledge of arrays Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 06/11] bpf: check_map_access() " Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 07/11] bpf: check_map_kptr_access() compute the offset from the reg state Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 08/11] bpf: Enable and verify btf_field arrays Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 09/11] selftests/bpf: Test global kptr arrays Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 10/11] selftests/bpf: Test global bpf_rb_root arrays Kui-Feng Lee
2024-04-12 21:08 ` [PATCH bpf-next v2 11/11] selftests/bpf: Test global bpf_list_head arrays Kui-Feng Lee
2024-04-18  3:30 ` [PATCH bpf-next v2 00/11] Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head Alexei Starovoitov
2024-04-18  4:31   ` Kui-Feng Lee
2024-04-18  5:11     ` Alexei Starovoitov
2024-04-18  6:07       ` Kui-Feng Lee
2024-04-18 14:53         ` Alexei Starovoitov
2024-04-18 18:27           ` Kui-Feng Lee
2024-04-19 18:36           ` Kui-Feng Lee
2024-04-23  2:45           ` Kui-Feng Lee
2024-04-23  2:54             ` Kui-Feng Lee [this message]
2024-04-24 20:09               ` Alexei Starovoitov
2024-04-24 22:32                 ` Kui-Feng Lee
2024-04-24 22:34                   ` Kui-Feng Lee
2024-04-24 22:36                     ` Kui-Feng Lee
2024-04-25  0:49                   ` Alexei Starovoitov
2024-04-25 17:08                     ` Kui-Feng Lee
2024-04-25  0:48                 ` Andrii Nakryiko
2024-04-25 17:09                   ` Kui-Feng Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=90652139-f541-4a99-837e-e5857c901f61@gmail.com \
    --to=sinquersw@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=kernel-team@meta.com \
    --cc=kuifeng@meta.com \
    --cc=martin.lau@linux.dev \
    --cc=song@kernel.org \
    --cc=thinker.li@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox