* [PATCH v7 bpf-next 0/2] bpf: Add a generic bits iterator
@ 2024-05-06 3:33 Yafang Shao
2024-05-06 3:33 ` [PATCH v7 bpf-next 1/2] bpf: Add " Yafang Shao
2024-05-06 3:33 ` [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter Yafang Shao
0 siblings, 2 replies; 11+ messages in thread
From: Yafang Shao @ 2024-05-06 3:33 UTC (permalink / raw)
To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa
Cc: bpf, Yafang Shao
hree new kfuncs, namely bpf_iter_bits_{new,next,destroy}, have been
added for the new bpf_iter_bits functionality. These kfuncs enable the
iteration of the bits from a given address and a given number of bits.
- bpf_iter_bits_new
Initialize a new bits iterator for a given memory area. Due to the
limitation of bpf memalloc, the max number of bits to be iterated
over is (4096 * 8).
- bpf_iter_bits_next
Get the next bit in a bpf_iter_bits
- bpf_iter_bits_destroy
Destroy a bpf_iter_bits
The bits iterator can be used in any context and on any address.
Changes:
- v6->v7:
- Fix endianness error for non-long-aligned data (Andrii)
- v5->v6:
- Add positive tests (Andrii)
- v4->v5:
- Simplify test cases (Andrii)
- v3->v4:
- Fix endianness error on s390x (Andrii)
- zero-initialize kit->bits_copy and zero out nr_bits (Andrii)
- v2->v3:
- Optimization for u64/u32 mask (Andrii)
- v1->v2:
- Simplify the CPU number verification code to avoid the failure on s390x
(Eduard)
- bpf: Add bpf_iter_cpumask
https://lwn.net/Articles/961104/
- bpf: Add new bpf helper bpf_for_each_cpu
https://lwn.net/Articles/939939/
Yafang Shao (2):
bpf: Add bits iterator
selftests/bpf: Add selftest for bits iter
kernel/bpf/helpers.c | 140 +++++++++++++++
.../selftests/bpf/prog_tests/verifier.c | 2 +
.../selftests/bpf/progs/verifier_bits_iter.c | 160 ++++++++++++++++++
3 files changed, 302 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/verifier_bits_iter.c
--
2.30.1 (Apple Git-130)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v7 bpf-next 1/2] bpf: Add bits iterator
2024-05-06 3:33 [PATCH v7 bpf-next 0/2] bpf: Add a generic bits iterator Yafang Shao
@ 2024-05-06 3:33 ` Yafang Shao
2024-05-07 3:38 ` Andrii Nakryiko
2024-05-06 3:33 ` [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter Yafang Shao
1 sibling, 1 reply; 11+ messages in thread
From: Yafang Shao @ 2024-05-06 3:33 UTC (permalink / raw)
To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa
Cc: bpf, Yafang Shao
Add three new kfuncs for the bits iterator:
- bpf_iter_bits_new
Initialize a new bits iterator for a given memory area. Due to the
limitation of bpf memalloc, the max number of bits that can be iterated
over is limited to (4096 * 8).
- bpf_iter_bits_next
Get the next bit in a bpf_iter_bits
- bpf_iter_bits_destroy
Destroy a bpf_iter_bits
The bits iterator facilitates the iteration of the bits of a memory area,
such as cpumask. It can be used in any context and on any address.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
kernel/bpf/helpers.c | 140 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 140 insertions(+)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 2a69a9a36c0f..83b2a02f795f 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2744,6 +2744,143 @@ __bpf_kfunc void bpf_preempt_enable(void)
preempt_enable();
}
+struct bpf_iter_bits {
+ __u64 __opaque[2];
+} __aligned(8);
+
+struct bpf_iter_bits_kern {
+ union {
+ unsigned long *bits;
+ unsigned long bits_copy;
+ };
+ u32 nr_bits;
+ int bit;
+} __aligned(8);
+
+/**
+ * bpf_iter_bits_new() - Initialize a new bits iterator for a given memory area
+ * @it: The new bpf_iter_bits to be created
+ * @unsafe_ptr__ign: A ponter pointing to a memory area to be iterated over
+ * @nr_bits: The number of bits to be iterated over. Due to the limitation of
+ * memalloc, it can't greater than (4096 * 8).
+ *
+ * This function initializes a new bpf_iter_bits structure for iterating over
+ * a memory area which is specified by the @unsafe_ptr__ign and @nr_bits. It
+ * copy the data of the memory area to the newly created bpf_iter_bits @it for
+ * subsequent iteration operations.
+ *
+ * On success, 0 is returned. On failure, ERR is returned.
+ */
+__bpf_kfunc int
+bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign, u32 nr_bits)
+{
+ struct bpf_iter_bits_kern *kit = (void *)it;
+ u32 words = BITS_TO_LONGS(nr_bits);
+ u32 size = BITS_TO_BYTES(nr_bits);
+ u32 left, offset;
+ int err;
+
+ BUILD_BUG_ON(sizeof(struct bpf_iter_bits_kern) != sizeof(struct bpf_iter_bits));
+ BUILD_BUG_ON(__alignof__(struct bpf_iter_bits_kern) !=
+ __alignof__(struct bpf_iter_bits));
+
+ if (!unsafe_ptr__ign || !nr_bits) {
+ kit->bits = NULL;
+ return -EINVAL;
+ }
+
+ kit->nr_bits = 0;
+ kit->bits_copy = 0;
+ /* Optimization for u64/u32 mask */
+ if (nr_bits <= 64) {
+ /* For big-endian, we must calculate the offset */
+ offset = IS_ENABLED(CONFIG_S390) ? sizeof(u64) - size : 0;
+
+ err = bpf_probe_read_kernel_common(((char *)&kit->bits_copy) + offset,
+ size, unsafe_ptr__ign);
+ if (err)
+ return -EFAULT;
+
+ kit->nr_bits = nr_bits;
+ kit->bit = -1;
+ return 0;
+ }
+
+ /* Fallback to memalloc */
+ kit->bits = bpf_mem_alloc(&bpf_global_ma, size);
+ if (!kit->bits)
+ return -ENOMEM;
+
+ err = bpf_probe_read_kernel_common(kit->bits, words * sizeof(u64), unsafe_ptr__ign);
+ if (err) {
+ bpf_mem_free(&bpf_global_ma, kit->bits);
+ return err;
+ }
+
+ /* long-aligned */
+ left = size & (sizeof(u64) - 1);
+ if (!left)
+ goto out;
+
+ offset = IS_ENABLED(CONFIG_S390) ? sizeof(u64) - left : 0;
+ err = bpf_probe_read_kernel_common((char *)(kit->bits + words - 1) + offset, left,
+ unsafe_ptr__ign + (words - 1) * sizeof(u64));
+ if (err) {
+ bpf_mem_free(&bpf_global_ma, kit->bits);
+ return err;
+ }
+
+out:
+ kit->nr_bits = nr_bits;
+ kit->bit = -1;
+ return 0;
+}
+
+/**
+ * bpf_iter_bits_next() - Get the next bit in a bpf_iter_bits
+ * @it: The bpf_iter_bits to be checked
+ *
+ * This function returns a pointer to a number representing the value of the
+ * next bit in the bits.
+ *
+ * If there are no further bit available, it returns NULL.
+ */
+__bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it)
+{
+ struct bpf_iter_bits_kern *kit = (void *)it;
+ u32 nr_bits = kit->nr_bits;
+ const unsigned long *bits;
+ int bit;
+
+ if (nr_bits == 0)
+ return NULL;
+
+ bits = nr_bits <= 64 ? &kit->bits_copy : kit->bits;
+ bit = find_next_bit(bits, nr_bits, kit->bit + 1);
+ if (bit >= nr_bits) {
+ kit->nr_bits = 0;
+ return NULL;
+ }
+
+ kit->bit = bit;
+ return &kit->bit;
+}
+
+/**
+ * bpf_iter_bits_destroy() - Destroy a bpf_iter_bits
+ * @it: The bpf_iter_bits to be destroyed
+ *
+ * Destroy the resource associated with the bpf_iter_bits.
+ */
+__bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
+{
+ struct bpf_iter_bits_kern *kit = (void *)it;
+
+ if (kit->nr_bits <= 64)
+ return;
+ bpf_mem_free(&bpf_global_ma, kit->bits);
+}
+
__bpf_kfunc_end_defs();
BTF_KFUNCS_START(generic_btf_ids)
@@ -2826,6 +2963,9 @@ BTF_ID_FLAGS(func, bpf_wq_set_callback_impl)
BTF_ID_FLAGS(func, bpf_wq_start)
BTF_ID_FLAGS(func, bpf_preempt_disable)
BTF_ID_FLAGS(func, bpf_preempt_enable)
+BTF_ID_FLAGS(func, bpf_iter_bits_new, KF_ITER_NEW)
+BTF_ID_FLAGS(func, bpf_iter_bits_next, KF_ITER_NEXT | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_bits_destroy, KF_ITER_DESTROY)
BTF_KFUNCS_END(common_btf_ids)
static const struct btf_kfunc_id_set common_kfunc_set = {
--
2.30.1 (Apple Git-130)
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter
2024-05-06 3:33 [PATCH v7 bpf-next 0/2] bpf: Add a generic bits iterator Yafang Shao
2024-05-06 3:33 ` [PATCH v7 bpf-next 1/2] bpf: Add " Yafang Shao
@ 2024-05-06 3:33 ` Yafang Shao
2024-05-07 3:42 ` Andrii Nakryiko
1 sibling, 1 reply; 11+ messages in thread
From: Yafang Shao @ 2024-05-06 3:33 UTC (permalink / raw)
To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa
Cc: bpf, Yafang Shao
Add test cases for the bits iter:
- positive case
- bit mask smaller than 8 bytes
- a typical case of having 8-byte bit mask
- another typical case where bit mask is > 8 bytes
- the index of set bit
- nagative cases
- bpf_iter_bits_destroy() is required after calling
bpf_iter_bits_new()
- bpf_iter_bits_destroy() can only destroy an initialized iter
- bpf_iter_bits_next() must use an initialized iter
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
.../selftests/bpf/prog_tests/verifier.c | 2 +
.../selftests/bpf/progs/verifier_bits_iter.c | 160 ++++++++++++++++++
2 files changed, 162 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/verifier_bits_iter.c
diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index c4f9f306646e..7e04ecaaa20a 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -84,6 +84,7 @@
#include "verifier_xadd.skel.h"
#include "verifier_xdp.skel.h"
#include "verifier_xdp_direct_packet_access.skel.h"
+#include "verifier_bits_iter.skel.h"
#define MAX_ENTRIES 11
@@ -198,6 +199,7 @@ void test_verifier_var_off(void) { RUN(verifier_var_off); }
void test_verifier_xadd(void) { RUN(verifier_xadd); }
void test_verifier_xdp(void) { RUN(verifier_xdp); }
void test_verifier_xdp_direct_packet_access(void) { RUN(verifier_xdp_direct_packet_access); }
+void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); }
static int init_test_val_map(struct bpf_object *obj, char *map_name)
{
diff --git a/tools/testing/selftests/bpf/progs/verifier_bits_iter.c b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
new file mode 100644
index 000000000000..2f7b62b25638
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2024 Yafang Shao <laoar.shao@gmail.com> */
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#include "bpf_misc.h"
+#include "task_kfunc_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+int bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign,
+ u32 nr_bits) __ksym __weak;
+int *bpf_iter_bits_next(struct bpf_iter_bits *it) __ksym __weak;
+void bpf_iter_bits_destroy(struct bpf_iter_bits *it) __ksym __weak;
+
+SEC("iter.s/cgroup")
+__description("bits iter without destroy")
+__failure __msg("Unreleased reference")
+int BPF_PROG(no_destroy, struct bpf_iter_meta *meta, struct cgroup *cgrp)
+{
+ struct bpf_iter_bits it;
+ struct task_struct *p;
+
+ p = bpf_task_from_pid(1);
+ if (!p)
+ return 1;
+
+ bpf_iter_bits_new(&it, p->cpus_ptr, 8192);
+
+ bpf_iter_bits_next(&it);
+ bpf_task_release(p);
+ return 0;
+}
+
+SEC("iter/cgroup")
+__description("bits iter with uninitialized iter in ->next()")
+__failure __msg("expected an initialized iter_bits as arg #1")
+int BPF_PROG(next_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
+{
+ struct bpf_iter_bits *it = NULL;
+
+ bpf_iter_bits_next(it);
+ return 0;
+}
+
+SEC("iter/cgroup")
+__description("bits iter with uninitialized iter in ->destroy()")
+__failure __msg("expected an initialized iter_bits as arg #1")
+int BPF_PROG(destroy_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
+{
+ struct bpf_iter_bits it = {};
+
+ bpf_iter_bits_destroy(&it);
+ return 0;
+}
+
+SEC("syscall")
+__description("bits copy 32")
+__success __retval(10)
+int bits_copy32(void)
+{
+ /* 21 bits: --------------------- */
+ u32 data = 0b11111101111101111100001000100101U;
+ int nr = 0, offset = 0;
+ int *bit;
+
+#if defined(__TARGET_ARCH_s390)
+ offset = sizeof(u32) - (21 + 7) / 8;
+#endif
+ bpf_for_each(bits, bit, ((char *)&data) + offset, 21)
+ nr++;
+ return nr;
+}
+
+SEC("syscall")
+__description("bits copy 64")
+__success __retval(18)
+int bits_copy64(void)
+{
+ /* 34 bits: ~-------- */
+ u64 data = 0xffffefdf0f0f0f0fUL;
+ int nr = 0, offset = 0;
+ int *bit;
+
+#if defined(__TARGET_ARCH_s390)
+ offset = sizeof(u64) - (34 + 7) / 8;
+#endif
+
+ bpf_for_each(bits, bit, ((char *)&data) + offset, 34)
+ nr++;
+ return nr;
+}
+
+SEC("syscall")
+__description("bits memalloc long-aligned")
+__success __retval(32) /* 16 * 2 */
+int bits_memalloc(void)
+{
+ char data[16];
+ int nr = 0;
+ int *bit;
+
+ __builtin_memset(&data, 0x48, sizeof(data));
+ bpf_for_each(bits, bit, &data, sizeof(data) * 8)
+ nr++;
+ return nr;
+}
+
+SEC("syscall")
+__description("bits memalloc non-long-aligned")
+__success __retval(85) /* 17 * 5*/
+int bits_memalloc_non_aligned(void)
+{
+ char data[17];
+ int nr = 0;
+ int *bit;
+
+ __builtin_memset(&data, 0x1f, sizeof(data));
+ bpf_for_each(bits, bit, &data, sizeof(data) * 8)
+ nr++;
+ return nr;
+}
+
+SEC("syscall")
+__description("bits memalloc non-aligned-bits")
+__success __retval(27) /* 8 * 3 + 3 */
+int bits_memalloc_non_aligned_bits(void)
+{
+ char data[16];
+ int nr = 0;
+ int *bit;
+
+ __builtin_memset(&data, 0x31, sizeof(data));
+ /* Different with all other bytes */
+ data[8] = 0xf7;
+
+ bpf_for_each(bits, bit, &data, 68)
+ nr++;
+ return nr;
+}
+
+
+SEC("syscall")
+__description("bit index")
+__success __retval(8)
+int bit_index(void)
+{
+ u64 data = 0x100;
+ int bit_idx = 0;
+ int *bit;
+
+ bpf_for_each(bits, bit, &data, 64) {
+ if (*bit == 0)
+ continue;
+ bit_idx = *bit;
+ }
+ return bit_idx;
+}
--
2.30.1 (Apple Git-130)
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v7 bpf-next 1/2] bpf: Add bits iterator
2024-05-06 3:33 ` [PATCH v7 bpf-next 1/2] bpf: Add " Yafang Shao
@ 2024-05-07 3:38 ` Andrii Nakryiko
2024-05-07 13:32 ` Yafang Shao
0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2024-05-07 3:38 UTC (permalink / raw)
To: Yafang Shao, David Vernet
Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, bpf
On Sun, May 5, 2024 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> Add three new kfuncs for the bits iterator:
> - bpf_iter_bits_new
> Initialize a new bits iterator for a given memory area. Due to the
> limitation of bpf memalloc, the max number of bits that can be iterated
> over is limited to (4096 * 8).
> - bpf_iter_bits_next
> Get the next bit in a bpf_iter_bits
> - bpf_iter_bits_destroy
> Destroy a bpf_iter_bits
>
> The bits iterator facilitates the iteration of the bits of a memory area,
> such as cpumask. It can be used in any context and on any address.
>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
> kernel/bpf/helpers.c | 140 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 140 insertions(+)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 2a69a9a36c0f..83b2a02f795f 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2744,6 +2744,143 @@ __bpf_kfunc void bpf_preempt_enable(void)
> preempt_enable();
> }
>
> +struct bpf_iter_bits {
> + __u64 __opaque[2];
> +} __aligned(8);
> +
> +struct bpf_iter_bits_kern {
> + union {
> + unsigned long *bits;
> + unsigned long bits_copy;
> + };
> + u32 nr_bits;
> + int bit;
> +} __aligned(8);
> +
> +/**
> + * bpf_iter_bits_new() - Initialize a new bits iterator for a given memory area
> + * @it: The new bpf_iter_bits to be created
> + * @unsafe_ptr__ign: A ponter pointing to a memory area to be iterated over
typo: pointer
> + * @nr_bits: The number of bits to be iterated over. Due to the limitation of
> + * memalloc, it can't greater than (4096 * 8).
typo: can't be greater
> + *
> + * This function initializes a new bpf_iter_bits structure for iterating over
> + * a memory area which is specified by the @unsafe_ptr__ign and @nr_bits. It
> + * copy the data of the memory area to the newly created bpf_iter_bits @it for
s/copy/copies/
> + * subsequent iteration operations.
> + *
> + * On success, 0 is returned. On failure, ERR is returned.
> + */
> +__bpf_kfunc int
> +bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign, u32 nr_bits)
> +{
> + struct bpf_iter_bits_kern *kit = (void *)it;
> + u32 words = BITS_TO_LONGS(nr_bits);
> + u32 size = BITS_TO_BYTES(nr_bits);
> + u32 left, offset;
> + int err;
> +
> + BUILD_BUG_ON(sizeof(struct bpf_iter_bits_kern) != sizeof(struct bpf_iter_bits));
> + BUILD_BUG_ON(__alignof__(struct bpf_iter_bits_kern) !=
> + __alignof__(struct bpf_iter_bits));
> +
> + if (!unsafe_ptr__ign || !nr_bits) {
> + kit->bits = NULL;
> + return -EINVAL;
> + }
> +
> + kit->nr_bits = 0;
> + kit->bits_copy = 0;
> + /* Optimization for u64/u32 mask */
> + if (nr_bits <= 64) {
> + /* For big-endian, we must calculate the offset */
> + offset = IS_ENABLED(CONFIG_S390) ? sizeof(u64) - size : 0;
S390 isn't the only big-endian architecture, it's wrong to hard-code just S390
there is __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ check throughout the
kernel to do this detection
> +
> + err = bpf_probe_read_kernel_common(((char *)&kit->bits_copy) + offset,
> + size, unsafe_ptr__ign);
> + if (err)
> + return -EFAULT;
I'd rewrite the above to something like (not tested, but should give
the right idea):
long bits = 0;
err = bpf_probe_read_kernel_common(&bits, size, unsafe_ptr__ign);
if (err)
return -EFAULT;
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
bits = __swab64(bits);
#endif
/* deal with bit mask of weird size, ensuring upper bits are zero */
bits <<= 64 - nr_bits;
bits >>= 64 - nr_bits;
kit->bits_copy = bits;
This should take care of both big-endianness, and non-multiple-of-8
sized bitmasks (I think, we need tests).
pw-bot: cr
> +
> + kit->nr_bits = nr_bits;
> + kit->bit = -1;
> + return 0;
> + }
> +
> + /* Fallback to memalloc */
> + kit->bits = bpf_mem_alloc(&bpf_global_ma, size);
> + if (!kit->bits)
> + return -ENOMEM;
> +
> + err = bpf_probe_read_kernel_common(kit->bits, words * sizeof(u64), unsafe_ptr__ign);
> + if (err) {
> + bpf_mem_free(&bpf_global_ma, kit->bits);
> + return err;
> + }
> +
> + /* long-aligned */
> + left = size & (sizeof(u64) - 1);
> + if (!left)
> + goto out;
> +
> + offset = IS_ENABLED(CONFIG_S390) ? sizeof(u64) - left : 0;
> + err = bpf_probe_read_kernel_common((char *)(kit->bits + words - 1) + offset, left,
> + unsafe_ptr__ign + (words - 1) * sizeof(u64));
> + if (err) {
> + bpf_mem_free(&bpf_global_ma, kit->bits);
> + return err;
> + }
tbh, I'm not sure what's the desired behavior here is. David (cc'ed),
you were dealing with cpumasks, how is the bit mask specified there?
Is it considered to be an long[] array or byte[] array? And how is
that working on big-endian, because I think it makes a difference?
Please take a look, thanks.
> +
> +out:
> + kit->nr_bits = nr_bits;
> + kit->bit = -1;
> + return 0;
> +}
> +
> +/**
> + * bpf_iter_bits_next() - Get the next bit in a bpf_iter_bits
> + * @it: The bpf_iter_bits to be checked
> + *
> + * This function returns a pointer to a number representing the value of the
> + * next bit in the bits.
> + *
> + * If there are no further bit available, it returns NULL.
> + */
> +__bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it)
> +{
> + struct bpf_iter_bits_kern *kit = (void *)it;
> + u32 nr_bits = kit->nr_bits;
> + const unsigned long *bits;
> + int bit;
> +
> + if (nr_bits == 0)
> + return NULL;
> +
> + bits = nr_bits <= 64 ? &kit->bits_copy : kit->bits;
> + bit = find_next_bit(bits, nr_bits, kit->bit + 1);
> + if (bit >= nr_bits) {
> + kit->nr_bits = 0;
> + return NULL;
> + }
> +
> + kit->bit = bit;
> + return &kit->bit;
> +}
> +
> +/**
> + * bpf_iter_bits_destroy() - Destroy a bpf_iter_bits
> + * @it: The bpf_iter_bits to be destroyed
> + *
> + * Destroy the resource associated with the bpf_iter_bits.
> + */
> +__bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
> +{
> + struct bpf_iter_bits_kern *kit = (void *)it;
> +
> + if (kit->nr_bits <= 64)
> + return;
> + bpf_mem_free(&bpf_global_ma, kit->bits);
> +}
> +
> __bpf_kfunc_end_defs();
>
> BTF_KFUNCS_START(generic_btf_ids)
> @@ -2826,6 +2963,9 @@ BTF_ID_FLAGS(func, bpf_wq_set_callback_impl)
> BTF_ID_FLAGS(func, bpf_wq_start)
> BTF_ID_FLAGS(func, bpf_preempt_disable)
> BTF_ID_FLAGS(func, bpf_preempt_enable)
> +BTF_ID_FLAGS(func, bpf_iter_bits_new, KF_ITER_NEW)
> +BTF_ID_FLAGS(func, bpf_iter_bits_next, KF_ITER_NEXT | KF_RET_NULL)
> +BTF_ID_FLAGS(func, bpf_iter_bits_destroy, KF_ITER_DESTROY)
> BTF_KFUNCS_END(common_btf_ids)
>
> static const struct btf_kfunc_id_set common_kfunc_set = {
> --
> 2.30.1 (Apple Git-130)
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter
2024-05-06 3:33 ` [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter Yafang Shao
@ 2024-05-07 3:42 ` Andrii Nakryiko
2024-05-07 13:38 ` Yafang Shao
0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2024-05-07 3:42 UTC (permalink / raw)
To: Yafang Shao
Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, bpf
On Sun, May 5, 2024 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> Add test cases for the bits iter:
> - positive case
> - bit mask smaller than 8 bytes
> - a typical case of having 8-byte bit mask
> - another typical case where bit mask is > 8 bytes
> - the index of set bit
>
> - nagative cases
> - bpf_iter_bits_destroy() is required after calling
> bpf_iter_bits_new()
> - bpf_iter_bits_destroy() can only destroy an initialized iter
> - bpf_iter_bits_next() must use an initialized iter
>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
> .../selftests/bpf/prog_tests/verifier.c | 2 +
> .../selftests/bpf/progs/verifier_bits_iter.c | 160 ++++++++++++++++++
> 2 files changed, 162 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/progs/verifier_bits_iter.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> index c4f9f306646e..7e04ecaaa20a 100644
> --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> @@ -84,6 +84,7 @@
> #include "verifier_xadd.skel.h"
> #include "verifier_xdp.skel.h"
> #include "verifier_xdp_direct_packet_access.skel.h"
> +#include "verifier_bits_iter.skel.h"
>
> #define MAX_ENTRIES 11
>
> @@ -198,6 +199,7 @@ void test_verifier_var_off(void) { RUN(verifier_var_off); }
> void test_verifier_xadd(void) { RUN(verifier_xadd); }
> void test_verifier_xdp(void) { RUN(verifier_xdp); }
> void test_verifier_xdp_direct_packet_access(void) { RUN(verifier_xdp_direct_packet_access); }
> +void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); }
>
> static int init_test_val_map(struct bpf_object *obj, char *map_name)
> {
> diff --git a/tools/testing/selftests/bpf/progs/verifier_bits_iter.c b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> new file mode 100644
> index 000000000000..2f7b62b25638
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> @@ -0,0 +1,160 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2024 Yafang Shao <laoar.shao@gmail.com> */
> +
> +#include "vmlinux.h"
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +#include "bpf_misc.h"
> +#include "task_kfunc_common.h"
> +
> +char _license[] SEC("license") = "GPL";
> +
> +int bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign,
> + u32 nr_bits) __ksym __weak;
> +int *bpf_iter_bits_next(struct bpf_iter_bits *it) __ksym __weak;
> +void bpf_iter_bits_destroy(struct bpf_iter_bits *it) __ksym __weak;
> +
> +SEC("iter.s/cgroup")
> +__description("bits iter without destroy")
> +__failure __msg("Unreleased reference")
> +int BPF_PROG(no_destroy, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> +{
> + struct bpf_iter_bits it;
> + struct task_struct *p;
> +
> + p = bpf_task_from_pid(1);
> + if (!p)
> + return 1;
> +
> + bpf_iter_bits_new(&it, p->cpus_ptr, 8192);
> +
> + bpf_iter_bits_next(&it);
> + bpf_task_release(p);
> + return 0;
> +}
> +
> +SEC("iter/cgroup")
> +__description("bits iter with uninitialized iter in ->next()")
> +__failure __msg("expected an initialized iter_bits as arg #1")
> +int BPF_PROG(next_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> +{
> + struct bpf_iter_bits *it = NULL;
> +
> + bpf_iter_bits_next(it);
> + return 0;
> +}
> +
> +SEC("iter/cgroup")
> +__description("bits iter with uninitialized iter in ->destroy()")
> +__failure __msg("expected an initialized iter_bits as arg #1")
> +int BPF_PROG(destroy_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> +{
> + struct bpf_iter_bits it = {};
> +
> + bpf_iter_bits_destroy(&it);
> + return 0;
> +}
> +
> +SEC("syscall")
> +__description("bits copy 32")
> +__success __retval(10)
> +int bits_copy32(void)
> +{
> + /* 21 bits: --------------------- */
> + u32 data = 0b11111101111101111100001000100101U;
if you define this bit mask as an array of bytes, then you won't have
to handle big-endian in the tests at all
> + int nr = 0, offset = 0;
> + int *bit;
> +
> +#if defined(__TARGET_ARCH_s390)
> + offset = sizeof(u32) - (21 + 7) / 8;
> +#endif
> + bpf_for_each(bits, bit, ((char *)&data) + offset, 21)
> + nr++;
> + return nr;
> +}
> +
> +SEC("syscall")
> +__description("bits copy 64")
> +__success __retval(18)
> +int bits_copy64(void)
> +{
> + /* 34 bits: ~-------- */
> + u64 data = 0xffffefdf0f0f0f0fUL;
> + int nr = 0, offset = 0;
> + int *bit;
> +
> +#if defined(__TARGET_ARCH_s390)
> + offset = sizeof(u64) - (34 + 7) / 8;
> +#endif
> +
> + bpf_for_each(bits, bit, ((char *)&data) + offset, 34)
see above about byte array, but if we define different (not as byte
array but long[]), it would be cleaner to have
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
u64 data = 0x......UL;
#else
u64 data = 0x......UL;
#endif
wherer we'd hard-code bit masks in proper endianness in one place and
then just do a clean `bpf_for_each(bits, bit, &data, <len>) {}` calls
> + nr++;
> + return nr;
> +}
> +
> +SEC("syscall")
> +__description("bits memalloc long-aligned")
> +__success __retval(32) /* 16 * 2 */
> +int bits_memalloc(void)
> +{
> + char data[16];
> + int nr = 0;
> + int *bit;
> +
> + __builtin_memset(&data, 0x48, sizeof(data));
> + bpf_for_each(bits, bit, &data, sizeof(data) * 8)
> + nr++;
> + return nr;
> +}
> +
> +SEC("syscall")
> +__description("bits memalloc non-long-aligned")
> +__success __retval(85) /* 17 * 5*/
> +int bits_memalloc_non_aligned(void)
> +{
> + char data[17];
> + int nr = 0;
> + int *bit;
> +
> + __builtin_memset(&data, 0x1f, sizeof(data));
> + bpf_for_each(bits, bit, &data, sizeof(data) * 8)
> + nr++;
> + return nr;
> +}
> +
> +SEC("syscall")
> +__description("bits memalloc non-aligned-bits")
> +__success __retval(27) /* 8 * 3 + 3 */
> +int bits_memalloc_non_aligned_bits(void)
> +{
> + char data[16];
> + int nr = 0;
> + int *bit;
> +
> + __builtin_memset(&data, 0x31, sizeof(data));
> + /* Different with all other bytes */
> + data[8] = 0xf7;
> +
> + bpf_for_each(bits, bit, &data, 68)
> + nr++;
> + return nr;
> +}
> +
> +
> +SEC("syscall")
> +__description("bit index")
> +__success __retval(8)
> +int bit_index(void)
> +{
> + u64 data = 0x100;
> + int bit_idx = 0;
> + int *bit;
> +
> + bpf_for_each(bits, bit, &data, 64) {
> + if (*bit == 0)
> + continue;
> + bit_idx = *bit;
> + }
> + return bit_idx;
> +}
> --
> 2.30.1 (Apple Git-130)
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v7 bpf-next 1/2] bpf: Add bits iterator
2024-05-07 3:38 ` Andrii Nakryiko
@ 2024-05-07 13:32 ` Yafang Shao
2024-05-07 17:09 ` Andrii Nakryiko
0 siblings, 1 reply; 11+ messages in thread
From: Yafang Shao @ 2024-05-07 13:32 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: David Vernet, ast, daniel, john.fastabend, andrii, martin.lau,
eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa, bpf
On Tue, May 7, 2024 at 11:38 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Sun, May 5, 2024 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > Add three new kfuncs for the bits iterator:
> > - bpf_iter_bits_new
> > Initialize a new bits iterator for a given memory area. Due to the
> > limitation of bpf memalloc, the max number of bits that can be iterated
> > over is limited to (4096 * 8).
> > - bpf_iter_bits_next
> > Get the next bit in a bpf_iter_bits
> > - bpf_iter_bits_destroy
> > Destroy a bpf_iter_bits
> >
> > The bits iterator facilitates the iteration of the bits of a memory area,
> > such as cpumask. It can be used in any context and on any address.
> >
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > ---
> > kernel/bpf/helpers.c | 140 +++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 140 insertions(+)
> >
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 2a69a9a36c0f..83b2a02f795f 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -2744,6 +2744,143 @@ __bpf_kfunc void bpf_preempt_enable(void)
> > preempt_enable();
> > }
> >
> > +struct bpf_iter_bits {
> > + __u64 __opaque[2];
> > +} __aligned(8);
> > +
> > +struct bpf_iter_bits_kern {
> > + union {
> > + unsigned long *bits;
> > + unsigned long bits_copy;
> > + };
> > + u32 nr_bits;
> > + int bit;
> > +} __aligned(8);
> > +
> > +/**
> > + * bpf_iter_bits_new() - Initialize a new bits iterator for a given memory area
> > + * @it: The new bpf_iter_bits to be created
> > + * @unsafe_ptr__ign: A ponter pointing to a memory area to be iterated over
>
> typo: pointer
Thanks for the fix and the other fixes.
>
> > + * @nr_bits: The number of bits to be iterated over. Due to the limitation of
> > + * memalloc, it can't greater than (4096 * 8).
>
> typo: can't be greater
>
> > + *
> > + * This function initializes a new bpf_iter_bits structure for iterating over
> > + * a memory area which is specified by the @unsafe_ptr__ign and @nr_bits. It
> > + * copy the data of the memory area to the newly created bpf_iter_bits @it for
>
> s/copy/copies/
>
> > + * subsequent iteration operations.
> > + *
> > + * On success, 0 is returned. On failure, ERR is returned.
> > + */
> > +__bpf_kfunc int
> > +bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign, u32 nr_bits)
> > +{
> > + struct bpf_iter_bits_kern *kit = (void *)it;
> > + u32 words = BITS_TO_LONGS(nr_bits);
> > + u32 size = BITS_TO_BYTES(nr_bits);
> > + u32 left, offset;
> > + int err;
> > +
> > + BUILD_BUG_ON(sizeof(struct bpf_iter_bits_kern) != sizeof(struct bpf_iter_bits));
> > + BUILD_BUG_ON(__alignof__(struct bpf_iter_bits_kern) !=
> > + __alignof__(struct bpf_iter_bits));
> > +
> > + if (!unsafe_ptr__ign || !nr_bits) {
> > + kit->bits = NULL;
> > + return -EINVAL;
> > + }
> > +
> > + kit->nr_bits = 0;
> > + kit->bits_copy = 0;
> > + /* Optimization for u64/u32 mask */
> > + if (nr_bits <= 64) {
> > + /* For big-endian, we must calculate the offset */
> > + offset = IS_ENABLED(CONFIG_S390) ? sizeof(u64) - size : 0;
>
> S390 isn't the only big-endian architecture, it's wrong to hard-code just S390
>
> there is __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ check throughout the
> kernel to do this detection
I missed that. will check it.
>
> > +
> > + err = bpf_probe_read_kernel_common(((char *)&kit->bits_copy) + offset,
> > + size, unsafe_ptr__ign);
> > + if (err)
> > + return -EFAULT;
>
> I'd rewrite the above to something like (not tested, but should give
> the right idea):
>
> long bits = 0;
>
> err = bpf_probe_read_kernel_common(&bits, size, unsafe_ptr__ign);
> if (err)
> return -EFAULT;
>
> #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> bits = __swab64(bits);
> #endif
>
> /* deal with bit mask of weird size, ensuring upper bits are zero */
> bits <<= 64 - nr_bits;
> bits >>= 64 - nr_bits;
>
> kit->bits_copy = bits;
>
>
> This should take care of both big-endianness, and non-multiple-of-8
> sized bitmasks (I think, we need tests).
looks good, will change it.
>
> pw-bot: cr
>
>
> > +
> > + kit->nr_bits = nr_bits;
> > + kit->bit = -1;
> > + return 0;
> > + }
> > +
> > + /* Fallback to memalloc */
> > + kit->bits = bpf_mem_alloc(&bpf_global_ma, size);
> > + if (!kit->bits)
> > + return -ENOMEM;
> > +
> > + err = bpf_probe_read_kernel_common(kit->bits, words * sizeof(u64), unsafe_ptr__ign);
> > + if (err) {
> > + bpf_mem_free(&bpf_global_ma, kit->bits);
> > + return err;
> > + }
> > +
> > + /* long-aligned */
> > + left = size & (sizeof(u64) - 1);
> > + if (!left)
> > + goto out;
> > +
> > + offset = IS_ENABLED(CONFIG_S390) ? sizeof(u64) - left : 0;
> > + err = bpf_probe_read_kernel_common((char *)(kit->bits + words - 1) + offset, left,
> > + unsafe_ptr__ign + (words - 1) * sizeof(u64));
> > + if (err) {
> > + bpf_mem_free(&bpf_global_ma, kit->bits);
> > + return err;
> > + }
>
> tbh, I'm not sure what's the desired behavior here is. David (cc'ed),
> you were dealing with cpumasks, how is the bit mask specified there?
> Is it considered to be an long[] array or byte[] array? And how is
> that working on big-endian, because I think it makes a difference?
> Please take a look, thanks.
The function find_next_bit() requires the pointer to be of type
"unsigned long *", hence, we must ensure consistency by converting it
here as well. As cpumask represents a bitmap and is always of type
"unsigned long *", it remains unaffected by endianness considerations.
>
> > +
> > +out:
> > + kit->nr_bits = nr_bits;
> > + kit->bit = -1;
> > + return 0;
> > +}
> > +
> > +/**
> > + * bpf_iter_bits_next() - Get the next bit in a bpf_iter_bits
> > + * @it: The bpf_iter_bits to be checked
> > + *
> > + * This function returns a pointer to a number representing the value of the
> > + * next bit in the bits.
> > + *
> > + * If there are no further bit available, it returns NULL.
> > + */
> > +__bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it)
> > +{
> > + struct bpf_iter_bits_kern *kit = (void *)it;
> > + u32 nr_bits = kit->nr_bits;
> > + const unsigned long *bits;
> > + int bit;
> > +
> > + if (nr_bits == 0)
> > + return NULL;
> > +
> > + bits = nr_bits <= 64 ? &kit->bits_copy : kit->bits;
> > + bit = find_next_bit(bits, nr_bits, kit->bit + 1);
> > + if (bit >= nr_bits) {
> > + kit->nr_bits = 0;
> > + return NULL;
> > + }
> > +
> > + kit->bit = bit;
> > + return &kit->bit;
> > +}
> > +
> > +/**
> > + * bpf_iter_bits_destroy() - Destroy a bpf_iter_bits
> > + * @it: The bpf_iter_bits to be destroyed
> > + *
> > + * Destroy the resource associated with the bpf_iter_bits.
> > + */
> > +__bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
> > +{
> > + struct bpf_iter_bits_kern *kit = (void *)it;
> > +
> > + if (kit->nr_bits <= 64)
> > + return;
> > + bpf_mem_free(&bpf_global_ma, kit->bits);
> > +}
> > +
> > __bpf_kfunc_end_defs();
> >
> > BTF_KFUNCS_START(generic_btf_ids)
> > @@ -2826,6 +2963,9 @@ BTF_ID_FLAGS(func, bpf_wq_set_callback_impl)
> > BTF_ID_FLAGS(func, bpf_wq_start)
> > BTF_ID_FLAGS(func, bpf_preempt_disable)
> > BTF_ID_FLAGS(func, bpf_preempt_enable)
> > +BTF_ID_FLAGS(func, bpf_iter_bits_new, KF_ITER_NEW)
> > +BTF_ID_FLAGS(func, bpf_iter_bits_next, KF_ITER_NEXT | KF_RET_NULL)
> > +BTF_ID_FLAGS(func, bpf_iter_bits_destroy, KF_ITER_DESTROY)
> > BTF_KFUNCS_END(common_btf_ids)
> >
> > static const struct btf_kfunc_id_set common_kfunc_set = {
> > --
> > 2.30.1 (Apple Git-130)
> >
--
Regards
Yafang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter
2024-05-07 3:42 ` Andrii Nakryiko
@ 2024-05-07 13:38 ` Yafang Shao
2024-05-07 17:11 ` Andrii Nakryiko
0 siblings, 1 reply; 11+ messages in thread
From: Yafang Shao @ 2024-05-07 13:38 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, bpf
On Tue, May 7, 2024 at 11:42 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Sun, May 5, 2024 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > Add test cases for the bits iter:
> > - positive case
> > - bit mask smaller than 8 bytes
> > - a typical case of having 8-byte bit mask
> > - another typical case where bit mask is > 8 bytes
> > - the index of set bit
> >
> > - nagative cases
> > - bpf_iter_bits_destroy() is required after calling
> > bpf_iter_bits_new()
> > - bpf_iter_bits_destroy() can only destroy an initialized iter
> > - bpf_iter_bits_next() must use an initialized iter
> >
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > ---
> > .../selftests/bpf/prog_tests/verifier.c | 2 +
> > .../selftests/bpf/progs/verifier_bits_iter.c | 160 ++++++++++++++++++
> > 2 files changed, 162 insertions(+)
> > create mode 100644 tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> > index c4f9f306646e..7e04ecaaa20a 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> > @@ -84,6 +84,7 @@
> > #include "verifier_xadd.skel.h"
> > #include "verifier_xdp.skel.h"
> > #include "verifier_xdp_direct_packet_access.skel.h"
> > +#include "verifier_bits_iter.skel.h"
> >
> > #define MAX_ENTRIES 11
> >
> > @@ -198,6 +199,7 @@ void test_verifier_var_off(void) { RUN(verifier_var_off); }
> > void test_verifier_xadd(void) { RUN(verifier_xadd); }
> > void test_verifier_xdp(void) { RUN(verifier_xdp); }
> > void test_verifier_xdp_direct_packet_access(void) { RUN(verifier_xdp_direct_packet_access); }
> > +void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); }
> >
> > static int init_test_val_map(struct bpf_object *obj, char *map_name)
> > {
> > diff --git a/tools/testing/selftests/bpf/progs/verifier_bits_iter.c b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > new file mode 100644
> > index 000000000000..2f7b62b25638
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > @@ -0,0 +1,160 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright (c) 2024 Yafang Shao <laoar.shao@gmail.com> */
> > +
> > +#include "vmlinux.h"
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +
> > +#include "bpf_misc.h"
> > +#include "task_kfunc_common.h"
> > +
> > +char _license[] SEC("license") = "GPL";
> > +
> > +int bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign,
> > + u32 nr_bits) __ksym __weak;
> > +int *bpf_iter_bits_next(struct bpf_iter_bits *it) __ksym __weak;
> > +void bpf_iter_bits_destroy(struct bpf_iter_bits *it) __ksym __weak;
> > +
> > +SEC("iter.s/cgroup")
> > +__description("bits iter without destroy")
> > +__failure __msg("Unreleased reference")
> > +int BPF_PROG(no_destroy, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > +{
> > + struct bpf_iter_bits it;
> > + struct task_struct *p;
> > +
> > + p = bpf_task_from_pid(1);
> > + if (!p)
> > + return 1;
> > +
> > + bpf_iter_bits_new(&it, p->cpus_ptr, 8192);
> > +
> > + bpf_iter_bits_next(&it);
> > + bpf_task_release(p);
> > + return 0;
> > +}
> > +
> > +SEC("iter/cgroup")
> > +__description("bits iter with uninitialized iter in ->next()")
> > +__failure __msg("expected an initialized iter_bits as arg #1")
> > +int BPF_PROG(next_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > +{
> > + struct bpf_iter_bits *it = NULL;
> > +
> > + bpf_iter_bits_next(it);
> > + return 0;
> > +}
> > +
> > +SEC("iter/cgroup")
> > +__description("bits iter with uninitialized iter in ->destroy()")
> > +__failure __msg("expected an initialized iter_bits as arg #1")
> > +int BPF_PROG(destroy_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > +{
> > + struct bpf_iter_bits it = {};
> > +
> > + bpf_iter_bits_destroy(&it);
> > + return 0;
> > +}
> > +
> > +SEC("syscall")
> > +__description("bits copy 32")
> > +__success __retval(10)
> > +int bits_copy32(void)
> > +{
> > + /* 21 bits: --------------------- */
> > + u32 data = 0b11111101111101111100001000100101U;
>
> if you define this bit mask as an array of bytes, then you won't have
> to handle big-endian in the tests at all
This test case provides a clear example of iterating over data of type
u32, offering valuable guidance for users who need to perform such
iterations.
>
>
> > + int nr = 0, offset = 0;
> > + int *bit;
> > +
> > +#if defined(__TARGET_ARCH_s390)
> > + offset = sizeof(u32) - (21 + 7) / 8;
> > +#endif
> > + bpf_for_each(bits, bit, ((char *)&data) + offset, 21)
> > + nr++;
> > + return nr;
> > +}
> > +
> > +SEC("syscall")
> > +__description("bits copy 64")
> > +__success __retval(18)
> > +int bits_copy64(void)
> > +{
> > + /* 34 bits: ~-------- */
> > + u64 data = 0xffffefdf0f0f0f0fUL;
> > + int nr = 0, offset = 0;
> > + int *bit;
> > +
> > +#if defined(__TARGET_ARCH_s390)
> > + offset = sizeof(u64) - (34 + 7) / 8;
> > +#endif
> > +
> > + bpf_for_each(bits, bit, ((char *)&data) + offset, 34)
>
> see above about byte array, but if we define different (not as byte
> array but long[]), it would be cleaner to have
This test case demonstrates how to iterate over data of type u64.
>
> #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> u64 data = 0x......UL;
> #else
> u64 data = 0x......UL;
> #endif
looks good.
>
> wherer we'd hard-code bit masks in proper endianness in one place and
> then just do a clean `bpf_for_each(bits, bit, &data, <len>) {}` calls
>
> > + nr++;
> > + return nr;
> > +}
> > +
> > +SEC("syscall")
> > +__description("bits memalloc long-aligned")
> > +__success __retval(32) /* 16 * 2 */
> > +int bits_memalloc(void)
> > +{
> > + char data[16];
> > + int nr = 0;
> > + int *bit;
> > +
> > + __builtin_memset(&data, 0x48, sizeof(data));
> > + bpf_for_each(bits, bit, &data, sizeof(data) * 8)
> > + nr++;
> > + return nr;
> > +}
> > +
> > +SEC("syscall")
> > +__description("bits memalloc non-long-aligned")
> > +__success __retval(85) /* 17 * 5*/
> > +int bits_memalloc_non_aligned(void)
> > +{
> > + char data[17];
> > + int nr = 0;
> > + int *bit;
> > +
> > + __builtin_memset(&data, 0x1f, sizeof(data));
> > + bpf_for_each(bits, bit, &data, sizeof(data) * 8)
> > + nr++;
> > + return nr;
> > +}
> > +
> > +SEC("syscall")
> > +__description("bits memalloc non-aligned-bits")
> > +__success __retval(27) /* 8 * 3 + 3 */
> > +int bits_memalloc_non_aligned_bits(void)
> > +{
> > + char data[16];
> > + int nr = 0;
> > + int *bit;
> > +
> > + __builtin_memset(&data, 0x31, sizeof(data));
> > + /* Different with all other bytes */
> > + data[8] = 0xf7;
> > +
> > + bpf_for_each(bits, bit, &data, 68)
> > + nr++;
> > + return nr;
> > +}
> > +
> > +
> > +SEC("syscall")
> > +__description("bit index")
> > +__success __retval(8)
> > +int bit_index(void)
> > +{
> > + u64 data = 0x100;
> > + int bit_idx = 0;
> > + int *bit;
> > +
> > + bpf_for_each(bits, bit, &data, 64) {
> > + if (*bit == 0)
> > + continue;
> > + bit_idx = *bit;
> > + }
> > + return bit_idx;
> > +}
> > --
> > 2.30.1 (Apple Git-130)
> >
--
Regards
Yafang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v7 bpf-next 1/2] bpf: Add bits iterator
2024-05-07 13:32 ` Yafang Shao
@ 2024-05-07 17:09 ` Andrii Nakryiko
0 siblings, 0 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2024-05-07 17:09 UTC (permalink / raw)
To: Yafang Shao
Cc: David Vernet, ast, daniel, john.fastabend, andrii, martin.lau,
eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa, bpf
On Tue, May 7, 2024 at 6:32 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Tue, May 7, 2024 at 11:38 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Sun, May 5, 2024 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > Add three new kfuncs for the bits iterator:
> > > - bpf_iter_bits_new
> > > Initialize a new bits iterator for a given memory area. Due to the
> > > limitation of bpf memalloc, the max number of bits that can be iterated
> > > over is limited to (4096 * 8).
> > > - bpf_iter_bits_next
> > > Get the next bit in a bpf_iter_bits
> > > - bpf_iter_bits_destroy
> > > Destroy a bpf_iter_bits
> > >
> > > The bits iterator facilitates the iteration of the bits of a memory area,
> > > such as cpumask. It can be used in any context and on any address.
> > >
> > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > > ---
> > > kernel/bpf/helpers.c | 140 +++++++++++++++++++++++++++++++++++++++++++
> > > 1 file changed, 140 insertions(+)
> > >
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index 2a69a9a36c0f..83b2a02f795f 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -2744,6 +2744,143 @@ __bpf_kfunc void bpf_preempt_enable(void)
> > > preempt_enable();
> > > }
> > >
> > > +struct bpf_iter_bits {
> > > + __u64 __opaque[2];
> > > +} __aligned(8);
> > > +
> > > +struct bpf_iter_bits_kern {
> > > + union {
> > > + unsigned long *bits;
> > > + unsigned long bits_copy;
> > > + };
> > > + u32 nr_bits;
> > > + int bit;
> > > +} __aligned(8);
> > > +
> > > +/**
> > > + * bpf_iter_bits_new() - Initialize a new bits iterator for a given memory area
> > > + * @it: The new bpf_iter_bits to be created
> > > + * @unsafe_ptr__ign: A ponter pointing to a memory area to be iterated over
> >
> > typo: pointer
>
> Thanks for the fix and the other fixes.
>
> >
> > > + * @nr_bits: The number of bits to be iterated over. Due to the limitation of
> > > + * memalloc, it can't greater than (4096 * 8).
> >
> > typo: can't be greater
> >
> > > + *
> > > + * This function initializes a new bpf_iter_bits structure for iterating over
> > > + * a memory area which is specified by the @unsafe_ptr__ign and @nr_bits. It
> > > + * copy the data of the memory area to the newly created bpf_iter_bits @it for
> >
> > s/copy/copies/
> >
> > > + * subsequent iteration operations.
> > > + *
> > > + * On success, 0 is returned. On failure, ERR is returned.
> > > + */
> > > +__bpf_kfunc int
> > > +bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign, u32 nr_bits)
> > > +{
> > > + struct bpf_iter_bits_kern *kit = (void *)it;
> > > + u32 words = BITS_TO_LONGS(nr_bits);
> > > + u32 size = BITS_TO_BYTES(nr_bits);
> > > + u32 left, offset;
> > > + int err;
> > > +
> > > + BUILD_BUG_ON(sizeof(struct bpf_iter_bits_kern) != sizeof(struct bpf_iter_bits));
> > > + BUILD_BUG_ON(__alignof__(struct bpf_iter_bits_kern) !=
> > > + __alignof__(struct bpf_iter_bits));
> > > +
> > > + if (!unsafe_ptr__ign || !nr_bits) {
> > > + kit->bits = NULL;
> > > + return -EINVAL;
> > > + }
> > > +
> > > + kit->nr_bits = 0;
> > > + kit->bits_copy = 0;
> > > + /* Optimization for u64/u32 mask */
> > > + if (nr_bits <= 64) {
> > > + /* For big-endian, we must calculate the offset */
> > > + offset = IS_ENABLED(CONFIG_S390) ? sizeof(u64) - size : 0;
> >
> > S390 isn't the only big-endian architecture, it's wrong to hard-code just S390
> >
> > there is __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ check throughout the
> > kernel to do this detection
>
> I missed that. will check it.
>
> >
> > > +
> > > + err = bpf_probe_read_kernel_common(((char *)&kit->bits_copy) + offset,
> > > + size, unsafe_ptr__ign);
> > > + if (err)
> > > + return -EFAULT;
> >
> > I'd rewrite the above to something like (not tested, but should give
> > the right idea):
> >
> > long bits = 0;
> >
> > err = bpf_probe_read_kernel_common(&bits, size, unsafe_ptr__ign);
> > if (err)
> > return -EFAULT;
> >
> > #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> > bits = __swab64(bits);
> > #endif
> >
> > /* deal with bit mask of weird size, ensuring upper bits are zero */
> > bits <<= 64 - nr_bits;
> > bits >>= 64 - nr_bits;
> >
> > kit->bits_copy = bits;
> >
> >
> > This should take care of both big-endianness, and non-multiple-of-8
> > sized bitmasks (I think, we need tests).
>
> looks good, will change it.
>
> >
> > pw-bot: cr
> >
> >
> > > +
> > > + kit->nr_bits = nr_bits;
> > > + kit->bit = -1;
> > > + return 0;
> > > + }
> > > +
> > > + /* Fallback to memalloc */
> > > + kit->bits = bpf_mem_alloc(&bpf_global_ma, size);
> > > + if (!kit->bits)
> > > + return -ENOMEM;
> > > +
> > > + err = bpf_probe_read_kernel_common(kit->bits, words * sizeof(u64), unsafe_ptr__ign);
> > > + if (err) {
> > > + bpf_mem_free(&bpf_global_ma, kit->bits);
> > > + return err;
> > > + }
> > > +
> > > + /* long-aligned */
> > > + left = size & (sizeof(u64) - 1);
> > > + if (!left)
> > > + goto out;
> > > +
> > > + offset = IS_ENABLED(CONFIG_S390) ? sizeof(u64) - left : 0;
> > > + err = bpf_probe_read_kernel_common((char *)(kit->bits + words - 1) + offset, left,
> > > + unsafe_ptr__ign + (words - 1) * sizeof(u64));
> > > + if (err) {
> > > + bpf_mem_free(&bpf_global_ma, kit->bits);
> > > + return err;
> > > + }
> >
> > tbh, I'm not sure what's the desired behavior here is. David (cc'ed),
> > you were dealing with cpumasks, how is the bit mask specified there?
> > Is it considered to be an long[] array or byte[] array? And how is
> > that working on big-endian, because I think it makes a difference?
> > Please take a look, thanks.
>
> The function find_next_bit() requires the pointer to be of type
> "unsigned long *", hence, we must ensure consistency by converting it
> here as well. As cpumask represents a bitmap and is always of type
> "unsigned long *", it remains unaffected by endianness considerations.
>
Right, but the question is whether this iterator should make the same
simplifying assumption or not? I think the motivation for this
iterator was the ability to iterate over CPU masks, so I'm asking (and
that's why I cc'ed David) what we should do to make it work well for
CPU masks.
> >
> > > +
> > > +out:
> > > + kit->nr_bits = nr_bits;
> > > + kit->bit = -1;
> > > + return 0;
> > > +}
> > > +
> > > +/**
> > > + * bpf_iter_bits_next() - Get the next bit in a bpf_iter_bits
> > > + * @it: The bpf_iter_bits to be checked
> > > + *
> > > + * This function returns a pointer to a number representing the value of the
> > > + * next bit in the bits.
> > > + *
> > > + * If there are no further bit available, it returns NULL.
> > > + */
> > > +__bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it)
> > > +{
> > > + struct bpf_iter_bits_kern *kit = (void *)it;
> > > + u32 nr_bits = kit->nr_bits;
> > > + const unsigned long *bits;
> > > + int bit;
> > > +
> > > + if (nr_bits == 0)
> > > + return NULL;
> > > +
> > > + bits = nr_bits <= 64 ? &kit->bits_copy : kit->bits;
> > > + bit = find_next_bit(bits, nr_bits, kit->bit + 1);
> > > + if (bit >= nr_bits) {
> > > + kit->nr_bits = 0;
> > > + return NULL;
> > > + }
> > > +
> > > + kit->bit = bit;
> > > + return &kit->bit;
> > > +}
> > > +
> > > +/**
> > > + * bpf_iter_bits_destroy() - Destroy a bpf_iter_bits
> > > + * @it: The bpf_iter_bits to be destroyed
> > > + *
> > > + * Destroy the resource associated with the bpf_iter_bits.
> > > + */
> > > +__bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
> > > +{
> > > + struct bpf_iter_bits_kern *kit = (void *)it;
> > > +
> > > + if (kit->nr_bits <= 64)
> > > + return;
> > > + bpf_mem_free(&bpf_global_ma, kit->bits);
> > > +}
> > > +
> > > __bpf_kfunc_end_defs();
> > >
> > > BTF_KFUNCS_START(generic_btf_ids)
> > > @@ -2826,6 +2963,9 @@ BTF_ID_FLAGS(func, bpf_wq_set_callback_impl)
> > > BTF_ID_FLAGS(func, bpf_wq_start)
> > > BTF_ID_FLAGS(func, bpf_preempt_disable)
> > > BTF_ID_FLAGS(func, bpf_preempt_enable)
> > > +BTF_ID_FLAGS(func, bpf_iter_bits_new, KF_ITER_NEW)
> > > +BTF_ID_FLAGS(func, bpf_iter_bits_next, KF_ITER_NEXT | KF_RET_NULL)
> > > +BTF_ID_FLAGS(func, bpf_iter_bits_destroy, KF_ITER_DESTROY)
> > > BTF_KFUNCS_END(common_btf_ids)
> > >
> > > static const struct btf_kfunc_id_set common_kfunc_set = {
> > > --
> > > 2.30.1 (Apple Git-130)
> > >
>
>
>
> --
> Regards
> Yafang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter
2024-05-07 13:38 ` Yafang Shao
@ 2024-05-07 17:11 ` Andrii Nakryiko
2024-05-09 2:11 ` Yafang Shao
0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2024-05-07 17:11 UTC (permalink / raw)
To: Yafang Shao
Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, bpf
On Tue, May 7, 2024 at 6:39 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Tue, May 7, 2024 at 11:42 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Sun, May 5, 2024 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > Add test cases for the bits iter:
> > > - positive case
> > > - bit mask smaller than 8 bytes
> > > - a typical case of having 8-byte bit mask
> > > - another typical case where bit mask is > 8 bytes
> > > - the index of set bit
> > >
> > > - nagative cases
> > > - bpf_iter_bits_destroy() is required after calling
> > > bpf_iter_bits_new()
> > > - bpf_iter_bits_destroy() can only destroy an initialized iter
> > > - bpf_iter_bits_next() must use an initialized iter
> > >
> > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > > ---
> > > .../selftests/bpf/prog_tests/verifier.c | 2 +
> > > .../selftests/bpf/progs/verifier_bits_iter.c | 160 ++++++++++++++++++
> > > 2 files changed, 162 insertions(+)
> > > create mode 100644 tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > >
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> > > index c4f9f306646e..7e04ecaaa20a 100644
> > > --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> > > +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> > > @@ -84,6 +84,7 @@
> > > #include "verifier_xadd.skel.h"
> > > #include "verifier_xdp.skel.h"
> > > #include "verifier_xdp_direct_packet_access.skel.h"
> > > +#include "verifier_bits_iter.skel.h"
> > >
> > > #define MAX_ENTRIES 11
> > >
> > > @@ -198,6 +199,7 @@ void test_verifier_var_off(void) { RUN(verifier_var_off); }
> > > void test_verifier_xadd(void) { RUN(verifier_xadd); }
> > > void test_verifier_xdp(void) { RUN(verifier_xdp); }
> > > void test_verifier_xdp_direct_packet_access(void) { RUN(verifier_xdp_direct_packet_access); }
> > > +void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); }
> > >
> > > static int init_test_val_map(struct bpf_object *obj, char *map_name)
> > > {
> > > diff --git a/tools/testing/selftests/bpf/progs/verifier_bits_iter.c b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > > new file mode 100644
> > > index 000000000000..2f7b62b25638
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > > @@ -0,0 +1,160 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +/* Copyright (c) 2024 Yafang Shao <laoar.shao@gmail.com> */
> > > +
> > > +#include "vmlinux.h"
> > > +#include <bpf/bpf_helpers.h>
> > > +#include <bpf/bpf_tracing.h>
> > > +
> > > +#include "bpf_misc.h"
> > > +#include "task_kfunc_common.h"
> > > +
> > > +char _license[] SEC("license") = "GPL";
> > > +
> > > +int bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign,
> > > + u32 nr_bits) __ksym __weak;
> > > +int *bpf_iter_bits_next(struct bpf_iter_bits *it) __ksym __weak;
> > > +void bpf_iter_bits_destroy(struct bpf_iter_bits *it) __ksym __weak;
> > > +
> > > +SEC("iter.s/cgroup")
> > > +__description("bits iter without destroy")
> > > +__failure __msg("Unreleased reference")
> > > +int BPF_PROG(no_destroy, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > > +{
> > > + struct bpf_iter_bits it;
> > > + struct task_struct *p;
> > > +
> > > + p = bpf_task_from_pid(1);
> > > + if (!p)
> > > + return 1;
> > > +
> > > + bpf_iter_bits_new(&it, p->cpus_ptr, 8192);
> > > +
> > > + bpf_iter_bits_next(&it);
> > > + bpf_task_release(p);
> > > + return 0;
> > > +}
> > > +
> > > +SEC("iter/cgroup")
> > > +__description("bits iter with uninitialized iter in ->next()")
> > > +__failure __msg("expected an initialized iter_bits as arg #1")
> > > +int BPF_PROG(next_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > > +{
> > > + struct bpf_iter_bits *it = NULL;
> > > +
> > > + bpf_iter_bits_next(it);
> > > + return 0;
> > > +}
> > > +
> > > +SEC("iter/cgroup")
> > > +__description("bits iter with uninitialized iter in ->destroy()")
> > > +__failure __msg("expected an initialized iter_bits as arg #1")
> > > +int BPF_PROG(destroy_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > > +{
> > > + struct bpf_iter_bits it = {};
> > > +
> > > + bpf_iter_bits_destroy(&it);
> > > + return 0;
> > > +}
> > > +
> > > +SEC("syscall")
> > > +__description("bits copy 32")
> > > +__success __retval(10)
> > > +int bits_copy32(void)
> > > +{
> > > + /* 21 bits: --------------------- */
> > > + u32 data = 0b11111101111101111100001000100101U;
> >
> > if you define this bit mask as an array of bytes, then you won't have
> > to handle big-endian in the tests at all
>
> This test case provides a clear example of iterating over data of type
> u32, offering valuable guidance for users who need to perform such
> iterations.
>
> >
> >
> > > + int nr = 0, offset = 0;
> > > + int *bit;
> > > +
> > > +#if defined(__TARGET_ARCH_s390)
> > > + offset = sizeof(u32) - (21 + 7) / 8;
> > > +#endif
> > > + bpf_for_each(bits, bit, ((char *)&data) + offset, 21)
> > > + nr++;
> > > + return nr;
> > > +}
> > > +
> > > +SEC("syscall")
> > > +__description("bits copy 64")
> > > +__success __retval(18)
> > > +int bits_copy64(void)
> > > +{
> > > + /* 34 bits: ~-------- */
> > > + u64 data = 0xffffefdf0f0f0f0fUL;
> > > + int nr = 0, offset = 0;
> > > + int *bit;
> > > +
> > > +#if defined(__TARGET_ARCH_s390)
> > > + offset = sizeof(u64) - (34 + 7) / 8;
> > > +#endif
> > > +
> > > + bpf_for_each(bits, bit, ((char *)&data) + offset, 34)
> >
> > see above about byte array, but if we define different (not as byte
> > array but long[]), it would be cleaner to have
>
> This test case demonstrates how to iterate over data of type u64.
>
> >
> > #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> > u64 data = 0x......UL;
> > #else
> > u64 data = 0x......UL;
> > #endif
>
> looks good.
>
Please hold off on sending a new revision until we figure out what the
contract should be. Because I feel like it's a (relatively) big
decision whether a bit mask is treated as an array of bytes or as an
array of longs. For little-endian it makes no difference, but for
big-endian it's a big difference and has usability and performance
implications.
> >
> > wherer we'd hard-code bit masks in proper endianness in one place and
> > then just do a clean `bpf_for_each(bits, bit, &data, <len>) {}` calls
> >
> > > + nr++;
> > > + return nr;
> > > +}
> > > +
> > > +SEC("syscall")
> > > +__description("bits memalloc long-aligned")
> > > +__success __retval(32) /* 16 * 2 */
> > > +int bits_memalloc(void)
> > > +{
> > > + char data[16];
> > > + int nr = 0;
> > > + int *bit;
> > > +
> > > + __builtin_memset(&data, 0x48, sizeof(data));
> > > + bpf_for_each(bits, bit, &data, sizeof(data) * 8)
> > > + nr++;
> > > + return nr;
> > > +}
> > > +
> > > +SEC("syscall")
> > > +__description("bits memalloc non-long-aligned")
> > > +__success __retval(85) /* 17 * 5*/
> > > +int bits_memalloc_non_aligned(void)
> > > +{
> > > + char data[17];
> > > + int nr = 0;
> > > + int *bit;
> > > +
> > > + __builtin_memset(&data, 0x1f, sizeof(data));
> > > + bpf_for_each(bits, bit, &data, sizeof(data) * 8)
> > > + nr++;
> > > + return nr;
> > > +}
> > > +
> > > +SEC("syscall")
> > > +__description("bits memalloc non-aligned-bits")
> > > +__success __retval(27) /* 8 * 3 + 3 */
> > > +int bits_memalloc_non_aligned_bits(void)
> > > +{
> > > + char data[16];
> > > + int nr = 0;
> > > + int *bit;
> > > +
> > > + __builtin_memset(&data, 0x31, sizeof(data));
> > > + /* Different with all other bytes */
> > > + data[8] = 0xf7;
> > > +
> > > + bpf_for_each(bits, bit, &data, 68)
> > > + nr++;
> > > + return nr;
> > > +}
> > > +
> > > +
> > > +SEC("syscall")
> > > +__description("bit index")
> > > +__success __retval(8)
> > > +int bit_index(void)
> > > +{
> > > + u64 data = 0x100;
> > > + int bit_idx = 0;
> > > + int *bit;
> > > +
> > > + bpf_for_each(bits, bit, &data, 64) {
> > > + if (*bit == 0)
> > > + continue;
> > > + bit_idx = *bit;
> > > + }
> > > + return bit_idx;
> > > +}
> > > --
> > > 2.30.1 (Apple Git-130)
> > >
>
>
>
> --
> Regards
> Yafang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter
2024-05-07 17:11 ` Andrii Nakryiko
@ 2024-05-09 2:11 ` Yafang Shao
2024-05-09 22:03 ` Andrii Nakryiko
0 siblings, 1 reply; 11+ messages in thread
From: Yafang Shao @ 2024-05-09 2:11 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, bpf
On Wed, May 8, 2024 at 1:12 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, May 7, 2024 at 6:39 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > On Tue, May 7, 2024 at 11:42 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Sun, May 5, 2024 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> > > >
> > > > Add test cases for the bits iter:
> > > > - positive case
> > > > - bit mask smaller than 8 bytes
> > > > - a typical case of having 8-byte bit mask
> > > > - another typical case where bit mask is > 8 bytes
> > > > - the index of set bit
> > > >
> > > > - nagative cases
> > > > - bpf_iter_bits_destroy() is required after calling
> > > > bpf_iter_bits_new()
> > > > - bpf_iter_bits_destroy() can only destroy an initialized iter
> > > > - bpf_iter_bits_next() must use an initialized iter
> > > >
> > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > > > ---
> > > > .../selftests/bpf/prog_tests/verifier.c | 2 +
> > > > .../selftests/bpf/progs/verifier_bits_iter.c | 160 ++++++++++++++++++
> > > > 2 files changed, 162 insertions(+)
> > > > create mode 100644 tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> > > > index c4f9f306646e..7e04ecaaa20a 100644
> > > > --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> > > > +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> > > > @@ -84,6 +84,7 @@
> > > > #include "verifier_xadd.skel.h"
> > > > #include "verifier_xdp.skel.h"
> > > > #include "verifier_xdp_direct_packet_access.skel.h"
> > > > +#include "verifier_bits_iter.skel.h"
> > > >
> > > > #define MAX_ENTRIES 11
> > > >
> > > > @@ -198,6 +199,7 @@ void test_verifier_var_off(void) { RUN(verifier_var_off); }
> > > > void test_verifier_xadd(void) { RUN(verifier_xadd); }
> > > > void test_verifier_xdp(void) { RUN(verifier_xdp); }
> > > > void test_verifier_xdp_direct_packet_access(void) { RUN(verifier_xdp_direct_packet_access); }
> > > > +void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); }
> > > >
> > > > static int init_test_val_map(struct bpf_object *obj, char *map_name)
> > > > {
> > > > diff --git a/tools/testing/selftests/bpf/progs/verifier_bits_iter.c b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > > > new file mode 100644
> > > > index 000000000000..2f7b62b25638
> > > > --- /dev/null
> > > > +++ b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > > > @@ -0,0 +1,160 @@
> > > > +// SPDX-License-Identifier: GPL-2.0-only
> > > > +/* Copyright (c) 2024 Yafang Shao <laoar.shao@gmail.com> */
> > > > +
> > > > +#include "vmlinux.h"
> > > > +#include <bpf/bpf_helpers.h>
> > > > +#include <bpf/bpf_tracing.h>
> > > > +
> > > > +#include "bpf_misc.h"
> > > > +#include "task_kfunc_common.h"
> > > > +
> > > > +char _license[] SEC("license") = "GPL";
> > > > +
> > > > +int bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign,
> > > > + u32 nr_bits) __ksym __weak;
> > > > +int *bpf_iter_bits_next(struct bpf_iter_bits *it) __ksym __weak;
> > > > +void bpf_iter_bits_destroy(struct bpf_iter_bits *it) __ksym __weak;
> > > > +
> > > > +SEC("iter.s/cgroup")
> > > > +__description("bits iter without destroy")
> > > > +__failure __msg("Unreleased reference")
> > > > +int BPF_PROG(no_destroy, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > > > +{
> > > > + struct bpf_iter_bits it;
> > > > + struct task_struct *p;
> > > > +
> > > > + p = bpf_task_from_pid(1);
> > > > + if (!p)
> > > > + return 1;
> > > > +
> > > > + bpf_iter_bits_new(&it, p->cpus_ptr, 8192);
> > > > +
> > > > + bpf_iter_bits_next(&it);
> > > > + bpf_task_release(p);
> > > > + return 0;
> > > > +}
> > > > +
> > > > +SEC("iter/cgroup")
> > > > +__description("bits iter with uninitialized iter in ->next()")
> > > > +__failure __msg("expected an initialized iter_bits as arg #1")
> > > > +int BPF_PROG(next_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > > > +{
> > > > + struct bpf_iter_bits *it = NULL;
> > > > +
> > > > + bpf_iter_bits_next(it);
> > > > + return 0;
> > > > +}
> > > > +
> > > > +SEC("iter/cgroup")
> > > > +__description("bits iter with uninitialized iter in ->destroy()")
> > > > +__failure __msg("expected an initialized iter_bits as arg #1")
> > > > +int BPF_PROG(destroy_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > > > +{
> > > > + struct bpf_iter_bits it = {};
> > > > +
> > > > + bpf_iter_bits_destroy(&it);
> > > > + return 0;
> > > > +}
> > > > +
> > > > +SEC("syscall")
> > > > +__description("bits copy 32")
> > > > +__success __retval(10)
> > > > +int bits_copy32(void)
> > > > +{
> > > > + /* 21 bits: --------------------- */
> > > > + u32 data = 0b11111101111101111100001000100101U;
> > >
> > > if you define this bit mask as an array of bytes, then you won't have
> > > to handle big-endian in the tests at all
> >
> > This test case provides a clear example of iterating over data of type
> > u32, offering valuable guidance for users who need to perform such
> > iterations.
> >
> > >
> > >
> > > > + int nr = 0, offset = 0;
> > > > + int *bit;
> > > > +
> > > > +#if defined(__TARGET_ARCH_s390)
> > > > + offset = sizeof(u32) - (21 + 7) / 8;
> > > > +#endif
> > > > + bpf_for_each(bits, bit, ((char *)&data) + offset, 21)
> > > > + nr++;
> > > > + return nr;
> > > > +}
> > > > +
> > > > +SEC("syscall")
> > > > +__description("bits copy 64")
> > > > +__success __retval(18)
> > > > +int bits_copy64(void)
> > > > +{
> > > > + /* 34 bits: ~-------- */
> > > > + u64 data = 0xffffefdf0f0f0f0fUL;
> > > > + int nr = 0, offset = 0;
> > > > + int *bit;
> > > > +
> > > > +#if defined(__TARGET_ARCH_s390)
> > > > + offset = sizeof(u64) - (34 + 7) / 8;
> > > > +#endif
> > > > +
> > > > + bpf_for_each(bits, bit, ((char *)&data) + offset, 34)
> > >
> > > see above about byte array, but if we define different (not as byte
> > > array but long[]), it would be cleaner to have
> >
> > This test case demonstrates how to iterate over data of type u64.
> >
> > >
> > > #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> > > u64 data = 0x......UL;
> > > #else
> > > u64 data = 0x......UL;
> > > #endif
> >
> > looks good.
> >
>
> Please hold off on sending a new revision until we figure out what the
> contract should be. Because I feel like it's a (relatively) big
> decision whether a bit mask is treated as an array of bytes or as an
> array of longs. For little-endian it makes no difference, but for
> big-endian it's a big difference and has usability and performance
> implications.
Perhaps it would be advantageous to define the interface as follows:
bpf_iter_bits_new(struct bpf_iter_bits *it, const u64
*unsafe_ptr__ign, u32 words)
This approach eliminates the need to account for endianness.
--
Regards
Yafang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter
2024-05-09 2:11 ` Yafang Shao
@ 2024-05-09 22:03 ` Andrii Nakryiko
0 siblings, 0 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2024-05-09 22:03 UTC (permalink / raw)
To: Yafang Shao
Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, bpf
On Wed, May 8, 2024 at 7:11 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Wed, May 8, 2024 at 1:12 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, May 7, 2024 at 6:39 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > On Tue, May 7, 2024 at 11:42 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Sun, May 5, 2024 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> > > > >
> > > > > Add test cases for the bits iter:
> > > > > - positive case
> > > > > - bit mask smaller than 8 bytes
> > > > > - a typical case of having 8-byte bit mask
> > > > > - another typical case where bit mask is > 8 bytes
> > > > > - the index of set bit
> > > > >
> > > > > - nagative cases
> > > > > - bpf_iter_bits_destroy() is required after calling
> > > > > bpf_iter_bits_new()
> > > > > - bpf_iter_bits_destroy() can only destroy an initialized iter
> > > > > - bpf_iter_bits_next() must use an initialized iter
> > > > >
> > > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > > > > ---
> > > > > .../selftests/bpf/prog_tests/verifier.c | 2 +
> > > > > .../selftests/bpf/progs/verifier_bits_iter.c | 160 ++++++++++++++++++
> > > > > 2 files changed, 162 insertions(+)
> > > > > create mode 100644 tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > > > >
> > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> > > > > index c4f9f306646e..7e04ecaaa20a 100644
> > > > > --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> > > > > +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> > > > > @@ -84,6 +84,7 @@
> > > > > #include "verifier_xadd.skel.h"
> > > > > #include "verifier_xdp.skel.h"
> > > > > #include "verifier_xdp_direct_packet_access.skel.h"
> > > > > +#include "verifier_bits_iter.skel.h"
> > > > >
> > > > > #define MAX_ENTRIES 11
> > > > >
> > > > > @@ -198,6 +199,7 @@ void test_verifier_var_off(void) { RUN(verifier_var_off); }
> > > > > void test_verifier_xadd(void) { RUN(verifier_xadd); }
> > > > > void test_verifier_xdp(void) { RUN(verifier_xdp); }
> > > > > void test_verifier_xdp_direct_packet_access(void) { RUN(verifier_xdp_direct_packet_access); }
> > > > > +void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); }
> > > > >
> > > > > static int init_test_val_map(struct bpf_object *obj, char *map_name)
> > > > > {
> > > > > diff --git a/tools/testing/selftests/bpf/progs/verifier_bits_iter.c b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > > > > new file mode 100644
> > > > > index 000000000000..2f7b62b25638
> > > > > --- /dev/null
> > > > > +++ b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
> > > > > @@ -0,0 +1,160 @@
> > > > > +// SPDX-License-Identifier: GPL-2.0-only
> > > > > +/* Copyright (c) 2024 Yafang Shao <laoar.shao@gmail.com> */
> > > > > +
> > > > > +#include "vmlinux.h"
> > > > > +#include <bpf/bpf_helpers.h>
> > > > > +#include <bpf/bpf_tracing.h>
> > > > > +
> > > > > +#include "bpf_misc.h"
> > > > > +#include "task_kfunc_common.h"
> > > > > +
> > > > > +char _license[] SEC("license") = "GPL";
> > > > > +
> > > > > +int bpf_iter_bits_new(struct bpf_iter_bits *it, const void *unsafe_ptr__ign,
> > > > > + u32 nr_bits) __ksym __weak;
> > > > > +int *bpf_iter_bits_next(struct bpf_iter_bits *it) __ksym __weak;
> > > > > +void bpf_iter_bits_destroy(struct bpf_iter_bits *it) __ksym __weak;
> > > > > +
> > > > > +SEC("iter.s/cgroup")
> > > > > +__description("bits iter without destroy")
> > > > > +__failure __msg("Unreleased reference")
> > > > > +int BPF_PROG(no_destroy, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > > > > +{
> > > > > + struct bpf_iter_bits it;
> > > > > + struct task_struct *p;
> > > > > +
> > > > > + p = bpf_task_from_pid(1);
> > > > > + if (!p)
> > > > > + return 1;
> > > > > +
> > > > > + bpf_iter_bits_new(&it, p->cpus_ptr, 8192);
> > > > > +
> > > > > + bpf_iter_bits_next(&it);
> > > > > + bpf_task_release(p);
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +SEC("iter/cgroup")
> > > > > +__description("bits iter with uninitialized iter in ->next()")
> > > > > +__failure __msg("expected an initialized iter_bits as arg #1")
> > > > > +int BPF_PROG(next_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > > > > +{
> > > > > + struct bpf_iter_bits *it = NULL;
> > > > > +
> > > > > + bpf_iter_bits_next(it);
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +SEC("iter/cgroup")
> > > > > +__description("bits iter with uninitialized iter in ->destroy()")
> > > > > +__failure __msg("expected an initialized iter_bits as arg #1")
> > > > > +int BPF_PROG(destroy_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
> > > > > +{
> > > > > + struct bpf_iter_bits it = {};
> > > > > +
> > > > > + bpf_iter_bits_destroy(&it);
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +SEC("syscall")
> > > > > +__description("bits copy 32")
> > > > > +__success __retval(10)
> > > > > +int bits_copy32(void)
> > > > > +{
> > > > > + /* 21 bits: --------------------- */
> > > > > + u32 data = 0b11111101111101111100001000100101U;
> > > >
> > > > if you define this bit mask as an array of bytes, then you won't have
> > > > to handle big-endian in the tests at all
> > >
> > > This test case provides a clear example of iterating over data of type
> > > u32, offering valuable guidance for users who need to perform such
> > > iterations.
> > >
> > > >
> > > >
> > > > > + int nr = 0, offset = 0;
> > > > > + int *bit;
> > > > > +
> > > > > +#if defined(__TARGET_ARCH_s390)
> > > > > + offset = sizeof(u32) - (21 + 7) / 8;
> > > > > +#endif
> > > > > + bpf_for_each(bits, bit, ((char *)&data) + offset, 21)
> > > > > + nr++;
> > > > > + return nr;
> > > > > +}
> > > > > +
> > > > > +SEC("syscall")
> > > > > +__description("bits copy 64")
> > > > > +__success __retval(18)
> > > > > +int bits_copy64(void)
> > > > > +{
> > > > > + /* 34 bits: ~-------- */
> > > > > + u64 data = 0xffffefdf0f0f0f0fUL;
> > > > > + int nr = 0, offset = 0;
> > > > > + int *bit;
> > > > > +
> > > > > +#if defined(__TARGET_ARCH_s390)
> > > > > + offset = sizeof(u64) - (34 + 7) / 8;
> > > > > +#endif
> > > > > +
> > > > > + bpf_for_each(bits, bit, ((char *)&data) + offset, 34)
> > > >
> > > > see above about byte array, but if we define different (not as byte
> > > > array but long[]), it would be cleaner to have
> > >
> > > This test case demonstrates how to iterate over data of type u64.
> > >
> > > >
> > > > #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> > > > u64 data = 0x......UL;
> > > > #else
> > > > u64 data = 0x......UL;
> > > > #endif
> > >
> > > looks good.
> > >
> >
> > Please hold off on sending a new revision until we figure out what the
> > contract should be. Because I feel like it's a (relatively) big
> > decision whether a bit mask is treated as an array of bytes or as an
> > array of longs. For little-endian it makes no difference, but for
> > big-endian it's a big difference and has usability and performance
> > implications.
>
> Perhaps it would be advantageous to define the interface as follows:
>
> bpf_iter_bits_new(struct bpf_iter_bits *it, const u64
> *unsafe_ptr__ign, u32 words)
>
> This approach eliminates the need to account for endianness.
I don't mind that, if others don't have any opinion. Let's just
document that by "words" we mean 8-byte integers.
>
> --
> Regards
> Yafang
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-05-09 22:03 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-06 3:33 [PATCH v7 bpf-next 0/2] bpf: Add a generic bits iterator Yafang Shao
2024-05-06 3:33 ` [PATCH v7 bpf-next 1/2] bpf: Add " Yafang Shao
2024-05-07 3:38 ` Andrii Nakryiko
2024-05-07 13:32 ` Yafang Shao
2024-05-07 17:09 ` Andrii Nakryiko
2024-05-06 3:33 ` [PATCH v7 bpf-next 2/2] selftests/bpf: Add selftest for bits iter Yafang Shao
2024-05-07 3:42 ` Andrii Nakryiko
2024-05-07 13:38 ` Yafang Shao
2024-05-07 17:11 ` Andrii Nakryiko
2024-05-09 2:11 ` Yafang Shao
2024-05-09 22:03 ` Andrii Nakryiko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox