* [PATCH bpf-next v3 0/3] bpf: Add kfuncs for read-only string operations
@ 2025-03-24 12:03 Viktor Malik
2025-03-24 12:03 ` [PATCH bpf-next v3 1/3] " Viktor Malik
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Viktor Malik @ 2025-03-24 12:03 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Mykola Lysenko, Shuah Khan, Viktor Malik
String operations are commonly used in programming and BPF programs are
no exception. Since it is cumbersome to reimplement them over and over,
this series introduce kfuncs which provide the most common operations.
For now, we only limit ourselves to functions which do not copy memory
since these usually introduce undefined behaviour in case the
source/destination buffers overlap which would have to be prevented by
the verifier.
The kernel already contains implementations for all of these, however,
it is not possible to use them from BPF context. The main reason is that
the verifier is not able to check that it is safe to access the entire
string if the string size is not passed to the kfunc. Therefore, the
unbounded variants of the operations are open-coded with
__get_kernel_nofault used instead of plain dereference to make them
safe.
On the contrary, safety of the bounded variants can be checked by the
verifier so we reuse the kernel implementations which are sometimes
highly optimized in assembly.
The last patch of the series adds a benchmark for comparing performance
of the bounded and unbounded variants which shows that on architectures
with optimized bounded string functions (e.g. strnlen on arm64), the
performance benefit can be significant (140% for 4095B strings).
Changes in v3:
- Open-code unbounded variants with __get_kernel_nofault instead of
dereference (suggested by Alexei).
- Use the __sz suffix for size parameters in bounded variants (suggested
by Eduard and Alexei).
- Make tests more compact (suggested by Eduard).
- Add benchmark.
Viktor Malik (3):
bpf: Add kfuncs for read-only string operations
selftests/bpf: Add tests for string kfuncs
selftests/bpf: Add benchmark for bounded/unbounded string kfuncs
kernel/bpf/helpers.c | 299 ++++++++++++++++++
tools/testing/selftests/bpf/Makefile | 2 +
tools/testing/selftests/bpf/bench.c | 21 ++
.../bpf/benchs/bench_string_kfuncs.c | 259 +++++++++++++++
.../bpf/benchs/run_bench_string_kfuncs.sh | 34 ++
.../selftests/bpf/prog_tests/string_kfuncs.c | 10 +
.../selftests/bpf/progs/string_kfuncs.c | 58 ++++
.../selftests/bpf/progs/string_kfuncs_bench.c | 88 ++++++
8 files changed, 771 insertions(+)
create mode 100644 tools/testing/selftests/bpf/benchs/bench_string_kfuncs.c
create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_string_kfuncs.sh
create mode 100644 tools/testing/selftests/bpf/prog_tests/string_kfuncs.c
create mode 100644 tools/testing/selftests/bpf/progs/string_kfuncs.c
create mode 100644 tools/testing/selftests/bpf/progs/string_kfuncs_bench.c
--
2.48.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH bpf-next v3 1/3] bpf: Add kfuncs for read-only string operations
2025-03-24 12:03 [PATCH bpf-next v3 0/3] bpf: Add kfuncs for read-only string operations Viktor Malik
@ 2025-03-24 12:03 ` Viktor Malik
2025-03-28 22:48 ` Andrii Nakryiko
2025-03-24 12:03 ` [PATCH bpf-next v3 2/3] selftests/bpf: Add tests for string kfuncs Viktor Malik
2025-03-24 12:03 ` [PATCH bpf-next v3 3/3] selftests/bpf: Add benchmark for bounded/unbounded " Viktor Malik
2 siblings, 1 reply; 9+ messages in thread
From: Viktor Malik @ 2025-03-24 12:03 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Mykola Lysenko, Shuah Khan, Viktor Malik
String operations are commonly used so this exposes the most common ones
to BPF programs. For now, we limit ourselves to operations which do not
copy memory around.
Unfortunately, most in-kernel implementations assume that strings are
%NUL-terminated, which is not necessarily true, and therefore we cannot
use them directly in BPF context. So, we use distinct approaches for
bounded and unbounded variants of string operations:
- Unbounded variants are open-coded with using __get_kernel_nofault
instead of plain dereference to make them safe.
- Bounded variants use params with the __sz suffix so safety is assured
by the verifier and we can use the in-kernel (potentially optimized)
functions.
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Viktor Malik <vmalik@redhat.com>
---
kernel/bpf/helpers.c | 299 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 299 insertions(+)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 5449756ba102..6f6af4289cd0 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
*/
+#include "linux/uaccess.h"
#include <linux/bpf.h>
#include <linux/btf.h>
#include <linux/bpf-cgroup.h>
@@ -3193,6 +3194,291 @@ __bpf_kfunc void bpf_local_irq_restore(unsigned long *flags__irq_flag)
local_irq_restore(*flags__irq_flag);
}
+/* Kfuncs for string operations.
+ *
+ * Since strings are not necessarily %NUL-terminated, we cannot directly call
+ * in-kernel implementations. Instead, unbounded variants are open-coded with
+ * using __get_kernel_nofault instead of plain dereference to make them safe.
+ * Bounded variants use params with the __sz suffix so safety is assured by the
+ * verifier and we can use the in-kernel (potentially optimized) functions.
+ */
+
+/**
+ * bpf_strcmp - Compare two strings
+ * @cs: One string
+ * @ct: Another string
+ */
+__bpf_kfunc int bpf_strcmp(const char *cs, const char *ct)
+{
+ int i = 0, ret = 0;
+ char c1, c2;
+
+ pagefault_disable();
+ while (i++ < XATTR_SIZE_MAX) {
+ __get_kernel_nofault(&c1, cs++, char, cs_out);
+ __get_kernel_nofault(&c2, ct++, char, ct_out);
+ if (c1 != c2) {
+ ret = c1 < c2 ? -1 : 1;
+ goto out;
+ }
+ if (!c1)
+ goto out;
+ }
+cs_out:
+ ret = -1;
+ goto out;
+ct_out:
+ ret = 1;
+out:
+ pagefault_enable();
+ return ret;
+}
+
+/**
+ * bpf_strchr - Find the first occurrence of a character in a string
+ * @s: The string to be searched
+ * @c: The character to search for
+ *
+ * Note that the %NUL-terminator is considered part of the string, and can
+ * be searched for.
+ */
+__bpf_kfunc char *bpf_strchr(const char *s, int c)
+{
+ char *ret = NULL;
+ int i = 0;
+ char sc;
+
+ pagefault_disable();
+ while (i++ < XATTR_SIZE_MAX) {
+ __get_kernel_nofault(&sc, s, char, out);
+ if (sc == (char)c) {
+ ret = (char *)s;
+ break;
+ }
+ if (sc == '\0')
+ break;
+ s++;
+ }
+out:
+ pagefault_enable();
+ return ret;
+}
+
+/**
+ * bpf_strchrnul - Find and return a character in a string, or end of string
+ * @s: The string to be searched
+ * @c: The character to search for
+ *
+ * Returns pointer to first occurrence of 'c' in s. If c is not found, then
+ * return a pointer to the null byte at the end of s.
+ */
+__bpf_kfunc char *bpf_strchrnul(const char *s, int c)
+{
+ char *ret = NULL;
+ int i = 0;
+ char sc;
+
+ pagefault_disable();
+ while (i++ < XATTR_SIZE_MAX) {
+ __get_kernel_nofault(&sc, s, char, out);
+ if (sc == '\0' || sc == (char)c) {
+ ret = (char *)s;
+ break;
+ }
+ s++;
+ }
+out:
+ pagefault_enable();
+ return ret;
+}
+
+/**
+ * bpf_strnchr - Find a character in a length limited string
+ * @s: The string to be searched
+ * @s__sz: The number of characters to be searched
+ * @c: The character to search for
+ *
+ * Note that the %NUL-terminator is considered part of the string, and can
+ * be searched for.
+ */
+__bpf_kfunc char *bpf_strnchr(void *s, u32 s__sz, int c)
+{
+ return strnchr(s, s__sz, c);
+}
+
+/**
+ * bpf_strnchrnul - Find and return a character in a length limited string,
+ * or end of string
+ * @s: The string to be searched
+ * @s__sz: The number of characters to be searched
+ * @c: The character to search for
+ *
+ * Returns pointer to the first occurrence of 'c' in s. If c is not found,
+ * then return a pointer to the last character of the string.
+ */
+__bpf_kfunc char *bpf_strnchrnul(void *s, u32 s__sz, int c)
+{
+ return strnchrnul(s, s__sz, c);
+}
+
+/**
+ * bpf_strrchr - Find the last occurrence of a character in a string
+ * @s: The string to be searched
+ * @c: The character to search for
+ */
+__bpf_kfunc char *bpf_strrchr(const char *s, int c)
+{
+ char *ret = NULL;
+ int i = 0;
+ char sc;
+
+ pagefault_disable();
+ while (i++ < XATTR_SIZE_MAX) {
+ __get_kernel_nofault(&sc, s, char, out);
+ if (sc == '\0')
+ break;
+ if (sc == (char)c)
+ ret = (char *)s;
+ s++;
+ }
+out:
+ pagefault_enable();
+ return (char *)ret;
+}
+
+__bpf_kfunc size_t bpf_strlen(const char *s)
+{
+ int i = 0;
+ char c;
+
+ pagefault_disable();
+ while (i < XATTR_SIZE_MAX) {
+ __get_kernel_nofault(&c, s++, char, out);
+ if (c == '\0')
+ break;
+ i++;
+ }
+out:
+ pagefault_enable();
+ return i;
+}
+
+__bpf_kfunc size_t bpf_strnlen(void *s, u32 s__sz)
+{
+ return strnlen(s, s__sz);
+}
+
+/**
+ * bpf_strspn - Calculate the length of the initial substring of @s which only contain letters in @accept
+ * @s: The string to be searched
+ * @accept: The string to search for
+ */
+__bpf_kfunc size_t bpf_strspn(const char *s, const char *accept)
+{
+ int i = 0;
+ char c;
+
+ pagefault_disable();
+ while (i < XATTR_SIZE_MAX) {
+ __get_kernel_nofault(&c, s++, char, out);
+ if (c == '\0' || !bpf_strchr(accept, c))
+ break;
+ i++;
+ }
+out:
+ pagefault_enable();
+ return i;
+}
+
+/**
+ * strcspn - Calculate the length of the initial substring of @s which does not contain letters in @reject
+ * @s: The string to be searched
+ * @reject: The string to avoid
+ */
+__bpf_kfunc size_t bpf_strcspn(const char *s, const char *reject)
+{
+ int i = 0;
+ char c;
+
+ pagefault_disable();
+ while (i < XATTR_SIZE_MAX) {
+ __get_kernel_nofault(&c, s++, char, out);
+ if (c == '\0' || bpf_strchr(reject, c))
+ break;
+ i++;
+ }
+out:
+ pagefault_enable();
+ return i;
+}
+
+/**
+ * bpf_strpbrk - Find the first occurrence of a set of characters
+ * @cs: The string to be searched
+ * @ct: The characters to search for
+ */
+__bpf_kfunc char *bpf_strpbrk(const char *cs, const char *ct)
+{
+ char *ret = NULL;
+ int i = 0;
+ char c;
+
+ pagefault_disable();
+ while (i++ < XATTR_SIZE_MAX) {
+ __get_kernel_nofault(&c, cs, char, out);
+ if (c == '\0')
+ break;
+ if (bpf_strchr(ct, c)) {
+ ret = (char *)cs;
+ break;
+ }
+ cs++;
+ }
+out:
+ pagefault_enable();
+ return ret;
+}
+
+/**
+ * bpf_strstr - Find the first substring in a %NUL terminated string
+ * @s1: The string to be searched
+ * @s2: The string to search for
+ */
+__bpf_kfunc char *bpf_strstr(const char *s1, const char *s2)
+{
+ size_t l1, l2;
+
+ l2 = bpf_strlen(s2);
+ if (!l2)
+ return (char *)s1;
+ l1 = bpf_strlen(s1);
+ while (l1 >= l2) {
+ l1--;
+ if (!memcmp(s1, s2, l2))
+ return (char *)s1;
+ s1++;
+ }
+ return NULL;
+}
+
+/**
+ * bpf_strnstr - Find the first substring in a length-limited string
+ * @s1: The string to be searched
+ * @s1__sz: The size of @s1
+ * @s2: The string to search for
+ * @s2__sz: The size of @s2
+ */
+__bpf_kfunc char *bpf_strnstr(void *s1, u32 s1__sz, void *s2, u32 s2__sz)
+{
+ /* strnstr() uses strlen() to get the length of s2. Since this is not
+ * safe in BPF context for non-%NUL-terminated strings, use strnlen
+ * first to make it safe.
+ */
+ if (strnlen(s2, s2__sz) == s2__sz)
+ return NULL;
+ return strnstr(s1, s2, s1__sz);
+}
+
__bpf_kfunc_end_defs();
BTF_KFUNCS_START(generic_btf_ids)
@@ -3293,6 +3579,19 @@ BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLE
BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY | KF_SLEEPABLE)
BTF_ID_FLAGS(func, bpf_local_irq_save)
BTF_ID_FLAGS(func, bpf_local_irq_restore)
+BTF_ID_FLAGS(func, bpf_strcmp);
+BTF_ID_FLAGS(func, bpf_strchr);
+BTF_ID_FLAGS(func, bpf_strchrnul);
+BTF_ID_FLAGS(func, bpf_strnchr);
+BTF_ID_FLAGS(func, bpf_strnchrnul);
+BTF_ID_FLAGS(func, bpf_strrchr);
+BTF_ID_FLAGS(func, bpf_strlen);
+BTF_ID_FLAGS(func, bpf_strnlen);
+BTF_ID_FLAGS(func, bpf_strspn);
+BTF_ID_FLAGS(func, bpf_strcspn);
+BTF_ID_FLAGS(func, bpf_strpbrk);
+BTF_ID_FLAGS(func, bpf_strstr);
+BTF_ID_FLAGS(func, bpf_strnstr);
BTF_KFUNCS_END(common_btf_ids)
static const struct btf_kfunc_id_set common_kfunc_set = {
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH bpf-next v3 2/3] selftests/bpf: Add tests for string kfuncs
2025-03-24 12:03 [PATCH bpf-next v3 0/3] bpf: Add kfuncs for read-only string operations Viktor Malik
2025-03-24 12:03 ` [PATCH bpf-next v3 1/3] " Viktor Malik
@ 2025-03-24 12:03 ` Viktor Malik
2025-03-25 8:33 ` Jiri Olsa
2025-03-24 12:03 ` [PATCH bpf-next v3 3/3] selftests/bpf: Add benchmark for bounded/unbounded " Viktor Malik
2 siblings, 1 reply; 9+ messages in thread
From: Viktor Malik @ 2025-03-24 12:03 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Mykola Lysenko, Shuah Khan, Viktor Malik
The tests use the RUN_TESTS helper which executes BPF programs with
BPF_PROG_TEST_RUN and check for the expected return value.
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Viktor Malik <vmalik@redhat.com>
---
.../selftests/bpf/prog_tests/string_kfuncs.c | 10 ++++
.../selftests/bpf/progs/string_kfuncs.c | 58 +++++++++++++++++++
2 files changed, 68 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/string_kfuncs.c
create mode 100644 tools/testing/selftests/bpf/progs/string_kfuncs.c
diff --git a/tools/testing/selftests/bpf/prog_tests/string_kfuncs.c b/tools/testing/selftests/bpf/prog_tests/string_kfuncs.c
new file mode 100644
index 000000000000..79dab172eb92
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/string_kfuncs.c
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2025 Red Hat, Inc.*/
+#include <test_progs.h>
+#include "string_kfuncs.skel.h"
+
+void test_string_kfuncs(void)
+{
+ RUN_TESTS(string_kfuncs);
+}
+
diff --git a/tools/testing/selftests/bpf/progs/string_kfuncs.c b/tools/testing/selftests/bpf/progs/string_kfuncs.c
new file mode 100644
index 000000000000..9fb1ed5ba1fa
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/string_kfuncs.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2025 Red Hat, Inc.*/
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+int bpf_strcmp(const char *cs, const char *ct) __ksym;
+char *bpf_strchr(const char *s, int c) __ksym;
+char *bpf_strchrnul(const char *s, int c) __ksym;
+char *bpf_strnchr(void *s, u32 s__sz, int c) __ksym;
+char *bpf_strnchrnul(void *s, u32 s__sz, int c) __ksym;
+char *bpf_strrchr(const char *s, int c) __ksym;
+size_t bpf_strlen(const char *s) __ksym;
+size_t bpf_strnlen(void *s, u32 s__sz) __ksym;
+size_t bpf_strspn(const char *s, const char *accept) __ksym;
+size_t bpf_strcspn(const char *s, const char *reject) __ksym;
+char *bpf_strpbrk(const char *cs, const char *ct) __ksym;
+char *bpf_strstr(const char *s1, const char *s2) __ksym;
+char *bpf_strstr(const char *s1, const char *s2) __ksym;
+char *bpf_strnstr(void *s1, u32 s1__sz, void *s2, u32 s2__sz) __ksym;
+
+char str1[] = "hello world";
+char str2[] = "hello";
+char str3[] = "world";
+char str4[] = "abc";
+char str5[] = "";
+
+#define __test(retval) SEC("syscall") __success __retval(retval)
+
+__test(0) int test_strcmp_eq(void *ctx) { return bpf_strcmp(str1, str1); }
+__test(1) int test_strcmp_neq(void *ctx) { return bpf_strcmp(str1, str2); }
+__test(1) int test_strchr_found(void *ctx) { return bpf_strchr(str1, 'e') - str1; }
+__test(11) int test_strchr_null(void *ctx) { return bpf_strchr(str1, '\0') - str1; }
+__test(0) u64 test_strchr_notfound(void *ctx) { return (u64)bpf_strchr(str1, 'x'); }
+__test(1) int test_strchrnul_found(void *ctx) { return bpf_strchrnul(str1, 'e') - str1; }
+__test(11) int test_strchrnul_notfound(void *ctx) { return bpf_strchrnul(str1, 'x') - str1; }
+__test(1) int test_strnchr_found(void *ctx) { return bpf_strnchr(str1, 5, 'e') - str1; }
+__test(11) int test_strnchr_null(void *ctx) { return bpf_strnchr(str1, 12, '\0') - str1; }
+__test(0) u64 test_strnchr_notfound(void *ctx) { return (u64)bpf_strnchr(str1, 5, 'w'); }
+__test(1) int test_strnchrnul_found(void *ctx) { return bpf_strnchrnul(str1, 5, 'e') - str1; }
+__test(11) int test_strnchrnul_notfound(void *ctx) { return bpf_strnchrnul(str1, 12, 'x') - str1; }
+__test(9) int test_strrchr_found(void *ctx) { return bpf_strrchr(str1, 'l') - str1; }
+__test(0) u64 test_strrchr_notfound(void *ctx) { return (u64)bpf_strrchr(str1, 'x'); }
+__test(11) size_t test_strlen(void *ctx) { return bpf_strlen(str1); }
+__test(11) size_t test_strnlen(void *ctx) { return bpf_strnlen(str1, 12); }
+__test(5) size_t test_strspn(void *ctx) { return bpf_strspn(str1, str2); }
+__test(2) size_t test_strcspn(void *ctx) { return bpf_strcspn(str1, str3); }
+__test(2) int test_strpbrk_found(void *ctx) { return bpf_strpbrk(str1, str3) - str1; }
+__test(0) u64 test_strpbrk_notfound(void *ctx) { return (u64)bpf_strpbrk(str1, str4); }
+__test(6) int test_strstr_found(void *ctx) { return bpf_strstr(str1, str3) - str1; }
+__test(0) u64 test_strstr_notfound(void *ctx) { return (u64)bpf_strstr(str1, str4); }
+__test(0) int test_strstr_empty(void *ctx) { return bpf_strstr(str1, str5) - str1; }
+__test(6) int test_strnstr_found(void *ctx) { return bpf_strnstr(str1, 12, str3, 6) - str1; }
+__test(0) u64 test_strnstr_unsafe(void *ctx) { return (u64)bpf_strnstr(str1, 5, str3, 5); }
+__test(0) u64 test_strnstr_notfound(void *ctx) { return (u64)bpf_strnstr(str1, 12, str4, 4); }
+__test(0) int test_strnstr_empty(void *ctx) { return bpf_strnstr(str1, 5, str5, 1) - str1; }
+
+char _license[] SEC("license") = "GPL";
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH bpf-next v3 3/3] selftests/bpf: Add benchmark for bounded/unbounded string kfuncs
2025-03-24 12:03 [PATCH bpf-next v3 0/3] bpf: Add kfuncs for read-only string operations Viktor Malik
2025-03-24 12:03 ` [PATCH bpf-next v3 1/3] " Viktor Malik
2025-03-24 12:03 ` [PATCH bpf-next v3 2/3] selftests/bpf: Add tests for string kfuncs Viktor Malik
@ 2025-03-24 12:03 ` Viktor Malik
2 siblings, 0 replies; 9+ messages in thread
From: Viktor Malik @ 2025-03-24 12:03 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Mykola Lysenko, Shuah Khan, Viktor Malik
Add a new benchmark using the existing bench infrastructure which
compares performance of bounded and unbounded string kfuncs added in the
previous commits.
Running on x86_64 and arm64, the most significant difference is in the
strlen/strnlen and strstr/strnstr comparisons on arm64:
strlen/strnlen
==============
strlen-1 0.453 ± 0.002M/s (drops 0.000 ± 0.000M/s)
strnlen-1 0.470 ± 0.006M/s (drops 0.000 ± 0.000M/s)
strlen-8 0.459 ± 0.011M/s (drops 0.000 ± 0.000M/s)
strnlen-8 0.451 ± 0.006M/s (drops 0.000 ± 0.000M/s)
strlen-64 0.439 ± 0.007M/s (drops 0.000 ± 0.000M/s)
strnlen-64 0.455 ± 0.006M/s (drops 0.000 ± 0.000M/s)
strlen-512 0.359 ± 0.006M/s (drops 0.000 ± 0.000M/s)
strnlen-512 0.441 ± 0.007M/s (drops 0.000 ± 0.000M/s)
strlen-2048 0.232 ± 0.003M/s (drops 0.000 ± 0.000M/s)
strnlen-2048 0.403 ± 0.005M/s (drops 0.000 ± 0.000M/s)
strlen-4095 0.151 ± 0.001M/s (drops 0.000 ± 0.000M/s)
strnlen-4095 0.362 ± 0.005M/s (drops 0.000 ± 0.000M/s)
strstr/strnstr
==============
strstr-8 0.452 ± 0.005M/s (drops 0.000 ± 0.000M/s)
strnstr-8 0.442 ± 0.006M/s (drops 0.000 ± 0.000M/s)
strstr-64 0.390 ± 0.004M/s (drops 0.000 ± 0.000M/s)
strnstr-64 0.400 ± 0.004M/s (drops 0.000 ± 0.000M/s)
strstr-512 0.228 ± 0.003M/s (drops 0.000 ± 0.000M/s)
strnstr-512 0.256 ± 0.002M/s (drops 0.000 ± 0.000M/s)
strstr-2048 0.095 ± 0.001M/s (drops 0.000 ± 0.000M/s)
strnstr-2048 0.113 ± 0.001M/s (drops 0.000 ± 0.000M/s)
strstr-4095 0.052 ± 0.001M/s (drops 0.000 ± 0.000M/s)
strnstr-4095 0.064 ± 0.001M/s (drops 0.000 ± 0.000M/s)
For strings longer than 64B, the unbounded variants are notably faster,
having as much as 140% performance gain over the bounded variants
(strncmp for strings of length 4095). The reason is that arm64 has an
optimized implementation of strnlen in assembly which is also used
inside strnstr.
On x86_64, which doesn't have any optimized string operations, there is
still an observable difference in strlen/strnlen and strstr/strnstr,
albeit much smaller than for arm64:
strlen/strnlen
==============
strlen-1 7.021 ± 0.036M/s (drops 0.000 ± 0.000M/s)
strnlen-1 7.000 ± 0.038M/s (drops 0.000 ± 0.000M/s)
strlen-8 6.837 ± 0.011M/s (drops 0.000 ± 0.000M/s)
strnlen-8 6.832 ± 0.064M/s (drops 0.000 ± 0.000M/s)
strlen-64 5.638 ± 0.026M/s (drops 0.000 ± 0.000M/s)
strnlen-64 6.010 ± 0.034M/s (drops 0.000 ± 0.000M/s)
strlen-512 3.322 ± 0.011M/s (drops 0.000 ± 0.000M/s)
strnlen-512 3.449 ± 0.014M/s (drops 0.000 ± 0.000M/s)
strlen-2048 1.390 ± 0.007M/s (drops 0.000 ± 0.000M/s)
strnlen-2048 1.429 ± 0.003M/s (drops 0.000 ± 0.000M/s)
strlen-4095 0.786 ± 0.003M/s (drops 0.000 ± 0.000M/s)
strnlen-4095 0.803 ± 0.002M/s (drops 0.000 ± 0.000M/s)
strstr/strnstr
==============
strstr-8 6.031 ± 0.012M/s (drops 0.000 ± 0.000M/s)
strnstr-8 6.322 ± 0.048M/s (drops 0.000 ± 0.000M/s)
strstr-64 3.221 ± 0.054M/s (drops 0.000 ± 0.000M/s)
strnstr-64 3.059 ± 0.025M/s (drops 0.000 ± 0.000M/s)
strstr-512 0.734 ± 0.006M/s (drops 0.000 ± 0.000M/s)
strnstr-512 0.849 ± 0.004M/s (drops 0.000 ± 0.000M/s)
strstr-2048 0.220 ± 0.004M/s (drops 0.000 ± 0.000M/s)
strnstr-2048 0.246 ± 0.002M/s (drops 0.000 ± 0.000M/s)
strstr-4095 0.104 ± 0.000M/s (drops 0.000 ± 0.000M/s)
strnstr-4095 0.122 ± 0.000M/s (drops 0.000 ± 0.000M/s)
The performance gain of the bounded variants on strings over 64B is
3%-6% for strlen/strnlen and 12%-18% for strstr/strnstr. The likely
explanation is that the unbounded variants use __get_kernel_nofault
instead of plain derefence which introduces some small overhead. This
manifests mainly in the above functions as they iterate multiple
strings (i.e. use __get_kernel_nofault more).
For the rest of the functions in the benchmark (strchr/strnchr and
strchrnul/strnchrnul), the performance difference is negligable or
within the bounds of a statistical error, with an exception of
strchr/strnchr on arm64:
strchr/strnchr
==============
strchr-1 0.475 ± 0.010M/s (drops 0.000 ± 0.000M/s)
strnchr-1 0.469 ± 0.008M/s (drops 0.000 ± 0.000M/s)
strchr-8 0.448 ± 0.011M/s (drops 0.000 ± 0.000M/s)
strnchr-8 0.472 ± 0.006M/s (drops 0.000 ± 0.000M/s)
strchr-64 0.432 ± 0.010M/s (drops 0.000 ± 0.000M/s)
strnchr-64 0.445 ± 0.008M/s (drops 0.000 ± 0.000M/s)
strchr-512 0.308 ± 0.003M/s (drops 0.000 ± 0.000M/s)
strnchr-512 0.330 ± 0.005M/s (drops 0.000 ± 0.000M/s)
strchr-2048 0.156 ± 0.002M/s (drops 0.000 ± 0.000M/s)
strnchr-2048 0.186 ± 0.003M/s (drops 0.000 ± 0.000M/s)
strchr-4095 0.094 ± 0.001M/s (drops 0.000 ± 0.000M/s)
strnchr-4095 0.115 ± 0.004M/s (drops 0.000 ± 0.000M/s)
Here, I'm not sure what the reason for the performance benefit is,
possibly a combination of compiler optimizations and
__get_kernel_nofault overhead.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
---
tools/testing/selftests/bpf/Makefile | 2 +
tools/testing/selftests/bpf/bench.c | 21 ++
.../bpf/benchs/bench_string_kfuncs.c | 259 ++++++++++++++++++
.../bpf/benchs/run_bench_string_kfuncs.sh | 34 +++
.../selftests/bpf/progs/string_kfuncs_bench.c | 88 ++++++
5 files changed, 404 insertions(+)
create mode 100644 tools/testing/selftests/bpf/benchs/bench_string_kfuncs.c
create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_string_kfuncs.sh
create mode 100644 tools/testing/selftests/bpf/progs/string_kfuncs_bench.c
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index ca41d47d4ba6..d04f7e78c8ab 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -813,6 +813,7 @@ $(OUTPUT)/bench_local_storage_create.o: $(OUTPUT)/bench_local_storage_create.ske
$(OUTPUT)/bench_bpf_hashmap_lookup.o: $(OUTPUT)/bpf_hashmap_lookup.skel.h
$(OUTPUT)/bench_htab_mem.o: $(OUTPUT)/htab_mem_bench.skel.h
$(OUTPUT)/bench_bpf_crypto.o: $(OUTPUT)/crypto_bench.skel.h
+$(OUTPUT)/bench_string_kfuncs.o: $(OUTPUT)/string_kfuncs_bench.skel.h
$(OUTPUT)/bench.o: bench.h testing_helpers.h $(BPFOBJ)
$(OUTPUT)/bench: LDLIBS += -lm
$(OUTPUT)/bench: $(OUTPUT)/bench.o \
@@ -833,6 +834,7 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o \
$(OUTPUT)/bench_local_storage_create.o \
$(OUTPUT)/bench_htab_mem.o \
$(OUTPUT)/bench_bpf_crypto.o \
+ $(OUTPUT)/bench_string_kfuncs.o \
#
$(call msg,BINARY,,$@)
$(Q)$(CC) $(CFLAGS) $(LDFLAGS) $(filter %.a %.o,$^) $(LDLIBS) -o $@
diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
index 1bd403a5ef7b..5aa7f63436f6 100644
--- a/tools/testing/selftests/bpf/bench.c
+++ b/tools/testing/selftests/bpf/bench.c
@@ -283,6 +283,7 @@ extern struct argp bench_local_storage_create_argp;
extern struct argp bench_htab_mem_argp;
extern struct argp bench_trigger_batch_argp;
extern struct argp bench_crypto_argp;
+extern struct argp bench_string_kfuncs_argp;
static const struct argp_child bench_parsers[] = {
{ &bench_ringbufs_argp, 0, "Ring buffers benchmark", 0 },
@@ -297,6 +298,7 @@ static const struct argp_child bench_parsers[] = {
{ &bench_htab_mem_argp, 0, "hash map memory benchmark", 0 },
{ &bench_trigger_batch_argp, 0, "BPF triggering benchmark", 0 },
{ &bench_crypto_argp, 0, "bpf crypto benchmark", 0 },
+ { &bench_string_kfuncs_argp, 0, "string kfuncs benchmark", 0 },
{},
};
@@ -550,6 +552,16 @@ extern const struct bench bench_htab_mem;
extern const struct bench bench_crypto_encrypt;
extern const struct bench bench_crypto_decrypt;
+/* string kfunc benchmarks */
+extern const struct bench bench_string_kfuncs_strlen;
+extern const struct bench bench_string_kfuncs_strnlen;
+extern const struct bench bench_string_kfuncs_strchr;
+extern const struct bench bench_string_kfuncs_strnchr;
+extern const struct bench bench_string_kfuncs_strchrnul;
+extern const struct bench bench_string_kfuncs_strnchrnul;
+extern const struct bench bench_string_kfuncs_strstr;
+extern const struct bench bench_string_kfuncs_strnstr;
+
static const struct bench *benchs[] = {
&bench_count_global,
&bench_count_local,
@@ -609,6 +621,15 @@ static const struct bench *benchs[] = {
&bench_htab_mem,
&bench_crypto_encrypt,
&bench_crypto_decrypt,
+ /* string kfuncs */
+ &bench_string_kfuncs_strlen,
+ &bench_string_kfuncs_strnlen,
+ &bench_string_kfuncs_strchr,
+ &bench_string_kfuncs_strnchr,
+ &bench_string_kfuncs_strchrnul,
+ &bench_string_kfuncs_strnchrnul,
+ &bench_string_kfuncs_strstr,
+ &bench_string_kfuncs_strnstr,
};
static void find_benchmark(void)
diff --git a/tools/testing/selftests/bpf/benchs/bench_string_kfuncs.c b/tools/testing/selftests/bpf/benchs/bench_string_kfuncs.c
new file mode 100644
index 000000000000..a2e11af092ce
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/bench_string_kfuncs.c
@@ -0,0 +1,259 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2025. Red Hat, Inc. */
+#include <argp.h>
+#include "bench.h"
+#include "string_kfuncs_bench.skel.h"
+
+static struct string_kfuncs_ctx {
+ struct string_kfuncs_bench *skel;
+} ctx;
+
+static struct string_kfuncs_args {
+ u32 str_len;
+} args = {
+ .str_len = 32,
+};
+
+enum {
+ ARG_STR_LEN = 5000,
+};
+
+static const struct argp_option opts[] = {
+ { "str-len", ARG_STR_LEN, "STR_LEN", 0, "Set the length of string(s)" },
+ {},
+};
+
+static error_t string_kfuncs_parse_arg(int key, char *arg, struct argp_state *state)
+{
+ switch (key) {
+ case ARG_STR_LEN:
+ args.str_len = strtoul(arg, NULL, 10);
+ if (!args.str_len ||
+ args.str_len >= sizeof(ctx.skel->bss->str)) {
+ fprintf(stderr, "Invalid str len (limit %zu)\n",
+ sizeof(ctx.skel->bss->str) - 1);
+ argp_usage(state);
+ }
+ break;
+ default:
+ return ARGP_ERR_UNKNOWN;
+ }
+
+ return 0;
+}
+
+const struct argp bench_string_kfuncs_argp = {
+ .options = opts,
+ .parser = string_kfuncs_parse_arg,
+};
+
+static void string_kfuncs_validate(void)
+{
+ if (env.consumer_cnt != 0) {
+ fprintf(stderr, "string_kfuncs benchmark doesn't support consumer!\n");
+ exit(1);
+ }
+}
+
+static void string_kfuncs_setup(void)
+{
+ int err;
+ char *str;
+ size_t i, sz, quarter;
+
+ sz = sizeof(ctx.skel->bss->str);
+ if (!sz) {
+ fprintf(stderr, "invalid string size (%zu)\n", sz);
+ exit(1);
+ }
+
+ setup_libbpf();
+
+ ctx.skel = string_kfuncs_bench__open();
+ if (!ctx.skel) {
+ fprintf(stderr, "failed to open skeleton\n");
+ exit(1);
+ }
+
+ /* Fill str with random digits 1-9 */
+ srandom(time(NULL));
+ str = ctx.skel->bss->str;
+ for (i = 0; i < args.str_len - 1; i++)
+ str[i] = '1' + random() % 9;
+
+ /* For strchr and variants - set the last character to '0' */
+ str[args.str_len - 1] = '0';
+ str[args.str_len] = '\0';
+
+ /* For strstr and variants - copy the last quarter of str to substr */
+ quarter = args.str_len / 4;
+ memcpy(ctx.skel->bss->substr, str + args.str_len - quarter, quarter + 1);
+
+ ctx.skel->rodata->str_len = args.str_len;
+
+ err = string_kfuncs_bench__load(ctx.skel);
+ if (err) {
+ fprintf(stderr, "failed to load skeleton\n");
+ string_kfuncs_bench__destroy(ctx.skel);
+ exit(1);
+ }
+}
+
+static void string_kfuncs_attach_prog(struct bpf_program *prog)
+{
+ struct bpf_link *link;
+
+ link = bpf_program__attach(prog);
+ if (!link) {
+ fprintf(stderr, "failed to attach program!\n");
+ exit(1);
+ }
+}
+
+static void string_kfuncs_strlen_setup(void)
+{
+ string_kfuncs_setup();
+ string_kfuncs_attach_prog(ctx.skel->progs.strlen_bench);
+}
+
+static void string_kfuncs_strnlen_setup(void)
+{
+ string_kfuncs_setup();
+ string_kfuncs_attach_prog(ctx.skel->progs.strnlen_bench);
+}
+
+static void string_kfuncs_strchr_setup(void)
+{
+ string_kfuncs_setup();
+ string_kfuncs_attach_prog(ctx.skel->progs.strchr_bench);
+}
+
+static void string_kfuncs_strnchr_setup(void)
+{
+ string_kfuncs_setup();
+ string_kfuncs_attach_prog(ctx.skel->progs.strnchr_bench);
+}
+
+static void string_kfuncs_strchrnul_setup(void)
+{
+ string_kfuncs_setup();
+ string_kfuncs_attach_prog(ctx.skel->progs.strchrnul_bench);
+}
+
+static void string_kfuncs_strnchrnul_setup(void)
+{
+ string_kfuncs_setup();
+ string_kfuncs_attach_prog(ctx.skel->progs.strnchrnul_bench);
+}
+
+static void string_kfuncs_strstr_setup(void)
+{
+ string_kfuncs_setup();
+ string_kfuncs_attach_prog(ctx.skel->progs.strstr_bench);
+}
+
+static void string_kfuncs_strnstr_setup(void)
+{
+ string_kfuncs_setup();
+ string_kfuncs_attach_prog(ctx.skel->progs.strnstr_bench);
+}
+
+static void *string_kfuncs_producer(void *ctx)
+{
+ while (true)
+ (void)syscall(__NR_getpgid);
+ return NULL;
+}
+
+static void string_kfuncs_measure(struct bench_res *res)
+{
+ res->hits = atomic_swap(&ctx.skel->bss->hits, 0);
+}
+
+const struct bench bench_string_kfuncs_strlen = {
+ .name = "string-kfuncs-strlen",
+ .argp = &bench_string_kfuncs_argp,
+ .validate = string_kfuncs_validate,
+ .setup = string_kfuncs_strlen_setup,
+ .producer_thread = string_kfuncs_producer,
+ .measure = string_kfuncs_measure,
+ .report_progress = hits_drops_report_progress,
+ .report_final = hits_drops_report_final,
+};
+
+const struct bench bench_string_kfuncs_strnlen = {
+ .name = "string-kfuncs-strnlen",
+ .argp = &bench_string_kfuncs_argp,
+ .validate = string_kfuncs_validate,
+ .setup = string_kfuncs_strnlen_setup,
+ .producer_thread = string_kfuncs_producer,
+ .measure = string_kfuncs_measure,
+ .report_progress = hits_drops_report_progress,
+ .report_final = hits_drops_report_final,
+};
+
+const struct bench bench_string_kfuncs_strchr = {
+ .name = "string-kfuncs-strchr",
+ .argp = &bench_string_kfuncs_argp,
+ .validate = string_kfuncs_validate,
+ .setup = string_kfuncs_strchr_setup,
+ .producer_thread = string_kfuncs_producer,
+ .measure = string_kfuncs_measure,
+ .report_progress = hits_drops_report_progress,
+ .report_final = hits_drops_report_final,
+};
+
+const struct bench bench_string_kfuncs_strnchr = {
+ .name = "string-kfuncs-strnchr",
+ .argp = &bench_string_kfuncs_argp,
+ .validate = string_kfuncs_validate,
+ .setup = string_kfuncs_strnchr_setup,
+ .producer_thread = string_kfuncs_producer,
+ .measure = string_kfuncs_measure,
+ .report_progress = hits_drops_report_progress,
+ .report_final = hits_drops_report_final,
+};
+
+const struct bench bench_string_kfuncs_strchrnul = {
+ .name = "string-kfuncs-strchrnul",
+ .argp = &bench_string_kfuncs_argp,
+ .validate = string_kfuncs_validate,
+ .setup = string_kfuncs_strchrnul_setup,
+ .producer_thread = string_kfuncs_producer,
+ .measure = string_kfuncs_measure,
+ .report_progress = hits_drops_report_progress,
+ .report_final = hits_drops_report_final,
+};
+
+const struct bench bench_string_kfuncs_strnchrnul = {
+ .name = "string-kfuncs-strnchrnul",
+ .argp = &bench_string_kfuncs_argp,
+ .validate = string_kfuncs_validate,
+ .setup = string_kfuncs_strnchrnul_setup,
+ .producer_thread = string_kfuncs_producer,
+ .measure = string_kfuncs_measure,
+ .report_progress = hits_drops_report_progress,
+ .report_final = hits_drops_report_final,
+};
+
+const struct bench bench_string_kfuncs_strstr = {
+ .name = "string-kfuncs-strstr",
+ .argp = &bench_string_kfuncs_argp,
+ .validate = string_kfuncs_validate,
+ .setup = string_kfuncs_strstr_setup,
+ .producer_thread = string_kfuncs_producer,
+ .measure = string_kfuncs_measure,
+ .report_progress = hits_drops_report_progress,
+ .report_final = hits_drops_report_final,
+};
+
+const struct bench bench_string_kfuncs_strnstr = {
+ .name = "string-kfuncs-strnstr",
+ .argp = &bench_string_kfuncs_argp,
+ .validate = string_kfuncs_validate,
+ .setup = string_kfuncs_strnstr_setup,
+ .producer_thread = string_kfuncs_producer,
+ .measure = string_kfuncs_measure,
+ .report_progress = hits_drops_report_progress,
+ .report_final = hits_drops_report_final,
+};
diff --git a/tools/testing/selftests/bpf/benchs/run_bench_string_kfuncs.sh b/tools/testing/selftests/bpf/benchs/run_bench_string_kfuncs.sh
new file mode 100755
index 000000000000..5e635681cd85
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/run_bench_string_kfuncs.sh
@@ -0,0 +1,34 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+source ./benchs/run_common.sh
+
+set -eufo pipefail
+
+header "strlen/strnlen"
+for s in 1 8 64 512 2048 4095; do
+ for b in strlen strnlen; do
+ summarize ${b}-${s} "$($RUN_BENCH --str-len=$s string-kfuncs-${b})"
+ done
+done
+
+header "strchr/strnchr"
+for s in 1 8 64 512 2048 4095; do
+ for b in strchr strnchr; do
+ summarize ${b}-${s} "$($RUN_BENCH --str-len=$s string-kfuncs-${b})"
+ done
+done
+
+header "strchrnul/strnchrnul"
+for s in 1 8 64 512 2048 4095; do
+ for b in strchrnul strnchrnul; do
+ summarize ${b}-${s} "$($RUN_BENCH --str-len=$s string-kfuncs-${b})"
+ done
+done
+
+header "strstr/strnstr"
+for s in 8 64 512 2048 4095; do
+ for b in strstr strnstr; do
+ summarize ${b}-${s} "$($RUN_BENCH --str-len=$s string-kfuncs-${b})"
+ done
+done
diff --git a/tools/testing/selftests/bpf/progs/string_kfuncs_bench.c b/tools/testing/selftests/bpf/progs/string_kfuncs_bench.c
new file mode 100644
index 000000000000..e227c54a5b92
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/string_kfuncs_bench.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2025. Red Hat, Inc. */
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#define STR_SZ 4096
+
+size_t bpf_strlen(const char *s) __ksym;
+size_t bpf_strnlen(void *s, u32 s__sz) __ksym;
+char *bpf_strchr(const char *s, int c) __ksym;
+char *bpf_strnchr(void *s, u32 s__sz, int c) __ksym;
+char *bpf_strchrnul(const char *s, int c) __ksym;
+char *bpf_strnchrnul(void *s, u32 s__sz, int c) __ksym;
+char *bpf_strstr(const char *s1, const char *s2) __ksym;
+char *bpf_strnstr(void *s1, u32 s1__sz, void *s2, u32 s2__sz) __ksym;
+
+/* Will be updated by benchmark before program loading */
+const volatile unsigned int str_len = 1;
+long hits = 0;
+char str[STR_SZ];
+char substr[STR_SZ];
+
+char _license[] SEC("license") = "GPL";
+
+SEC("tp/syscalls/sys_enter_getpgid")
+int strlen_bench(void *ctx)
+{
+ if (bpf_strlen(str) > 0)
+ __sync_add_and_fetch(&hits, 1);
+ return 0;
+}
+
+SEC("tp/syscalls/sys_enter_getpgid")
+int strnlen_bench(void *ctx)
+{
+ if (bpf_strnlen(str, str_len + 1) > 0)
+ __sync_add_and_fetch(&hits, 1);
+ return 0;
+}
+
+SEC("tp/syscalls/sys_enter_getpgid")
+int strchr_bench(void *ctx)
+{
+ if (bpf_strchr(str, '0') != NULL)
+ __sync_add_and_fetch(&hits, 1);
+ return 0;
+}
+
+SEC("tp/syscalls/sys_enter_getpgid")
+int strnchr_bench(void *ctx)
+{
+ if (bpf_strnchr(str, str_len + 1, '0') != NULL)
+ __sync_add_and_fetch(&hits, 1);
+ return 0;
+}
+
+SEC("tp/syscalls/sys_enter_getpgid")
+int strchrnul_bench(void *ctx)
+{
+ if (*bpf_strchrnul(str, '0') != '\0')
+ __sync_add_and_fetch(&hits, 1);
+ return 0;
+}
+
+SEC("tp/syscalls/sys_enter_getpgid")
+int strnchrnul_bench(void *ctx)
+{
+ if (*bpf_strnchrnul(str, str_len + 1, '0') != '\0')
+ __sync_add_and_fetch(&hits, 1);
+ return 0;
+}
+
+SEC("tp/syscalls/sys_enter_getpgid")
+int strstr_bench(void *ctx)
+{
+ if (bpf_strstr(str, substr) != NULL)
+ __sync_add_and_fetch(&hits, 1);
+ return 0;
+}
+
+SEC("tp/syscalls/sys_enter_getpgid")
+int strnstr_bench(void *ctx)
+{
+ if (bpf_strnstr(str, str_len + 1, substr, str_len / 4 + 1) != NULL)
+ __sync_add_and_fetch(&hits, 1);
+ return 0;
+}
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next v3 2/3] selftests/bpf: Add tests for string kfuncs
2025-03-24 12:03 ` [PATCH bpf-next v3 2/3] selftests/bpf: Add tests for string kfuncs Viktor Malik
@ 2025-03-25 8:33 ` Jiri Olsa
0 siblings, 0 replies; 9+ messages in thread
From: Jiri Olsa @ 2025-03-25 8:33 UTC (permalink / raw)
To: Viktor Malik
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo,
Mykola Lysenko, Shuah Khan
On Mon, Mar 24, 2025 at 01:03:29PM +0100, Viktor Malik wrote:
> The tests use the RUN_TESTS helper which executes BPF programs with
> BPF_PROG_TEST_RUN and check for the expected return value.
>
> Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
> Signed-off-by: Viktor Malik <vmalik@redhat.com>
> ---
> .../selftests/bpf/prog_tests/string_kfuncs.c | 10 ++++
> .../selftests/bpf/progs/string_kfuncs.c | 58 +++++++++++++++++++
> 2 files changed, 68 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/string_kfuncs.c
> create mode 100644 tools/testing/selftests/bpf/progs/string_kfuncs.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/string_kfuncs.c b/tools/testing/selftests/bpf/prog_tests/string_kfuncs.c
> new file mode 100644
> index 000000000000..79dab172eb92
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/string_kfuncs.c
> @@ -0,0 +1,10 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (C) 2025 Red Hat, Inc.*/
> +#include <test_progs.h>
> +#include "string_kfuncs.skel.h"
> +
> +void test_string_kfuncs(void)
> +{
> + RUN_TESTS(string_kfuncs);
> +}
> +
> diff --git a/tools/testing/selftests/bpf/progs/string_kfuncs.c b/tools/testing/selftests/bpf/progs/string_kfuncs.c
> new file mode 100644
> index 000000000000..9fb1ed5ba1fa
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/string_kfuncs.c
> @@ -0,0 +1,58 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (C) 2025 Red Hat, Inc.*/
> +#include "vmlinux.h"
> +#include <bpf/bpf_helpers.h>
> +#include "bpf_misc.h"
> +
> +int bpf_strcmp(const char *cs, const char *ct) __ksym;
> +char *bpf_strchr(const char *s, int c) __ksym;
> +char *bpf_strchrnul(const char *s, int c) __ksym;
> +char *bpf_strnchr(void *s, u32 s__sz, int c) __ksym;
> +char *bpf_strnchrnul(void *s, u32 s__sz, int c) __ksym;
> +char *bpf_strrchr(const char *s, int c) __ksym;
> +size_t bpf_strlen(const char *s) __ksym;
> +size_t bpf_strnlen(void *s, u32 s__sz) __ksym;
> +size_t bpf_strspn(const char *s, const char *accept) __ksym;
> +size_t bpf_strcspn(const char *s, const char *reject) __ksym;
> +char *bpf_strpbrk(const char *cs, const char *ct) __ksym;
> +char *bpf_strstr(const char *s1, const char *s2) __ksym;
> +char *bpf_strstr(const char *s1, const char *s2) __ksym;
> +char *bpf_strnstr(void *s1, u32 s1__sz, void *s2, u32 s2__sz) __ksym;
hi,
I don't think above declarations are necessary, it should be all
in vmlinux.h already
jirka
> +
> +char str1[] = "hello world";
> +char str2[] = "hello";
> +char str3[] = "world";
> +char str4[] = "abc";
> +char str5[] = "";
> +
> +#define __test(retval) SEC("syscall") __success __retval(retval)
> +
> +__test(0) int test_strcmp_eq(void *ctx) { return bpf_strcmp(str1, str1); }
> +__test(1) int test_strcmp_neq(void *ctx) { return bpf_strcmp(str1, str2); }
> +__test(1) int test_strchr_found(void *ctx) { return bpf_strchr(str1, 'e') - str1; }
> +__test(11) int test_strchr_null(void *ctx) { return bpf_strchr(str1, '\0') - str1; }
> +__test(0) u64 test_strchr_notfound(void *ctx) { return (u64)bpf_strchr(str1, 'x'); }
> +__test(1) int test_strchrnul_found(void *ctx) { return bpf_strchrnul(str1, 'e') - str1; }
> +__test(11) int test_strchrnul_notfound(void *ctx) { return bpf_strchrnul(str1, 'x') - str1; }
> +__test(1) int test_strnchr_found(void *ctx) { return bpf_strnchr(str1, 5, 'e') - str1; }
> +__test(11) int test_strnchr_null(void *ctx) { return bpf_strnchr(str1, 12, '\0') - str1; }
> +__test(0) u64 test_strnchr_notfound(void *ctx) { return (u64)bpf_strnchr(str1, 5, 'w'); }
> +__test(1) int test_strnchrnul_found(void *ctx) { return bpf_strnchrnul(str1, 5, 'e') - str1; }
> +__test(11) int test_strnchrnul_notfound(void *ctx) { return bpf_strnchrnul(str1, 12, 'x') - str1; }
> +__test(9) int test_strrchr_found(void *ctx) { return bpf_strrchr(str1, 'l') - str1; }
> +__test(0) u64 test_strrchr_notfound(void *ctx) { return (u64)bpf_strrchr(str1, 'x'); }
> +__test(11) size_t test_strlen(void *ctx) { return bpf_strlen(str1); }
> +__test(11) size_t test_strnlen(void *ctx) { return bpf_strnlen(str1, 12); }
> +__test(5) size_t test_strspn(void *ctx) { return bpf_strspn(str1, str2); }
> +__test(2) size_t test_strcspn(void *ctx) { return bpf_strcspn(str1, str3); }
> +__test(2) int test_strpbrk_found(void *ctx) { return bpf_strpbrk(str1, str3) - str1; }
> +__test(0) u64 test_strpbrk_notfound(void *ctx) { return (u64)bpf_strpbrk(str1, str4); }
> +__test(6) int test_strstr_found(void *ctx) { return bpf_strstr(str1, str3) - str1; }
> +__test(0) u64 test_strstr_notfound(void *ctx) { return (u64)bpf_strstr(str1, str4); }
> +__test(0) int test_strstr_empty(void *ctx) { return bpf_strstr(str1, str5) - str1; }
> +__test(6) int test_strnstr_found(void *ctx) { return bpf_strnstr(str1, 12, str3, 6) - str1; }
> +__test(0) u64 test_strnstr_unsafe(void *ctx) { return (u64)bpf_strnstr(str1, 5, str3, 5); }
> +__test(0) u64 test_strnstr_notfound(void *ctx) { return (u64)bpf_strnstr(str1, 12, str4, 4); }
> +__test(0) int test_strnstr_empty(void *ctx) { return bpf_strnstr(str1, 5, str5, 1) - str1; }
> +
> +char _license[] SEC("license") = "GPL";
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next v3 1/3] bpf: Add kfuncs for read-only string operations
2025-03-24 12:03 ` [PATCH bpf-next v3 1/3] " Viktor Malik
@ 2025-03-28 22:48 ` Andrii Nakryiko
2025-04-01 12:48 ` Viktor Malik
0 siblings, 1 reply; 9+ messages in thread
From: Andrii Nakryiko @ 2025-03-28 22:48 UTC (permalink / raw)
To: Viktor Malik
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Mykola Lysenko, Shuah Khan
On Mon, Mar 24, 2025 at 5:04 AM Viktor Malik <vmalik@redhat.com> wrote:
>
> String operations are commonly used so this exposes the most common ones
> to BPF programs. For now, we limit ourselves to operations which do not
> copy memory around.
>
> Unfortunately, most in-kernel implementations assume that strings are
> %NUL-terminated, which is not necessarily true, and therefore we cannot
> use them directly in BPF context. So, we use distinct approaches for
> bounded and unbounded variants of string operations:
>
> - Unbounded variants are open-coded with using __get_kernel_nofault
> instead of plain dereference to make them safe.
>
> - Bounded variants use params with the __sz suffix so safety is assured
> by the verifier and we can use the in-kernel (potentially optimized)
> functions.
>
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Viktor Malik <vmalik@redhat.com>
> ---
> kernel/bpf/helpers.c | 299 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 299 insertions(+)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 5449756ba102..6f6af4289cd0 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1,6 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0-only
> /* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
> */
> +#include "linux/uaccess.h"
<> should be used?
> #include <linux/bpf.h>
> #include <linux/btf.h>
> #include <linux/bpf-cgroup.h>
> @@ -3193,6 +3194,291 @@ __bpf_kfunc void bpf_local_irq_restore(unsigned long *flags__irq_flag)
> local_irq_restore(*flags__irq_flag);
> }
>
> +/* Kfuncs for string operations.
> + *
> + * Since strings are not necessarily %NUL-terminated, we cannot directly call
> + * in-kernel implementations. Instead, unbounded variants are open-coded with
> + * using __get_kernel_nofault instead of plain dereference to make them safe.
> + * Bounded variants use params with the __sz suffix so safety is assured by the
> + * verifier and we can use the in-kernel (potentially optimized) functions.
> + */
> +
> +/**
> + * bpf_strcmp - Compare two strings
> + * @cs: One string
> + * @ct: Another string
> + */
> +__bpf_kfunc int bpf_strcmp(const char *cs, const char *ct)
> +{
> + int i = 0, ret = 0;
> + char c1, c2;
> +
> + pagefault_disable();
> + while (i++ < XATTR_SIZE_MAX) {
> + __get_kernel_nofault(&c1, cs++, char, cs_out);
> + __get_kernel_nofault(&c2, ct++, char, ct_out);
nit: should we avoid passing increment statements into macro? It's
succinct and all, but we lose nothing by having cs++; ct++; at the end
of while loop, no?
> + if (c1 != c2) {
> + ret = c1 < c2 ? -1 : 1;
> + goto out;
> + }
> + if (!c1)
> + goto out;
> + }
> +cs_out:
> + ret = -1;
> + goto out;
> +ct_out:
> + ret = 1;
> +out:
> + pagefault_enable();
> + return ret;
> +}
Given valid values are only -1, 0, and 1, should we return -EFAULT
when one or the other string can't be fetched?
Yes, users that don't care will treat -EFAULT as the first string is
smaller than the second, but that's what you have anyways. But having
-EFAULT is still useful, IMO. We can also return -E2BIG if we reach i
== XATTR_SIZE_MAX situation, no?
> +
> +/**
> + * bpf_strchr - Find the first occurrence of a character in a string
> + * @s: The string to be searched
> + * @c: The character to search for
> + *
> + * Note that the %NUL-terminator is considered part of the string, and can
> + * be searched for.
> + */
> +__bpf_kfunc char *bpf_strchr(const char *s, int c)
if we do int -> char here, something breaks?
> +{
> + char *ret = NULL;
> + int i = 0;
> + char sc;
> +
> + pagefault_disable();
> + while (i++ < XATTR_SIZE_MAX) {
> + __get_kernel_nofault(&sc, s, char, out);
> + if (sc == (char)c) {
> + ret = (char *)s;
> + break;
> + }
> + if (sc == '\0')
not very consistent with bpf_strcmp() implementation where you just
did `!c1` for the same. FWIW, when dealing with string characters I
like `sc == '\0'` better, but regardless let's be consistent, at
least.
> + break;
> + s++;
It's like bpf_strcmp and bpf_strchr were written by two different
people, stylistically :)
> + }
> +out:
> + pagefault_enable();
how about we
DEFINE_LOCK_GUARD_0(pagefault, pagefault_disable(), pagefault_enable())
like we do for preempt_{disable,enable}() and simplify all the
implementations significantly?
> + return ret;
> +}
> +
> +/**
> + * bpf_strchrnul - Find and return a character in a string, or end of string
> + * @s: The string to be searched
> + * @c: The character to search for
> + *
> + * Returns pointer to first occurrence of 'c' in s. If c is not found, then
> + * return a pointer to the null byte at the end of s.
> + */
> +__bpf_kfunc char *bpf_strchrnul(const char *s, int c)
> +{
> + char *ret = NULL;
> + int i = 0;
> + char sc;
> +
> + pagefault_disable();
> + while (i++ < XATTR_SIZE_MAX) {
erm... for (i = 0; i < XATTR_SIZE_MAX; i++, s++) ?
what advantage does while() form provide? same question for lots of
other functions. for() is meant for loops like this, no?
> + __get_kernel_nofault(&sc, s, char, out);
> + if (sc == '\0' || sc == (char)c) {
> + ret = (char *)s;
> + break;
> + }
> + s++;
> + }
> +out:
> + pagefault_enable();
> + return ret;
> +}
> +
> +/**
> + * bpf_strnchr - Find a character in a length limited string
> + * @s: The string to be searched
> + * @s__sz: The number of characters to be searched
> + * @c: The character to search for
> + *
> + * Note that the %NUL-terminator is considered part of the string, and can
> + * be searched for.
> + */
> +__bpf_kfunc char *bpf_strnchr(void *s, u32 s__sz, int c)
I'm a bit on the fence here. I can see cases where s would be some
string somewhere (not "trusted" by verifier, because I did BPF CO-RE
based casts, etc). Also I can see how s__sz is non-const known at
runtime only.
I think the performance argument is much less of a priority compared
to the ability to use the helper in a much wider set of cases. WDYT?
Maybe let's just have __get_kernel_nofault() for everything?
If performance is truly that important, we can later have an
optimization in which we detect constant size and "guaranteed"
lifetime and validity of `s`, and use optimized strnchr()
implementation?
But I'd start with a safe and generic __get_kernel_nofault() way for sure.
> +{
> + return strnchr(s, s__sz, c);
> +}
> +
> +/**
> + * bpf_strnchrnul - Find and return a character in a length limited string,
> + * or end of string
> + * @s: The string to be searched
> + * @s__sz: The number of characters to be searched
> + * @c: The character to search for
> + *
> + * Returns pointer to the first occurrence of 'c' in s. If c is not found,
> + * then return a pointer to the last character of the string.
> + */
> +__bpf_kfunc char *bpf_strnchrnul(void *s, u32 s__sz, int c)
> +{
> + return strnchrnul(s, s__sz, c);
> +}
> +
> +/**
> + * bpf_strrchr - Find the last occurrence of a character in a string
> + * @s: The string to be searched
> + * @c: The character to search for
> + */
> +__bpf_kfunc char *bpf_strrchr(const char *s, int c)
`const char *` return? we won't (well, shouldn't!) allow writing into
it from the BPF program.
> +{
> + char *ret = NULL;
> + int i = 0;
> + char sc;
> +
> + pagefault_disable();
> + while (i++ < XATTR_SIZE_MAX) {
> + __get_kernel_nofault(&sc, s, char, out);
> + if (sc == '\0')
> + break;
> + if (sc == (char)c)
> + ret = (char *)s;
> + s++;
> + }
> +out:
> + pagefault_enable();
> + return (char *)ret;
> +}
> +
> +__bpf_kfunc size_t bpf_strlen(const char *s)
> +{
> + int i = 0;
> + char c;
> +
> + pagefault_disable();
> + while (i < XATTR_SIZE_MAX) {
> + __get_kernel_nofault(&c, s++, char, out);
> + if (c == '\0')
> + break;
> + i++;
> + }
> +out:
> + pagefault_enable();
> + return i;
> +}
> +
> +__bpf_kfunc size_t bpf_strnlen(void *s, u32 s__sz)
> +{
> + return strnlen(s, s__sz);
> +}
> +
> +/**
> + * bpf_strspn - Calculate the length of the initial substring of @s which only contain letters in @accept
> + * @s: The string to be searched
> + * @accept: The string to search for
> + */
> +__bpf_kfunc size_t bpf_strspn(const char *s, const char *accept)
> +{
> + int i = 0;
> + char c;
> +
> + pagefault_disable();
> + while (i < XATTR_SIZE_MAX) {
> + __get_kernel_nofault(&c, s++, char, out);
> + if (c == '\0' || !bpf_strchr(accept, c))
hm... so `s` is untrusted/unsafe, but `accept` is? How should verifier
make a distinction? It's `const char *` in the signature, so what
makes one more safe than the other?
> + break;
> + i++;
> + }
> +out:
> + pagefault_enable();
> + return i;
> +}
> +
> +/**
> + * strcspn - Calculate the length of the initial substring of @s which does not contain letters in @reject
> + * @s: The string to be searched
> + * @reject: The string to avoid
> + */
> +__bpf_kfunc size_t bpf_strcspn(const char *s, const char *reject)
> +{
> + int i = 0;
> + char c;
> +
> + pagefault_disable();
> + while (i < XATTR_SIZE_MAX) {
> + __get_kernel_nofault(&c, s++, char, out);
> + if (c == '\0' || bpf_strchr(reject, c))
> + break;
> + i++;
> + }
> +out:
> + pagefault_enable();
> + return i;
> +}
> +
> +/**
> + * bpf_strpbrk - Find the first occurrence of a set of characters
> + * @cs: The string to be searched
> + * @ct: The characters to search for
> + */
> +__bpf_kfunc char *bpf_strpbrk(const char *cs, const char *ct)
wouldn't this be `cs + bpf_strcspn(cs, ct)`?
> +{
> + char *ret = NULL;
> + int i = 0;
> + char c;
> +
> + pagefault_disable();
> + while (i++ < XATTR_SIZE_MAX) {
> + __get_kernel_nofault(&c, cs, char, out);
> + if (c == '\0')
> + break;
> + if (bpf_strchr(ct, c)) {
> + ret = (char *)cs;
> + break;
> + }
> + cs++;
> + }
> +out:
> + pagefault_enable();
> + return ret;
> +}
> +
> +/**
> + * bpf_strstr - Find the first substring in a %NUL terminated string
> + * @s1: The string to be searched
> + * @s2: The string to search for
> + */
> +__bpf_kfunc char *bpf_strstr(const char *s1, const char *s2)
> +{
> + size_t l1, l2;
> +
> + l2 = bpf_strlen(s2);
> + if (!l2)
> + return (char *)s1;
> + l1 = bpf_strlen(s1);
> + while (l1 >= l2) {
> + l1--;
> + if (!memcmp(s1, s2, l2))
> + return (char *)s1;
> + s1++;
> + }
> + return NULL;
no __get_kernel_nofault() anymore?
> +}
> +
> +/**
> + * bpf_strnstr - Find the first substring in a length-limited string
> + * @s1: The string to be searched
> + * @s1__sz: The size of @s1
> + * @s2: The string to search for
> + * @s2__sz: The size of @s2
> + */
> +__bpf_kfunc char *bpf_strnstr(void *s1, u32 s1__sz, void *s2, u32 s2__sz)
> +{
> + /* strnstr() uses strlen() to get the length of s2. Since this is not
> + * safe in BPF context for non-%NUL-terminated strings, use strnlen
> + * first to make it safe.
> + */
> + if (strnlen(s2, s2__sz) == s2__sz)
> + return NULL;
> + return strnstr(s1, s2, s1__sz);
> +}
> +
we have to assume that the string will change from under us, so any
algorithm that does bpf_strlen/strlen/strnlen and then relies on that
length to be true seems fishy...
pw-bot: cr
> __bpf_kfunc_end_defs();
>
> BTF_KFUNCS_START(generic_btf_ids)
> @@ -3293,6 +3579,19 @@ BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLE
> BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY | KF_SLEEPABLE)
> BTF_ID_FLAGS(func, bpf_local_irq_save)
> BTF_ID_FLAGS(func, bpf_local_irq_restore)
> +BTF_ID_FLAGS(func, bpf_strcmp);
> +BTF_ID_FLAGS(func, bpf_strchr);
> +BTF_ID_FLAGS(func, bpf_strchrnul);
> +BTF_ID_FLAGS(func, bpf_strnchr);
> +BTF_ID_FLAGS(func, bpf_strnchrnul);
> +BTF_ID_FLAGS(func, bpf_strrchr);
> +BTF_ID_FLAGS(func, bpf_strlen);
> +BTF_ID_FLAGS(func, bpf_strnlen);
> +BTF_ID_FLAGS(func, bpf_strspn);
> +BTF_ID_FLAGS(func, bpf_strcspn);
> +BTF_ID_FLAGS(func, bpf_strpbrk);
> +BTF_ID_FLAGS(func, bpf_strstr);
> +BTF_ID_FLAGS(func, bpf_strnstr);
> BTF_KFUNCS_END(common_btf_ids)
>
> static const struct btf_kfunc_id_set common_kfunc_set = {
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next v3 1/3] bpf: Add kfuncs for read-only string operations
2025-03-28 22:48 ` Andrii Nakryiko
@ 2025-04-01 12:48 ` Viktor Malik
2025-04-01 20:20 ` Andrii Nakryiko
0 siblings, 1 reply; 9+ messages in thread
From: Viktor Malik @ 2025-04-01 12:48 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Mykola Lysenko, Shuah Khan
On 3/28/25 23:48, Andrii Nakryiko wrote:
> On Mon, Mar 24, 2025 at 5:04 AM Viktor Malik <vmalik@redhat.com> wrote:
>>
>> String operations are commonly used so this exposes the most common ones
>> to BPF programs. For now, we limit ourselves to operations which do not
>> copy memory around.
>>
>> Unfortunately, most in-kernel implementations assume that strings are
>> %NUL-terminated, which is not necessarily true, and therefore we cannot
>> use them directly in BPF context. So, we use distinct approaches for
>> bounded and unbounded variants of string operations:
>>
>> - Unbounded variants are open-coded with using __get_kernel_nofault
>> instead of plain dereference to make them safe.
>>
>> - Bounded variants use params with the __sz suffix so safety is assured
>> by the verifier and we can use the in-kernel (potentially optimized)
>> functions.
>>
>> Suggested-by: Alexei Starovoitov <ast@kernel.org>
>> Signed-off-by: Viktor Malik <vmalik@redhat.com>
>> ---
>> kernel/bpf/helpers.c | 299 +++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 299 insertions(+)
>>
>> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
>> index 5449756ba102..6f6af4289cd0 100644
>> --- a/kernel/bpf/helpers.c
>> +++ b/kernel/bpf/helpers.c
>> @@ -1,6 +1,7 @@
>> // SPDX-License-Identifier: GPL-2.0-only
>> /* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
>> */
>> +#include "linux/uaccess.h"
>
> <> should be used?
Yes.
>
>> #include <linux/bpf.h>
>> #include <linux/btf.h>
>> #include <linux/bpf-cgroup.h>
>> @@ -3193,6 +3194,291 @@ __bpf_kfunc void bpf_local_irq_restore(unsigned long *flags__irq_flag)
>> local_irq_restore(*flags__irq_flag);
>> }
>>
>> +/* Kfuncs for string operations.
>> + *
>> + * Since strings are not necessarily %NUL-terminated, we cannot directly call
>> + * in-kernel implementations. Instead, unbounded variants are open-coded with
>> + * using __get_kernel_nofault instead of plain dereference to make them safe.
>> + * Bounded variants use params with the __sz suffix so safety is assured by the
>> + * verifier and we can use the in-kernel (potentially optimized) functions.
>> + */
>> +
>> +/**
>> + * bpf_strcmp - Compare two strings
>> + * @cs: One string
>> + * @ct: Another string
>> + */
>> +__bpf_kfunc int bpf_strcmp(const char *cs, const char *ct)
>> +{
>> + int i = 0, ret = 0;
>> + char c1, c2;
>> +
>> + pagefault_disable();
>> + while (i++ < XATTR_SIZE_MAX) {
>> + __get_kernel_nofault(&c1, cs++, char, cs_out);
>> + __get_kernel_nofault(&c2, ct++, char, ct_out);
>
> nit: should we avoid passing increment statements into macro? It's
> succinct and all, but we lose nothing by having cs++; ct++; at the end
> of while loop, no?
That's probably a good idea. It shouldn't be a problem having those
increments at the end of the loop so let me update it.
>
>> + if (c1 != c2) {
>> + ret = c1 < c2 ? -1 : 1;
>> + goto out;
>> + }
>> + if (!c1)
>> + goto out;
>> + }
>> +cs_out:
>> + ret = -1;
>> + goto out;
>> +ct_out:
>> + ret = 1;
>> +out:
>> + pagefault_enable();
>> + return ret;
>> +}
>
> Given valid values are only -1, 0, and 1, should we return -EFAULT
> when one or the other string can't be fetched?
>
> Yes, users that don't care will treat -EFAULT as the first string is
> smaller than the second, but that's what you have anyways. But having
> -EFAULT is still useful, IMO. We can also return -E2BIG if we reach i
> == XATTR_SIZE_MAX situation, no?
I was a bit hesitant to make the semantics of bpf_strcmp different from
strcmp. But the truth is that returning errors here may bring some value
so if people are ok with that, I have no problem implementing your
proposal.
But in such a case, I'd suggest that we do the same for the rest of the
string kfuncs, too. That is, return -EINVAL if __get_kernel_nofault
fails and -E2BIG if the string is longer than XATTR_SIZE_MAX, possibly
wrapped in PTR_ERR when the kfunc returns a pointer. What do you think?
>
>> +
>> +/**
>> + * bpf_strchr - Find the first occurrence of a character in a string
>> + * @s: The string to be searched
>> + * @c: The character to search for
>> + *
>> + * Note that the %NUL-terminator is considered part of the string, and can
>> + * be searched for.
>> + */
>> +__bpf_kfunc char *bpf_strchr(const char *s, int c)
>
> if we do int -> char here, something breaks?
No, it shouldn't. IIUC the int comes from the original prototype of libc
strchr and it's there solely for legacy purposes. Let's change it to
char.
>
>> +{
>> + char *ret = NULL;
>> + int i = 0;
>> + char sc;
>> +
>> + pagefault_disable();
>> + while (i++ < XATTR_SIZE_MAX) {
>> + __get_kernel_nofault(&sc, s, char, out);
>> + if (sc == (char)c) {
>> + ret = (char *)s;
>> + break;
>> + }
>> + if (sc == '\0')
>
> not very consistent with bpf_strcmp() implementation where you just
> did `!c1` for the same. FWIW, when dealing with string characters I
> like `sc == '\0'` better, but regardless let's be consistent, at
> least.
>
>> + break;
>> + s++;
>
> It's like bpf_strcmp and bpf_strchr were written by two different
> people, stylistically :)
Yeah, the main reason here is that I've taken the implementations from
lib/string.c so that's where these differences come from. But the truth
is that the BPF kfuncs required quite a lot of changes so it's better to
rewrite them even further and make them consistent among themselves.
I'll have a look into it.
>
>> + }
>> +out:
>> + pagefault_enable();
>
> how about we
>
> DEFINE_LOCK_GUARD_0(pagefault, pagefault_disable(), pagefault_enable())
>
> like we do for preempt_{disable,enable}() and simplify all the
> implementations significantly?
That's neat, I didn't know it. It will a bit more tricky to use here as
__get_kernel_nofault still requires a label but we should at least be
able to get rid of pagefault_{disable,enable}() in each function.
>
>> + return ret;
>> +}
>> +
>> +/**
>> + * bpf_strchrnul - Find and return a character in a string, or end of string
>> + * @s: The string to be searched
>> + * @c: The character to search for
>> + *
>> + * Returns pointer to first occurrence of 'c' in s. If c is not found, then
>> + * return a pointer to the null byte at the end of s.
>> + */
>> +__bpf_kfunc char *bpf_strchrnul(const char *s, int c)
>> +{
>> + char *ret = NULL;
>> + int i = 0;
>> + char sc;
>> +
>> + pagefault_disable();
>> + while (i++ < XATTR_SIZE_MAX) {
>
> erm... for (i = 0; i < XATTR_SIZE_MAX; i++, s++) ?
>
> what advantage does while() form provide? same question for lots of
> other functions. for() is meant for loops like this, no?
Yes, obviously for() is better here, I'll use it.
>
>> + __get_kernel_nofault(&sc, s, char, out);
>> + if (sc == '\0' || sc == (char)c) {
>> + ret = (char *)s;
>> + break;
>> + }
>> + s++;
>> + }
>> +out:
>> + pagefault_enable();
>> + return ret;
>> +}
>> +
>> +/**
>> + * bpf_strnchr - Find a character in a length limited string
>> + * @s: The string to be searched
>> + * @s__sz: The number of characters to be searched
>> + * @c: The character to search for
>> + *
>> + * Note that the %NUL-terminator is considered part of the string, and can
>> + * be searched for.
>> + */
>> +__bpf_kfunc char *bpf_strnchr(void *s, u32 s__sz, int c)
>
> I'm a bit on the fence here. I can see cases where s would be some
> string somewhere (not "trusted" by verifier, because I did BPF CO-RE
> based casts, etc). Also I can see how s__sz is non-const known at
> runtime only.
>
> I think the performance argument is much less of a priority compared
> to the ability to use the helper in a much wider set of cases. WDYT?
> Maybe let's just have __get_kernel_nofault() for everything?
>
> If performance is truly that important, we can later have an
> optimization in which we detect constant size and "guaranteed"
> lifetime and validity of `s`, and use optimized strnchr()
> implementation?
>
> But I'd start with a safe and generic __get_kernel_nofault() way for sure.
Ok, that makes sense. I didn't realize that with this prototype, we're
eliminating a number of uses-cases and I agree that it's more important
than performance.
Also it turned out that the bounded string functions are seldom
optimized on arches and therefore the performance benefit is minimal
(the only real case is strnlen on few arches like arm64).
So let's turn everything to generic __get_kernel_nofault variants for
now. We can think about optimization later, it would be much better if
we could detect situations when the kernel implementaion can be used
even for the unbounded functions. There, the performance gain would be
seen on much more functions.
>
>> +{
>> + return strnchr(s, s__sz, c);
>> +}
>> +
>> +/**
>> + * bpf_strnchrnul - Find and return a character in a length limited string,
>> + * or end of string
>> + * @s: The string to be searched
>> + * @s__sz: The number of characters to be searched
>> + * @c: The character to search for
>> + *
>> + * Returns pointer to the first occurrence of 'c' in s. If c is not found,
>> + * then return a pointer to the last character of the string.
>> + */
>> +__bpf_kfunc char *bpf_strnchrnul(void *s, u32 s__sz, int c)
>> +{
>> + return strnchrnul(s, s__sz, c);
>> +}
>> +
>> +/**
>> + * bpf_strrchr - Find the last occurrence of a character in a string
>> + * @s: The string to be searched
>> + * @c: The character to search for
>> + */
>> +__bpf_kfunc char *bpf_strrchr(const char *s, int c)
>
> `const char *` return? we won't (well, shouldn't!) allow writing into
> it from the BPF program.
Yes, agreed.
>
>> +{
>> + char *ret = NULL;
>> + int i = 0;
>> + char sc;
>> +
>> + pagefault_disable();
>> + while (i++ < XATTR_SIZE_MAX) {
>> + __get_kernel_nofault(&sc, s, char, out);
>> + if (sc == '\0')
>> + break;
>> + if (sc == (char)c)
>> + ret = (char *)s;
>> + s++;
>> + }
>> +out:
>> + pagefault_enable();
>> + return (char *)ret;
>> +}
>> +
>> +__bpf_kfunc size_t bpf_strlen(const char *s)
>> +{
>> + int i = 0;
>> + char c;
>> +
>> + pagefault_disable();
>> + while (i < XATTR_SIZE_MAX) {
>> + __get_kernel_nofault(&c, s++, char, out);
>> + if (c == '\0')
>> + break;
>> + i++;
>> + }
>> +out:
>> + pagefault_enable();
>> + return i;
>> +}
>> +
>> +__bpf_kfunc size_t bpf_strnlen(void *s, u32 s__sz)
>> +{
>> + return strnlen(s, s__sz);
>> +}
>> +
>> +/**
>> + * bpf_strspn - Calculate the length of the initial substring of @s which only contain letters in @accept
>> + * @s: The string to be searched
>> + * @accept: The string to search for
>> + */
>> +__bpf_kfunc size_t bpf_strspn(const char *s, const char *accept)
>> +{
>> + int i = 0;
>> + char c;
>> +
>> + pagefault_disable();
>> + while (i < XATTR_SIZE_MAX) {
>> + __get_kernel_nofault(&c, s++, char, out);
>> + if (c == '\0' || !bpf_strchr(accept, c))
>
> hm... so `s` is untrusted/unsafe, but `accept` is? How should verifier
> make a distinction? It's `const char *` in the signature, so what
> makes one more safe than the other?
accept is untrusted as well, that's why bpf_strchr is used instead of
strchr. Or am I missing something?
>
>> + break;
>> + i++;
>> + }
>> +out:
>> + pagefault_enable();
>> + return i;
>> +}
>> +
>> +/**
>> + * strcspn - Calculate the length of the initial substring of @s which does not contain letters in @reject
>> + * @s: The string to be searched
>> + * @reject: The string to avoid
>> + */
>> +__bpf_kfunc size_t bpf_strcspn(const char *s, const char *reject)
>> +{
>> + int i = 0;
>> + char c;
>> +
>> + pagefault_disable();
>> + while (i < XATTR_SIZE_MAX) {
>> + __get_kernel_nofault(&c, s++, char, out);
>> + if (c == '\0' || bpf_strchr(reject, c))
>> + break;
>> + i++;
>> + }
>> +out:
>> + pagefault_enable();
>> + return i;
>> +}
>> +
>> +/**
>> + * bpf_strpbrk - Find the first occurrence of a set of characters
>> + * @cs: The string to be searched
>> + * @ct: The characters to search for
>> + */
>> +__bpf_kfunc char *bpf_strpbrk(const char *cs, const char *ct)
>
> wouldn't this be `cs + bpf_strcspn(cs, ct)`?
That's IMO correct, unless bpf_strcspn(cs, ct) == strlen(cs). Then it's
NULL. But it's still a nicer implementation.
>
>> +{
>> + char *ret = NULL;
>> + int i = 0;
>> + char c;
>> +
>> + pagefault_disable();
>> + while (i++ < XATTR_SIZE_MAX) {
>> + __get_kernel_nofault(&c, cs, char, out);
>> + if (c == '\0')
>> + break;
>> + if (bpf_strchr(ct, c)) {
>> + ret = (char *)cs;
>> + break;
>> + }
>> + cs++;
>> + }
>> +out:
>> + pagefault_enable();
>> + return ret;
>> +}
>> +
>> +/**
>> + * bpf_strstr - Find the first substring in a %NUL terminated string
>> + * @s1: The string to be searched
>> + * @s2: The string to search for
>> + */
>> +__bpf_kfunc char *bpf_strstr(const char *s1, const char *s2)
>> +{
>> + size_t l1, l2;
>> +
>> + l2 = bpf_strlen(s2);
>> + if (!l2)
>> + return (char *)s1;
>> + l1 = bpf_strlen(s1);
>> + while (l1 >= l2) {
>> + l1--;
>> + if (!memcmp(s1, s2, l2))
>> + return (char *)s1;
>> + s1++;
>> + }
>> + return NULL;
>
> no __get_kernel_nofault() anymore?
It's hidden inside bpf_strlen.
But I assume that the same as below applies (string possibly changing in
between the bpf_strlen and memcmp calls) so we'll need a different
implementation.
>
>> +}
>> +
>> +/**
>> + * bpf_strnstr - Find the first substring in a length-limited string
>> + * @s1: The string to be searched
>> + * @s1__sz: The size of @s1
>> + * @s2: The string to search for
>> + * @s2__sz: The size of @s2
>> + */
>> +__bpf_kfunc char *bpf_strnstr(void *s1, u32 s1__sz, void *s2, u32 s2__sz)
>> +{
>> + /* strnstr() uses strlen() to get the length of s2. Since this is not
>> + * safe in BPF context for non-%NUL-terminated strings, use strnlen
>> + * first to make it safe.
>> + */
>> + if (strnlen(s2, s2__sz) == s2__sz)
>> + return NULL;
>> + return strnstr(s1, s2, s1__sz);
>> +}
>> +
>
> we have to assume that the string will change from under us, so any
> algorithm that does bpf_strlen/strlen/strnlen and then relies on that
> length to be true seems fishy...
Hm, that's a good point, I didn't consider that. I'll think about a
better implementation for bpf_strstr and bpf_strnstr.
Thanks a lot for the thourough review! I'll post v4 soon.
Viktor
>
> pw-bot: cr
>
>> __bpf_kfunc_end_defs();
>>
>> BTF_KFUNCS_START(generic_btf_ids)
>> @@ -3293,6 +3579,19 @@ BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLE
>> BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY | KF_SLEEPABLE)
>> BTF_ID_FLAGS(func, bpf_local_irq_save)
>> BTF_ID_FLAGS(func, bpf_local_irq_restore)
>> +BTF_ID_FLAGS(func, bpf_strcmp);
>> +BTF_ID_FLAGS(func, bpf_strchr);
>> +BTF_ID_FLAGS(func, bpf_strchrnul);
>> +BTF_ID_FLAGS(func, bpf_strnchr);
>> +BTF_ID_FLAGS(func, bpf_strnchrnul);
>> +BTF_ID_FLAGS(func, bpf_strrchr);
>> +BTF_ID_FLAGS(func, bpf_strlen);
>> +BTF_ID_FLAGS(func, bpf_strnlen);
>> +BTF_ID_FLAGS(func, bpf_strspn);
>> +BTF_ID_FLAGS(func, bpf_strcspn);
>> +BTF_ID_FLAGS(func, bpf_strpbrk);
>> +BTF_ID_FLAGS(func, bpf_strstr);
>> +BTF_ID_FLAGS(func, bpf_strnstr);
>> BTF_KFUNCS_END(common_btf_ids)
>>
>> static const struct btf_kfunc_id_set common_kfunc_set = {
>> --
>> 2.48.1
>>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next v3 1/3] bpf: Add kfuncs for read-only string operations
2025-04-01 12:48 ` Viktor Malik
@ 2025-04-01 20:20 ` Andrii Nakryiko
2025-04-01 20:27 ` Alexei Starovoitov
0 siblings, 1 reply; 9+ messages in thread
From: Andrii Nakryiko @ 2025-04-01 20:20 UTC (permalink / raw)
To: Viktor Malik
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Mykola Lysenko, Shuah Khan
On Tue, Apr 1, 2025 at 5:48 AM Viktor Malik <vmalik@redhat.com> wrote:
>
> On 3/28/25 23:48, Andrii Nakryiko wrote:
> > On Mon, Mar 24, 2025 at 5:04 AM Viktor Malik <vmalik@redhat.com> wrote:
> >>
> >> String operations are commonly used so this exposes the most common ones
> >> to BPF programs. For now, we limit ourselves to operations which do not
> >> copy memory around.
> >>
> >> Unfortunately, most in-kernel implementations assume that strings are
> >> %NUL-terminated, which is not necessarily true, and therefore we cannot
> >> use them directly in BPF context. So, we use distinct approaches for
> >> bounded and unbounded variants of string operations:
> >>
> >> - Unbounded variants are open-coded with using __get_kernel_nofault
> >> instead of plain dereference to make them safe.
> >>
> >> - Bounded variants use params with the __sz suffix so safety is assured
> >> by the verifier and we can use the in-kernel (potentially optimized)
> >> functions.
> >>
> >> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> >> Signed-off-by: Viktor Malik <vmalik@redhat.com>
> >> ---
> >> kernel/bpf/helpers.c | 299 +++++++++++++++++++++++++++++++++++++++++++
> >> 1 file changed, 299 insertions(+)
> >>
[...]
> >> +
> >> + pagefault_disable();
> >> + while (i++ < XATTR_SIZE_MAX) {
> >> + __get_kernel_nofault(&c1, cs++, char, cs_out);
> >> + __get_kernel_nofault(&c2, ct++, char, ct_out);
> >
> > nit: should we avoid passing increment statements into macro? It's
> > succinct and all, but we lose nothing by having cs++; ct++; at the end
> > of while loop, no?
>
> That's probably a good idea. It shouldn't be a problem having those
> increments at the end of the loop so let me update it.
ok, thanks
>
> >
> >> + if (c1 != c2) {
> >> + ret = c1 < c2 ? -1 : 1;
> >> + goto out;
> >> + }
> >> + if (!c1)
> >> + goto out;
> >> + }
> >> +cs_out:
> >> + ret = -1;
> >> + goto out;
> >> +ct_out:
> >> + ret = 1;
> >> +out:
> >> + pagefault_enable();
> >> + return ret;
> >> +}
> >
> > Given valid values are only -1, 0, and 1, should we return -EFAULT
> > when one or the other string can't be fetched?
> >
> > Yes, users that don't care will treat -EFAULT as the first string is
> > smaller than the second, but that's what you have anyways. But having
> > -EFAULT is still useful, IMO. We can also return -E2BIG if we reach i
> > == XATTR_SIZE_MAX situation, no?
>
> I was a bit hesitant to make the semantics of bpf_strcmp different from
> strcmp. But the truth is that returning errors here may bring some value
> so if people are ok with that, I have no problem implementing your
> proposal.
>
> But in such a case, I'd suggest that we do the same for the rest of the
> string kfuncs, too. That is, return -EINVAL if __get_kernel_nofault
-EFAULT, not -EINVAL
> fails and -E2BIG if the string is longer than XATTR_SIZE_MAX, possibly
> wrapped in PTR_ERR when the kfunc returns a pointer. What do you think?
yep, makes sense. Unlike strcmp() and others, we do have an extra
conditions as you mentioned (memory read faulting, too long/unbounded
strings, etc), so I'd prefer if users were able to tell error
condition from trustworthy result, yep
>
> >
> >> +
> >> +/**
> >> + * bpf_strchr - Find the first occurrence of a character in a string
> >> + * @s: The string to be searched
> >> + * @c: The character to search for
> >> + *
> >> + * Note that the %NUL-terminator is considered part of the string, and can
> >> + * be searched for.
> >> + */
> >> +__bpf_kfunc char *bpf_strchr(const char *s, int c)
> >
> > if we do int -> char here, something breaks?
>
> No, it shouldn't. IIUC the int comes from the original prototype of libc
> strchr and it's there solely for legacy purposes. Let's change it to
> char.
>
+1, I always found that `int c` in glibc confusing
> >
> >> +{
> >> + char *ret = NULL;
> >> + int i = 0;
> >> + char sc;
> >> +
> >> + pagefault_disable();
> >> + while (i++ < XATTR_SIZE_MAX) {
> >> + __get_kernel_nofault(&sc, s, char, out);
> >> + if (sc == (char)c) {
> >> + ret = (char *)s;
> >> + break;
> >> + }
> >> + if (sc == '\0')
> >
> > not very consistent with bpf_strcmp() implementation where you just
> > did `!c1` for the same. FWIW, when dealing with string characters I
> > like `sc == '\0'` better, but regardless let's be consistent, at
> > least.
> >
> >> + break;
> >> + s++;
> >
> > It's like bpf_strcmp and bpf_strchr were written by two different
> > people, stylistically :)
>
> Yeah, the main reason here is that I've taken the implementations from
> lib/string.c so that's where these differences come from. But the truth
> is that the BPF kfuncs required quite a lot of changes so it's better to
> rewrite them even further and make them consistent among themselves.
> I'll have a look into it.
>
> >
> >> + }
> >> +out:
> >> + pagefault_enable();
> >
> > how about we
> >
> > DEFINE_LOCK_GUARD_0(pagefault, pagefault_disable(), pagefault_enable())
> >
> > like we do for preempt_{disable,enable}() and simplify all the
> > implementations significantly?
>
> That's neat, I didn't know it. It will a bit more tricky to use here as
> __get_kernel_nofault still requires a label but we should at least be
> able to get rid of pagefault_{disable,enable}() in each function.
yep, we'll have:
err_out:
return -EFAULT;
But the rest will be a clean loop and no pagefault_disable() clean
ups, distracting from the main logic
[...]
> >> +/**
> >> + * bpf_strnchr - Find a character in a length limited string
> >> + * @s: The string to be searched
> >> + * @s__sz: The number of characters to be searched
> >> + * @c: The character to search for
> >> + *
> >> + * Note that the %NUL-terminator is considered part of the string, and can
> >> + * be searched for.
> >> + */
> >> +__bpf_kfunc char *bpf_strnchr(void *s, u32 s__sz, int c)
> >
> > I'm a bit on the fence here. I can see cases where s would be some
> > string somewhere (not "trusted" by verifier, because I did BPF CO-RE
> > based casts, etc). Also I can see how s__sz is non-const known at
> > runtime only.
> >
> > I think the performance argument is much less of a priority compared
> > to the ability to use the helper in a much wider set of cases. WDYT?
> > Maybe let's just have __get_kernel_nofault() for everything?
> >
> > If performance is truly that important, we can later have an
> > optimization in which we detect constant size and "guaranteed"
> > lifetime and validity of `s`, and use optimized strnchr()
> > implementation?
> >
> > But I'd start with a safe and generic __get_kernel_nofault() way for sure.
>
> Ok, that makes sense. I didn't realize that with this prototype, we're
> eliminating a number of uses-cases and I agree that it's more important
> than performance.
>
> Also it turned out that the bounded string functions are seldom
> optimized on arches and therefore the performance benefit is minimal
> (the only real case is strnlen on few arches like arm64).
>
> So let's turn everything to generic __get_kernel_nofault variants for
> now. We can think about optimization later, it would be much better if
> we could detect situations when the kernel implementaion can be used
> even for the unbounded functions. There, the performance gain would be
> seen on much more functions.
Let's leave it for the future, I suspect string routine performance
will never be a real performance concert, as it will normally be a
small part of otherwise much more expensive custom logic in BPF
program. But we'll have an option to optimize, and that's all that
matters. Generality beats performance for these APIs, though, so let's
generalize first.
>
> >
> >> +{
> >> + return strnchr(s, s__sz, c);
> >> +}
> >> +
[...]
> >> +/**
> >> + * bpf_strspn - Calculate the length of the initial substring of @s which only contain letters in @accept
> >> + * @s: The string to be searched
> >> + * @accept: The string to search for
> >> + */
> >> +__bpf_kfunc size_t bpf_strspn(const char *s, const char *accept)
> >> +{
> >> + int i = 0;
> >> + char c;
> >> +
> >> + pagefault_disable();
> >> + while (i < XATTR_SIZE_MAX) {
> >> + __get_kernel_nofault(&c, s++, char, out);
> >> + if (c == '\0' || !bpf_strchr(accept, c))
> >
> > hm... so `s` is untrusted/unsafe, but `accept` is? How should verifier
> > make a distinction? It's `const char *` in the signature, so what
> > makes one more safe than the other?
>
> accept is untrusted as well, that's why bpf_strchr is used instead of
> strchr. Or am I missing something?
I'm just asking how should verifier know this just looking at the prototype:
__bpf_kfunc size_t bpf_strspn(const char *s, const char *accept)
?
For the next revision, let's add a bunch more "negative cases",
passing known-bad pointers/strings and checking -EFAULT and -E2BIG.
These are very generic APIs, let's test this thoroughly: user instead
of kernel pointers, unbounded/too long strings, just plain invalid
pointers, etc.
>
> >
> >> + break;
> >> + i++;
> >> + }
> >> +out:
> >> + pagefault_enable();
> >> + return i;
> >> +}
> >> +
> >> +/**
> >> + * strcspn - Calculate the length of the initial substring of @s which does not contain letters in @reject
> >> + * @s: The string to be searched
> >> + * @reject: The string to avoid
> >> + */
> >> +__bpf_kfunc size_t bpf_strcspn(const char *s, const char *reject)
> >> +{
> >> + int i = 0;
> >> + char c;
> >> +
> >> + pagefault_disable();
> >> + while (i < XATTR_SIZE_MAX) {
> >> + __get_kernel_nofault(&c, s++, char, out);
> >> + if (c == '\0' || bpf_strchr(reject, c))
> >> + break;
> >> + i++;
> >> + }
> >> +out:
> >> + pagefault_enable();
> >> + return i;
> >> +}
> >> +
> >> +/**
> >> + * bpf_strpbrk - Find the first occurrence of a set of characters
> >> + * @cs: The string to be searched
> >> + * @ct: The characters to search for
> >> + */
> >> +__bpf_kfunc char *bpf_strpbrk(const char *cs, const char *ct)
> >
> > wouldn't this be `cs + bpf_strcspn(cs, ct)`?
>
> That's IMO correct, unless bpf_strcspn(cs, ct) == strlen(cs). Then it's
> NULL. But it's still a nicer implementation.
ah, strlen() is problematic (though we can have __bpf_strcspn()
returning both size_t and the actual character (as out parameter), to
distinguish these situations, don't know. It will be more obvious
while implementing if that makes sense or not (all those algorithms
are pretty straightforward, so code reuse isn't really a big concern
from my POV)
>
> >
> >> +{
> >> + char *ret = NULL;
> >> + int i = 0;
> >> + char c;
> >> +
> >> + pagefault_disable();
> >> + while (i++ < XATTR_SIZE_MAX) {
> >> + __get_kernel_nofault(&c, cs, char, out);
> >> + if (c == '\0')
> >> + break;
> >> + if (bpf_strchr(ct, c)) {
> >> + ret = (char *)cs;
> >> + break;
> >> + }
> >> + cs++;
> >> + }
> >> +out:
> >> + pagefault_enable();
> >> + return ret;
> >> +}
> >> +
> >> +/**
> >> + * bpf_strstr - Find the first substring in a %NUL terminated string
> >> + * @s1: The string to be searched
> >> + * @s2: The string to search for
> >> + */
> >> +__bpf_kfunc char *bpf_strstr(const char *s1, const char *s2)
> >> +{
> >> + size_t l1, l2;
> >> +
> >> + l2 = bpf_strlen(s2);
> >> + if (!l2)
> >> + return (char *)s1;
> >> + l1 = bpf_strlen(s1);
> >> + while (l1 >= l2) {
> >> + l1--;
> >> + if (!memcmp(s1, s2, l2))
> >> + return (char *)s1;
> >> + s1++;
> >> + }
> >> + return NULL;
> >
> > no __get_kernel_nofault() anymore?
>
> It's hidden inside bpf_strlen.
>
> But I assume that the same as below applies (string possibly changing in
> between the bpf_strlen and memcmp calls) so we'll need a different
> implementation.
>
yep, assume the worst. The result might be invalid due to race (as in,
we can report that string is found at position X, but a few
nanoseconds later that's not true anymore), that's fine. But we
shouldn't crash, loop forever, etc. As long as the result is correct
in a non-changing string situation, we should be fine.
> >
> >> +}
> >> +
> >> +/**
> >> + * bpf_strnstr - Find the first substring in a length-limited string
> >> + * @s1: The string to be searched
> >> + * @s1__sz: The size of @s1
> >> + * @s2: The string to search for
> >> + * @s2__sz: The size of @s2
> >> + */
> >> +__bpf_kfunc char *bpf_strnstr(void *s1, u32 s1__sz, void *s2, u32 s2__sz)
> >> +{
> >> + /* strnstr() uses strlen() to get the length of s2. Since this is not
> >> + * safe in BPF context for non-%NUL-terminated strings, use strnlen
> >> + * first to make it safe.
> >> + */
> >> + if (strnlen(s2, s2__sz) == s2__sz)
> >> + return NULL;
> >> + return strnstr(s1, s2, s1__sz);
> >> +}
> >> +
> >
> > we have to assume that the string will change from under us, so any
> > algorithm that does bpf_strlen/strlen/strnlen and then relies on that
> > length to be true seems fishy...
>
> Hm, that's a good point, I didn't consider that. I'll think about a
> better implementation for bpf_strstr and bpf_strnstr.
>
> Thanks a lot for the thourough review! I'll post v4 soon.
>
thanks, let's be a bit more paranoid and add some negative tests while at it
> Viktor
>
> >
> > pw-bot: cr
> >
> >> __bpf_kfunc_end_defs();
> >>
> >> BTF_KFUNCS_START(generic_btf_ids)
> >> @@ -3293,6 +3579,19 @@ BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLE
> >> BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY | KF_SLEEPABLE)
> >> BTF_ID_FLAGS(func, bpf_local_irq_save)
> >> BTF_ID_FLAGS(func, bpf_local_irq_restore)
> >> +BTF_ID_FLAGS(func, bpf_strcmp);
> >> +BTF_ID_FLAGS(func, bpf_strchr);
> >> +BTF_ID_FLAGS(func, bpf_strchrnul);
> >> +BTF_ID_FLAGS(func, bpf_strnchr);
> >> +BTF_ID_FLAGS(func, bpf_strnchrnul);
> >> +BTF_ID_FLAGS(func, bpf_strrchr);
> >> +BTF_ID_FLAGS(func, bpf_strlen);
> >> +BTF_ID_FLAGS(func, bpf_strnlen);
> >> +BTF_ID_FLAGS(func, bpf_strspn);
> >> +BTF_ID_FLAGS(func, bpf_strcspn);
> >> +BTF_ID_FLAGS(func, bpf_strpbrk);
> >> +BTF_ID_FLAGS(func, bpf_strstr);
> >> +BTF_ID_FLAGS(func, bpf_strnstr);
> >> BTF_KFUNCS_END(common_btf_ids)
> >>
> >> static const struct btf_kfunc_id_set common_kfunc_set = {
> >> --
> >> 2.48.1
> >>
> >
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next v3 1/3] bpf: Add kfuncs for read-only string operations
2025-04-01 20:20 ` Andrii Nakryiko
@ 2025-04-01 20:27 ` Alexei Starovoitov
0 siblings, 0 replies; 9+ messages in thread
From: Alexei Starovoitov @ 2025-04-01 20:27 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Viktor Malik, bpf, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
Hao Luo, Jiri Olsa, Mykola Lysenko, Shuah Khan
On Tue, Apr 1, 2025 at 1:21 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> > >
> > >> + }
> > >> +out:
> > >> + pagefault_enable();
> > >
> > > how about we
> > >
> > > DEFINE_LOCK_GUARD_0(pagefault, pagefault_disable(), pagefault_enable())
> > >
> > > like we do for preempt_{disable,enable}() and simplify all the
> > > implementations significantly?
> >
> > That's neat, I didn't know it. It will a bit more tricky to use here as
> > __get_kernel_nofault still requires a label but we should at least be
> > able to get rid of pagefault_{disable,enable}() in each function.
>
> yep, we'll have:
>
> err_out:
> return -EFAULT;
>
> But the rest will be a clean loop and no pagefault_disable() clean
> ups, distracting from the main logic
Just a word of caution...
Please double check that both gcc and clang generate
proper cleanup code when asm goto is used.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-04-01 20:27 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-24 12:03 [PATCH bpf-next v3 0/3] bpf: Add kfuncs for read-only string operations Viktor Malik
2025-03-24 12:03 ` [PATCH bpf-next v3 1/3] " Viktor Malik
2025-03-28 22:48 ` Andrii Nakryiko
2025-04-01 12:48 ` Viktor Malik
2025-04-01 20:20 ` Andrii Nakryiko
2025-04-01 20:27 ` Alexei Starovoitov
2025-03-24 12:03 ` [PATCH bpf-next v3 2/3] selftests/bpf: Add tests for string kfuncs Viktor Malik
2025-03-25 8:33 ` Jiri Olsa
2025-03-24 12:03 ` [PATCH bpf-next v3 3/3] selftests/bpf: Add benchmark for bounded/unbounded " Viktor Malik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox