From: Matt Bobrowski <mattbobrowski@google.com>
To: Viktor Malik <vmalik@redhat.com>
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>
Subject: Re: [PATCH bpf-next v4 3/4] bpf: Add kfuncs for read-only string operations
Date: Thu, 8 May 2025 09:41:58 +0000 [thread overview]
Message-ID: <aBx8Zjux0bSgtV04@google.com> (raw)
In-Reply-To: <19913411da8c08170d959207e28262efc0a5d813.1746598898.git.vmalik@redhat.com>
On Wed, May 07, 2025 at 08:40:38AM +0200, Viktor Malik wrote:
> String operations are commonly used so this exposes the most common ones
> to BPF programs. For now, we limit ourselves to operations which do not
> copy memory around.
>
> Unfortunately, most in-kernel implementations assume that strings are
> %NUL-terminated, which is not necessarily true, and therefore we cannot
> use them directly in the BPF context. Instead, we open-code them using
> __get_kernel_nofault instead of plain dereference to make them safe and
> limit the strings length to XATTR_SIZE_MAX to make sure the functions
> terminate. When __get_kernel_nofault fails, functions return -EFAULT.
> Similarly, when the size bound is reached, the functions return -E2BIG.
Curious, why was XATTR_SIZE_MAX chosen as the upper bounds here? Just
an arbitrary value that felt large enough?
> At the moment, strings can be passed to the kfuncs in three forms:
> - string literals (i.e. pointers to read-only maps)
> - global variables (i.e. pointers to read-write maps)
> - stack-allocated buffers
>
> Note that currently, it is not possible to pass strings from the BPF
> program context (like function args) as the verifier doesn't treat them
> as neither PTR_TO_MEM nor PTR_TO_BTF_ID.
>
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Viktor Malik <vmalik@redhat.com>
> ---
> kernel/bpf/helpers.c | 440 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 440 insertions(+)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index e3a2662f4e33..8fb7c2ca7ac0 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -23,6 +23,7 @@
> #include <linux/btf_ids.h>
> #include <linux/bpf_mem_alloc.h>
> #include <linux/kasan.h>
> +#include <linux/uaccess.h>
>
> #include "../../lib/kstrtox.h"
>
> @@ -3194,6 +3195,433 @@ __bpf_kfunc void bpf_local_irq_restore(unsigned long *flags__irq_flag)
> local_irq_restore(*flags__irq_flag);
> }
>
> +/* Kfuncs for string operations.
> + *
> + * Since strings are not necessarily %NUL-terminated, we cannot directly call
> + * in-kernel implementations. Instead, we open-code the implementations using
> + * __get_kernel_nofault instead of plain dereference to make them safe.
> + */
Returning an -EFAULT error code for supplied arguments which are
deemed to be invalid is just a very weird semantic IMO. As a BPF
program author, I totally wouldn't expect a BPF kfunc to return
-EINVAL if I had supplied it something that it couldn't quite possibly
handle to begin with. Look at the prior art, being pre-existing BPF
kfuncs, and you'll see how they handle invalidly supplied arguments. I
totally understand that attempting to dereference a NULL-pointer would
ultimately result in a fault being raised, so it may feel rather
natural to also report an -EFAULT error code upon doing some
NULL-pointer checks that hold true, but from an API usability POV it
just seems awkward and wrong.
Another thing that I noticed was that none of these BPF kfunc
arguments make use of the parameter suffixes i.e. __str, __sz. Why is
that exactly? Will leaning on those break you in some way?
> +/**
> + * bpf_strcmp - Compare two strings
> + * @s1: One string
> + * @s2: Another string
> + *
> + * Return:
> + * * %0 - Strings are equal
> + * * %-1 - @s1 is smaller
> + * * %1 - @s2 is smaller
> + * * %-EFAULT - Cannot read one of the strings
> + * * %-E2BIG - One of strings is too large
> + */
> +__bpf_kfunc int bpf_strcmp(const char *s1, const char *s2)
> +{
> + char c1, c2;
> + int i;
> +
> + if (!s1 || !s2)
> + return -EFAULT;
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&c1, s1, char, err_out);
> + __get_kernel_nofault(&c2, s2, char, err_out);
> + if (c1 != c2)
> + return c1 < c2 ? -1 : 1;
> + if (c1 == '\0')
> + return 0;
> + s1++;
> + s2++;
> + }
> + return -E2BIG;
> +err_out:
> + return -EFAULT;
> +}
> +
> +/**
> + * bpf_strchr - Find the first occurrence of a character in a string
> + * @s: The string to be searched
> + * @c: The character to search for
> + *
> + * Note that the %NUL-terminator is considered part of the string, and can
> + * be searched for.
> + *
> + * Return:
> + * * const char * - Pointer to the first occurrence of @c within @s
> + * * %NULL - @c not found in @s
> + * * %-EFAULT - Cannot read @s
> + * * %-E2BIG - @s too large
> + */
> +__bpf_kfunc const char *bpf_strchr(const char *s, char c)
> +{
> + char sc;
> + int i;
> +
> + if (!s)
> + return ERR_PTR(-EFAULT);
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&sc, s, char, err_out);
> + if (sc == c)
> + return s;
> + if (sc == '\0')
> + return NULL;
> + s++;
> + }
> + return ERR_PTR(-E2BIG);
> +err_out:
> + return ERR_PTR(-EFAULT);
> +}
> +
> +/**
> + * bpf_strnchr - Find a character in a length limited string
> + * @s: The string to be searched
> + * @count: The number of characters to be searched
> + * @c: The character to search for
> + *
> + * Note that the %NUL-terminator is considered part of the string, and can
> + * be searched for.
> + *
> + * Return:
> + * * const char * - Pointer to the first occurrence of @c within @s
> + * * %NULL - @c not found in the first @count characters of @s
> + * * %-EFAULT - Cannot read @s
> + * * %-E2BIG - @s too large
> + */
> +__bpf_kfunc const char *bpf_strnchr(const char *s, size_t count, char c)
> +{
> + char sc;
> + int i;
> +
> + if (!s)
> + return ERR_PTR(-EFAULT);
> +
> + guard(pagefault)();
> + for (i = 0; i < count && i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&sc, s, char, err_out);
> + if (sc == c)
> + return s;
> + if (sc == '\0')
> + return NULL;
> + s++;
> + }
> + return i == XATTR_SIZE_MAX ? ERR_PTR(-E2BIG) : NULL;
> +err_out:
> + return ERR_PTR(-EFAULT);
> +}
> +
> +/**
> + * bpf_strchrnul - Find and return a character in a string, or end of string
> + * @s: The string to be searched
> + * @c: The character to search for
> + *
> + * Return:
> + * * const char * - Pointer to the first occurrence of @c within @s or pointer
> + * to the null byte at the end of @s when @c is not found
> + * * %-EFAULT - Cannot read @s
> + * * %-E2BIG - @s too large
> + */
> +__bpf_kfunc const char *bpf_strchrnul(const char *s, char c)
> +{
> + char sc;
> + int i;
> +
> + if (!s)
> + return ERR_PTR(-EFAULT);
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&sc, s, char, err_out);
> + if (sc == '\0' || sc == c)
> + return s;
> + s++;
> + }
> + return ERR_PTR(-E2BIG);
> +err_out:
> + return ERR_PTR(-EFAULT);
> +}
> +
> +/**
> + * bpf_strrchr - Find the last occurrence of a character in a string
> + * @s: The string to be searched
> + * @c: The character to search for
> + *
> + * Return:
> + * * const char * - Pointer to the last occurrence of @c within @s
> + * * %NULL - @c not found in @s
> + * * %-EFAULT - Cannot read @s
> + * * %-E2BIG - @s too large
> + */
> +__bpf_kfunc const char *bpf_strrchr(const char *s, int c)
> +{
> + const char *last = NULL;
> + char sc;
> + int i;
> +
> + if (!s)
> + return ERR_PTR(-EFAULT);
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&sc, s, char, err_out);
> + if (sc == '\0')
> + return last;
> + if (sc == c)
> + last = s;
> + s++;
> + }
> + return ERR_PTR(-E2BIG);
> +err_out:
> + return ERR_PTR(-EFAULT);
> +}
> +
> +/**
> + * bpf_strlen - Calculate the length of a string
> + * @s: The string
> + *
> + * Return:
> + * * >=0 - The length of @s
> + * * %-EFAULT - Cannot read @s
> + * * %-E2BIG - @s too large
> + */
> +__bpf_kfunc int bpf_strlen(const char *s)
> +{
> + char c;
> + int i;
> +
> + if (!s)
> + return -EFAULT;
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&c, s, char, err_out);
> + if (c == '\0')
> + return i;
> + s++;
> + }
> + return -E2BIG;
> +err_out:
> + return -EFAULT;
> +}
> +
> +/**
> + * bpf_strlen - Calculate the length of a length-limited string
> + * @s: The string
> + * @count: The maximum number of characters to count
> + *
> + * Return:
> + * * >=0 - The length of @s
> + * * %-EFAULT - Cannot read @s
> + * * %-E2BIG - @s too large
> + */
> +__bpf_kfunc int bpf_strnlen(const char *s, size_t count)
> +{
> + char c;
> + int i;
> +
> + if (!s)
> + return -EFAULT;
> +
> + guard(pagefault)();
> + for (i = 0; i < count && i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&c, s, char, err_out);
> + if (c == '\0')
> + return i;
> + s++;
> + }
> + return i == XATTR_SIZE_MAX ? -E2BIG : i;
> +err_out:
> + return -EFAULT;
> +}
> +
> +/**
> + * bpf_strspn - Calculate the length of the initial substring of @s which only
> + * contains letters in @accept
> + * @s: The string to be searched
> + * @accept: The string to search for
> + *
> + * Return:
> + * * >=0 - The length of the initial substring of @s which only contains
> + * letter in @accept
> + * * %-EFAULT - Cannot read @s
> + * * %-E2BIG - @s too large
v> + */
> +__bpf_kfunc int bpf_strspn(const char *s, const char *accept)
> +{
> + const char *p;
> + char c;
> + int i;
> +
> + if (!s || !accept)
> + return -EFAULT;
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&c, s, char, err_out);
> + p = bpf_strchr(accept, c);
> + if (IS_ERR(p))
> + return PTR_ERR(p);
> + if (c == '\0' || !p)
> + return i;
> + s++;
> + }
> + return -E2BIG;
> +err_out:
> + return -EFAULT;
> +}
> +
> +/**
> + * strcspn - Calculate the length of the initial substring of @s which does not
> + * contain letters in @reject
> + * @s: The string to be searched
> + * @reject: The string to avoid
> + *
> + * Return:
> + * * >=0 - The length of the initial substring of @s which does not contain
> + * letters from @reject
> + * * %-EFAULT - Cannot read @s
> + * * %-E2BIG - @s too large
> + */
> +__bpf_kfunc int bpf_strcspn(const char *s, const char *reject)
> +{
> + const char *p;
> + char c;
> + int i;
> +
> + if (!s || !reject)
> + return -EFAULT;
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&c, s, char, err_out);
> + p = bpf_strchr(reject, c);
> + if (IS_ERR(p))
> + return PTR_ERR(p);
> + if (c == '\0' || p)
> + return i;
> + s++;
> + }
> + return -E2BIG;
> +err_out:
> + return -EFAULT;
> +}
> +
> +/**
> + * bpf_strpbrk - Find the first occurrence of a set of characters
> + * @s: The string to be searched
> + * @accept: The characters to search for
> + *
> + * Return:
> + * * const char * - Pointer to the first occurrence of a character from @accept
> + * within @s
> + * * %NULL - No character from @accept found in @s
> + * * %-EFAULT - Cannot read one of the strings
> + * * %-E2BIG - One of the strings is too large
> + */
> +__bpf_kfunc const char *bpf_strpbrk(const char *s, const char *accept)
> +{
> + const char *p;
> + char c;
> + int i;
> +
> + if (!s || !accept)
> + return ERR_PTR(-EFAULT);
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + __get_kernel_nofault(&c, s, char, err_out);
> + if (c == '\0')
> + return NULL;
> + p = bpf_strchr(accept, c);
> + if (IS_ERR(p))
> + return p;
> + if (p)
> + return s;
> + s++;
> + }
> + return ERR_PTR(-E2BIG);
> +err_out:
> + return ERR_PTR(-EFAULT);
> +}
> +
> +/**
> + * bpf_strstr - Find the first substring in a string
> + * @s1: The string to be searched
> + * @s2: The string to search for
> + *
> + * Return:
> + * * const char * - Pointer to the first occurrence of @s2 within @s1
> + * * %NULL - @s2 is not a substring of @s1
> + * * %-EFAULT - Cannot read one of the strings
> + * * %-E2BIG - One of the strings is too large
> + */
> +__bpf_kfunc const char *bpf_strstr(const char *s1, const char *s2)
> +{
> + char c1, c2;
> + int i, j;
> +
> + if (!s1 || !s2)
> + return ERR_PTR(-EFAULT);
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + for (j = 0; j < XATTR_SIZE_MAX; j++) {
> + __get_kernel_nofault(&c1, s1 + j, char, err_out);
> + __get_kernel_nofault(&c2, s2 + j, char, err_out);
> + if (c2 == '\0')
> + return s1;
> + if (c1 == '\0')
> + return NULL;
> + if (c1 != c2)
> + break;
> + }
> + if (j == XATTR_SIZE_MAX)
> + return ERR_PTR(-E2BIG);
> + s1++;
> + }
> + return ERR_PTR(-E2BIG);
> +err_out:
> + return ERR_PTR(-EFAULT);
> +}
> +
> +/**
> + * bpf_strnstr - Find the first substring in a length-limited string
> + * @s1: The string to be searched
> + * @s2: The string to search for
> + * @len: the maximum number of characters to search
Return: ...?
> + */
> +__bpf_kfunc const char *bpf_strnstr(const char *s1, const char *s2, size_t len)
> +{
> + char c1, c2;
> + int i, j;
> +
> + if (!s1 || !s2)
> + return ERR_PTR(-EFAULT);
> +
> + guard(pagefault)();
> + for (i = 0; i < XATTR_SIZE_MAX; i++) {
> + for (j = 0; i + j < len && j < XATTR_SIZE_MAX; j++) {
> + __get_kernel_nofault(&c1, s1 + j, char, err_out);
> + __get_kernel_nofault(&c2, s2 + j, char, err_out);
> + if (c2 == '\0')
> + return s1;
> + if (c1 == '\0')
> + return NULL;
> + if (c1 != c2)
> + break;
> + }
> + if (j == XATTR_SIZE_MAX)
> + return ERR_PTR(-E2BIG);
> + if (i + j == len)
> + return NULL;
> + s1++;
> + }
> + return ERR_PTR(-E2BIG);
> +err_out:
> + return ERR_PTR(-EFAULT);
> +}
> +
> __bpf_kfunc_end_defs();
>
> BTF_KFUNCS_START(generic_btf_ids)
> @@ -3294,6 +3722,18 @@ BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLE
> BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY | KF_SLEEPABLE)
> BTF_ID_FLAGS(func, bpf_local_irq_save)
> BTF_ID_FLAGS(func, bpf_local_irq_restore)
> +BTF_ID_FLAGS(func, bpf_strcmp);
> +BTF_ID_FLAGS(func, bpf_strchr, KF_RET_NULL);
> +BTF_ID_FLAGS(func, bpf_strchrnul);
> +BTF_ID_FLAGS(func, bpf_strnchr, KF_RET_NULL);
> +BTF_ID_FLAGS(func, bpf_strrchr, KF_RET_NULL);
> +BTF_ID_FLAGS(func, bpf_strlen);
> +BTF_ID_FLAGS(func, bpf_strnlen);
> +BTF_ID_FLAGS(func, bpf_strspn);
> +BTF_ID_FLAGS(func, bpf_strcspn);
> +BTF_ID_FLAGS(func, bpf_strpbrk, KF_RET_NULL);
> +BTF_ID_FLAGS(func, bpf_strstr, KF_RET_NULL);
> +BTF_ID_FLAGS(func, bpf_strnstr, KF_RET_NULL);
> BTF_KFUNCS_END(common_btf_ids)
>
> static const struct btf_kfunc_id_set common_kfunc_set = {
> --
> 2.49.0
>
next prev parent reply other threads:[~2025-05-08 9:42 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-07 6:40 [PATCH bpf-next v4 0/4] bpf: Add kfuncs for read-only string operations Viktor Malik
2025-05-07 6:40 ` [PATCH bpf-next v4 1/4] bpf: Teach vefier to handle const ptrs as args to kfuncs Viktor Malik
2025-05-08 9:08 ` Matt Bobrowski
2025-05-09 16:20 ` Alexei Starovoitov
2025-05-13 6:48 ` Matt Bobrowski
2025-05-13 7:54 ` Viktor Malik
2025-05-13 14:58 ` Alexei Starovoitov
2025-05-13 7:58 ` Viktor Malik
2025-05-07 6:40 ` [PATCH bpf-next v4 2/4] uaccess: Define pagefault lock guard Viktor Malik
2025-05-08 10:00 ` Matt Bobrowski
2025-05-09 18:20 ` Andrii Nakryiko
2025-05-13 6:55 ` Matt Bobrowski
2025-05-07 6:40 ` [PATCH bpf-next v4 3/4] bpf: Add kfuncs for read-only string operations Viktor Malik
2025-05-08 9:41 ` Matt Bobrowski [this message]
2025-05-09 16:39 ` Alexei Starovoitov
2025-05-28 9:05 ` Viktor Malik
2025-05-15 12:32 ` Viktor Malik
2025-05-09 18:20 ` Andrii Nakryiko
2025-05-09 21:37 ` Eduard Zingerman
2025-05-09 22:03 ` Andrii Nakryiko
2025-05-15 12:27 ` Viktor Malik
2025-05-15 17:17 ` Andrii Nakryiko
2025-05-16 5:59 ` Viktor Malik
2025-05-16 15:50 ` Andrii Nakryiko
2025-05-07 6:40 ` [PATCH bpf-next v4 4/4] selftests/bpf: Add tests for string kfuncs Viktor Malik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aBx8Zjux0bSgtV04@google.com \
--to=mattbobrowski@google.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=vmalik@redhat.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.