* [PATCH bpf-next 1/6] bpf: Don't EFAULT for getsockopt with optval=NULL
2023-04-18 22:53 [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt Stanislav Fomichev
@ 2023-04-18 22:53 ` Stanislav Fomichev
2023-04-18 22:53 ` [PATCH bpf-next 2/6] selftests/bpf: Verify optval=NULL case Stanislav Fomichev
` (5 subsequent siblings)
6 siblings, 0 replies; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-18 22:53 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
kpsingh, sdf, haoluo, jolsa, Martin KaFai Lau
Some socket options do getsockopt with optval=NULL to estimate
the size of the final buffer (which is returned via optlen).
This breaks BPF getsockopt assumptions about permitted
optval buffer size. Let's enforce these assumptions only
when non-NULL optval is provided.
Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
Reported-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/ZD7Js4fj5YyI2oLd@google.com/T/#mb68daf700f87a9244a15d01d00c3f0e5b08f49f7
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
kernel/bpf/cgroup.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 53edb8ad2471..a06e118a9be5 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1921,14 +1921,17 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
if (ret < 0)
goto out;
- if (ctx.optlen > max_optlen || ctx.optlen < 0) {
+ if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
ret = -EFAULT;
goto out;
}
if (ctx.optlen != 0) {
- if (copy_to_user(optval, ctx.optval, ctx.optlen) ||
- put_user(ctx.optlen, optlen)) {
+ if (optval && copy_to_user(optval, ctx.optval, ctx.optlen)) {
+ ret = -EFAULT;
+ goto out;
+ }
+ if (put_user(ctx.optlen, optlen)) {
ret = -EFAULT;
goto out;
}
--
2.40.0.634.g4ca3ef3211-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH bpf-next 2/6] selftests/bpf: Verify optval=NULL case
2023-04-18 22:53 [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt Stanislav Fomichev
2023-04-18 22:53 ` [PATCH bpf-next 1/6] bpf: Don't EFAULT for getsockopt with optval=NULL Stanislav Fomichev
@ 2023-04-18 22:53 ` Stanislav Fomichev
2023-04-18 22:53 ` [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen Stanislav Fomichev
` (4 subsequent siblings)
6 siblings, 0 replies; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-18 22:53 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
kpsingh, sdf, haoluo, jolsa
Make sure we get optlen exported instead of getting EFAULT.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
.../selftests/bpf/prog_tests/sockopt_sk.c | 28 +++++++++++++++++++
.../testing/selftests/bpf/progs/sockopt_sk.c | 12 ++++++++
2 files changed, 40 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c
index 60d952719d27..4512dd808c33 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c
@@ -3,6 +3,7 @@
#include "cgroup_helpers.h"
#include <linux/tcp.h>
+#include <linux/netlink.h>
#include "sockopt_sk.skel.h"
#ifndef SOL_TCP
@@ -183,6 +184,33 @@ static int getsetsockopt(void)
goto err;
}
+ /* optval=NULL case is handled correctly */
+
+ close(fd);
+ fd = socket(AF_NETLINK, SOCK_RAW, 0);
+ if (fd < 0) {
+ log_err("Failed to create AF_NETLINK socket");
+ return -1;
+ }
+
+ buf.u32 = 1;
+ optlen = sizeof(__u32);
+ err = setsockopt(fd, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP, &buf, optlen);
+ if (err) {
+ log_err("Unexpected getsockopt(NETLINK_ADD_MEMBERSHIP) err=%d errno=%d",
+ err, errno);
+ goto err;
+ }
+
+ optlen = 0;
+ err = getsockopt(fd, SOL_NETLINK, NETLINK_LIST_MEMBERSHIPS, NULL, &optlen);
+ if (err) {
+ log_err("Unexpected getsockopt(NETLINK_LIST_MEMBERSHIPS) err=%d errno=%d",
+ err, errno);
+ goto err;
+ }
+ ASSERT_EQ(optlen, 4, "Unexpected NETLINK_LIST_MEMBERSHIPS value");
+
free(big_buf);
close(fd);
return 0;
diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c
index c8d810010a94..fe1df4cd206e 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_sk.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c
@@ -32,6 +32,12 @@ int _getsockopt(struct bpf_sockopt *ctx)
__u8 *optval_end = ctx->optval_end;
__u8 *optval = ctx->optval;
struct sockopt_sk *storage;
+ struct bpf_sock *sk;
+
+ /* Bypass AF_NETLINK. */
+ sk = ctx->sk;
+ if (sk && sk->family == AF_NETLINK)
+ return 1;
/* Make sure bpf_get_netns_cookie is callable.
*/
@@ -131,6 +137,12 @@ int _setsockopt(struct bpf_sockopt *ctx)
__u8 *optval_end = ctx->optval_end;
__u8 *optval = ctx->optval;
struct sockopt_sk *storage;
+ struct bpf_sock *sk;
+
+ /* Bypass AF_NETLINK. */
+ sk = ctx->sk;
+ if (sk && sk->family == AF_NETLINK)
+ return 1;
/* Make sure bpf_get_netns_cookie is callable.
*/
--
2.40.0.634.g4ca3ef3211-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
2023-04-18 22:53 [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt Stanislav Fomichev
2023-04-18 22:53 ` [PATCH bpf-next 1/6] bpf: Don't EFAULT for getsockopt with optval=NULL Stanislav Fomichev
2023-04-18 22:53 ` [PATCH bpf-next 2/6] selftests/bpf: Verify optval=NULL case Stanislav Fomichev
@ 2023-04-18 22:53 ` Stanislav Fomichev
2023-04-21 15:24 ` Daniel Borkmann
2023-04-18 22:53 ` [PATCH bpf-next 4/6] selftests/bpf: Update EFAULT {g,s}etsockopt selftests Stanislav Fomichev
` (3 subsequent siblings)
6 siblings, 1 reply; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-18 22:53 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
kpsingh, sdf, haoluo, jolsa, Martin KaFai Lau
Over time, we've found out several special socket option cases which need
special treatment. And if BPF program doesn't handle them correctly, this
might EFAULT perfectly valid {g,s}setsockopt calls.
The intention of the EFAULT was to make it apparent to the
developers that the program is doing something wrong.
However, this inadvertently might affect production workloads
with the BPF programs that are not too careful.
Let's try to minimize the chance of BPF program screwing up userspace
by ignoring the output of those BPF programs (instead of returning
EFAULT to the userspace). pr_info_ratelimited those cases to
the dmesg to help with figuring out what's going wrong.
Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
kernel/bpf/cgroup.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index a06e118a9be5..af4d20864fb4 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1826,7 +1826,9 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
ret = 1;
} else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
/* optlen is out of bounds */
- ret = -EFAULT;
+ pr_info_ratelimited(
+ "bpf setsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
+ ctx.optlen, max_optlen);
} else {
/* optlen within bounds, run kernel handler */
ret = 0;
@@ -1922,7 +1924,9 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
goto out;
if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
- ret = -EFAULT;
+ pr_info_ratelimited(
+ "bpf getsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
+ ctx.optlen, max_optlen);
goto out;
}
--
2.40.0.634.g4ca3ef3211-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
2023-04-18 22:53 ` [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen Stanislav Fomichev
@ 2023-04-21 15:24 ` Daniel Borkmann
2023-04-21 16:09 ` Stanislav Fomichev
0 siblings, 1 reply; 17+ messages in thread
From: Daniel Borkmann @ 2023-04-21 15:24 UTC (permalink / raw)
To: Stanislav Fomichev, bpf
Cc: ast, andrii, martin.lau, song, yhs, john.fastabend, kpsingh,
haoluo, jolsa, Martin KaFai Lau
On 4/19/23 12:53 AM, Stanislav Fomichev wrote:
> Over time, we've found out several special socket option cases which need
> special treatment. And if BPF program doesn't handle them correctly, this
> might EFAULT perfectly valid {g,s}setsockopt calls.
>
> The intention of the EFAULT was to make it apparent to the
> developers that the program is doing something wrong.
> However, this inadvertently might affect production workloads
> with the BPF programs that are not too careful.
Took in the first two for now. It would be good if the commit description
in here could have more details for posterity given this is too vague.
> Let's try to minimize the chance of BPF program screwing up userspace
> by ignoring the output of those BPF programs (instead of returning
> EFAULT to the userspace). pr_info_ratelimited those cases to
> the dmesg to help with figuring out what's going wrong.
>
> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
> Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
> kernel/bpf/cgroup.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index a06e118a9be5..af4d20864fb4 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -1826,7 +1826,9 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
> ret = 1;
> } else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
> /* optlen is out of bounds */
> - ret = -EFAULT;
> + pr_info_ratelimited(
> + "bpf setsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
> + ctx.optlen, max_optlen);
Does it help any regular user if this log message is seen? I kind of doubt it a bit,
it might create more confusion if log gets spammed with it, imo.
> } else {
> /* optlen within bounds, run kernel handler */
> ret = 0;
> @@ -1922,7 +1924,9 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
> goto out;
>
> if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
> - ret = -EFAULT;
> + pr_info_ratelimited(
> + "bpf getsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
> + ctx.optlen, max_optlen);
> goto out;
> }
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
2023-04-21 15:24 ` Daniel Borkmann
@ 2023-04-21 16:09 ` Stanislav Fomichev
2023-04-25 0:56 ` Martin KaFai Lau
0 siblings, 1 reply; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-21 16:09 UTC (permalink / raw)
To: Daniel Borkmann
Cc: bpf, ast, andrii, martin.lau, song, yhs, john.fastabend, kpsingh,
haoluo, jolsa, Martin KaFai Lau
On Fri, Apr 21, 2023 at 8:24 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 4/19/23 12:53 AM, Stanislav Fomichev wrote:
> > Over time, we've found out several special socket option cases which need
> > special treatment. And if BPF program doesn't handle them correctly, this
> > might EFAULT perfectly valid {g,s}setsockopt calls.
> >
> > The intention of the EFAULT was to make it apparent to the
> > developers that the program is doing something wrong.
> > However, this inadvertently might affect production workloads
> > with the BPF programs that are not too careful.
>
> Took in the first two for now. It would be good if the commit description
> in here could have more details for posterity given this is too vague.
Thanks! Will try to repost next week with more details.
> > Let's try to minimize the chance of BPF program screwing up userspace
> > by ignoring the output of those BPF programs (instead of returning
> > EFAULT to the userspace). pr_info_ratelimited those cases to
> > the dmesg to help with figuring out what's going wrong.
> >
> > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
> > Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> > kernel/bpf/cgroup.c | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index a06e118a9be5..af4d20864fb4 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -1826,7 +1826,9 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
> > ret = 1;
> > } else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
> > /* optlen is out of bounds */
> > - ret = -EFAULT;
> > + pr_info_ratelimited(
> > + "bpf setsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
> > + ctx.optlen, max_optlen);
>
> Does it help any regular user if this log message is seen? I kind of doubt it a bit,
> it might create more confusion if log gets spammed with it, imo.
Agreed, but we need some way to let the users know that their bpf
program is doing the wrong thing (if they set the optlen too high for
example).
Any other better alternatives to expose those events?
> > } else {
> > /* optlen within bounds, run kernel handler */
> > ret = 0;
> > @@ -1922,7 +1924,9 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
> > goto out;
> >
> > if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
> > - ret = -EFAULT;
> > + pr_info_ratelimited(
> > + "bpf getsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
> > + ctx.optlen, max_optlen);
> > goto out;
> > }
> >
> >
>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
2023-04-21 16:09 ` Stanislav Fomichev
@ 2023-04-25 0:56 ` Martin KaFai Lau
2023-04-25 17:12 ` Stanislav Fomichev
0 siblings, 1 reply; 17+ messages in thread
From: Martin KaFai Lau @ 2023-04-25 0:56 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: bpf, ast, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
jolsa, Martin KaFai Lau, Daniel Borkmann
On 4/21/23 9:09 AM, Stanislav Fomichev wrote:
> On Fri, Apr 21, 2023 at 8:24 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>
>> On 4/19/23 12:53 AM, Stanislav Fomichev wrote:
>>> Over time, we've found out several special socket option cases which need
>>> special treatment. And if BPF program doesn't handle them correctly, this
>>> might EFAULT perfectly valid {g,s}setsockopt calls.
>>>
>>> The intention of the EFAULT was to make it apparent to the
>>> developers that the program is doing something wrong.
>>> However, this inadvertently might affect production workloads
>>> with the BPF programs that are not too careful.
>>
>> Took in the first two for now. It would be good if the commit description
>> in here could have more details for posterity given this is too vague.
>
> Thanks! Will try to repost next week with more details.
>
>>> Let's try to minimize the chance of BPF program screwing up userspace
>>> by ignoring the output of those BPF programs (instead of returning
>>> EFAULT to the userspace). pr_info_ratelimited those cases to
>>> the dmesg to help with figuring out what's going wrong.
>>>
>>> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
>>> Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>>> ---
>>> kernel/bpf/cgroup.c | 8 ++++++--
>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
>>> index a06e118a9be5..af4d20864fb4 100644
>>> --- a/kernel/bpf/cgroup.c
>>> +++ b/kernel/bpf/cgroup.c
>>> @@ -1826,7 +1826,9 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
>>> ret = 1;
>>> } else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
>>> /* optlen is out of bounds */
>>> - ret = -EFAULT;
>>> + pr_info_ratelimited(
>>> + "bpf setsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
>>> + ctx.optlen, max_optlen);
>>
>> Does it help any regular user if this log message is seen? I kind of doubt it a bit,
>> it might create more confusion if log gets spammed with it, imo.
>
> Agreed, but we need some way to let the users know that their bpf
> program is doing the wrong thing (if they set the optlen too high for
> example).
imo, I also think a printk here will be a noise in dmesg most of the time (but
much better than an unexpected -EFAULT).
> Any other better alternatives to expose those events?
Is it possible to only -EFAULT when the bpf prog setting a ctx.optlen larger
than the "original" user provided optlen?
Ignore for all other cases that is due to the current PAGE_SIZE limitation?
>
>>> } else {
>>> /* optlen within bounds, run kernel handler */
>>> ret = 0;
>>> @@ -1922,7 +1924,9 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
>>> goto out;
>>>
>>> if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
>>> - ret = -EFAULT;
>>> + pr_info_ratelimited(
>>> + "bpf getsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
>>> + ctx.optlen, max_optlen);
>>> goto out;
>>> }
>>>
>>>
>>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
2023-04-25 0:56 ` Martin KaFai Lau
@ 2023-04-25 17:12 ` Stanislav Fomichev
2023-04-25 18:31 ` Martin KaFai Lau
0 siblings, 1 reply; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-25 17:12 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: bpf, ast, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
jolsa, Martin KaFai Lau, Daniel Borkmann
On Mon, Apr 24, 2023 at 5:56 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 4/21/23 9:09 AM, Stanislav Fomichev wrote:
> > On Fri, Apr 21, 2023 at 8:24 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>
> >> On 4/19/23 12:53 AM, Stanislav Fomichev wrote:
> >>> Over time, we've found out several special socket option cases which need
> >>> special treatment. And if BPF program doesn't handle them correctly, this
> >>> might EFAULT perfectly valid {g,s}setsockopt calls.
> >>>
> >>> The intention of the EFAULT was to make it apparent to the
> >>> developers that the program is doing something wrong.
> >>> However, this inadvertently might affect production workloads
> >>> with the BPF programs that are not too careful.
> >>
> >> Took in the first two for now. It would be good if the commit description
> >> in here could have more details for posterity given this is too vague.
> >
> > Thanks! Will try to repost next week with more details.
> >
> >>> Let's try to minimize the chance of BPF program screwing up userspace
> >>> by ignoring the output of those BPF programs (instead of returning
> >>> EFAULT to the userspace). pr_info_ratelimited those cases to
> >>> the dmesg to help with figuring out what's going wrong.
> >>>
> >>> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
> >>> Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
> >>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> >>> ---
> >>> kernel/bpf/cgroup.c | 8 ++++++--
> >>> 1 file changed, 6 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> >>> index a06e118a9be5..af4d20864fb4 100644
> >>> --- a/kernel/bpf/cgroup.c
> >>> +++ b/kernel/bpf/cgroup.c
> >>> @@ -1826,7 +1826,9 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
> >>> ret = 1;
> >>> } else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
> >>> /* optlen is out of bounds */
> >>> - ret = -EFAULT;
> >>> + pr_info_ratelimited(
> >>> + "bpf setsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
> >>> + ctx.optlen, max_optlen);
> >>
> >> Does it help any regular user if this log message is seen? I kind of doubt it a bit,
> >> it might create more confusion if log gets spammed with it, imo.
> >
> > Agreed, but we need some way to let the users know that their bpf
> > program is doing the wrong thing (if they set the optlen too high for
> > example).
>
> imo, I also think a printk here will be a noise in dmesg most of the time (but
> much better than an unexpected -EFAULT).
I was thinking for a v2, maybe print it at least once? Similar to
current bpf_warn_invalid_xdp_action?
> > Any other better alternatives to expose those events?
>
> Is it possible to only -EFAULT when the bpf prog setting a ctx.optlen larger
> than the "original" user provided optlen?
> Ignore for all other cases that is due to the current PAGE_SIZE limitation?
Should be possible. That "ctx.optlen > PAGE_SIZE && ctx.optlen <
original_optlen" is the condition where we'd silently ignore BPF
output.
As per above, I'll stick a line to the dmest (similar
bpf_warn_invalid_xdp_action), at least to record that this has
happened once.
LMK if you or Danial still don't see a value in printing this..
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
2023-04-25 17:12 ` Stanislav Fomichev
@ 2023-04-25 18:31 ` Martin KaFai Lau
2023-04-26 17:27 ` Stanislav Fomichev
0 siblings, 1 reply; 17+ messages in thread
From: Martin KaFai Lau @ 2023-04-25 18:31 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: bpf, ast, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
jolsa, Martin KaFai Lau, Daniel Borkmann
On 4/25/23 10:12 AM, Stanislav Fomichev wrote:
> On Mon, Apr 24, 2023 at 5:56 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 4/21/23 9:09 AM, Stanislav Fomichev wrote:
>>> On Fri, Apr 21, 2023 at 8:24 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>
>>>> On 4/19/23 12:53 AM, Stanislav Fomichev wrote:
>>>>> Over time, we've found out several special socket option cases which need
>>>>> special treatment. And if BPF program doesn't handle them correctly, this
>>>>> might EFAULT perfectly valid {g,s}setsockopt calls.
>>>>>
>>>>> The intention of the EFAULT was to make it apparent to the
>>>>> developers that the program is doing something wrong.
>>>>> However, this inadvertently might affect production workloads
>>>>> with the BPF programs that are not too careful.
>>>>
>>>> Took in the first two for now. It would be good if the commit description
>>>> in here could have more details for posterity given this is too vague.
>>>
>>> Thanks! Will try to repost next week with more details.
>>>
>>>>> Let's try to minimize the chance of BPF program screwing up userspace
>>>>> by ignoring the output of those BPF programs (instead of returning
>>>>> EFAULT to the userspace). pr_info_ratelimited those cases to
>>>>> the dmesg to help with figuring out what's going wrong.
>>>>>
>>>>> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
>>>>> Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
>>>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>>>>> ---
>>>>> kernel/bpf/cgroup.c | 8 ++++++--
>>>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
>>>>> index a06e118a9be5..af4d20864fb4 100644
>>>>> --- a/kernel/bpf/cgroup.c
>>>>> +++ b/kernel/bpf/cgroup.c
>>>>> @@ -1826,7 +1826,9 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
>>>>> ret = 1;
>>>>> } else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
>>>>> /* optlen is out of bounds */
>>>>> - ret = -EFAULT;
>>>>> + pr_info_ratelimited(
>>>>> + "bpf setsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
>>>>> + ctx.optlen, max_optlen);
>>>>
>>>> Does it help any regular user if this log message is seen? I kind of doubt it a bit,
>>>> it might create more confusion if log gets spammed with it, imo.
>>>
>>> Agreed, but we need some way to let the users know that their bpf
>>> program is doing the wrong thing (if they set the optlen too high for
>>> example).
>>
>> imo, I also think a printk here will be a noise in dmesg most of the time (but
>> much better than an unexpected -EFAULT).
>
> I was thinking for a v2, maybe print it at least once? Similar to
> current bpf_warn_invalid_xdp_action?
>
>
>>> Any other better alternatives to expose those events?
>>
>> Is it possible to only -EFAULT when the bpf prog setting a ctx.optlen larger
>> than the "original" user provided optlen?
Nevermind the "ctx.optlen larger than the original user provided optlen" part. I
mis-read something in __cgroup_bpf_run_filter_getsockopt(). max_optlen is the
right limit that the kernel needs to bound the ctx.optlen written by bpf prog.
>> Ignore for all other cases that is due to the current PAGE_SIZE limitation?
>
> Should be possible. That "ctx.optlen > PAGE_SIZE && ctx.optlen <
> original_optlen" is the condition where we'd silently ignore BPF
> output.
and should the -ve ctx.optlen be treated separately? like in
__cgroup_bpf_run_filter_getsockopt():
if (ctx.optlen < 0) {
ret = -EFAULT;
goto out;
}
if (optval && ctx.optlen > max_optlen) {
ret = original_optlen > PAGE_SIZE ? 0 : -EFAULT;
goto out;
}
> As per above, I'll stick a line to the dmest (similar
> bpf_warn_invalid_xdp_action), at least to record that this has
> happened once.
> LMK if you or Danial still don't see a value in printing this..
pr_info_once? hmm... I think it is ok-ish. At least not a warning.
I think almost all of the time the bpf prog forgets to set it to 0 for the long
optval that it has no interest in. However, to suppress this pr_info_once,
setting optlen to 0 will disable the following cgroup-bpf prog from using the
optval as read-only. The case that the printk that may flag is that the bpf prog
did indeed want to change the long optval?
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
2023-04-25 18:31 ` Martin KaFai Lau
@ 2023-04-26 17:27 ` Stanislav Fomichev
2023-04-26 18:07 ` Martin KaFai Lau
0 siblings, 1 reply; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-26 17:27 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: bpf, ast, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
jolsa, Martin KaFai Lau, Daniel Borkmann
On Tue, Apr 25, 2023 at 11:31 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 4/25/23 10:12 AM, Stanislav Fomichev wrote:
> > On Mon, Apr 24, 2023 at 5:56 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>
> >> On 4/21/23 9:09 AM, Stanislav Fomichev wrote:
> >>> On Fri, Apr 21, 2023 at 8:24 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>>
> >>>> On 4/19/23 12:53 AM, Stanislav Fomichev wrote:
> >>>>> Over time, we've found out several special socket option cases which need
> >>>>> special treatment. And if BPF program doesn't handle them correctly, this
> >>>>> might EFAULT perfectly valid {g,s}setsockopt calls.
> >>>>>
> >>>>> The intention of the EFAULT was to make it apparent to the
> >>>>> developers that the program is doing something wrong.
> >>>>> However, this inadvertently might affect production workloads
> >>>>> with the BPF programs that are not too careful.
> >>>>
> >>>> Took in the first two for now. It would be good if the commit description
> >>>> in here could have more details for posterity given this is too vague.
> >>>
> >>> Thanks! Will try to repost next week with more details.
> >>>
> >>>>> Let's try to minimize the chance of BPF program screwing up userspace
> >>>>> by ignoring the output of those BPF programs (instead of returning
> >>>>> EFAULT to the userspace). pr_info_ratelimited those cases to
> >>>>> the dmesg to help with figuring out what's going wrong.
> >>>>>
> >>>>> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
> >>>>> Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
> >>>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> >>>>> ---
> >>>>> kernel/bpf/cgroup.c | 8 ++++++--
> >>>>> 1 file changed, 6 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> >>>>> index a06e118a9be5..af4d20864fb4 100644
> >>>>> --- a/kernel/bpf/cgroup.c
> >>>>> +++ b/kernel/bpf/cgroup.c
> >>>>> @@ -1826,7 +1826,9 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
> >>>>> ret = 1;
> >>>>> } else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
> >>>>> /* optlen is out of bounds */
> >>>>> - ret = -EFAULT;
> >>>>> + pr_info_ratelimited(
> >>>>> + "bpf setsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
> >>>>> + ctx.optlen, max_optlen);
> >>>>
> >>>> Does it help any regular user if this log message is seen? I kind of doubt it a bit,
> >>>> it might create more confusion if log gets spammed with it, imo.
> >>>
> >>> Agreed, but we need some way to let the users know that their bpf
> >>> program is doing the wrong thing (if they set the optlen too high for
> >>> example).
> >>
> >> imo, I also think a printk here will be a noise in dmesg most of the time (but
> >> much better than an unexpected -EFAULT).
> >
> > I was thinking for a v2, maybe print it at least once? Similar to
> > current bpf_warn_invalid_xdp_action?
> >
> >
> >>> Any other better alternatives to expose those events?
> >>
> >> Is it possible to only -EFAULT when the bpf prog setting a ctx.optlen larger
> >> than the "original" user provided optlen?
>
> Nevermind the "ctx.optlen larger than the original user provided optlen" part. I
> mis-read something in __cgroup_bpf_run_filter_getsockopt(). max_optlen is the
> right limit that the kernel needs to bound the ctx.optlen written by bpf prog.
>
> >> Ignore for all other cases that is due to the current PAGE_SIZE limitation?
> >
> > Should be possible. That "ctx.optlen > PAGE_SIZE && ctx.optlen <
> > original_optlen" is the condition where we'd silently ignore BPF
> > output.
>
> and should the -ve ctx.optlen be treated separately? like in
> __cgroup_bpf_run_filter_getsockopt():
>
> if (ctx.optlen < 0) {
> ret = -EFAULT;
> goto out;
> }
>
> if (optval && ctx.optlen > max_optlen) {
> ret = original_optlen > PAGE_SIZE ? 0 : -EFAULT;
> goto out;
> }
Good suggestion. Let me try to distinguish between the two cases:
- bpf prog sets optlen larger than the original_optlen -> EFAULT
- bpf prog forgets to reset optlen to 0 for large optlen -> ignore
> > As per above, I'll stick a line to the dmest (similar
> > bpf_warn_invalid_xdp_action), at least to record that this has
> > happened once.
> > LMK if you or Danial still don't see a value in printing this..
>
> pr_info_once? hmm... I think it is ok-ish. At least not a warning.
>
> I think almost all of the time the bpf prog forgets to set it to 0 for the long
> optval that it has no interest in. However, to suppress this pr_info_once,
> setting optlen to 0 will disable the following cgroup-bpf prog from using the
> optval as read-only. The case that the printk that may flag is that the bpf prog
> did indeed want to change the long optval?
The case we want to printk is where the prog changes some byte in the
first 4k range of the optval and does not touch optlen (or maybe
adjusts optlen to be >PAGE_SIZE and <original_optlen).
I agree that it feels super corner-casy; but it feels like without
some kind of hint, it would be impossible to figure out why it doesn't
work. Or am I overblowing it?
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
2023-04-26 17:27 ` Stanislav Fomichev
@ 2023-04-26 18:07 ` Martin KaFai Lau
0 siblings, 0 replies; 17+ messages in thread
From: Martin KaFai Lau @ 2023-04-26 18:07 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: bpf, ast, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
jolsa, Martin KaFai Lau, Daniel Borkmann
On 4/26/23 10:27 AM, Stanislav Fomichev wrote:
>>> As per above, I'll stick a line to the dmest (similar
>>> bpf_warn_invalid_xdp_action), at least to record that this has
>>> happened once.
>>> LMK if you or Danial still don't see a value in printing this..
>>
>> pr_info_once? hmm... I think it is ok-ish. At least not a warning.
>>
>> I think almost all of the time the bpf prog forgets to set it to 0 for the long
>> optval that it has no interest in. However, to suppress this pr_info_once,
>> setting optlen to 0 will disable the following cgroup-bpf prog from using the
>> optval as read-only. The case that the printk that may flag is that the bpf prog
>> did indeed want to change the long optval?
>
> The case we want to printk is where the prog changes some byte in the
> first 4k range of the optval and does not touch optlen (or maybe
> adjusts optlen to be >PAGE_SIZE and <original_optlen).
> I agree that it feels super corner-casy; but it feels like without
> some kind of hint, it would be impossible to figure out why it doesn't
> work. Or am I overblowing it?
I don't have a better idea how to flag this 'changing the first few bytes of a
long optval is not supported' either. I guess pr_info_once is ok-ish for now to
stop the bleeding in the most common case.
If it can separate the original_optlen > PAGE_SIZE case (ignore and no -EFAULT),
the message probably needs to be less alarming. "bpf setsockopt returned
unexpected optlen" may cause confusion when the bpf prog did not touch the
optval and optlen.
Hopefully this pr_info_once will disappear when the cgroup-bpf prog can directly
read/write the optval without pre-allocation.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH bpf-next 4/6] selftests/bpf: Update EFAULT {g,s}etsockopt selftests
2023-04-18 22:53 [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt Stanislav Fomichev
` (2 preceding siblings ...)
2023-04-18 22:53 ` [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen Stanislav Fomichev
@ 2023-04-18 22:53 ` Stanislav Fomichev
2023-04-18 22:53 ` [PATCH bpf-next 5/6] selftests/bpf: Correctly handle optlen > 4096 Stanislav Fomichev
` (2 subsequent siblings)
6 siblings, 0 replies; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-18 22:53 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
kpsingh, sdf, haoluo, jolsa
Instead of assuming EFAULT, let's assume the BPF program's
output is ignored.
Remove "getsockopt: deny arbitrary ctx->retval" because it
was actually testing optlen. We have separate set of tests
for retval.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
.../selftests/bpf/prog_tests/sockopt.c | 42 ++++++-------------
1 file changed, 13 insertions(+), 29 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt.c b/tools/testing/selftests/bpf/prog_tests/sockopt.c
index aa4debf62fc6..bff7d91d1e1d 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockopt.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockopt.c
@@ -249,7 +249,7 @@ static struct sockopt_test {
.get_optlen = 64,
},
{
- .descr = "getsockopt: deny bigger ctx->optlen",
+ .descr = "getsockopt: ignore bigger ctx->optlen",
.insns = {
/* ctx->optlen = 65 */
BPF_MOV64_IMM(BPF_REG_0, 65),
@@ -268,28 +268,10 @@ static struct sockopt_test {
.attach_type = BPF_CGROUP_GETSOCKOPT,
.expected_attach_type = BPF_CGROUP_GETSOCKOPT,
- .get_optlen = 64,
-
- .error = EFAULT_GETSOCKOPT,
- },
- {
- .descr = "getsockopt: deny arbitrary ctx->retval",
- .insns = {
- /* ctx->retval = 123 */
- BPF_MOV64_IMM(BPF_REG_0, 123),
- BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0,
- offsetof(struct bpf_sockopt, retval)),
-
- /* return 1 */
- BPF_MOV64_IMM(BPF_REG_0, 1),
- BPF_EXIT_INSN(),
- },
- .attach_type = BPF_CGROUP_GETSOCKOPT,
- .expected_attach_type = BPF_CGROUP_GETSOCKOPT,
-
- .get_optlen = 64,
-
- .error = EFAULT_GETSOCKOPT,
+ .get_level = SOL_IP,
+ .get_optname = IP_TOS,
+ .get_optval = {},
+ .get_optlen = 4,
},
{
.descr = "getsockopt: support smaller ctx->optlen",
@@ -627,9 +609,10 @@ static struct sockopt_test {
.attach_type = BPF_CGROUP_SETSOCKOPT,
.expected_attach_type = BPF_CGROUP_SETSOCKOPT,
- .set_optlen = 4,
-
- .error = EFAULT_SETSOCKOPT,
+ .set_level = SOL_IP,
+ .set_optname = IP_TOS,
+ .set_optval = { 1 << 3 },
+ .set_optlen = 1,
},
{
.descr = "setsockopt: deny ctx->optlen > input optlen",
@@ -644,9 +627,10 @@ static struct sockopt_test {
.attach_type = BPF_CGROUP_SETSOCKOPT,
.expected_attach_type = BPF_CGROUP_SETSOCKOPT,
- .set_optlen = 64,
-
- .error = EFAULT_SETSOCKOPT,
+ .set_level = SOL_IP,
+ .set_optname = IP_TOS,
+ .set_optval = { 1 << 3 },
+ .set_optlen = 1,
},
{
.descr = "setsockopt: allow changing ctx->optlen within bounds",
--
2.40.0.634.g4ca3ef3211-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH bpf-next 5/6] selftests/bpf: Correctly handle optlen > 4096
2023-04-18 22:53 [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt Stanislav Fomichev
` (3 preceding siblings ...)
2023-04-18 22:53 ` [PATCH bpf-next 4/6] selftests/bpf: Update EFAULT {g,s}etsockopt selftests Stanislav Fomichev
@ 2023-04-18 22:53 ` Stanislav Fomichev
2023-04-18 22:53 ` [PATCH bpf-next 6/6] bpf: Document EFAULT changes for sockopt Stanislav Fomichev
2023-04-21 15:20 ` [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt patchwork-bot+netdevbpf
6 siblings, 0 replies; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-18 22:53 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
kpsingh, sdf, haoluo, jolsa
Even though it's not relevant in selftests, the people
might still copy-paste from them. So let's take care
of optlen > 4096 cases explicitly.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
.../progs/cgroup_getset_retval_getsockopt.c | 12 +++++++++
.../progs/cgroup_getset_retval_setsockopt.c | 16 ++++++++++++
.../selftests/bpf/progs/sockopt_inherit.c | 16 ++++++++++--
.../selftests/bpf/progs/sockopt_multi.c | 24 +++++++++++++++---
.../selftests/bpf/progs/sockopt_qos_to_cc.c | 8 +++++-
.../testing/selftests/bpf/progs/sockopt_sk.c | 25 +++++++++++++------
6 files changed, 88 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/cgroup_getset_retval_getsockopt.c b/tools/testing/selftests/bpf/progs/cgroup_getset_retval_getsockopt.c
index b2a409e6382a..b66454886cc4 100644
--- a/tools/testing/selftests/bpf/progs/cgroup_getset_retval_getsockopt.c
+++ b/tools/testing/selftests/bpf/progs/cgroup_getset_retval_getsockopt.c
@@ -20,6 +20,10 @@ int get_retval(struct bpf_sockopt *ctx)
ctx_retval_value = ctx->retval;
__sync_fetch_and_add(&invocations, 1);
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+
return 1;
}
@@ -31,6 +35,10 @@ int set_eisconn(struct bpf_sockopt *ctx)
if (bpf_set_retval(-EISCONN))
assertion_error = 1;
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+
return 1;
}
@@ -41,5 +49,9 @@ int clear_retval(struct bpf_sockopt *ctx)
ctx->retval = 0;
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+
return 1;
}
diff --git a/tools/testing/selftests/bpf/progs/cgroup_getset_retval_setsockopt.c b/tools/testing/selftests/bpf/progs/cgroup_getset_retval_setsockopt.c
index d6e5903e06ba..68fce0311771 100644
--- a/tools/testing/selftests/bpf/progs/cgroup_getset_retval_setsockopt.c
+++ b/tools/testing/selftests/bpf/progs/cgroup_getset_retval_setsockopt.c
@@ -18,6 +18,10 @@ int get_retval(struct bpf_sockopt *ctx)
retval_value = bpf_get_retval();
__sync_fetch_and_add(&invocations, 1);
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+
return 1;
}
@@ -29,6 +33,10 @@ int set_eunatch(struct bpf_sockopt *ctx)
if (bpf_set_retval(-EUNATCH))
assertion_error = 1;
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+
return 0;
}
@@ -40,6 +48,10 @@ int set_eisconn(struct bpf_sockopt *ctx)
if (bpf_set_retval(-EISCONN))
assertion_error = 1;
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+
return 0;
}
@@ -48,5 +60,9 @@ int legacy_eperm(struct bpf_sockopt *ctx)
{
__sync_fetch_and_add(&invocations, 1);
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+
return 0;
}
diff --git a/tools/testing/selftests/bpf/progs/sockopt_inherit.c b/tools/testing/selftests/bpf/progs/sockopt_inherit.c
index 9fb241b97291..78e56070dbcf 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_inherit.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_inherit.c
@@ -55,7 +55,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
__u8 *optval = ctx->optval;
if (ctx->level != SOL_CUSTOM)
- return 1; /* only interested in SOL_CUSTOM */
+ goto out; /* only interested in SOL_CUSTOM */
if (optval + 1 > optval_end)
return 0; /* EPERM, bounds check */
@@ -70,6 +70,12 @@ int _getsockopt(struct bpf_sockopt *ctx)
ctx->optlen = 1;
return 1;
+
+out:
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+ return 1;
}
SEC("cgroup/setsockopt")
@@ -80,7 +86,7 @@ int _setsockopt(struct bpf_sockopt *ctx)
__u8 *optval = ctx->optval;
if (ctx->level != SOL_CUSTOM)
- return 1; /* only interested in SOL_CUSTOM */
+ goto out; /* only interested in SOL_CUSTOM */
if (optval + 1 > optval_end)
return 0; /* EPERM, bounds check */
@@ -93,4 +99,10 @@ int _setsockopt(struct bpf_sockopt *ctx)
ctx->optlen = -1;
return 1;
+
+out:
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+ return 1;
}
diff --git a/tools/testing/selftests/bpf/progs/sockopt_multi.c b/tools/testing/selftests/bpf/progs/sockopt_multi.c
index 177a59069dae..f645eb253c6c 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_multi.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_multi.c
@@ -12,7 +12,7 @@ int _getsockopt_child(struct bpf_sockopt *ctx)
__u8 *optval = ctx->optval;
if (ctx->level != SOL_IP || ctx->optname != IP_TOS)
- return 1;
+ goto out;
if (optval + 1 > optval_end)
return 0; /* EPERM, bounds check */
@@ -26,6 +26,12 @@ int _getsockopt_child(struct bpf_sockopt *ctx)
ctx->optlen = 1;
return 1;
+
+out:
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+ return 1;
}
SEC("cgroup/getsockopt")
@@ -35,7 +41,7 @@ int _getsockopt_parent(struct bpf_sockopt *ctx)
__u8 *optval = ctx->optval;
if (ctx->level != SOL_IP || ctx->optname != IP_TOS)
- return 1;
+ goto out;
if (optval + 1 > optval_end)
return 0; /* EPERM, bounds check */
@@ -49,6 +55,12 @@ int _getsockopt_parent(struct bpf_sockopt *ctx)
ctx->optlen = 1;
return 1;
+
+out:
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+ return 1;
}
SEC("cgroup/setsockopt")
@@ -58,7 +70,7 @@ int _setsockopt(struct bpf_sockopt *ctx)
__u8 *optval = ctx->optval;
if (ctx->level != SOL_IP || ctx->optname != IP_TOS)
- return 1;
+ goto out;
if (optval + 1 > optval_end)
return 0; /* EPERM, bounds check */
@@ -67,4 +79,10 @@ int _setsockopt(struct bpf_sockopt *ctx)
ctx->optlen = 1;
return 1;
+
+out:
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+ return 1;
}
diff --git a/tools/testing/selftests/bpf/progs/sockopt_qos_to_cc.c b/tools/testing/selftests/bpf/progs/sockopt_qos_to_cc.c
index 1bce83b6e3a7..597f3e276cf7 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_qos_to_cc.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_qos_to_cc.c
@@ -19,7 +19,7 @@ int sockopt_qos_to_cc(struct bpf_sockopt *ctx)
char cc_cubic[TCP_CA_NAME_MAX] = "cubic";
if (ctx->level != SOL_IPV6 || ctx->optname != IPV6_TCLASS)
- return 1;
+ goto out;
if (optval + 1 > optval_end)
return 0; /* EPERM, bounds check */
@@ -36,4 +36,10 @@ int sockopt_qos_to_cc(struct bpf_sockopt *ctx)
return 0;
}
return 1;
+
+out:
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+ return 1;
}
diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c
index fe1df4cd206e..e1aed858c208 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_sk.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c
@@ -37,7 +37,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
/* Bypass AF_NETLINK. */
sk = ctx->sk;
if (sk && sk->family == AF_NETLINK)
- return 1;
+ goto out;
/* Make sure bpf_get_netns_cookie is callable.
*/
@@ -52,8 +52,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
* let next BPF program in the cgroup chain or kernel
* handle it.
*/
- ctx->optlen = 0; /* bypass optval>PAGE_SIZE */
- return 1;
+ goto out;
}
if (ctx->level == SOL_SOCKET && ctx->optname == SO_SNDBUF) {
@@ -61,7 +60,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
* let next BPF program in the cgroup chain or kernel
* handle it.
*/
- return 1;
+ goto out;
}
if (ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION) {
@@ -69,7 +68,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
* let next BPF program in the cgroup chain or kernel
* handle it.
*/
- return 1;
+ goto out;
}
if (ctx->level == SOL_TCP && ctx->optname == TCP_ZEROCOPY_RECEIVE) {
@@ -85,7 +84,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
if (((struct tcp_zerocopy_receive *)optval)->address != 0)
return 0; /* unexpected data */
- return 1;
+ goto out;
}
if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
@@ -129,6 +128,12 @@ int _getsockopt(struct bpf_sockopt *ctx)
ctx->optlen = 1;
return 1;
+
+out:
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+ return 1;
}
SEC("cgroup/setsockopt")
@@ -142,7 +147,7 @@ int _setsockopt(struct bpf_sockopt *ctx)
/* Bypass AF_NETLINK. */
sk = ctx->sk;
if (sk && sk->family == AF_NETLINK)
- return 1;
+ goto out;
/* Make sure bpf_get_netns_cookie is callable.
*/
@@ -224,4 +229,10 @@ int _setsockopt(struct bpf_sockopt *ctx)
*/
return 1;
+
+out:
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+ return 1;
}
--
2.40.0.634.g4ca3ef3211-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH bpf-next 6/6] bpf: Document EFAULT changes for sockopt
2023-04-18 22:53 [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt Stanislav Fomichev
` (4 preceding siblings ...)
2023-04-18 22:53 ` [PATCH bpf-next 5/6] selftests/bpf: Correctly handle optlen > 4096 Stanislav Fomichev
@ 2023-04-18 22:53 ` Stanislav Fomichev
2023-04-19 20:08 ` kernel test robot
2023-04-21 15:20 ` [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt patchwork-bot+netdevbpf
6 siblings, 1 reply; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-18 22:53 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
kpsingh, sdf, haoluo, jolsa
And add examples for how to correctly handle large optlens.
This is less relevant now when we don't EFAULT anymore, but
that's still the correct thing to do.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
Documentation/bpf/prog_cgroup_sockopt.rst | 64 +++++++++++++++++++++--
1 file changed, 60 insertions(+), 4 deletions(-)
diff --git a/Documentation/bpf/prog_cgroup_sockopt.rst b/Documentation/bpf/prog_cgroup_sockopt.rst
index 172f957204bf..dcb8d4681257 100644
--- a/Documentation/bpf/prog_cgroup_sockopt.rst
+++ b/Documentation/bpf/prog_cgroup_sockopt.rst
@@ -29,7 +29,7 @@ chain finish (i.e. kernel ``setsockopt`` handling will *not* be executed).
Note, that ``optlen`` can not be increased beyond the user-supplied
value. It can only be decreased or set to -1. Any other value will
-trigger ``EFAULT``.
+ignore the BPF program's changes to ``optlen``/``optval``.
Return Type
-----------
@@ -45,13 +45,14 @@ sockopt. The BPF hook can observe ``optval``, ``optlen`` and ``retval``
if it's interested in whatever kernel has returned. BPF hook can override
the values above, adjust ``optlen`` and reset ``retval`` to 0. If ``optlen``
has been increased above initial ``getsockopt`` value (i.e. userspace
-buffer is too small), ``EFAULT`` is returned.
+buffer is too small), BPF program's ``optlen`` and ``optval`` are
+ignored.
This hook has access to the cgroup and socket local storage.
Note, that the only acceptable value to set to ``retval`` is 0 and the
original value that the kernel returned. Any other value will trigger
-``EFAULT``.
+ignore BPF program's changes.
Return Type
-----------
@@ -98,10 +99,65 @@ When the ``optval`` is greater than the ``PAGE_SIZE``, the BPF program
indicates that the kernel should use BPF's trimmed ``optval``.
When the BPF program returns with the ``optlen`` greater than
-``PAGE_SIZE``, the userspace will receive ``EFAULT`` errno.
+``PAGE_SIZE``, the userspace will receive original kernel
+buffers without any modifications that the BPF program might have
+applied.
Example
=======
+Recommended way to handle BPF programs is as follows:
+
+```
+SEC("cgroup/getsockopt")
+int getsockopt(struct bpf_sockopt *ctx)
+{
+ /* Custom socket option. */
+ if (ctx->level == MY_SOL && ctx->optname == MY_OPTNAME) {
+ ctx->retval = 0;
+ optval[0] = ...;
+ ctx->optlen = 1;
+ return 1;
+ }
+
+ /* Modify kernel's socket option. */
+ if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
+ ctx->retval = 0;
+ optval[0] = ...;
+ ctx->optlen = 1;
+ return 1;
+ }
+
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+
+ return 1;
+}
+
+SEC("cgroup/setsockopt")
+int setsockopt(struct bpf_sockopt *ctx)
+{
+ /* Custom socket option. */
+ if (ctx->level == MY_SOL && ctx->optname == MY_OPTNAME) {
+ /* do something */
+ ctx->optlen = -1;
+ return 1;
+ }
+
+ /* Modify kernel's socket option. */
+ if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
+ optval[0] = ...;
+ return 1;
+ }
+
+ /* optval larger than PAGE_SIZE use kernel's buffer. */
+ if (ctx->optlen > 4096)
+ ctx->optlen = 0;
+
+ return 1;
+}
+```
+
See ``tools/testing/selftests/bpf/progs/sockopt_sk.c`` for an example
of BPF program that handles socket options.
--
2.40.0.634.g4ca3ef3211-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH bpf-next 6/6] bpf: Document EFAULT changes for sockopt
2023-04-18 22:53 ` [PATCH bpf-next 6/6] bpf: Document EFAULT changes for sockopt Stanislav Fomichev
@ 2023-04-19 20:08 ` kernel test robot
2023-04-20 18:17 ` Stanislav Fomichev
0 siblings, 1 reply; 17+ messages in thread
From: kernel test robot @ 2023-04-19 20:08 UTC (permalink / raw)
To: Stanislav Fomichev, bpf
Cc: oe-kbuild-all, ast, daniel, andrii, martin.lau, song, yhs,
john.fastabend, kpsingh, sdf, haoluo, jolsa
Hi Stanislav,
kernel test robot noticed the following build warnings:
[auto build test WARNING on bpf-next/master]
url: https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/bpf-Don-t-EFAULT-for-getsockopt-with-optval-NULL/20230419-065442
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link: https://lore.kernel.org/r/20230418225343.553806-7-sdf%40google.com
patch subject: [PATCH bpf-next 6/6] bpf: Document EFAULT changes for sockopt
reproduce:
# https://github.com/intel-lab-lkp/linux/commit/789f0fbf25934464ae56e0022939fc77d4615d65
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Stanislav-Fomichev/bpf-Don-t-EFAULT-for-getsockopt-with-optval-NULL/20230419-065442
git checkout 789f0fbf25934464ae56e0022939fc77d4615d65
make menuconfig
# enable CONFIG_COMPILE_TEST, CONFIG_WARN_MISSING_DOCUMENTS, CONFIG_WARN_ABI_ERRORS
make htmldocs
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304200301.XukL6sTb-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> Documentation/bpf/prog_cgroup_sockopt.rst:115: WARNING: Unexpected indentation.
>> Documentation/bpf/prog_cgroup_sockopt.rst:111: WARNING: Inline literal start-string without end-string.
>> Documentation/bpf/prog_cgroup_sockopt.rst:111: WARNING: Inline emphasis start-string without end-string.
>> Documentation/bpf/prog_cgroup_sockopt.rst:121: WARNING: Block quote ends without a blank line; unexpected unindent.
>> Documentation/bpf/prog_cgroup_sockopt.rst:159: WARNING: Title level inconsistent:
vim +115 Documentation/bpf/prog_cgroup_sockopt.rst
110
> 111 ```
112 SEC("cgroup/getsockopt")
113 int getsockopt(struct bpf_sockopt *ctx)
114 {
> 115 /* Custom socket option. */
116 if (ctx->level == MY_SOL && ctx->optname == MY_OPTNAME) {
117 ctx->retval = 0;
118 optval[0] = ...;
119 ctx->optlen = 1;
120 return 1;
> 121 }
122
123 /* Modify kernel's socket option. */
124 if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
125 ctx->retval = 0;
126 optval[0] = ...;
127 ctx->optlen = 1;
128 return 1;
129 }
130
131 /* optval larger than PAGE_SIZE use kernel's buffer. */
132 if (ctx->optlen > 4096)
133 ctx->optlen = 0;
134
135 return 1;
136 }
137
138 SEC("cgroup/setsockopt")
139 int setsockopt(struct bpf_sockopt *ctx)
140 {
141 /* Custom socket option. */
142 if (ctx->level == MY_SOL && ctx->optname == MY_OPTNAME) {
143 /* do something */
144 ctx->optlen = -1;
145 return 1;
146 }
147
148 /* Modify kernel's socket option. */
149 if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
150 optval[0] = ...;
151 return 1;
152 }
153
154 /* optval larger than PAGE_SIZE use kernel's buffer. */
155 if (ctx->optlen > 4096)
156 ctx->optlen = 0;
157
158 return 1;
> 159 }
160 ```
161
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH bpf-next 6/6] bpf: Document EFAULT changes for sockopt
2023-04-19 20:08 ` kernel test robot
@ 2023-04-20 18:17 ` Stanislav Fomichev
0 siblings, 0 replies; 17+ messages in thread
From: Stanislav Fomichev @ 2023-04-20 18:17 UTC (permalink / raw)
To: kernel test robot
Cc: bpf, oe-kbuild-all, ast, daniel, andrii, martin.lau, song, yhs,
john.fastabend, kpsingh, haoluo, jolsa
On Wed, Apr 19, 2023 at 1:08 PM kernel test robot <lkp@intel.com> wrote:
>
> Hi Stanislav,
>
> kernel test robot noticed the following build warnings:
Stupid me using ``` blocks instead of code-block:: c. Will address
once I hear some feedback on the rest of the patches.
> [auto build test WARNING on bpf-next/master]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/bpf-Don-t-EFAULT-for-getsockopt-with-optval-NULL/20230419-065442
> base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
> patch link: https://lore.kernel.org/r/20230418225343.553806-7-sdf%40google.com
> patch subject: [PATCH bpf-next 6/6] bpf: Document EFAULT changes for sockopt
> reproduce:
> # https://github.com/intel-lab-lkp/linux/commit/789f0fbf25934464ae56e0022939fc77d4615d65
> git remote add linux-review https://github.com/intel-lab-lkp/linux
> git fetch --no-tags linux-review Stanislav-Fomichev/bpf-Don-t-EFAULT-for-getsockopt-with-optval-NULL/20230419-065442
> git checkout 789f0fbf25934464ae56e0022939fc77d4615d65
> make menuconfig
> # enable CONFIG_COMPILE_TEST, CONFIG_WARN_MISSING_DOCUMENTS, CONFIG_WARN_ABI_ERRORS
> make htmldocs
>
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <lkp@intel.com>
> | Link: https://lore.kernel.org/oe-kbuild-all/202304200301.XukL6sTb-lkp@intel.com/
>
> All warnings (new ones prefixed by >>):
>
> >> Documentation/bpf/prog_cgroup_sockopt.rst:115: WARNING: Unexpected indentation.
> >> Documentation/bpf/prog_cgroup_sockopt.rst:111: WARNING: Inline literal start-string without end-string.
> >> Documentation/bpf/prog_cgroup_sockopt.rst:111: WARNING: Inline emphasis start-string without end-string.
> >> Documentation/bpf/prog_cgroup_sockopt.rst:121: WARNING: Block quote ends without a blank line; unexpected unindent.
> >> Documentation/bpf/prog_cgroup_sockopt.rst:159: WARNING: Title level inconsistent:
>
> vim +115 Documentation/bpf/prog_cgroup_sockopt.rst
>
> 110
> > 111 ```
> 112 SEC("cgroup/getsockopt")
> 113 int getsockopt(struct bpf_sockopt *ctx)
> 114 {
> > 115 /* Custom socket option. */
> 116 if (ctx->level == MY_SOL && ctx->optname == MY_OPTNAME) {
> 117 ctx->retval = 0;
> 118 optval[0] = ...;
> 119 ctx->optlen = 1;
> 120 return 1;
> > 121 }
> 122
> 123 /* Modify kernel's socket option. */
> 124 if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
> 125 ctx->retval = 0;
> 126 optval[0] = ...;
> 127 ctx->optlen = 1;
> 128 return 1;
> 129 }
> 130
> 131 /* optval larger than PAGE_SIZE use kernel's buffer. */
> 132 if (ctx->optlen > 4096)
> 133 ctx->optlen = 0;
> 134
> 135 return 1;
> 136 }
> 137
> 138 SEC("cgroup/setsockopt")
> 139 int setsockopt(struct bpf_sockopt *ctx)
> 140 {
> 141 /* Custom socket option. */
> 142 if (ctx->level == MY_SOL && ctx->optname == MY_OPTNAME) {
> 143 /* do something */
> 144 ctx->optlen = -1;
> 145 return 1;
> 146 }
> 147
> 148 /* Modify kernel's socket option. */
> 149 if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
> 150 optval[0] = ...;
> 151 return 1;
> 152 }
> 153
> 154 /* optval larger than PAGE_SIZE use kernel's buffer. */
> 155 if (ctx->optlen > 4096)
> 156 ctx->optlen = 0;
> 157
> 158 return 1;
> > 159 }
> 160 ```
> 161
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt
2023-04-18 22:53 [PATCH bpf-next 0/6] bpf: handle another corner case in getsockopt Stanislav Fomichev
` (5 preceding siblings ...)
2023-04-18 22:53 ` [PATCH bpf-next 6/6] bpf: Document EFAULT changes for sockopt Stanislav Fomichev
@ 2023-04-21 15:20 ` patchwork-bot+netdevbpf
6 siblings, 0 replies; 17+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-04-21 15:20 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
kpsingh, haoluo, jolsa, martin.lau
Hello:
This series was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:
On Tue, 18 Apr 2023 15:53:37 -0700 you wrote:
> Martin reports another case where getsockopt EFAULTs perfectly
> valid callers. Let's fix it and also replace EFAULT with
> pr_info_ratelimited. That should hopefully make this place
> less error prone.
>
> First 2 patches fix the issue with NETLINK_LIST_MEMBERSHIPS and
> test it.
>
> [...]
Here is the summary with links:
- [bpf-next,1/6] bpf: Don't EFAULT for getsockopt with optval=NULL
https://git.kernel.org/bpf/bpf-next/c/00e74ae08638
- [bpf-next,2/6] selftests/bpf: Verify optval=NULL case
https://git.kernel.org/bpf/bpf-next/c/833d67ecdc5f
- [bpf-next,3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
(no matching commit)
- [bpf-next,4/6] selftests/bpf: Update EFAULT {g,s}etsockopt selftests
(no matching commit)
- [bpf-next,5/6] selftests/bpf: Correctly handle optlen > 4096
(no matching commit)
- [bpf-next,6/6] bpf: Document EFAULT changes for sockopt
(no matching commit)
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 17+ messages in thread