* [PATCH bpf-next 1/2] bpf: always allocate at least 16 bytes for setsockopt hook
2019-07-29 21:51 [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer Stanislav Fomichev
@ 2019-07-29 21:51 ` Stanislav Fomichev
2019-07-29 21:51 ` [PATCH bpf-next 2/2] selftests/bpf: extend sockopt_sk selftest with TCP_CONGESTION use case Stanislav Fomichev
2019-08-01 20:58 ` [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer Alexei Starovoitov
2 siblings, 0 replies; 6+ messages in thread
From: Stanislav Fomichev @ 2019-07-29 21:51 UTC (permalink / raw)
To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev
Since we always allocate memory, allocate just a little bit more
for the BPF program in case it need to override user input with
bigger value. The canonical example is TCP_CONGESTION where
input string might be too small to override (nv -> bbr or cubic).
16 bytes are chosen to match the size of TCP_CA_NAME_MAX and can
be extended in the future if needed.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
kernel/bpf/cgroup.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 0a00eaca6fae..6a6a154cfa7b 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -964,7 +964,6 @@ static int sockopt_alloc_buf(struct bpf_sockopt_kern *ctx, int max_optlen)
return -ENOMEM;
ctx->optval_end = ctx->optval + max_optlen;
- ctx->optlen = max_optlen;
return 0;
}
@@ -984,7 +983,7 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
.level = *level,
.optname = *optname,
};
- int ret;
+ int ret, max_optlen;
/* Opportunistic check to see whether we have any BPF program
* attached to the hook so we don't waste time allocating
@@ -994,10 +993,18 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
__cgroup_bpf_prog_array_is_empty(cgrp, BPF_CGROUP_SETSOCKOPT))
return 0;
- ret = sockopt_alloc_buf(&ctx, *optlen);
+ /* Allocate a bit more than the initial user buffer for
+ * BPF program. The canonical use case is overriding
+ * TCP_CONGESTION(nv) to TCP_CONGESTION(cubic).
+ */
+ max_optlen = max_t(int, 16, *optlen);
+
+ ret = sockopt_alloc_buf(&ctx, max_optlen);
if (ret)
return ret;
+ ctx.optlen = *optlen;
+
if (copy_from_user(ctx.optval, optval, *optlen) != 0) {
ret = -EFAULT;
goto out;
@@ -1016,7 +1023,7 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
if (ctx.optlen == -1) {
/* optlen set to -1, bypass kernel */
ret = 1;
- } else if (ctx.optlen > *optlen || ctx.optlen < -1) {
+ } else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
/* optlen is out of bounds */
ret = -EFAULT;
} else {
@@ -1063,6 +1070,8 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
if (ret)
return ret;
+ ctx.optlen = max_optlen;
+
if (!retval) {
/* If kernel getsockopt finished successfully,
* copy whatever was returned to the user back
--
2.22.0.709.g102302147b-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH bpf-next 2/2] selftests/bpf: extend sockopt_sk selftest with TCP_CONGESTION use case
2019-07-29 21:51 [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer Stanislav Fomichev
2019-07-29 21:51 ` [PATCH bpf-next 1/2] bpf: always allocate at least 16 bytes for setsockopt hook Stanislav Fomichev
@ 2019-07-29 21:51 ` Stanislav Fomichev
2019-08-01 20:58 ` [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer Alexei Starovoitov
2 siblings, 0 replies; 6+ messages in thread
From: Stanislav Fomichev @ 2019-07-29 21:51 UTC (permalink / raw)
To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev
Ignore SOL_TCP:TCP_CONGESTION in getsockopt and always override
SOL_TCP:TCP_CONGESTION with "cubic" in setsockopt hook.
Call setsockopt(SOL_TCP, TCP_CONGESTION) with short optval ("nv")
to make sure BPF program has enough buffer space to replace it
with "cubic".
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
.../testing/selftests/bpf/progs/sockopt_sk.c | 22 ++++++++++++++++
tools/testing/selftests/bpf/test_sockopt_sk.c | 25 +++++++++++++++++++
2 files changed, 47 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c
index 076122c898e9..9a3d1c79e6fe 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_sk.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c
@@ -1,5 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
#include <netinet/in.h>
+#include <netinet/tcp.h>
#include <linux/bpf.h>
#include "bpf_helpers.h"
@@ -42,6 +44,14 @@ int _getsockopt(struct bpf_sockopt *ctx)
return 1;
}
+ if (ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION) {
+ /* Not interested in SOL_TCP:TCP_CONGESTION;
+ * let next BPF program in the cgroup chain or kernel
+ * handle it.
+ */
+ return 1;
+ }
+
if (ctx->level != SOL_CUSTOM)
return 0; /* EPERM, deny everything except custom level */
@@ -91,6 +101,18 @@ int _setsockopt(struct bpf_sockopt *ctx)
return 1;
}
+ if (ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION) {
+ /* Always use cubic */
+
+ if (optval + 5 > optval_end)
+ return 0; /* EPERM, bounds check */
+
+ memcpy(optval, "cubic", 5);
+ ctx->optlen = 5;
+
+ return 1;
+ }
+
if (ctx->level != SOL_CUSTOM)
return 0; /* EPERM, deny everything except custom level */
diff --git a/tools/testing/selftests/bpf/test_sockopt_sk.c b/tools/testing/selftests/bpf/test_sockopt_sk.c
index 036b652e5ca9..e4f6055d92e9 100644
--- a/tools/testing/selftests/bpf/test_sockopt_sk.c
+++ b/tools/testing/selftests/bpf/test_sockopt_sk.c
@@ -6,6 +6,7 @@
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
+#include <netinet/tcp.h>
#include <linux/filter.h>
#include <bpf/bpf.h>
@@ -25,6 +26,7 @@ static int getsetsockopt(void)
union {
char u8[4];
__u32 u32;
+ char cc[16]; /* TCP_CA_NAME_MAX */
} buf = {};
socklen_t optlen;
@@ -115,6 +117,29 @@ static int getsetsockopt(void)
goto err;
}
+ /* TCP_CONGESTION can extend the string */
+
+ strcpy(buf.cc, "nv");
+ err = setsockopt(fd, SOL_TCP, TCP_CONGESTION, &buf, strlen("nv"));
+ if (err) {
+ log_err("Failed to call setsockopt(TCP_CONGESTION)");
+ goto err;
+ }
+
+
+ optlen = sizeof(buf.cc);
+ err = getsockopt(fd, SOL_TCP, TCP_CONGESTION, &buf, &optlen);
+ if (err) {
+ log_err("Failed to call getsockopt(TCP_CONGESTION)");
+ goto err;
+ }
+
+ if (strcmp(buf.cc, "cubic") != 0) {
+ log_err("Unexpected getsockopt(TCP_CONGESTION) %s != %s",
+ buf.cc, "cubic");
+ goto err;
+ }
+
close(fd);
return 0;
err:
--
2.22.0.709.g102302147b-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer
2019-07-29 21:51 [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer Stanislav Fomichev
2019-07-29 21:51 ` [PATCH bpf-next 1/2] bpf: always allocate at least 16 bytes for setsockopt hook Stanislav Fomichev
2019-07-29 21:51 ` [PATCH bpf-next 2/2] selftests/bpf: extend sockopt_sk selftest with TCP_CONGESTION use case Stanislav Fomichev
@ 2019-08-01 20:58 ` Alexei Starovoitov
2019-08-01 21:11 ` Stanislav Fomichev
2 siblings, 1 reply; 6+ messages in thread
From: Alexei Starovoitov @ 2019-08-01 20:58 UTC (permalink / raw)
To: Stanislav Fomichev; +Cc: netdev, bpf, davem, ast, daniel
On Mon, Jul 29, 2019 at 02:51:09PM -0700, Stanislav Fomichev wrote:
> Current setsockopt hook is limited to the size of the buffer that
> user had supplied. Since we always allocate memory and copy the value
> into kernel space, allocate just a little bit more in case BPF
> program needs to override input data with a larger value.
>
> The canonical example is TCP_CONGESTION socket option where
> input buffer is a string and if user calls it with a short string,
> BPF program has no way of extending it.
>
> The tests are extended with TCP_CONGESTION use case.
Applied, Thanks
Please consider integrating test_sockopt* into test_progs.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer
2019-08-01 20:58 ` [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer Alexei Starovoitov
@ 2019-08-01 21:11 ` Stanislav Fomichev
2019-08-01 21:29 ` Alexei Starovoitov
0 siblings, 1 reply; 6+ messages in thread
From: Stanislav Fomichev @ 2019-08-01 21:11 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: Stanislav Fomichev, netdev, bpf, davem, ast, daniel
On 08/01, Alexei Starovoitov wrote:
> On Mon, Jul 29, 2019 at 02:51:09PM -0700, Stanislav Fomichev wrote:
> > Current setsockopt hook is limited to the size of the buffer that
> > user had supplied. Since we always allocate memory and copy the value
> > into kernel space, allocate just a little bit more in case BPF
> > program needs to override input data with a larger value.
> >
> > The canonical example is TCP_CONGESTION socket option where
> > input buffer is a string and if user calls it with a short string,
> > BPF program has no way of extending it.
> >
> > The tests are extended with TCP_CONGESTION use case.
>
> Applied, Thanks
>
> Please consider integrating test_sockopt* into test_progs.
Sure, will take a look. I think I didn't do it initially
because these tests create/move to cgroups and test_progs
do simple tests with BPF_PROG_TEST_RUN.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer
2019-08-01 21:11 ` Stanislav Fomichev
@ 2019-08-01 21:29 ` Alexei Starovoitov
0 siblings, 0 replies; 6+ messages in thread
From: Alexei Starovoitov @ 2019-08-01 21:29 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: Stanislav Fomichev, Network Development, bpf, David S. Miller,
Alexei Starovoitov, Daniel Borkmann
On Thu, Aug 1, 2019 at 2:11 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> On 08/01, Alexei Starovoitov wrote:
> > On Mon, Jul 29, 2019 at 02:51:09PM -0700, Stanislav Fomichev wrote:
> > > Current setsockopt hook is limited to the size of the buffer that
> > > user had supplied. Since we always allocate memory and copy the value
> > > into kernel space, allocate just a little bit more in case BPF
> > > program needs to override input data with a larger value.
> > >
> > > The canonical example is TCP_CONGESTION socket option where
> > > input buffer is a string and if user calls it with a short string,
> > > BPF program has no way of extending it.
> > >
> > > The tests are extended with TCP_CONGESTION use case.
> >
> > Applied, Thanks
> >
> > Please consider integrating test_sockopt* into test_progs.
> Sure, will take a look. I think I didn't do it initially
> because these tests create/move to cgroups and test_progs
> do simple tests with BPF_PROG_TEST_RUN.
I think it would be great to consolidate all tests under test_progs.
Since testing currently is all manual, myself and Daniel cannot realistically
run all of them for every patch.
When it's all part of test_progs it makes testing easier.
Especially test_progs can now run individual test or subtest.
^ permalink raw reply [flat|nested] 6+ messages in thread