public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
@ 2023-04-27 20:04 Stanislav Fomichev
  2023-04-27 20:04 ` [PATCH bpf-next v2 1/4] " Stanislav Fomichev
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Stanislav Fomichev @ 2023-04-27 20:04 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa

optval larger than PAGE_SIZE leads to EFAULT if the BPF program
isn't careful enough. This is often overlooked and might break
completely unrelated socket options. Instead of EFAULT,
let's ignore BPF program buffer changes. See the first patch for
more info.

In addition, clearly document this corner case and reset optlen
in our selftests (in case somebody copy-pastes from them).

Stanislav Fomichev (4):
  bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
  selftests/bpf: Update EFAULT {g,s}etsockopt selftests
  selftests/bpf: Correctly handle optlen > 4096
  bpf: Document EFAULT changes for sockopt

 Documentation/bpf/prog_cgroup_sockopt.rst     | 57 ++++++++++++-
 kernel/bpf/cgroup.c                           | 12 +++
 .../selftests/bpf/prog_tests/sockopt.c        | 80 +++++++++++++++++--
 .../progs/cgroup_getset_retval_getsockopt.c   | 12 +++
 .../progs/cgroup_getset_retval_setsockopt.c   | 16 ++++
 .../selftests/bpf/progs/sockopt_inherit.c     | 16 +++-
 .../selftests/bpf/progs/sockopt_multi.c       | 24 +++++-
 .../selftests/bpf/progs/sockopt_qos_to_cc.c   |  8 +-
 .../testing/selftests/bpf/progs/sockopt_sk.c  | 25 ++++--
 9 files changed, 230 insertions(+), 20 deletions(-)

-- 
2.40.1.495.gc816e09b53d-goog


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 1/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
  2023-04-27 20:04 [PATCH bpf-next v2 0/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen Stanislav Fomichev
@ 2023-04-27 20:04 ` Stanislav Fomichev
  2023-05-01  5:51   ` Martin KaFai Lau
  2023-04-27 20:04 ` [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests Stanislav Fomichev
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2023-04-27 20:04 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, Martin KaFai Lau

With the way the hooks implemented right now, we have a special
condition: optval larger than PAGE_SIZE will expose only first 4k into
BPF; any modifications to the optval are ignored. If the BPF program
doesn't handle this condition by resetting optlen to 0,
the userspace will get EFAULT.

The intention of the EFAULT was to make it apparent to the
developers that the program is doing something wrong.
However, this inadvertently might affect production workloads
with the BPF programs that are not too careful (i.e., returning EFAULT
for perfectly valid setsockopt/getsockopt calls).

Let's try to minimize the chance of BPF program screwing up userspace
by ignoring the output of those BPF programs (instead of returning
EFAULT to the userspace). pr_info_once those cases to
the dmesg to help with figuring out what's going wrong.

Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 kernel/bpf/cgroup.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index a06e118a9be5..e041159a1ce0 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1826,6 +1826,11 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
 		ret = 1;
 	} else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
 		/* optlen is out of bounds */
+		if (*optlen > PAGE_SIZE && ctx.optlen >= 0) {
+			pr_info_once("bpf setsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
+				     ctx.optlen, max_optlen);
+			goto out;
+		}
 		ret = -EFAULT;
 	} else {
 		/* optlen within bounds, run kernel handler */
@@ -1881,8 +1886,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
 		.optname = optname,
 		.current_task = current,
 	};
+	int orig_optlen;
 	int ret;
 
+	orig_optlen = max_optlen;
 	ctx.optlen = max_optlen;
 	max_optlen = sockopt_alloc_buf(&ctx, max_optlen, &buf);
 	if (max_optlen < 0)
@@ -1922,6 +1929,11 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
 		goto out;
 
 	if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
+		if (orig_optlen > PAGE_SIZE && ctx.optlen >= 0) {
+			pr_info_once("bpf getsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
+				     ctx.optlen, max_optlen);
+			goto out;
+		}
 		ret = -EFAULT;
 		goto out;
 	}
-- 
2.40.1.495.gc816e09b53d-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests
  2023-04-27 20:04 [PATCH bpf-next v2 0/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen Stanislav Fomichev
  2023-04-27 20:04 ` [PATCH bpf-next v2 1/4] " Stanislav Fomichev
@ 2023-04-27 20:04 ` Stanislav Fomichev
  2023-04-28 23:57   ` Martin KaFai Lau
  2023-04-27 20:04 ` [PATCH bpf-next v2 3/4] selftests/bpf: Correctly handle optlen > 4096 Stanislav Fomichev
  2023-04-27 20:04 ` [PATCH bpf-next v2 4/4] bpf: Document EFAULT changes for sockopt Stanislav Fomichev
  3 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2023-04-27 20:04 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa

Instead of assuming EFAULT, let's assume the BPF program's
output is ignored.

Remove "getsockopt: deny arbitrary ctx->retval" because it
was actually testing optlen. We have separate set of tests
for retval.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/sockopt.c        | 80 +++++++++++++++++--
 1 file changed, 74 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt.c b/tools/testing/selftests/bpf/prog_tests/sockopt.c
index aa4debf62fc6..8dad30ce910e 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockopt.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockopt.c
@@ -273,10 +273,30 @@ static struct sockopt_test {
 		.error = EFAULT_GETSOCKOPT,
 	},
 	{
-		.descr = "getsockopt: deny arbitrary ctx->retval",
+		.descr = "getsockopt: ignore >PAGE_SIZE optlen",
 		.insns = {
-			/* ctx->retval = 123 */
-			BPF_MOV64_IMM(BPF_REG_0, 123),
+			/* write 0xFF to the first optval byte */
+
+			/* r6 = ctx->optval */
+			BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1,
+				    offsetof(struct bpf_sockopt, optval)),
+			/* r2 = ctx->optval */
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_6),
+			/* r6 = ctx->optval + 1 */
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+
+			/* r7 = ctx->optval_end */
+			BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1,
+				    offsetof(struct bpf_sockopt, optval_end)),
+
+			/* if (ctx->optval + 1 <= ctx->optval_end) { */
+			BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1),
+			/* ctx->optval[0] = 0xF0 */
+			BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 0xFF),
+			/* } */
+
+			/* ctx->retval = 0 */
+			BPF_MOV64_IMM(BPF_REG_0, 0),
 			BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0,
 				    offsetof(struct bpf_sockopt, retval)),
 
@@ -287,9 +307,10 @@ static struct sockopt_test {
 		.attach_type = BPF_CGROUP_GETSOCKOPT,
 		.expected_attach_type = BPF_CGROUP_GETSOCKOPT,
 
-		.get_optlen = 64,
-
-		.error = EFAULT_GETSOCKOPT,
+		.get_level = 1234,
+		.get_optname = 5678,
+		.get_optval = {}, /* the changes are ignored */
+		.get_optlen = 4096 + 1,
 	},
 	{
 		.descr = "getsockopt: support smaller ctx->optlen",
@@ -648,6 +669,49 @@ static struct sockopt_test {
 
 		.error = EFAULT_SETSOCKOPT,
 	},
+	{
+		.descr = "setsockopt: ignore >PAGE_SIZE optlen",
+		.insns = {
+			/* write 0xFF to the first optval byte */
+
+			/* r6 = ctx->optval */
+			BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1,
+				    offsetof(struct bpf_sockopt, optval)),
+			/* r2 = ctx->optval */
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_6),
+			/* r6 = ctx->optval + 1 */
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+
+			/* r7 = ctx->optval_end */
+			BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1,
+				    offsetof(struct bpf_sockopt, optval_end)),
+
+			/* if (ctx->optval + 1 <= ctx->optval_end) { */
+			BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1),
+			/* ctx->optval[0] = 0xF0 */
+			BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 0xFF),
+			/* } */
+
+			BPF_MOV64_IMM(BPF_REG_0, 1),
+			BPF_EXIT_INSN(),
+		},
+		.attach_type = BPF_CGROUP_SETSOCKOPT,
+		.expected_attach_type = BPF_CGROUP_SETSOCKOPT,
+
+		.set_level = SOL_IP,
+		.set_optname = IP_TOS,
+		.set_optval = { 1 << 3 },
+		.set_optlen = 4096 + 1,
+
+		.get_level = SOL_IP,
+		.get_optname = IP_TOS,
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+		.get_optval = { 1 << 3, 0, 0, 0 }, /* the changes are ignored */
+#else
+		.get_optval = { 0, 0, 0, 1 << 3 }, /* the changes are ignored */
+#endif
+		.get_optlen = 4,
+	},
 	{
 		.descr = "setsockopt: allow changing ctx->optlen within bounds",
 		.insns = {
@@ -922,6 +986,7 @@ static int run_test(int cgroup_fd, struct sockopt_test *test)
 
 	if (test->get_optlen) {
 		optval = malloc(test->get_optlen);
+		memset(optval, 0, test->get_optlen);
 		socklen_t optlen = test->get_optlen;
 		socklen_t expected_get_optlen = test->get_optlen_ret ?:
 			test->get_optlen;
@@ -946,6 +1011,9 @@ static int run_test(int cgroup_fd, struct sockopt_test *test)
 			goto free_optval;
 		}
 
+		if (optlen > sizeof(test->get_optval))
+			optlen = sizeof(test->get_optval);
+
 		if (memcmp(optval, test->get_optval, optlen) != 0) {
 			errno = 0;
 			log_err("getsockopt returned unexpected optval");
-- 
2.40.1.495.gc816e09b53d-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 3/4] selftests/bpf: Correctly handle optlen > 4096
  2023-04-27 20:04 [PATCH bpf-next v2 0/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen Stanislav Fomichev
  2023-04-27 20:04 ` [PATCH bpf-next v2 1/4] " Stanislav Fomichev
  2023-04-27 20:04 ` [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests Stanislav Fomichev
@ 2023-04-27 20:04 ` Stanislav Fomichev
  2023-04-27 20:04 ` [PATCH bpf-next v2 4/4] bpf: Document EFAULT changes for sockopt Stanislav Fomichev
  3 siblings, 0 replies; 15+ messages in thread
From: Stanislav Fomichev @ 2023-04-27 20:04 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa

Even though it's not relevant in selftests, the people
might still copy-paste from them. So let's take care
of optlen > 4096 cases explicitly.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../progs/cgroup_getset_retval_getsockopt.c   | 12 +++++++++
 .../progs/cgroup_getset_retval_setsockopt.c   | 16 ++++++++++++
 .../selftests/bpf/progs/sockopt_inherit.c     | 16 ++++++++++--
 .../selftests/bpf/progs/sockopt_multi.c       | 24 +++++++++++++++---
 .../selftests/bpf/progs/sockopt_qos_to_cc.c   |  8 +++++-
 .../testing/selftests/bpf/progs/sockopt_sk.c  | 25 +++++++++++++------
 6 files changed, 88 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/cgroup_getset_retval_getsockopt.c b/tools/testing/selftests/bpf/progs/cgroup_getset_retval_getsockopt.c
index b2a409e6382a..b66454886cc4 100644
--- a/tools/testing/selftests/bpf/progs/cgroup_getset_retval_getsockopt.c
+++ b/tools/testing/selftests/bpf/progs/cgroup_getset_retval_getsockopt.c
@@ -20,6 +20,10 @@ int get_retval(struct bpf_sockopt *ctx)
 	ctx_retval_value = ctx->retval;
 	__sync_fetch_and_add(&invocations, 1);
 
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+
 	return 1;
 }
 
@@ -31,6 +35,10 @@ int set_eisconn(struct bpf_sockopt *ctx)
 	if (bpf_set_retval(-EISCONN))
 		assertion_error = 1;
 
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+
 	return 1;
 }
 
@@ -41,5 +49,9 @@ int clear_retval(struct bpf_sockopt *ctx)
 
 	ctx->retval = 0;
 
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+
 	return 1;
 }
diff --git a/tools/testing/selftests/bpf/progs/cgroup_getset_retval_setsockopt.c b/tools/testing/selftests/bpf/progs/cgroup_getset_retval_setsockopt.c
index d6e5903e06ba..68fce0311771 100644
--- a/tools/testing/selftests/bpf/progs/cgroup_getset_retval_setsockopt.c
+++ b/tools/testing/selftests/bpf/progs/cgroup_getset_retval_setsockopt.c
@@ -18,6 +18,10 @@ int get_retval(struct bpf_sockopt *ctx)
 	retval_value = bpf_get_retval();
 	__sync_fetch_and_add(&invocations, 1);
 
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+
 	return 1;
 }
 
@@ -29,6 +33,10 @@ int set_eunatch(struct bpf_sockopt *ctx)
 	if (bpf_set_retval(-EUNATCH))
 		assertion_error = 1;
 
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+
 	return 0;
 }
 
@@ -40,6 +48,10 @@ int set_eisconn(struct bpf_sockopt *ctx)
 	if (bpf_set_retval(-EISCONN))
 		assertion_error = 1;
 
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+
 	return 0;
 }
 
@@ -48,5 +60,9 @@ int legacy_eperm(struct bpf_sockopt *ctx)
 {
 	__sync_fetch_and_add(&invocations, 1);
 
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+
 	return 0;
 }
diff --git a/tools/testing/selftests/bpf/progs/sockopt_inherit.c b/tools/testing/selftests/bpf/progs/sockopt_inherit.c
index 9fb241b97291..78e56070dbcf 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_inherit.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_inherit.c
@@ -55,7 +55,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
 	__u8 *optval = ctx->optval;
 
 	if (ctx->level != SOL_CUSTOM)
-		return 1; /* only interested in SOL_CUSTOM */
+		goto out; /* only interested in SOL_CUSTOM */
 
 	if (optval + 1 > optval_end)
 		return 0; /* EPERM, bounds check */
@@ -70,6 +70,12 @@ int _getsockopt(struct bpf_sockopt *ctx)
 	ctx->optlen = 1;
 
 	return 1;
+
+out:
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+	return 1;
 }
 
 SEC("cgroup/setsockopt")
@@ -80,7 +86,7 @@ int _setsockopt(struct bpf_sockopt *ctx)
 	__u8 *optval = ctx->optval;
 
 	if (ctx->level != SOL_CUSTOM)
-		return 1; /* only interested in SOL_CUSTOM */
+		goto out; /* only interested in SOL_CUSTOM */
 
 	if (optval + 1 > optval_end)
 		return 0; /* EPERM, bounds check */
@@ -93,4 +99,10 @@ int _setsockopt(struct bpf_sockopt *ctx)
 	ctx->optlen = -1;
 
 	return 1;
+
+out:
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+	return 1;
 }
diff --git a/tools/testing/selftests/bpf/progs/sockopt_multi.c b/tools/testing/selftests/bpf/progs/sockopt_multi.c
index 177a59069dae..f645eb253c6c 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_multi.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_multi.c
@@ -12,7 +12,7 @@ int _getsockopt_child(struct bpf_sockopt *ctx)
 	__u8 *optval = ctx->optval;
 
 	if (ctx->level != SOL_IP || ctx->optname != IP_TOS)
-		return 1;
+		goto out;
 
 	if (optval + 1 > optval_end)
 		return 0; /* EPERM, bounds check */
@@ -26,6 +26,12 @@ int _getsockopt_child(struct bpf_sockopt *ctx)
 	ctx->optlen = 1;
 
 	return 1;
+
+out:
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+	return 1;
 }
 
 SEC("cgroup/getsockopt")
@@ -35,7 +41,7 @@ int _getsockopt_parent(struct bpf_sockopt *ctx)
 	__u8 *optval = ctx->optval;
 
 	if (ctx->level != SOL_IP || ctx->optname != IP_TOS)
-		return 1;
+		goto out;
 
 	if (optval + 1 > optval_end)
 		return 0; /* EPERM, bounds check */
@@ -49,6 +55,12 @@ int _getsockopt_parent(struct bpf_sockopt *ctx)
 	ctx->optlen = 1;
 
 	return 1;
+
+out:
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+	return 1;
 }
 
 SEC("cgroup/setsockopt")
@@ -58,7 +70,7 @@ int _setsockopt(struct bpf_sockopt *ctx)
 	__u8 *optval = ctx->optval;
 
 	if (ctx->level != SOL_IP || ctx->optname != IP_TOS)
-		return 1;
+		goto out;
 
 	if (optval + 1 > optval_end)
 		return 0; /* EPERM, bounds check */
@@ -67,4 +79,10 @@ int _setsockopt(struct bpf_sockopt *ctx)
 	ctx->optlen = 1;
 
 	return 1;
+
+out:
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+	return 1;
 }
diff --git a/tools/testing/selftests/bpf/progs/sockopt_qos_to_cc.c b/tools/testing/selftests/bpf/progs/sockopt_qos_to_cc.c
index 1bce83b6e3a7..597f3e276cf7 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_qos_to_cc.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_qos_to_cc.c
@@ -19,7 +19,7 @@ int sockopt_qos_to_cc(struct bpf_sockopt *ctx)
 	char cc_cubic[TCP_CA_NAME_MAX] = "cubic";
 
 	if (ctx->level != SOL_IPV6 || ctx->optname != IPV6_TCLASS)
-		return 1;
+		goto out;
 
 	if (optval + 1 > optval_end)
 		return 0; /* EPERM, bounds check */
@@ -36,4 +36,10 @@ int sockopt_qos_to_cc(struct bpf_sockopt *ctx)
 			return 0;
 	}
 	return 1;
+
+out:
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+	return 1;
 }
diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c
index fe1df4cd206e..e1aed858c208 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_sk.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c
@@ -37,7 +37,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
 	/* Bypass AF_NETLINK. */
 	sk = ctx->sk;
 	if (sk && sk->family == AF_NETLINK)
-		return 1;
+		goto out;
 
 	/* Make sure bpf_get_netns_cookie is callable.
 	 */
@@ -52,8 +52,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
 		 * let next BPF program in the cgroup chain or kernel
 		 * handle it.
 		 */
-		ctx->optlen = 0; /* bypass optval>PAGE_SIZE */
-		return 1;
+		goto out;
 	}
 
 	if (ctx->level == SOL_SOCKET && ctx->optname == SO_SNDBUF) {
@@ -61,7 +60,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
 		 * let next BPF program in the cgroup chain or kernel
 		 * handle it.
 		 */
-		return 1;
+		goto out;
 	}
 
 	if (ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION) {
@@ -69,7 +68,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
 		 * let next BPF program in the cgroup chain or kernel
 		 * handle it.
 		 */
-		return 1;
+		goto out;
 	}
 
 	if (ctx->level == SOL_TCP && ctx->optname == TCP_ZEROCOPY_RECEIVE) {
@@ -85,7 +84,7 @@ int _getsockopt(struct bpf_sockopt *ctx)
 		if (((struct tcp_zerocopy_receive *)optval)->address != 0)
 			return 0; /* unexpected data */
 
-		return 1;
+		goto out;
 	}
 
 	if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
@@ -129,6 +128,12 @@ int _getsockopt(struct bpf_sockopt *ctx)
 	ctx->optlen = 1;
 
 	return 1;
+
+out:
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+	return 1;
 }
 
 SEC("cgroup/setsockopt")
@@ -142,7 +147,7 @@ int _setsockopt(struct bpf_sockopt *ctx)
 	/* Bypass AF_NETLINK. */
 	sk = ctx->sk;
 	if (sk && sk->family == AF_NETLINK)
-		return 1;
+		goto out;
 
 	/* Make sure bpf_get_netns_cookie is callable.
 	 */
@@ -224,4 +229,10 @@ int _setsockopt(struct bpf_sockopt *ctx)
 			   */
 
 	return 1;
+
+out:
+	/* optval larger than PAGE_SIZE use kernel's buffer. */
+	if (ctx->optlen > 4096)
+		ctx->optlen = 0;
+	return 1;
 }
-- 
2.40.1.495.gc816e09b53d-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 4/4] bpf: Document EFAULT changes for sockopt
  2023-04-27 20:04 [PATCH bpf-next v2 0/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen Stanislav Fomichev
                   ` (2 preceding siblings ...)
  2023-04-27 20:04 ` [PATCH bpf-next v2 3/4] selftests/bpf: Correctly handle optlen > 4096 Stanislav Fomichev
@ 2023-04-27 20:04 ` Stanislav Fomichev
  3 siblings, 0 replies; 15+ messages in thread
From: Stanislav Fomichev @ 2023-04-27 20:04 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa

And add examples for how to correctly handle large optlens.
This is less relevant now when we don't EFAULT anymore, but
that's still the correct thing to do.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 Documentation/bpf/prog_cgroup_sockopt.rst | 57 ++++++++++++++++++++++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/Documentation/bpf/prog_cgroup_sockopt.rst b/Documentation/bpf/prog_cgroup_sockopt.rst
index 172f957204bf..d5cca25135b6 100644
--- a/Documentation/bpf/prog_cgroup_sockopt.rst
+++ b/Documentation/bpf/prog_cgroup_sockopt.rst
@@ -98,10 +98,65 @@ When the ``optval`` is greater than the ``PAGE_SIZE``, the BPF program
   indicates that the kernel should use BPF's trimmed ``optval``.
 
 When the BPF program returns with the ``optlen`` greater than
-``PAGE_SIZE``, the userspace will receive ``EFAULT`` errno.
+``PAGE_SIZE``, the userspace will receive original kernel
+buffers without any modifications that the BPF program might have
+applied.
 
 Example
 =======
 
+Recommended way to handle BPF programs is as follows:
+
+.. code-block:: c
+
+	SEC("cgroup/getsockopt")
+	int getsockopt(struct bpf_sockopt *ctx)
+	{
+		/* Custom socket option. */
+		if (ctx->level == MY_SOL && ctx->optname == MY_OPTNAME) {
+			ctx->retval = 0;
+			optval[0] = ...;
+			ctx->optlen = 1;
+			return 1;
+		}
+
+		/* Modify kernel's socket option. */
+		if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
+			ctx->retval = 0;
+			optval[0] = ...;
+			ctx->optlen = 1;
+			return 1;
+		}
+
+		/* optval larger than PAGE_SIZE use kernel's buffer. */
+		if (ctx->optlen > 4096)
+			ctx->optlen = 0;
+
+		return 1;
+	}
+
+	SEC("cgroup/setsockopt")
+	int setsockopt(struct bpf_sockopt *ctx)
+	{
+		/* Custom socket option. */
+		if (ctx->level == MY_SOL && ctx->optname == MY_OPTNAME) {
+			/* do something */
+			ctx->optlen = -1;
+			return 1;
+		}
+
+		/* Modify kernel's socket option. */
+		if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
+			optval[0] = ...;
+			return 1;
+		}
+
+		/* optval larger than PAGE_SIZE use kernel's buffer. */
+		if (ctx->optlen > 4096)
+			ctx->optlen = 0;
+
+		return 1;
+	}
+
 See ``tools/testing/selftests/bpf/progs/sockopt_sk.c`` for an example
 of BPF program that handles socket options.
-- 
2.40.1.495.gc816e09b53d-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests
  2023-04-27 20:04 ` [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests Stanislav Fomichev
@ 2023-04-28 23:57   ` Martin KaFai Lau
  2023-04-28 23:59     ` Stanislav Fomichev
  0 siblings, 1 reply; 15+ messages in thread
From: Martin KaFai Lau @ 2023-04-28 23:57 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, bpf

On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> Instead of assuming EFAULT, let's assume the BPF program's
> output is ignored.
> 
> Remove "getsockopt: deny arbitrary ctx->retval" because it
> was actually testing optlen. We have separate set of tests
> for retval.
> 
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>   .../selftests/bpf/prog_tests/sockopt.c        | 80 +++++++++++++++++--
>   1 file changed, 74 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt.c b/tools/testing/selftests/bpf/prog_tests/sockopt.c
> index aa4debf62fc6..8dad30ce910e 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockopt.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockopt.c
> @@ -273,10 +273,30 @@ static struct sockopt_test {
>   		.error = EFAULT_GETSOCKOPT,
>   	},
>   	{
> -		.descr = "getsockopt: deny arbitrary ctx->retval",
> +		.descr = "getsockopt: ignore >PAGE_SIZE optlen",
>   		.insns = {
> -			/* ctx->retval = 123 */
> -			BPF_MOV64_IMM(BPF_REG_0, 123),
> +			/* write 0xFF to the first optval byte */
> +
> +			/* r6 = ctx->optval */
> +			BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1,
> +				    offsetof(struct bpf_sockopt, optval)),
> +			/* r2 = ctx->optval */
> +			BPF_MOV64_REG(BPF_REG_2, BPF_REG_6),
> +			/* r6 = ctx->optval + 1 */
> +			BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
> +
> +			/* r7 = ctx->optval_end */
> +			BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1,
> +				    offsetof(struct bpf_sockopt, optval_end)),
> +
> +			/* if (ctx->optval + 1 <= ctx->optval_end) { */
> +			BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1),
> +			/* ctx->optval[0] = 0xF0 */
> +			BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 0xFF),
> +			/* } */
> +
> +			/* ctx->retval = 0 */
> +			BPF_MOV64_IMM(BPF_REG_0, 0),
>   			BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0,
>   				    offsetof(struct bpf_sockopt, retval)),
>   
> @@ -287,9 +307,10 @@ static struct sockopt_test {
>   		.attach_type = BPF_CGROUP_GETSOCKOPT,
>   		.expected_attach_type = BPF_CGROUP_GETSOCKOPT,
>   
> -		.get_optlen = 64,
> -
> -		.error = EFAULT_GETSOCKOPT,
> +		.get_level = 1234,
> +		.get_optname = 5678,
> +		.get_optval = {}, /* the changes are ignored */
> +		.get_optlen = 4096 + 1,

The patchset looks good. Thanks for taking care of it.

One question, is it safe to the assume 4096 page size for all platforms in the 
selftests?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests
  2023-04-28 23:57   ` Martin KaFai Lau
@ 2023-04-28 23:59     ` Stanislav Fomichev
  2023-04-29  0:32       ` Stanislav Fomichev
  0 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2023-04-28 23:59 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, bpf

On Fri, Apr 28, 2023 at 4:57 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> > Instead of assuming EFAULT, let's assume the BPF program's
> > output is ignored.
> >
> > Remove "getsockopt: deny arbitrary ctx->retval" because it
> > was actually testing optlen. We have separate set of tests
> > for retval.
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >   .../selftests/bpf/prog_tests/sockopt.c        | 80 +++++++++++++++++--
> >   1 file changed, 74 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt.c b/tools/testing/selftests/bpf/prog_tests/sockopt.c
> > index aa4debf62fc6..8dad30ce910e 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/sockopt.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/sockopt.c
> > @@ -273,10 +273,30 @@ static struct sockopt_test {
> >               .error = EFAULT_GETSOCKOPT,
> >       },
> >       {
> > -             .descr = "getsockopt: deny arbitrary ctx->retval",
> > +             .descr = "getsockopt: ignore >PAGE_SIZE optlen",
> >               .insns = {
> > -                     /* ctx->retval = 123 */
> > -                     BPF_MOV64_IMM(BPF_REG_0, 123),
> > +                     /* write 0xFF to the first optval byte */
> > +
> > +                     /* r6 = ctx->optval */
> > +                     BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1,
> > +                                 offsetof(struct bpf_sockopt, optval)),
> > +                     /* r2 = ctx->optval */
> > +                     BPF_MOV64_REG(BPF_REG_2, BPF_REG_6),
> > +                     /* r6 = ctx->optval + 1 */
> > +                     BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
> > +
> > +                     /* r7 = ctx->optval_end */
> > +                     BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1,
> > +                                 offsetof(struct bpf_sockopt, optval_end)),
> > +
> > +                     /* if (ctx->optval + 1 <= ctx->optval_end) { */
> > +                     BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1),
> > +                     /* ctx->optval[0] = 0xF0 */
> > +                     BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 0xFF),
> > +                     /* } */
> > +
> > +                     /* ctx->retval = 0 */
> > +                     BPF_MOV64_IMM(BPF_REG_0, 0),
> >                       BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0,
> >                                   offsetof(struct bpf_sockopt, retval)),
> >
> > @@ -287,9 +307,10 @@ static struct sockopt_test {
> >               .attach_type = BPF_CGROUP_GETSOCKOPT,
> >               .expected_attach_type = BPF_CGROUP_GETSOCKOPT,
> >
> > -             .get_optlen = 64,
> > -
> > -             .error = EFAULT_GETSOCKOPT,
> > +             .get_level = 1234,
> > +             .get_optname = 5678,
> > +             .get_optval = {}, /* the changes are ignored */
> > +             .get_optlen = 4096 + 1,
>
> The patchset looks good. Thanks for taking care of it.
>
> One question, is it safe to the assume 4096 page size for all platforms in the
> selftests?

Good question; let me respin with sysconf() just to be safe..

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests
  2023-04-28 23:59     ` Stanislav Fomichev
@ 2023-04-29  0:32       ` Stanislav Fomichev
  2023-04-29  0:44         ` Martin KaFai Lau
  0 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2023-04-29  0:32 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, bpf

On Fri, Apr 28, 2023 at 4:59 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Fri, Apr 28, 2023 at 4:57 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >
> > On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> > > Instead of assuming EFAULT, let's assume the BPF program's
> > > output is ignored.
> > >
> > > Remove "getsockopt: deny arbitrary ctx->retval" because it
> > > was actually testing optlen. We have separate set of tests
> > > for retval.
> > >
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >   .../selftests/bpf/prog_tests/sockopt.c        | 80 +++++++++++++++++--
> > >   1 file changed, 74 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt.c b/tools/testing/selftests/bpf/prog_tests/sockopt.c
> > > index aa4debf62fc6..8dad30ce910e 100644
> > > --- a/tools/testing/selftests/bpf/prog_tests/sockopt.c
> > > +++ b/tools/testing/selftests/bpf/prog_tests/sockopt.c
> > > @@ -273,10 +273,30 @@ static struct sockopt_test {
> > >               .error = EFAULT_GETSOCKOPT,
> > >       },
> > >       {
> > > -             .descr = "getsockopt: deny arbitrary ctx->retval",
> > > +             .descr = "getsockopt: ignore >PAGE_SIZE optlen",
> > >               .insns = {
> > > -                     /* ctx->retval = 123 */
> > > -                     BPF_MOV64_IMM(BPF_REG_0, 123),
> > > +                     /* write 0xFF to the first optval byte */
> > > +
> > > +                     /* r6 = ctx->optval */
> > > +                     BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1,
> > > +                                 offsetof(struct bpf_sockopt, optval)),
> > > +                     /* r2 = ctx->optval */
> > > +                     BPF_MOV64_REG(BPF_REG_2, BPF_REG_6),
> > > +                     /* r6 = ctx->optval + 1 */
> > > +                     BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
> > > +
> > > +                     /* r7 = ctx->optval_end */
> > > +                     BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1,
> > > +                                 offsetof(struct bpf_sockopt, optval_end)),
> > > +
> > > +                     /* if (ctx->optval + 1 <= ctx->optval_end) { */
> > > +                     BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1),
> > > +                     /* ctx->optval[0] = 0xF0 */
> > > +                     BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 0xFF),
> > > +                     /* } */
> > > +
> > > +                     /* ctx->retval = 0 */
> > > +                     BPF_MOV64_IMM(BPF_REG_0, 0),
> > >                       BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0,
> > >                                   offsetof(struct bpf_sockopt, retval)),
> > >
> > > @@ -287,9 +307,10 @@ static struct sockopt_test {
> > >               .attach_type = BPF_CGROUP_GETSOCKOPT,
> > >               .expected_attach_type = BPF_CGROUP_GETSOCKOPT,
> > >
> > > -             .get_optlen = 64,
> > > -
> > > -             .error = EFAULT_GETSOCKOPT,
> > > +             .get_level = 1234,
> > > +             .get_optname = 5678,
> > > +             .get_optval = {}, /* the changes are ignored */
> > > +             .get_optlen = 4096 + 1,
> >
> > The patchset looks good. Thanks for taking care of it.
> >
> > One question, is it safe to the assume 4096 page size for all platforms in the
> > selftests?
>
> Good question; let me respin with sysconf() just to be safe..

Argh, the compiler yells at me:
error: initializer element is not a compile-time constant

I guess I'm just gonna do #define PAGE_SIZE 4096 and if we do hit some
problems on the other archs, I'll ifdef it in one place.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests
  2023-04-29  0:32       ` Stanislav Fomichev
@ 2023-04-29  0:44         ` Martin KaFai Lau
  2023-05-01 17:22           ` Stanislav Fomichev
  0 siblings, 1 reply; 15+ messages in thread
From: Martin KaFai Lau @ 2023-04-29  0:44 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, bpf

On 4/28/23 5:32 PM, Stanislav Fomichev wrote:
> On Fri, Apr 28, 2023 at 4:59 PM Stanislav Fomichev <sdf@google.com> wrote:
>>
>> On Fri, Apr 28, 2023 at 4:57 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>
>>> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
>>>> Instead of assuming EFAULT, let's assume the BPF program's
>>>> output is ignored.
>>>>
>>>> Remove "getsockopt: deny arbitrary ctx->retval" because it
>>>> was actually testing optlen. We have separate set of tests
>>>> for retval.
>>>>
>>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>>>> ---
>>>>    .../selftests/bpf/prog_tests/sockopt.c        | 80 +++++++++++++++++--
>>>>    1 file changed, 74 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt.c b/tools/testing/selftests/bpf/prog_tests/sockopt.c
>>>> index aa4debf62fc6..8dad30ce910e 100644
>>>> --- a/tools/testing/selftests/bpf/prog_tests/sockopt.c
>>>> +++ b/tools/testing/selftests/bpf/prog_tests/sockopt.c
>>>> @@ -273,10 +273,30 @@ static struct sockopt_test {
>>>>                .error = EFAULT_GETSOCKOPT,
>>>>        },
>>>>        {
>>>> -             .descr = "getsockopt: deny arbitrary ctx->retval",
>>>> +             .descr = "getsockopt: ignore >PAGE_SIZE optlen",
>>>>                .insns = {
>>>> -                     /* ctx->retval = 123 */
>>>> -                     BPF_MOV64_IMM(BPF_REG_0, 123),
>>>> +                     /* write 0xFF to the first optval byte */
>>>> +
>>>> +                     /* r6 = ctx->optval */
>>>> +                     BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1,
>>>> +                                 offsetof(struct bpf_sockopt, optval)),
>>>> +                     /* r2 = ctx->optval */
>>>> +                     BPF_MOV64_REG(BPF_REG_2, BPF_REG_6),
>>>> +                     /* r6 = ctx->optval + 1 */
>>>> +                     BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
>>>> +
>>>> +                     /* r7 = ctx->optval_end */
>>>> +                     BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1,
>>>> +                                 offsetof(struct bpf_sockopt, optval_end)),
>>>> +
>>>> +                     /* if (ctx->optval + 1 <= ctx->optval_end) { */
>>>> +                     BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1),
>>>> +                     /* ctx->optval[0] = 0xF0 */
>>>> +                     BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 0xFF),
>>>> +                     /* } */
>>>> +
>>>> +                     /* ctx->retval = 0 */
>>>> +                     BPF_MOV64_IMM(BPF_REG_0, 0),
>>>>                        BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0,
>>>>                                    offsetof(struct bpf_sockopt, retval)),
>>>>
>>>> @@ -287,9 +307,10 @@ static struct sockopt_test {
>>>>                .attach_type = BPF_CGROUP_GETSOCKOPT,
>>>>                .expected_attach_type = BPF_CGROUP_GETSOCKOPT,
>>>>
>>>> -             .get_optlen = 64,
>>>> -
>>>> -             .error = EFAULT_GETSOCKOPT,
>>>> +             .get_level = 1234,
>>>> +             .get_optname = 5678,
>>>> +             .get_optval = {}, /* the changes are ignored */
>>>> +             .get_optlen = 4096 + 1,
>>>
>>> The patchset looks good. Thanks for taking care of it.
>>>
>>> One question, is it safe to the assume 4096 page size for all platforms in the
>>> selftests?
>>
>> Good question; let me respin with sysconf() just to be safe..
> 
> Argh, the compiler yells at me:
> error: initializer element is not a compile-time constant
> 
> I guess I'm just gonna do #define PAGE_SIZE 4096 and if we do hit some
> problems on the other archs, I'll ifdef it in one place.

or run_test() can reinit optlen to sysconf_page_size + 1 if optlen == 4097.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
  2023-04-27 20:04 ` [PATCH bpf-next v2 1/4] " Stanislav Fomichev
@ 2023-05-01  5:51   ` Martin KaFai Lau
  2023-05-01 16:55     ` Stanislav Fomichev
  0 siblings, 1 reply; 15+ messages in thread
From: Martin KaFai Lau @ 2023-05-01  5:51 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, Martin KaFai Lau, bpf

On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> @@ -1881,8 +1886,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
>   		.optname = optname,
>   		.current_task = current,
>   	};
> +	int orig_optlen;
>   	int ret;
>   
> +	orig_optlen = max_optlen;

For getsockopt, when the kernel's getsockopt finished successfully (the 
following 'if (!retval)' case), how about also setting orig_optlen to the kernel 
returned 'optlen'. For example, the user's orig_optlen is 8096 and the kernel 
returned optlen is 1024. If the bpf prog still sets the ctx.optlen to something 
 > PAGE_SIZE, -EFAULT will be returned.

>   	ctx.optlen = max_optlen;
>   	max_optlen = sockopt_alloc_buf(&ctx, max_optlen, &buf);
>   	if (max_optlen < 0)
> @@ -1922,6 +1929,11 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
>   		goto out;
>   
>   	if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
> +		if (orig_optlen > PAGE_SIZE && ctx.optlen >= 0) {
> +			pr_info_once("bpf getsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
> +				     ctx.optlen, max_optlen);
> +			goto out;
> +		}
>   		ret = -EFAULT;
>   		goto out;
>   	}


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
  2023-05-01  5:51   ` Martin KaFai Lau
@ 2023-05-01 16:55     ` Stanislav Fomichev
  2023-05-01 18:58       ` Martin KaFai Lau
  0 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2023-05-01 16:55 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, Martin KaFai Lau, bpf

On Sun, Apr 30, 2023 at 10:52 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> > @@ -1881,8 +1886,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
> >               .optname = optname,
> >               .current_task = current,
> >       };
> > +     int orig_optlen;
> >       int ret;
> >
> > +     orig_optlen = max_optlen;
>
> For getsockopt, when the kernel's getsockopt finished successfully (the
> following 'if (!retval)' case), how about also setting orig_optlen to the kernel
> returned 'optlen'. For example, the user's orig_optlen is 8096 and the kernel
> returned optlen is 1024. If the bpf prog still sets the ctx.optlen to something
>  > PAGE_SIZE, -EFAULT will be returned.

Wouldn't it defeat the purpose? Or am I missing something?

ctx.optlen would still be 8096, not 1024, right (regardless of what
the kernel returns)?
So it would trigger EFAULT case which we try to avoid.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests
  2023-04-29  0:44         ` Martin KaFai Lau
@ 2023-05-01 17:22           ` Stanislav Fomichev
  2023-05-01 19:04             ` Martin KaFai Lau
  0 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2023-05-01 17:22 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, bpf

On Fri, Apr 28, 2023 at 5:44 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 4/28/23 5:32 PM, Stanislav Fomichev wrote:
> > On Fri, Apr 28, 2023 at 4:59 PM Stanislav Fomichev <sdf@google.com> wrote:
> >>
> >> On Fri, Apr 28, 2023 at 4:57 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>>
> >>> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> >>>> Instead of assuming EFAULT, let's assume the BPF program's
> >>>> output is ignored.
> >>>>
> >>>> Remove "getsockopt: deny arbitrary ctx->retval" because it
> >>>> was actually testing optlen. We have separate set of tests
> >>>> for retval.
> >>>>
> >>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> >>>> ---
> >>>>    .../selftests/bpf/prog_tests/sockopt.c        | 80 +++++++++++++++++--
> >>>>    1 file changed, 74 insertions(+), 6 deletions(-)
> >>>>
> >>>> diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt.c b/tools/testing/selftests/bpf/prog_tests/sockopt.c
> >>>> index aa4debf62fc6..8dad30ce910e 100644
> >>>> --- a/tools/testing/selftests/bpf/prog_tests/sockopt.c
> >>>> +++ b/tools/testing/selftests/bpf/prog_tests/sockopt.c
> >>>> @@ -273,10 +273,30 @@ static struct sockopt_test {
> >>>>                .error = EFAULT_GETSOCKOPT,
> >>>>        },
> >>>>        {
> >>>> -             .descr = "getsockopt: deny arbitrary ctx->retval",
> >>>> +             .descr = "getsockopt: ignore >PAGE_SIZE optlen",
> >>>>                .insns = {
> >>>> -                     /* ctx->retval = 123 */
> >>>> -                     BPF_MOV64_IMM(BPF_REG_0, 123),
> >>>> +                     /* write 0xFF to the first optval byte */
> >>>> +
> >>>> +                     /* r6 = ctx->optval */
> >>>> +                     BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1,
> >>>> +                                 offsetof(struct bpf_sockopt, optval)),
> >>>> +                     /* r2 = ctx->optval */
> >>>> +                     BPF_MOV64_REG(BPF_REG_2, BPF_REG_6),
> >>>> +                     /* r6 = ctx->optval + 1 */
> >>>> +                     BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
> >>>> +
> >>>> +                     /* r7 = ctx->optval_end */
> >>>> +                     BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1,
> >>>> +                                 offsetof(struct bpf_sockopt, optval_end)),
> >>>> +
> >>>> +                     /* if (ctx->optval + 1 <= ctx->optval_end) { */
> >>>> +                     BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1),
> >>>> +                     /* ctx->optval[0] = 0xF0 */
> >>>> +                     BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 0xFF),
> >>>> +                     /* } */
> >>>> +
> >>>> +                     /* ctx->retval = 0 */
> >>>> +                     BPF_MOV64_IMM(BPF_REG_0, 0),
> >>>>                        BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0,
> >>>>                                    offsetof(struct bpf_sockopt, retval)),
> >>>>
> >>>> @@ -287,9 +307,10 @@ static struct sockopt_test {
> >>>>                .attach_type = BPF_CGROUP_GETSOCKOPT,
> >>>>                .expected_attach_type = BPF_CGROUP_GETSOCKOPT,
> >>>>
> >>>> -             .get_optlen = 64,
> >>>> -
> >>>> -             .error = EFAULT_GETSOCKOPT,
> >>>> +             .get_level = 1234,
> >>>> +             .get_optname = 5678,
> >>>> +             .get_optval = {}, /* the changes are ignored */
> >>>> +             .get_optlen = 4096 + 1,
> >>>
> >>> The patchset looks good. Thanks for taking care of it.
> >>>
> >>> One question, is it safe to the assume 4096 page size for all platforms in the
> >>> selftests?
> >>
> >> Good question; let me respin with sysconf() just to be safe..
> >
> > Argh, the compiler yells at me:
> > error: initializer element is not a compile-time constant
> >
> > I guess I'm just gonna do #define PAGE_SIZE 4096 and if we do hit some
> > problems on the other archs, I'll ifdef it in one place.
>
> or run_test() can reinit optlen to sysconf_page_size + 1 if optlen == 4097.

Maybe I can do something like the following?

               if (test->set_optlen >= PAGE_SIZE) {
                       int num_pages = test->set_optlen / PAGE_SIZE;
                       int remainder = test->set_optlen % PAGE_SIZE;

                       test->set_optlen = num_pages *
sysconf(_SC_PAGESIZE) + remainder;
               }

More verbose, but less magical than depending on 4097. For the BPF
side, I can probably pass proper value via bss..

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
  2023-05-01 16:55     ` Stanislav Fomichev
@ 2023-05-01 18:58       ` Martin KaFai Lau
  2023-05-01 19:33         ` Stanislav Fomichev
  0 siblings, 1 reply; 15+ messages in thread
From: Martin KaFai Lau @ 2023-05-01 18:58 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, Martin KaFai Lau, bpf

On 5/1/23 9:55 AM, Stanislav Fomichev wrote:
> On Sun, Apr 30, 2023 at 10:52 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
>>> @@ -1881,8 +1886,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
>>>                .optname = optname,
>>>                .current_task = current,
>>>        };
>>> +     int orig_optlen;
>>>        int ret;
>>>
>>> +     orig_optlen = max_optlen;
>>
>> For getsockopt, when the kernel's getsockopt finished successfully (the
>> following 'if (!retval)' case), how about also setting orig_optlen to the kernel
>> returned 'optlen'. For example, the user's orig_optlen is 8096 and the kernel
>> returned optlen is 1024. If the bpf prog still sets the ctx.optlen to something
>>   > PAGE_SIZE, -EFAULT will be returned.
> 
> Wouldn't it defeat the purpose? Or am I missing something?
> 
> ctx.optlen would still be 8096, not 1024, right (regardless of what
> the kernel returns)?
> So it would trigger EFAULT case which we try to avoid.

My understanding is the ctx.optlen should be 1024 after the 'if (!retval)' 
statement.

The 'int __user *optlen' arg has the kernel returned optlen (1024). The 'int 
max_optlen' arg has the original user's optlen (8096).

int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
				       int optname, char __user *optval,
				       int __user *optlen /* 1024 */,
				       int max_optlen /* 8096 */,
				       int retval)
{
	/* ... */

	orig_optlen = max_optlen; /* orig_optlen == 8096 */
	ctx.optlen = max_optlen;  /* ctx.optlen == 8096 */

	
	if (!retval) {
		/* If kernel getsockopt finished successfully,
		 * copy whatever was returned to the user back
		 * into our temporary buffer. Set optlen to the
		 * one that kernel returned as well to let
		 * BPF programs inspect the value.
		 */

		if (get_user(ctx.optlen, optlen)) {
			ret = -EFAULT;
			goto out;
		}

		/* ctx.optlen == 1024 */

		orig_optlen = ctx.optlen;
	}

	/* ... */
}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests
  2023-05-01 17:22           ` Stanislav Fomichev
@ 2023-05-01 19:04             ` Martin KaFai Lau
  0 siblings, 0 replies; 15+ messages in thread
From: Martin KaFai Lau @ 2023-05-01 19:04 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, bpf

On 5/1/23 10:22 AM, Stanislav Fomichev wrote:
> On Fri, Apr 28, 2023 at 5:44 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 4/28/23 5:32 PM, Stanislav Fomichev wrote:
>>> On Fri, Apr 28, 2023 at 4:59 PM Stanislav Fomichev <sdf@google.com> wrote:
>>>>
>>>> On Fri, Apr 28, 2023 at 4:57 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>>>
>>>>> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
>>>>>> Instead of assuming EFAULT, let's assume the BPF program's
>>>>>> output is ignored.
>>>>>>
>>>>>> Remove "getsockopt: deny arbitrary ctx->retval" because it
>>>>>> was actually testing optlen. We have separate set of tests
>>>>>> for retval.
>>>>>>
>>>>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>>>>>> ---
>>>>>>     .../selftests/bpf/prog_tests/sockopt.c        | 80 +++++++++++++++++--
>>>>>>     1 file changed, 74 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt.c b/tools/testing/selftests/bpf/prog_tests/sockopt.c
>>>>>> index aa4debf62fc6..8dad30ce910e 100644
>>>>>> --- a/tools/testing/selftests/bpf/prog_tests/sockopt.c
>>>>>> +++ b/tools/testing/selftests/bpf/prog_tests/sockopt.c
>>>>>> @@ -273,10 +273,30 @@ static struct sockopt_test {
>>>>>>                 .error = EFAULT_GETSOCKOPT,
>>>>>>         },
>>>>>>         {
>>>>>> -             .descr = "getsockopt: deny arbitrary ctx->retval",
>>>>>> +             .descr = "getsockopt: ignore >PAGE_SIZE optlen",
>>>>>>                 .insns = {
>>>>>> -                     /* ctx->retval = 123 */
>>>>>> -                     BPF_MOV64_IMM(BPF_REG_0, 123),
>>>>>> +                     /* write 0xFF to the first optval byte */
>>>>>> +
>>>>>> +                     /* r6 = ctx->optval */
>>>>>> +                     BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1,
>>>>>> +                                 offsetof(struct bpf_sockopt, optval)),
>>>>>> +                     /* r2 = ctx->optval */
>>>>>> +                     BPF_MOV64_REG(BPF_REG_2, BPF_REG_6),
>>>>>> +                     /* r6 = ctx->optval + 1 */
>>>>>> +                     BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
>>>>>> +
>>>>>> +                     /* r7 = ctx->optval_end */
>>>>>> +                     BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_1,
>>>>>> +                                 offsetof(struct bpf_sockopt, optval_end)),
>>>>>> +
>>>>>> +                     /* if (ctx->optval + 1 <= ctx->optval_end) { */
>>>>>> +                     BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1),
>>>>>> +                     /* ctx->optval[0] = 0xF0 */
>>>>>> +                     BPF_ST_MEM(BPF_B, BPF_REG_2, 0, 0xFF),
>>>>>> +                     /* } */
>>>>>> +
>>>>>> +                     /* ctx->retval = 0 */
>>>>>> +                     BPF_MOV64_IMM(BPF_REG_0, 0),
>>>>>>                         BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0,
>>>>>>                                     offsetof(struct bpf_sockopt, retval)),
>>>>>>
>>>>>> @@ -287,9 +307,10 @@ static struct sockopt_test {
>>>>>>                 .attach_type = BPF_CGROUP_GETSOCKOPT,
>>>>>>                 .expected_attach_type = BPF_CGROUP_GETSOCKOPT,
>>>>>>
>>>>>> -             .get_optlen = 64,
>>>>>> -
>>>>>> -             .error = EFAULT_GETSOCKOPT,
>>>>>> +             .get_level = 1234,
>>>>>> +             .get_optname = 5678,
>>>>>> +             .get_optval = {}, /* the changes are ignored */
>>>>>> +             .get_optlen = 4096 + 1,
>>>>>
>>>>> The patchset looks good. Thanks for taking care of it.
>>>>>
>>>>> One question, is it safe to the assume 4096 page size for all platforms in the
>>>>> selftests?
>>>>
>>>> Good question; let me respin with sysconf() just to be safe..
>>>
>>> Argh, the compiler yells at me:
>>> error: initializer element is not a compile-time constant
>>>
>>> I guess I'm just gonna do #define PAGE_SIZE 4096 and if we do hit some
>>> problems on the other archs, I'll ifdef it in one place.
>>
>> or run_test() can reinit optlen to sysconf_page_size + 1 if optlen == 4097.
> 
> Maybe I can do something like the following?
> 
>                 if (test->set_optlen >= PAGE_SIZE) {
>                         int num_pages = test->set_optlen / PAGE_SIZE;
>                         int remainder = test->set_optlen % PAGE_SIZE;
> 
>                         test->set_optlen = num_pages *
> sysconf(_SC_PAGESIZE) + remainder;
>                 }
> 
> More verbose, but less magical than depending on 4097. 

LGTM.

> For the BPF side, I can probably pass proper value via bss..

Make sense also.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen
  2023-05-01 18:58       ` Martin KaFai Lau
@ 2023-05-01 19:33         ` Stanislav Fomichev
  0 siblings, 0 replies; 15+ messages in thread
From: Stanislav Fomichev @ 2023-05-01 19:33 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, haoluo,
	jolsa, Martin KaFai Lau, bpf

On Mon, May 1, 2023 at 11:58 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 5/1/23 9:55 AM, Stanislav Fomichev wrote:
> > On Sun, Apr 30, 2023 at 10:52 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>
> >> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> >>> @@ -1881,8 +1886,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
> >>>                .optname = optname,
> >>>                .current_task = current,
> >>>        };
> >>> +     int orig_optlen;
> >>>        int ret;
> >>>
> >>> +     orig_optlen = max_optlen;
> >>
> >> For getsockopt, when the kernel's getsockopt finished successfully (the
> >> following 'if (!retval)' case), how about also setting orig_optlen to the kernel
> >> returned 'optlen'. For example, the user's orig_optlen is 8096 and the kernel
> >> returned optlen is 1024. If the bpf prog still sets the ctx.optlen to something
> >>   > PAGE_SIZE, -EFAULT will be returned.
> >
> > Wouldn't it defeat the purpose? Or am I missing something?
> >
> > ctx.optlen would still be 8096, not 1024, right (regardless of what
> > the kernel returns)?
> > So it would trigger EFAULT case which we try to avoid.
>
> My understanding is the ctx.optlen should be 1024 after the 'if (!retval)'
> statement.

Ah, you're right, thanks! Will add your suggestion.


> The 'int __user *optlen' arg has the kernel returned optlen (1024). The 'int
> max_optlen' arg has the original user's optlen (8096).
>
> int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
>                                        int optname, char __user *optval,
>                                        int __user *optlen /* 1024 */,
>                                        int max_optlen /* 8096 */,
>                                        int retval)
> {
>         /* ... */
>
>         orig_optlen = max_optlen; /* orig_optlen == 8096 */
>         ctx.optlen = max_optlen;  /* ctx.optlen == 8096 */
>
>
>         if (!retval) {
>                 /* If kernel getsockopt finished successfully,
>                  * copy whatever was returned to the user back
>                  * into our temporary buffer. Set optlen to the
>                  * one that kernel returned as well to let
>                  * BPF programs inspect the value.
>                  */
>
>                 if (get_user(ctx.optlen, optlen)) {
>                         ret = -EFAULT;
>                         goto out;
>                 }
>
>                 /* ctx.optlen == 1024 */
>
>                 orig_optlen = ctx.optlen;
>         }
>
>         /* ... */
> }

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-05-01 19:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-27 20:04 [PATCH bpf-next v2 0/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen Stanislav Fomichev
2023-04-27 20:04 ` [PATCH bpf-next v2 1/4] " Stanislav Fomichev
2023-05-01  5:51   ` Martin KaFai Lau
2023-05-01 16:55     ` Stanislav Fomichev
2023-05-01 18:58       ` Martin KaFai Lau
2023-05-01 19:33         ` Stanislav Fomichev
2023-04-27 20:04 ` [PATCH bpf-next v2 2/4] selftests/bpf: Update EFAULT {g,s}etsockopt selftests Stanislav Fomichev
2023-04-28 23:57   ` Martin KaFai Lau
2023-04-28 23:59     ` Stanislav Fomichev
2023-04-29  0:32       ` Stanislav Fomichev
2023-04-29  0:44         ` Martin KaFai Lau
2023-05-01 17:22           ` Stanislav Fomichev
2023-05-01 19:04             ` Martin KaFai Lau
2023-04-27 20:04 ` [PATCH bpf-next v2 3/4] selftests/bpf: Correctly handle optlen > 4096 Stanislav Fomichev
2023-04-27 20:04 ` [PATCH bpf-next v2 4/4] bpf: Document EFAULT changes for sockopt Stanislav Fomichev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox