* [PATCH bpf-next 0/2] Allow reads from uninit stack
@ 2023-02-16 18:36 Eduard Zingerman
2023-02-16 18:36 ` [PATCH bpf-next 1/2] bpf: " Eduard Zingerman
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Eduard Zingerman @ 2023-02-16 18:36 UTC (permalink / raw)
To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman
This patch-set modifies BPF verifier to accept programs that read from
uninitialized stack locations, but only if executed in privileged mode.
This provides significant verification performance gains: 30% to 70% less
processed states for big number of test programs.
The reason for performance gains comes from treating STACK_MISC and
STACK_INVALID as compatible, when cached state is compared to current state
in verifier.c:stacksafe().
The change should not affect safety, because any value read from STACK_MISC
location has full binary range (e.g. 0x00-0xff for byte-sized reads).
Details and measurements are provided in the description for the patch #1.
The change was suggested by Andrii Nakryiko, the initial patch was created
by Alexei Starovoitov. The discussion could be found at [1].
[1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
Eduard Zingerman (2):
bpf: Allow reads from uninit stack
selftests/bpf: Tests for uninitialized stack reads
kernel/bpf/verifier.c | 10 ++
.../selftests/bpf/prog_tests/uninit_stack.c | 9 ++
.../selftests/bpf/progs/test_global_func10.c | 6 +-
.../selftests/bpf/progs/uninit_stack.c | 55 +++++++++
tools/testing/selftests/bpf/verifier/calls.c | 13 ++-
.../bpf/verifier/helper_access_var_len.c | 104 ++++++++++++------
.../testing/selftests/bpf/verifier/int_ptr.c | 9 +-
.../selftests/bpf/verifier/search_pruning.c | 13 ++-
tools/testing/selftests/bpf/verifier/sock.c | 27 -----
.../selftests/bpf/verifier/spill_fill.c | 7 +-
.../testing/selftests/bpf/verifier/var_off.c | 52 ---------
11 files changed, 171 insertions(+), 134 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/uninit_stack.c
create mode 100644 tools/testing/selftests/bpf/progs/uninit_stack.c
--
2.39.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH bpf-next 1/2] bpf: Allow reads from uninit stack
2023-02-16 18:36 [PATCH bpf-next 0/2] Allow reads from uninit stack Eduard Zingerman
@ 2023-02-16 18:36 ` Eduard Zingerman
2023-02-17 0:36 ` Andrii Nakryiko
2023-02-16 18:36 ` [PATCH bpf-next 2/2] selftests/bpf: Tests for uninitialized stack reads Eduard Zingerman
2023-02-17 20:37 ` [PATCH bpf-next 0/2] Allow reads from uninit stack Daniel Borkmann
2 siblings, 1 reply; 12+ messages in thread
From: Eduard Zingerman @ 2023-02-16 18:36 UTC (permalink / raw)
To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman
This commits updates the following functions to allow reads from
uninitialized stack locations when env->allow_uninit_stack option is
enabled:
- check_stack_read_fixed_off()
- check_stack_range_initialized(), called from:
- check_stack_read_var_off()
- check_helper_mem_access()
Such change allows to relax logic in stacksafe() to treat STACK_MISC
and STACK_INVALID in a same way and make the following stack slot
configurations equivalent:
| Cached state | Current state |
| stack slot | stack slot |
|------------------+------------------|
| STACK_INVALID or | STACK_INVALID or |
| STACK_MISC | STACK_SPILL or |
| | STACK_MISC or |
| | STACK_ZERO or |
| | STACK_DYNPTR |
This leads to significant verification speed gains (see below).
The idea was suggested by Andrii Nakryiko [1] and initial patch was
created by Alexei Starovoitov [2].
Currently the env->allow_uninit_stack is allowed for programs loaded
by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.
A number of test cases from verifier/*.c were expecting uninitialized
stack access to be an error. These test cases were updated to execute
in unprivileged mode (thus preserving the tests).
The test progs/test_global_func10.c expected "invalid indirect access
to stack" error message because of the access to uninitialized memory
region. The test is updated to provoke the same error message by
accessing stack out of allocated range.
The following tests had to be removed because these can't be made
unprivileged:
- verifier/sock.c:
- "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
stack_value"
BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
- verifier/var_off.c:
- "indirect variable-offset stack access, max_off+size > max_initialized"
- "indirect variable-offset stack access, uninitialized"
These tests verify that access to uninitialized stack values is
detected when stack offset is not a constant. However, variable
stack access is prohibited in unprivileged mode, thus these tests
are no longer valid.
* * *
Here is veristat log comparing this patch with current master on a
set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
and cilium BPF binaries (see [3]):
$ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
File Program States (A) States (B) States (DIFF)
-------------------------- -------------------------- ---------- ---------- ----------------
bpf_host.o tail_handle_ipv6_from_host 349 244 -105 (-30.09%)
bpf_host.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
bpf_lxc.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
bpf_sock.o cil_sock4_connect 70 48 -22 (-31.43%)
bpf_sock.o cil_sock4_sendmsg 68 46 -22 (-32.35%)
bpf_xdp.o tail_handle_nat_fwd_ipv4 1554 803 -751 (-48.33%)
bpf_xdp.o tail_lb_ipv4 6457 2473 -3984 (-61.70%)
bpf_xdp.o tail_lb_ipv6 7249 3908 -3341 (-46.09%)
pyperf600_bpf_loop.bpf.o on_event 287 145 -142 (-49.48%)
strobemeta.bpf.o on_event 15915 4772 -11143 (-70.02%)
strobemeta_nounroll2.bpf.o on_event 17087 3820 -13267 (-77.64%)
xdp_synproxy_kern.bpf.o syncookie_tc 21271 6635 -14636 (-68.81%)
xdp_synproxy_kern.bpf.o syncookie_xdp 23122 6024 -17098 (-73.95%)
-------------------------- -------------------------- ---------- ---------- ----------------
Note: I limited selection by states_pct<-30%.
Inspection of differences in pyperf600_bpf_loop behavior shows that
the following patch for the test removes almost all differences:
- a/tools/testing/selftests/bpf/progs/pyperf.h
+ b/tools/testing/selftests/bpf/progs/pyperf.h
@ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
}
if (event->pthread_match || !pidData->use_tls) {
- void* frame_ptr;
- FrameData frame;
+ void* frame_ptr = 0;
+ FrameData frame = {};
Symbol sym = {};
int cur_cpu = bpf_get_smp_processor_id();
W/o this patch the difference comes from the following pattern
(for different variables):
static bool get_frame_data(... FrameData *frame ...)
{
...
bpf_probe_read_user(&frame->f_code, ...);
if (!frame->f_code)
return false;
...
bpf_probe_read_user(&frame->co_name, ...);
if (frame->co_name)
...;
}
int __on_event(struct bpf_raw_tracepoint_args *ctx)
{
FrameData frame;
...
get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
...
}
SEC("raw_tracepoint/kfree_skb")
int on_event(struct bpf_raw_tracepoint_args* ctx)
{
...
ret |= __on_event(ctx);
ret |= __on_event(ctx);
...
}
With regards to value `frame->co_name` the following is important:
- Because of the conditional `if (!frame->f_code)` each call to
__on_event() produces two states, one with `frame->co_name` marked
as STACK_MISC, another with it as is (and marked STACK_INVALID on a
first call).
- The call to bpf_probe_read_user() does not mark stack slots
corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
these slots as BPF_MISC, this happens because of the following loop
in the check_helper_call():
for (i = 0; i < meta.access_size; i++) {
err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
BPF_WRITE, -1, false);
if (err)
return err;
}
Note the size of the write, it is a one byte write for each byte
touched by a helper. The BPF_B write does not lead to write marks
for the target stack slot.
- Which means that w/o this patch when second __on_event() call is
verified `if (frame->co_name)` will propagate read marks first to a
stack slot with STACK_MISC marks and second to a stack slot with
STACK_INVALID marks and these states would be considered different.
[1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
[2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
[3] git@github.com:anakryiko/cilium.git
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
kernel/bpf/verifier.c | 10 ++
.../selftests/bpf/progs/test_global_func10.c | 6 +-
tools/testing/selftests/bpf/verifier/calls.c | 13 ++-
.../bpf/verifier/helper_access_var_len.c | 104 ++++++++++++------
.../testing/selftests/bpf/verifier/int_ptr.c | 9 +-
.../selftests/bpf/verifier/search_pruning.c | 13 ++-
tools/testing/selftests/bpf/verifier/sock.c | 27 -----
.../selftests/bpf/verifier/spill_fill.c | 7 +-
.../testing/selftests/bpf/verifier/var_off.c | 52 ---------
9 files changed, 107 insertions(+), 134 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 272563a0b770..6fbd0e25ccab 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3826,6 +3826,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
continue;
if (type == STACK_MISC)
continue;
+ if (type == STACK_INVALID && env->allow_uninit_stack)
+ continue;
verbose(env, "invalid read from stack off %d+%d size %d\n",
off, i, size);
return -EACCES;
@@ -3863,6 +3865,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
continue;
if (type == STACK_ZERO)
continue;
+ if (type == STACK_INVALID && env->allow_uninit_stack)
+ continue;
verbose(env, "invalid read from stack off %d+%d size %d\n",
off, i, size);
return -EACCES;
@@ -5761,6 +5765,8 @@ static int check_stack_range_initialized(
}
goto mark;
}
+ if (*stype == STACK_INVALID && env->allow_uninit_stack)
+ goto mark;
if (is_spilled_reg(&state->stack[spi]) &&
(state->stack[spi].spilled_ptr.type == SCALAR_VALUE ||
@@ -13936,6 +13942,10 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID)
continue;
+ if (env->allow_uninit_stack &&
+ old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_MISC)
+ continue;
+
/* explored stack has more populated slots than current stack
* and these slots were used
*/
diff --git a/tools/testing/selftests/bpf/progs/test_global_func10.c b/tools/testing/selftests/bpf/progs/test_global_func10.c
index 97b7031d0e22..9eb28721fdbc 100644
--- a/tools/testing/selftests/bpf/progs/test_global_func10.c
+++ b/tools/testing/selftests/bpf/progs/test_global_func10.c
@@ -4,12 +4,12 @@
#include <bpf/bpf_helpers.h>
struct Small {
- int x;
+ long x;
};
struct Big {
- int x;
- int y;
+ long x;
+ long y;
};
__noinline int foo(const struct Big *big)
diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c
index 9d993926bf0e..289ed202ec66 100644
--- a/tools/testing/selftests/bpf/verifier/calls.c
+++ b/tools/testing/selftests/bpf/verifier/calls.c
@@ -2221,19 +2221,22 @@
* that fp-8 stack slot was unused in the fall-through
* branch and will accept the program incorrectly
*/
- BPF_JMP_IMM(BPF_JGT, BPF_REG_1, 2, 2),
+ BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32),
+ BPF_JMP_IMM(BPF_JGT, BPF_REG_0, 2, 2),
BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
BPF_JMP_IMM(BPF_JA, 0, 0, 0),
BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
BPF_LD_MAP_FD(BPF_REG_1, 0),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .fixup_map_hash_48b = { 6 },
- .errstr = "invalid indirect read from stack R2 off -8+0 size 8",
- .result = REJECT,
- .prog_type = BPF_PROG_TYPE_XDP,
+ .fixup_map_hash_48b = { 7 },
+ .errstr_unpriv = "invalid indirect read from stack R2 off -8+0 size 8",
+ .result_unpriv = REJECT,
+ /* in privileged mode reads from uninitialized stack locations are permitted */
+ .result = ACCEPT,
},
{
"calls: ctx read at start of subprog",
diff --git a/tools/testing/selftests/bpf/verifier/helper_access_var_len.c b/tools/testing/selftests/bpf/verifier/helper_access_var_len.c
index a6c869a7319c..9c4885885aba 100644
--- a/tools/testing/selftests/bpf/verifier/helper_access_var_len.c
+++ b/tools/testing/selftests/bpf/verifier/helper_access_var_len.c
@@ -29,19 +29,30 @@
{
"helper access to variable memory: stack, bitwise AND, zero included",
.insns = {
- BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 8),
- BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
- BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -64),
- BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_2, -128),
- BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, -128),
- BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 64),
- BPF_MOV64_IMM(BPF_REG_3, 0),
- BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel),
+ /* set max stack size */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -128, 0),
+ /* set r3 to a random value */
+ BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_0),
+ /* use bitwise AND to limit r3 range to [0, 64] */
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_3, 64),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -64),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ /* Call bpf_ringbuf_output(), it is one of a few helper functions with
+ * ARG_CONST_SIZE_OR_ZERO parameter allowed in unpriv mode.
+ * For unpriv this should signal an error, because memory at &fp[-64] is
+ * not initialized.
+ */
+ BPF_EMIT_CALL(BPF_FUNC_ringbuf_output),
BPF_EXIT_INSN(),
},
- .errstr = "invalid indirect read from stack R1 off -64+0 size 64",
- .result = REJECT,
- .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .fixup_map_ringbuf = { 4 },
+ .errstr_unpriv = "invalid indirect read from stack R2 off -64+0 size 64",
+ .result_unpriv = REJECT,
+ /* in privileged mode reads from uninitialized stack locations are permitted */
+ .result = ACCEPT,
},
{
"helper access to variable memory: stack, bitwise AND + JMP, wrong max",
@@ -183,20 +194,31 @@
{
"helper access to variable memory: stack, JMP, no min check",
.insns = {
- BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 8),
- BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
- BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -64),
- BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_2, -128),
- BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, -128),
- BPF_JMP_IMM(BPF_JGT, BPF_REG_2, 64, 3),
- BPF_MOV64_IMM(BPF_REG_3, 0),
- BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel),
+ /* set max stack size */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -128, 0),
+ /* set r3 to a random value */
+ BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_0),
+ /* use JMP to limit r3 range to [0, 64] */
+ BPF_JMP_IMM(BPF_JGT, BPF_REG_3, 64, 6),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -64),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ /* Call bpf_ringbuf_output(), it is one of a few helper functions with
+ * ARG_CONST_SIZE_OR_ZERO parameter allowed in unpriv mode.
+ * For unpriv this should signal an error, because memory at &fp[-64] is
+ * not initialized.
+ */
+ BPF_EMIT_CALL(BPF_FUNC_ringbuf_output),
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr = "invalid indirect read from stack R1 off -64+0 size 64",
- .result = REJECT,
- .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .fixup_map_ringbuf = { 4 },
+ .errstr_unpriv = "invalid indirect read from stack R2 off -64+0 size 64",
+ .result_unpriv = REJECT,
+ /* in privileged mode reads from uninitialized stack locations are permitted */
+ .result = ACCEPT,
},
{
"helper access to variable memory: stack, JMP (signed), no min check",
@@ -564,29 +586,41 @@
{
"helper access to variable memory: 8 bytes leak",
.insns = {
- BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 8),
- BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
- BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -64),
+ /* set max stack size */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -128, 0),
+ /* set r3 to a random value */
+ BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_0),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -64),
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -64),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -56),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -48),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -40),
+ /* Note: fp[-32] left uninitialized */
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -24),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
- BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_2, -128),
- BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_10, -128),
- BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 63),
- BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, 1),
- BPF_MOV64_IMM(BPF_REG_3, 0),
- BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel),
- BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16),
+ /* Limit r3 range to [1, 64] */
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_3, 63),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, 1),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ /* Call bpf_ringbuf_output(), it is one of a few helper functions with
+ * ARG_CONST_SIZE_OR_ZERO parameter allowed in unpriv mode.
+ * For unpriv this should signal an error, because memory region [1, 64]
+ * at &fp[-64] is not fully initialized.
+ */
+ BPF_EMIT_CALL(BPF_FUNC_ringbuf_output),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr = "invalid indirect read from stack R1 off -64+32 size 64",
- .result = REJECT,
- .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .fixup_map_ringbuf = { 3 },
+ .errstr_unpriv = "invalid indirect read from stack R2 off -64+32 size 64",
+ .result_unpriv = REJECT,
+ /* in privileged mode reads from uninitialized stack locations are permitted */
+ .result = ACCEPT,
},
{
"helper access to variable memory: 8 bytes no leak (init memory)",
diff --git a/tools/testing/selftests/bpf/verifier/int_ptr.c b/tools/testing/selftests/bpf/verifier/int_ptr.c
index 070893fb2900..02d9e004260b 100644
--- a/tools/testing/selftests/bpf/verifier/int_ptr.c
+++ b/tools/testing/selftests/bpf/verifier/int_ptr.c
@@ -54,12 +54,13 @@
/* bpf_strtoul() */
BPF_EMIT_CALL(BPF_FUNC_strtoul),
- BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .result = REJECT,
- .prog_type = BPF_PROG_TYPE_CGROUP_SYSCTL,
- .errstr = "invalid indirect read from stack R4 off -16+4 size 8",
+ .result_unpriv = REJECT,
+ .errstr_unpriv = "invalid indirect read from stack R4 off -16+4 size 8",
+ /* in privileged mode reads from uninitialized stack locations are permitted */
+ .result = ACCEPT,
},
{
"ARG_PTR_TO_LONG misaligned",
diff --git a/tools/testing/selftests/bpf/verifier/search_pruning.c b/tools/testing/selftests/bpf/verifier/search_pruning.c
index d63fd8991b03..745d6b5842fd 100644
--- a/tools/testing/selftests/bpf/verifier/search_pruning.c
+++ b/tools/testing/selftests/bpf/verifier/search_pruning.c
@@ -128,9 +128,10 @@
BPF_EXIT_INSN(),
},
.fixup_map_hash_8b = { 3 },
- .errstr = "invalid read from stack off -16+0 size 8",
- .result = REJECT,
- .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .errstr_unpriv = "invalid read from stack off -16+0 size 8",
+ .result_unpriv = REJECT,
+ /* in privileged mode reads from uninitialized stack locations are permitted */
+ .result = ACCEPT,
},
{
"precision tracking for u32 spill/fill",
@@ -258,6 +259,8 @@
BPF_EXIT_INSN(),
},
.flags = BPF_F_TEST_STATE_FREQ,
- .errstr = "invalid read from stack off -8+1 size 8",
- .result = REJECT,
+ .errstr_unpriv = "invalid read from stack off -8+1 size 8",
+ .result_unpriv = REJECT,
+ /* in privileged mode reads from uninitialized stack locations are permitted */
+ .result = ACCEPT,
},
diff --git a/tools/testing/selftests/bpf/verifier/sock.c b/tools/testing/selftests/bpf/verifier/sock.c
index d11d0b28be41..108dd3ee1edd 100644
--- a/tools/testing/selftests/bpf/verifier/sock.c
+++ b/tools/testing/selftests/bpf/verifier/sock.c
@@ -530,33 +530,6 @@
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
.result = ACCEPT,
},
-{
- "sk_storage_get(map, skb->sk, &stack_value, 1): partially init stack_value",
- .insns = {
- BPF_MOV64_IMM(BPF_REG_2, 0),
- BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8),
- BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct __sk_buff, sk)),
- BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2),
- BPF_MOV64_IMM(BPF_REG_0, 0),
- BPF_EXIT_INSN(),
- BPF_EMIT_CALL(BPF_FUNC_sk_fullsock),
- BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
- BPF_MOV64_IMM(BPF_REG_0, 0),
- BPF_EXIT_INSN(),
- BPF_MOV64_IMM(BPF_REG_4, 1),
- BPF_MOV64_REG(BPF_REG_3, BPF_REG_10),
- BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, -8),
- BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
- BPF_LD_MAP_FD(BPF_REG_1, 0),
- BPF_EMIT_CALL(BPF_FUNC_sk_storage_get),
- BPF_MOV64_IMM(BPF_REG_0, 0),
- BPF_EXIT_INSN(),
- },
- .fixup_sk_storage_map = { 14 },
- .prog_type = BPF_PROG_TYPE_SCHED_CLS,
- .result = REJECT,
- .errstr = "invalid indirect read from stack",
-},
{
"bpf_map_lookup_elem(smap, &key)",
.insns = {
diff --git a/tools/testing/selftests/bpf/verifier/spill_fill.c b/tools/testing/selftests/bpf/verifier/spill_fill.c
index 9bb302dade23..d1463bf4949a 100644
--- a/tools/testing/selftests/bpf/verifier/spill_fill.c
+++ b/tools/testing/selftests/bpf/verifier/spill_fill.c
@@ -171,9 +171,10 @@
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .result = REJECT,
- .errstr = "invalid read from stack off -4+0 size 4",
- .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ .result_unpriv = REJECT,
+ .errstr_unpriv = "invalid read from stack off -4+0 size 4",
+ /* in privileged mode reads from uninitialized stack locations are permitted */
+ .result = ACCEPT,
},
{
"Spill a u32 const scalar. Refill as u16. Offset to skb->data",
diff --git a/tools/testing/selftests/bpf/verifier/var_off.c b/tools/testing/selftests/bpf/verifier/var_off.c
index d37f512fad16..b183e26c03f1 100644
--- a/tools/testing/selftests/bpf/verifier/var_off.c
+++ b/tools/testing/selftests/bpf/verifier/var_off.c
@@ -212,31 +212,6 @@
.result = REJECT,
.prog_type = BPF_PROG_TYPE_LWT_IN,
},
-{
- "indirect variable-offset stack access, max_off+size > max_initialized",
- .insns = {
- /* Fill only the second from top 8 bytes of the stack. */
- BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0),
- /* Get an unknown value. */
- BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 0),
- /* Make it small and 4-byte aligned. */
- BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 4),
- BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 16),
- /* Add it to fp. We now have either fp-12 or fp-16, but we don't know
- * which. fp-12 size 8 is partially uninitialized stack.
- */
- BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_10),
- /* Dereference it indirectly. */
- BPF_LD_MAP_FD(BPF_REG_1, 0),
- BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
- BPF_MOV64_IMM(BPF_REG_0, 0),
- BPF_EXIT_INSN(),
- },
- .fixup_map_hash_8b = { 5 },
- .errstr = "invalid indirect read from stack R2 var_off",
- .result = REJECT,
- .prog_type = BPF_PROG_TYPE_LWT_IN,
-},
{
"indirect variable-offset stack access, min_off < min_initialized",
.insns = {
@@ -289,33 +264,6 @@
.result = ACCEPT,
.prog_type = BPF_PROG_TYPE_CGROUP_SKB,
},
-{
- "indirect variable-offset stack access, uninitialized",
- .insns = {
- BPF_MOV64_IMM(BPF_REG_2, 6),
- BPF_MOV64_IMM(BPF_REG_3, 28),
- /* Fill the top 16 bytes of the stack. */
- BPF_ST_MEM(BPF_W, BPF_REG_10, -16, 0),
- BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
- /* Get an unknown value. */
- BPF_LDX_MEM(BPF_W, BPF_REG_4, BPF_REG_1, 0),
- /* Make it small and 4-byte aligned. */
- BPF_ALU64_IMM(BPF_AND, BPF_REG_4, 4),
- BPF_ALU64_IMM(BPF_SUB, BPF_REG_4, 16),
- /* Add it to fp. We now have either fp-12 or fp-16, we don't know
- * which, but either way it points to initialized stack.
- */
- BPF_ALU64_REG(BPF_ADD, BPF_REG_4, BPF_REG_10),
- BPF_MOV64_IMM(BPF_REG_5, 8),
- /* Dereference it indirectly. */
- BPF_EMIT_CALL(BPF_FUNC_getsockopt),
- BPF_MOV64_IMM(BPF_REG_0, 0),
- BPF_EXIT_INSN(),
- },
- .errstr = "invalid indirect read from stack R4 var_off",
- .result = REJECT,
- .prog_type = BPF_PROG_TYPE_SOCK_OPS,
-},
{
"indirect variable-offset stack access, ok",
.insns = {
--
2.39.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH bpf-next 2/2] selftests/bpf: Tests for uninitialized stack reads
2023-02-16 18:36 [PATCH bpf-next 0/2] Allow reads from uninit stack Eduard Zingerman
2023-02-16 18:36 ` [PATCH bpf-next 1/2] bpf: " Eduard Zingerman
@ 2023-02-16 18:36 ` Eduard Zingerman
2023-02-17 0:55 ` Andrii Nakryiko
2023-02-17 20:37 ` [PATCH bpf-next 0/2] Allow reads from uninit stack Daniel Borkmann
2 siblings, 1 reply; 12+ messages in thread
From: Eduard Zingerman @ 2023-02-16 18:36 UTC (permalink / raw)
To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman
Two testcases to make sure that stack reads from uninitialized
locations are accepted by verifier when executed in privileged mode:
- read from a fixed offset;
- read from a variable offset.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
.../selftests/bpf/prog_tests/uninit_stack.c | 9 +++
.../selftests/bpf/progs/uninit_stack.c | 55 +++++++++++++++++++
2 files changed, 64 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/uninit_stack.c
create mode 100644 tools/testing/selftests/bpf/progs/uninit_stack.c
diff --git a/tools/testing/selftests/bpf/prog_tests/uninit_stack.c b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c
new file mode 100644
index 000000000000..e64c71948491
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c
@@ -0,0 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <test_progs.h>
+#include "uninit_stack.skel.h"
+
+void test_uninit_stack(void)
+{
+ RUN_TESTS(uninit_stack);
+}
diff --git a/tools/testing/selftests/bpf/progs/uninit_stack.c b/tools/testing/selftests/bpf/progs/uninit_stack.c
new file mode 100644
index 000000000000..20ff6a22c906
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/uninit_stack.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+/* Read an uninitialized value from stack at a fixed offset */
+SEC("socket")
+__naked int read_uninit_stack_fixed_off(void *ctx)
+{
+ asm volatile (" \
+ // force stack depth to be 128 \
+ *(u64*)(r10 - 128) = r1; \
+ r1 = *(u8 *)(r10 - 8 ); \
+ r1 = *(u8 *)(r10 - 11); \
+ r1 = *(u8 *)(r10 - 13); \
+ r1 = *(u8 *)(r10 - 15); \
+ r1 = *(u16*)(r10 - 16); \
+ r1 = *(u32*)(r10 - 32); \
+ r1 = *(u64*)(r10 - 64); \
+ // read from a spill of a wrong size, it is a separate \
+ // branch in check_stack_read_fixed_off() \
+ *(u32*)(r10 - 72) = r1; \
+ r1 = *(u64*)(r10 - 72); \
+ r0 = 0; \
+ exit; \
+"
+ ::: __clobber_all);
+}
+
+/* Read an uninitialized value from stack at a variable offset */
+SEC("socket")
+__naked int read_uninit_stack_var_off(void *ctx)
+{
+ asm volatile (" \
+ call %[bpf_get_prandom_u32]; \
+ // force stack depth to be 64 \
+ *(u64*)(r10 - 64) = r0; \
+ r0 = -r0; \
+ // give r0 a range [-31, -1] \
+ if r0 s<= -32 goto exit_%=; \
+ if r0 s>= 0 goto exit_%=; \
+ // access stack using r0 \
+ r1 = r10; \
+ r1 += r0; \
+ r2 = *(u8*)(r1 + 0); \
+exit_%=: r0 = 0; \
+ exit; \
+"
+ :
+ : __imm(bpf_get_prandom_u32)
+ : __clobber_all);
+}
+
+char _license[] SEC("license") = "GPL";
--
2.39.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Allow reads from uninit stack
2023-02-16 18:36 ` [PATCH bpf-next 1/2] bpf: " Eduard Zingerman
@ 2023-02-17 0:36 ` Andrii Nakryiko
2023-02-17 13:13 ` Eduard Zingerman
0 siblings, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2023-02-17 0:36 UTC (permalink / raw)
To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs
On Thu, Feb 16, 2023 at 10:36 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> This commits updates the following functions to allow reads from
> uninitialized stack locations when env->allow_uninit_stack option is
> enabled:
> - check_stack_read_fixed_off()
> - check_stack_range_initialized(), called from:
> - check_stack_read_var_off()
> - check_helper_mem_access()
>
> Such change allows to relax logic in stacksafe() to treat STACK_MISC
> and STACK_INVALID in a same way and make the following stack slot
> configurations equivalent:
>
> | Cached state | Current state |
> | stack slot | stack slot |
> |------------------+------------------|
> | STACK_INVALID or | STACK_INVALID or |
> | STACK_MISC | STACK_SPILL or |
> | | STACK_MISC or |
> | | STACK_ZERO or |
> | | STACK_DYNPTR |
>
> This leads to significant verification speed gains (see below).
>
> The idea was suggested by Andrii Nakryiko [1] and initial patch was
> created by Alexei Starovoitov [2].
>
> Currently the env->allow_uninit_stack is allowed for programs loaded
> by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.
>
> A number of test cases from verifier/*.c were expecting uninitialized
> stack access to be an error. These test cases were updated to execute
> in unprivileged mode (thus preserving the tests).
>
> The test progs/test_global_func10.c expected "invalid indirect access
> to stack" error message because of the access to uninitialized memory
> region. The test is updated to provoke the same error message by
> accessing stack out of allocated range.
>
> The following tests had to be removed because these can't be made
> unprivileged:
> - verifier/sock.c:
> - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
> stack_value"
> BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
> - verifier/var_off.c:
> - "indirect variable-offset stack access, max_off+size > max_initialized"
> - "indirect variable-offset stack access, uninitialized"
> These tests verify that access to uninitialized stack values is
> detected when stack offset is not a constant. However, variable
> stack access is prohibited in unprivileged mode, thus these tests
> are no longer valid.
>
> * * *
>
> Here is veristat log comparing this patch with current master on a
> set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
> and cilium BPF binaries (see [3]):
>
> $ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
> File Program States (A) States (B) States (DIFF)
> -------------------------- -------------------------- ---------- ---------- ----------------
> bpf_host.o tail_handle_ipv6_from_host 349 244 -105 (-30.09%)
> bpf_host.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
> bpf_lxc.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
> bpf_sock.o cil_sock4_connect 70 48 -22 (-31.43%)
> bpf_sock.o cil_sock4_sendmsg 68 46 -22 (-32.35%)
> bpf_xdp.o tail_handle_nat_fwd_ipv4 1554 803 -751 (-48.33%)
> bpf_xdp.o tail_lb_ipv4 6457 2473 -3984 (-61.70%)
> bpf_xdp.o tail_lb_ipv6 7249 3908 -3341 (-46.09%)
> pyperf600_bpf_loop.bpf.o on_event 287 145 -142 (-49.48%)
> strobemeta.bpf.o on_event 15915 4772 -11143 (-70.02%)
> strobemeta_nounroll2.bpf.o on_event 17087 3820 -13267 (-77.64%)
> xdp_synproxy_kern.bpf.o syncookie_tc 21271 6635 -14636 (-68.81%)
> xdp_synproxy_kern.bpf.o syncookie_xdp 23122 6024 -17098 (-73.95%)
> -------------------------- -------------------------- ---------- ---------- ----------------
>
> Note: I limited selection by states_pct<-30%.
>
> Inspection of differences in pyperf600_bpf_loop behavior shows that
> the following patch for the test removes almost all differences:
>
> - a/tools/testing/selftests/bpf/progs/pyperf.h
> + b/tools/testing/selftests/bpf/progs/pyperf.h
> @ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
> }
>
> if (event->pthread_match || !pidData->use_tls) {
> - void* frame_ptr;
> - FrameData frame;
> + void* frame_ptr = 0;
> + FrameData frame = {};
> Symbol sym = {};
> int cur_cpu = bpf_get_smp_processor_id();
>
> W/o this patch the difference comes from the following pattern
> (for different variables):
>
> static bool get_frame_data(... FrameData *frame ...)
> {
> ...
> bpf_probe_read_user(&frame->f_code, ...);
> if (!frame->f_code)
> return false;
> ...
> bpf_probe_read_user(&frame->co_name, ...);
> if (frame->co_name)
> ...;
> }
>
> int __on_event(struct bpf_raw_tracepoint_args *ctx)
> {
> FrameData frame;
> ...
> get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
> ...
> }
>
> SEC("raw_tracepoint/kfree_skb")
> int on_event(struct bpf_raw_tracepoint_args* ctx)
> {
> ...
> ret |= __on_event(ctx);
> ret |= __on_event(ctx);
> ...
> }
>
> With regards to value `frame->co_name` the following is important:
> - Because of the conditional `if (!frame->f_code)` each call to
> __on_event() produces two states, one with `frame->co_name` marked
> as STACK_MISC, another with it as is (and marked STACK_INVALID on a
> first call).
> - The call to bpf_probe_read_user() does not mark stack slots
> corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
> these slots as BPF_MISC, this happens because of the following loop
> in the check_helper_call():
>
> for (i = 0; i < meta.access_size; i++) {
> err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
> BPF_WRITE, -1, false);
> if (err)
> return err;
> }
>
> Note the size of the write, it is a one byte write for each byte
> touched by a helper. The BPF_B write does not lead to write marks
> for the target stack slot.
> - Which means that w/o this patch when second __on_event() call is
> verified `if (frame->co_name)` will propagate read marks first to a
> stack slot with STACK_MISC marks and second to a stack slot with
> STACK_INVALID marks and these states would be considered different.
>
> [1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
> [2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
> [3] git@github.com:anakryiko/cilium.git
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
> kernel/bpf/verifier.c | 10 ++
> .../selftests/bpf/progs/test_global_func10.c | 6 +-
> tools/testing/selftests/bpf/verifier/calls.c | 13 ++-
> .../bpf/verifier/helper_access_var_len.c | 104 ++++++++++++------
> .../testing/selftests/bpf/verifier/int_ptr.c | 9 +-
> .../selftests/bpf/verifier/search_pruning.c | 13 ++-
> tools/testing/selftests/bpf/verifier/sock.c | 27 -----
> .../selftests/bpf/verifier/spill_fill.c | 7 +-
> .../testing/selftests/bpf/verifier/var_off.c | 52 ---------
> 9 files changed, 107 insertions(+), 134 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 272563a0b770..6fbd0e25ccab 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3826,6 +3826,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
> continue;
> if (type == STACK_MISC)
> continue;
> + if (type == STACK_INVALID && env->allow_uninit_stack)
> + continue;
> verbose(env, "invalid read from stack off %d+%d size %d\n",
> off, i, size);
> return -EACCES;
> @@ -3863,6 +3865,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
> continue;
> if (type == STACK_ZERO)
> continue;
> + if (type == STACK_INVALID && env->allow_uninit_stack)
> + continue;
> verbose(env, "invalid read from stack off %d+%d size %d\n",
> off, i, size);
> return -EACCES;
> @@ -5761,6 +5765,8 @@ static int check_stack_range_initialized(
> }
> goto mark;
> }
> + if (*stype == STACK_INVALID && env->allow_uninit_stack)
> + goto mark;
should we support clobber and conversion to STACK_MISC like we do for
STACK_ZERO? If yes, probably cleaner to just extend condition to
if ((*stype == STACK_ZERO) || (*stype == STACK_INVALID &&
env->allow_uninit_stack))
?
Other than that, looks good:
Acked-by: Andrii Nakryiko <andrii@kernel.org>
>
> if (is_spilled_reg(&state->stack[spi]) &&
> (state->stack[spi].spilled_ptr.type == SCALAR_VALUE ||
> @@ -13936,6 +13942,10 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
> if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID)
> continue;
>
> + if (env->allow_uninit_stack &&
> + old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_MISC)
> + continue;
> +
> /* explored stack has more populated slots than current stack
> * and these slots were used
> */
[...]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH bpf-next 2/2] selftests/bpf: Tests for uninitialized stack reads
2023-02-16 18:36 ` [PATCH bpf-next 2/2] selftests/bpf: Tests for uninitialized stack reads Eduard Zingerman
@ 2023-02-17 0:55 ` Andrii Nakryiko
2023-02-17 13:25 ` Eduard Zingerman
0 siblings, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2023-02-17 0:55 UTC (permalink / raw)
To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs
On Thu, Feb 16, 2023 at 10:36 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Two testcases to make sure that stack reads from uninitialized
> locations are accepted by verifier when executed in privileged mode:
> - read from a fixed offset;
> - read from a variable offset.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
> .../selftests/bpf/prog_tests/uninit_stack.c | 9 +++
> .../selftests/bpf/progs/uninit_stack.c | 55 +++++++++++++++++++
> 2 files changed, 64 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/uninit_stack.c
> create mode 100644 tools/testing/selftests/bpf/progs/uninit_stack.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/uninit_stack.c b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c
> new file mode 100644
> index 000000000000..e64c71948491
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c
> @@ -0,0 +1,9 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <test_progs.h>
> +#include "uninit_stack.skel.h"
> +
> +void test_uninit_stack(void)
> +{
> + RUN_TESTS(uninit_stack);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/uninit_stack.c b/tools/testing/selftests/bpf/progs/uninit_stack.c
> new file mode 100644
> index 000000000000..20ff6a22c906
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/uninit_stack.c
> @@ -0,0 +1,55 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include "bpf_misc.h"
> +
> +/* Read an uninitialized value from stack at a fixed offset */
> +SEC("socket")
> +__naked int read_uninit_stack_fixed_off(void *ctx)
> +{
> + asm volatile (" \
> + // force stack depth to be 128 \
> + *(u64*)(r10 - 128) = r1; \
> + r1 = *(u8 *)(r10 - 8 ); \
> + r1 = *(u8 *)(r10 - 11); \
> + r1 = *(u8 *)(r10 - 13); \
> + r1 = *(u8 *)(r10 - 15); \
> + r1 = *(u16*)(r10 - 16); \
> + r1 = *(u32*)(r10 - 32); \
> + r1 = *(u64*)(r10 - 64); \
> + // read from a spill of a wrong size, it is a separate \
> + // branch in check_stack_read_fixed_off() \
> + *(u32*)(r10 - 72) = r1; \
> + r1 = *(u64*)(r10 - 72); \
> + r0 = 0; \
> + exit; \
would it be better to
r0 = *(u64*)(r10 - 72);
exit;
to make sure that in the future verifier doesn't smartly optimize out
unused reads?
Either way, looks good to me:
Acked-by: Andrii Nakryiko <andrii@kernel.org>
> +"
> + ::: __clobber_all);
> +}
> +
> +/* Read an uninitialized value from stack at a variable offset */
> +SEC("socket")
> +__naked int read_uninit_stack_var_off(void *ctx)
> +{
> + asm volatile (" \
> + call %[bpf_get_prandom_u32]; \
> + // force stack depth to be 64 \
> + *(u64*)(r10 - 64) = r0; \
> + r0 = -r0; \
> + // give r0 a range [-31, -1] \
> + if r0 s<= -32 goto exit_%=; \
> + if r0 s>= 0 goto exit_%=; \
> + // access stack using r0 \
> + r1 = r10; \
> + r1 += r0; \
> + r2 = *(u8*)(r1 + 0); \
> +exit_%=: r0 = 0; \
> + exit; \
> +"
> + :
> + : __imm(bpf_get_prandom_u32)
> + : __clobber_all);
> +}
> +
> +char _license[] SEC("license") = "GPL";
> --
> 2.39.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Allow reads from uninit stack
2023-02-17 0:36 ` Andrii Nakryiko
@ 2023-02-17 13:13 ` Eduard Zingerman
2023-02-17 21:58 ` Andrii Nakryiko
0 siblings, 1 reply; 12+ messages in thread
From: Eduard Zingerman @ 2023-02-17 13:13 UTC (permalink / raw)
To: Andrii Nakryiko; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs
On Thu, 2023-02-16 at 16:36 -0800, Andrii Nakryiko wrote:
> On Thu, Feb 16, 2023 at 10:36 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > This commits updates the following functions to allow reads from
> > uninitialized stack locations when env->allow_uninit_stack option is
> > enabled:
> > - check_stack_read_fixed_off()
> > - check_stack_range_initialized(), called from:
> > - check_stack_read_var_off()
> > - check_helper_mem_access()
> >
> > Such change allows to relax logic in stacksafe() to treat STACK_MISC
> > and STACK_INVALID in a same way and make the following stack slot
> > configurations equivalent:
> >
> > | Cached state | Current state |
> > | stack slot | stack slot |
> > |------------------+------------------|
> > | STACK_INVALID or | STACK_INVALID or |
> > | STACK_MISC | STACK_SPILL or |
> > | | STACK_MISC or |
> > | | STACK_ZERO or |
> > | | STACK_DYNPTR |
> >
> > This leads to significant verification speed gains (see below).
> >
> > The idea was suggested by Andrii Nakryiko [1] and initial patch was
> > created by Alexei Starovoitov [2].
> >
> > Currently the env->allow_uninit_stack is allowed for programs loaded
> > by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.
> >
> > A number of test cases from verifier/*.c were expecting uninitialized
> > stack access to be an error. These test cases were updated to execute
> > in unprivileged mode (thus preserving the tests).
> >
> > The test progs/test_global_func10.c expected "invalid indirect access
> > to stack" error message because of the access to uninitialized memory
> > region. The test is updated to provoke the same error message by
> > accessing stack out of allocated range.
> >
> > The following tests had to be removed because these can't be made
> > unprivileged:
> > - verifier/sock.c:
> > - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
> > stack_value"
> > BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
> > - verifier/var_off.c:
> > - "indirect variable-offset stack access, max_off+size > max_initialized"
> > - "indirect variable-offset stack access, uninitialized"
> > These tests verify that access to uninitialized stack values is
> > detected when stack offset is not a constant. However, variable
> > stack access is prohibited in unprivileged mode, thus these tests
> > are no longer valid.
> >
> > * * *
> >
> > Here is veristat log comparing this patch with current master on a
> > set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
> > and cilium BPF binaries (see [3]):
> >
> > $ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
> > File Program States (A) States (B) States (DIFF)
> > -------------------------- -------------------------- ---------- ---------- ----------------
> > bpf_host.o tail_handle_ipv6_from_host 349 244 -105 (-30.09%)
> > bpf_host.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
> > bpf_lxc.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
> > bpf_sock.o cil_sock4_connect 70 48 -22 (-31.43%)
> > bpf_sock.o cil_sock4_sendmsg 68 46 -22 (-32.35%)
> > bpf_xdp.o tail_handle_nat_fwd_ipv4 1554 803 -751 (-48.33%)
> > bpf_xdp.o tail_lb_ipv4 6457 2473 -3984 (-61.70%)
> > bpf_xdp.o tail_lb_ipv6 7249 3908 -3341 (-46.09%)
> > pyperf600_bpf_loop.bpf.o on_event 287 145 -142 (-49.48%)
> > strobemeta.bpf.o on_event 15915 4772 -11143 (-70.02%)
> > strobemeta_nounroll2.bpf.o on_event 17087 3820 -13267 (-77.64%)
> > xdp_synproxy_kern.bpf.o syncookie_tc 21271 6635 -14636 (-68.81%)
> > xdp_synproxy_kern.bpf.o syncookie_xdp 23122 6024 -17098 (-73.95%)
> > -------------------------- -------------------------- ---------- ---------- ----------------
> >
> > Note: I limited selection by states_pct<-30%.
> >
> > Inspection of differences in pyperf600_bpf_loop behavior shows that
> > the following patch for the test removes almost all differences:
> >
> > - a/tools/testing/selftests/bpf/progs/pyperf.h
> > + b/tools/testing/selftests/bpf/progs/pyperf.h
> > @ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
> > }
> >
> > if (event->pthread_match || !pidData->use_tls) {
> > - void* frame_ptr;
> > - FrameData frame;
> > + void* frame_ptr = 0;
> > + FrameData frame = {};
> > Symbol sym = {};
> > int cur_cpu = bpf_get_smp_processor_id();
> >
> > W/o this patch the difference comes from the following pattern
> > (for different variables):
> >
> > static bool get_frame_data(... FrameData *frame ...)
> > {
> > ...
> > bpf_probe_read_user(&frame->f_code, ...);
> > if (!frame->f_code)
> > return false;
> > ...
> > bpf_probe_read_user(&frame->co_name, ...);
> > if (frame->co_name)
> > ...;
> > }
> >
> > int __on_event(struct bpf_raw_tracepoint_args *ctx)
> > {
> > FrameData frame;
> > ...
> > get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
> > ...
> > }
> >
> > SEC("raw_tracepoint/kfree_skb")
> > int on_event(struct bpf_raw_tracepoint_args* ctx)
> > {
> > ...
> > ret |= __on_event(ctx);
> > ret |= __on_event(ctx);
> > ...
> > }
> >
> > With regards to value `frame->co_name` the following is important:
> > - Because of the conditional `if (!frame->f_code)` each call to
> > __on_event() produces two states, one with `frame->co_name` marked
> > as STACK_MISC, another with it as is (and marked STACK_INVALID on a
> > first call).
> > - The call to bpf_probe_read_user() does not mark stack slots
> > corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
> > these slots as BPF_MISC, this happens because of the following loop
> > in the check_helper_call():
> >
> > for (i = 0; i < meta.access_size; i++) {
> > err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
> > BPF_WRITE, -1, false);
> > if (err)
> > return err;
> > }
> >
> > Note the size of the write, it is a one byte write for each byte
> > touched by a helper. The BPF_B write does not lead to write marks
> > for the target stack slot.
> > - Which means that w/o this patch when second __on_event() call is
> > verified `if (frame->co_name)` will propagate read marks first to a
> > stack slot with STACK_MISC marks and second to a stack slot with
> > STACK_INVALID marks and these states would be considered different.
> >
> > [1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
> > [2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
> > [3] git@github.com:anakryiko/cilium.git
> >
> > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > Suggested-by: Alexei Starovoitov <ast@kernel.org>
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > ---
> > kernel/bpf/verifier.c | 10 ++
> > .../selftests/bpf/progs/test_global_func10.c | 6 +-
> > tools/testing/selftests/bpf/verifier/calls.c | 13 ++-
> > .../bpf/verifier/helper_access_var_len.c | 104 ++++++++++++------
> > .../testing/selftests/bpf/verifier/int_ptr.c | 9 +-
> > .../selftests/bpf/verifier/search_pruning.c | 13 ++-
> > tools/testing/selftests/bpf/verifier/sock.c | 27 -----
> > .../selftests/bpf/verifier/spill_fill.c | 7 +-
> > .../testing/selftests/bpf/verifier/var_off.c | 52 ---------
> > 9 files changed, 107 insertions(+), 134 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 272563a0b770..6fbd0e25ccab 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -3826,6 +3826,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
> > continue;
> > if (type == STACK_MISC)
> > continue;
> > + if (type == STACK_INVALID && env->allow_uninit_stack)
> > + continue;
> > verbose(env, "invalid read from stack off %d+%d size %d\n",
> > off, i, size);
> > return -EACCES;
> > @@ -3863,6 +3865,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
> > continue;
> > if (type == STACK_ZERO)
> > continue;
> > + if (type == STACK_INVALID && env->allow_uninit_stack)
> > + continue;
> > verbose(env, "invalid read from stack off %d+%d size %d\n",
> > off, i, size);
> > return -EACCES;
> > @@ -5761,6 +5765,8 @@ static int check_stack_range_initialized(
> > }
> > goto mark;
> > }
> > + if (*stype == STACK_INVALID && env->allow_uninit_stack)
> > + goto mark;
>
> should we support clobber and conversion to STACK_MISC like we do for
> STACK_ZERO? If yes, probably cleaner to just extend condition to
>
> if ((*stype == STACK_ZERO) || (*stype == STACK_INVALID &&
> env->allow_uninit_stack))
>
> ?
As far as I understand, conversion of STACK_ZERO to STACK_MISC is
necessary for safety reasons (like we can't be sure that memory will
remain STACK_ZERO after clobber call).
However for STACK_INVALID -> STACK_MISC case, I don't think there is a
way to observe such change (apart from log output). After this patch
there would be no difference between STACK_INVALID and STACK_MISC in
privileged mode.
Hence, such change is a matter of style and does not affect verifier
behavior. If you think that the following is more concise:
if ((*stype == STACK_ZERO) ||
(*stype == STACK_INVALID && env->allow_uninit_stack)) {
if (clobber) {
/* helper can write anything into the stack */
*stype = STACK_MISC;
}
goto mark;
}
I can make this update and add appropriate test, checking log output.
Personally, I that intent would be more clear if the current notation
is preserved.
>
>
> Other than that, looks good:
>
> Acked-by: Andrii Nakryiko <andrii@kernel.org>
>
> >
> > if (is_spilled_reg(&state->stack[spi]) &&
> > (state->stack[spi].spilled_ptr.type == SCALAR_VALUE ||
> > @@ -13936,6 +13942,10 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
> > if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID)
> > continue;
> >
> > + if (env->allow_uninit_stack &&
> > + old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_MISC)
> > + continue;
> > +
> > /* explored stack has more populated slots than current stack
> > * and these slots were used
> > */
>
> [...]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH bpf-next 2/2] selftests/bpf: Tests for uninitialized stack reads
2023-02-17 0:55 ` Andrii Nakryiko
@ 2023-02-17 13:25 ` Eduard Zingerman
2023-02-17 22:00 ` Andrii Nakryiko
0 siblings, 1 reply; 12+ messages in thread
From: Eduard Zingerman @ 2023-02-17 13:25 UTC (permalink / raw)
To: Andrii Nakryiko; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs
On Thu, 2023-02-16 at 16:55 -0800, Andrii Nakryiko wrote:
> On Thu, Feb 16, 2023 at 10:36 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > Two testcases to make sure that stack reads from uninitialized
> > locations are accepted by verifier when executed in privileged mode:
> > - read from a fixed offset;
> > - read from a variable offset.
> >
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > ---
> > .../selftests/bpf/prog_tests/uninit_stack.c | 9 +++
> > .../selftests/bpf/progs/uninit_stack.c | 55 +++++++++++++++++++
> > 2 files changed, 64 insertions(+)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/uninit_stack.c
> > create mode 100644 tools/testing/selftests/bpf/progs/uninit_stack.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/uninit_stack.c b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c
> > new file mode 100644
> > index 000000000000..e64c71948491
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c
> > @@ -0,0 +1,9 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <test_progs.h>
> > +#include "uninit_stack.skel.h"
> > +
> > +void test_uninit_stack(void)
> > +{
> > + RUN_TESTS(uninit_stack);
> > +}
> > diff --git a/tools/testing/selftests/bpf/progs/uninit_stack.c b/tools/testing/selftests/bpf/progs/uninit_stack.c
> > new file mode 100644
> > index 000000000000..20ff6a22c906
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/uninit_stack.c
> > @@ -0,0 +1,55 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/bpf.h>
> > +#include <bpf/bpf_helpers.h>
> > +#include "bpf_misc.h"
> > +
> > +/* Read an uninitialized value from stack at a fixed offset */
> > +SEC("socket")
> > +__naked int read_uninit_stack_fixed_off(void *ctx)
> > +{
> > + asm volatile (" \
> > + // force stack depth to be 128 \
> > + *(u64*)(r10 - 128) = r1; \
> > + r1 = *(u8 *)(r10 - 8 ); \
> > + r1 = *(u8 *)(r10 - 11); \
> > + r1 = *(u8 *)(r10 - 13); \
> > + r1 = *(u8 *)(r10 - 15); \
> > + r1 = *(u16*)(r10 - 16); \
> > + r1 = *(u32*)(r10 - 32); \
> > + r1 = *(u64*)(r10 - 64); \
> > + // read from a spill of a wrong size, it is a separate \
> > + // branch in check_stack_read_fixed_off() \
> > + *(u32*)(r10 - 72) = r1; \
> > + r1 = *(u64*)(r10 - 72); \
> > + r0 = 0; \
> > + exit; \
>
> would it be better to
>
> r0 = *(u64*)(r10 - 72);
> exit;
>
> to make sure that in the future verifier doesn't smartly optimize out
> unused reads?
Are there plans for such optimizations? If there are, many tests might
be in trouble. I thought that this is delegated to the C compiler.
For this particular case the rewrite might look as:
asm volatile (" \
r0 = 0; \
/* force stack depth to be 128 */ \
*(u64*)(r10 - 128) = r1; \
r1 = *(u8 *)(r10 - 8 ); \
r0 += r1; \
r1 = *(u8 *)(r10 - 11); \
r0 += r1; \
r1 = *(u8 *)(r10 - 13); \
r0 += r1; \
r1 = *(u8 *)(r10 - 15); \
r0 += r1; \
r1 = *(u16*)(r10 - 16); \
r0 += r1; \
r1 = *(u32*)(r10 - 32); \
r0 += r1; \
r1 = *(u64*)(r10 - 64); \
r0 += r1; \
/* read from a spill of a wrong size, it is a separate \
* branch in check_stack_read_fixed_off() \
*/ \
*(u32*)(r10 - 72) = r1; \
r1 = *(u64*)(r10 - 72); \
r0 += r1; \
exit; \
"
::: __clobber_all);
It works but is kinda ugly.
---
Orthogonal to the above issue, I found that use of the '//' comments
in the asm code w/o newlines is invalid, as it makes rest of the
string a comment. I changed '\n\' line endings to '\' just before
sending the patch and did not verify the change.
=> The patch-set would have to be resent.
>
>
> Either way, looks good to me:
>
> Acked-by: Andrii Nakryiko <andrii@kernel.org>
>
> > +"
> > + ::: __clobber_all);
> > +}
> > +
> > +/* Read an uninitialized value from stack at a variable offset */
> > +SEC("socket")
> > +__naked int read_uninit_stack_var_off(void *ctx)
> > +{
> > + asm volatile (" \
> > + call %[bpf_get_prandom_u32]; \
> > + // force stack depth to be 64 \
> > + *(u64*)(r10 - 64) = r0; \
> > + r0 = -r0; \
> > + // give r0 a range [-31, -1] \
> > + if r0 s<= -32 goto exit_%=; \
> > + if r0 s>= 0 goto exit_%=; \
> > + // access stack using r0 \
> > + r1 = r10; \
> > + r1 += r0; \
> > + r2 = *(u8*)(r1 + 0); \
> > +exit_%=: r0 = 0; \
> > + exit; \
> > +"
> > + :
> > + : __imm(bpf_get_prandom_u32)
> > + : __clobber_all);
> > +}
> > +
> > +char _license[] SEC("license") = "GPL";
> > --
> > 2.39.1
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH bpf-next 0/2] Allow reads from uninit stack
2023-02-16 18:36 [PATCH bpf-next 0/2] Allow reads from uninit stack Eduard Zingerman
2023-02-16 18:36 ` [PATCH bpf-next 1/2] bpf: " Eduard Zingerman
2023-02-16 18:36 ` [PATCH bpf-next 2/2] selftests/bpf: Tests for uninitialized stack reads Eduard Zingerman
@ 2023-02-17 20:37 ` Daniel Borkmann
2023-02-17 20:46 ` Eduard Zingerman
2 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2023-02-17 20:37 UTC (permalink / raw)
To: Eduard Zingerman, bpf, ast; +Cc: andrii, martin.lau, kernel-team, yhs
On 2/16/23 7:36 PM, Eduard Zingerman wrote:
> This patch-set modifies BPF verifier to accept programs that read from
> uninitialized stack locations, but only if executed in privileged mode.
> This provides significant verification performance gains: 30% to 70% less
> processed states for big number of test programs.
>
> The reason for performance gains comes from treating STACK_MISC and
> STACK_INVALID as compatible, when cached state is compared to current state
> in verifier.c:stacksafe().
>
> The change should not affect safety, because any value read from STACK_MISC
> location has full binary range (e.g. 0x00-0xff for byte-sized reads).
>
> Details and measurements are provided in the description for the patch #1.
>
> The change was suggested by Andrii Nakryiko, the initial patch was created
> by Alexei Starovoitov. The discussion could be found at [1].
>
> [1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
Ptal, looks like BPF CI is complaining:
https://github.com/kernel-patches/bpf/actions/runs/4205832876/jobs/7298488977
[...]
GEN-SKEL [test_progs] bpf_mod_race.skel.h
GEN-SKEL [test_progs] trace_dummy_st_ops.skel.h
libbpf: sec 'socket': corrupted program 'read_uninit_stack_fixed_off', offset 0, size 0
Error: failed to open BPF object file: Invalid argument
GEN-SKEL [test_progs] test_raw_tp_test_run.skel.h
make: *** [Makefile:578: /tmp/work/bpf/bpf/tools/testing/selftests/bpf/uninit_stack.skel.h] Error 234
make: *** Deleting file '/tmp/work/bpf/bpf/tools/testing/selftests/bpf/uninit_stack.skel.h'
make: *** Waiting for unfinished jobs....
make: Leaving directory '/tmp/work/bpf/bpf/tools/testing/selftests/bpf'
Error: Process completed with exit code 2.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH bpf-next 0/2] Allow reads from uninit stack
2023-02-17 20:37 ` [PATCH bpf-next 0/2] Allow reads from uninit stack Daniel Borkmann
@ 2023-02-17 20:46 ` Eduard Zingerman
0 siblings, 0 replies; 12+ messages in thread
From: Eduard Zingerman @ 2023-02-17 20:46 UTC (permalink / raw)
To: Daniel Borkmann, bpf, ast; +Cc: andrii, martin.lau, kernel-team, yhs
On Fri, 2023-02-17 at 21:37 +0100, Daniel Borkmann wrote:
[...]
> Ptal, looks like BPF CI is complaining:
>
> https://github.com/kernel-patches/bpf/actions/runs/4205832876/jobs/7298488977
>
Yes, I messed up comments in the asm blocks when replaced '\n\' line
endings with '\' before sending the patch w/o re-testing.
Sorry about that.
I'm waiting for answers from Andrii and will resend the patch-set.
---
Here is how the tests should look like:
/* Read an uninitialized value from stack at a fixed offset */
SEC("socket")
__naked int read_uninit_stack_fixed_off(void *ctx)
{
asm volatile (" \
r0 = 0; \
/* force stack depth to be 128 */ \
*(u64*)(r10 - 128) = r1; \
r1 = *(u8 *)(r10 - 8 ); \
r0 += r1; \
r1 = *(u8 *)(r10 - 11); \
r1 = *(u8 *)(r10 - 13); \
r1 = *(u8 *)(r10 - 15); \
r1 = *(u16*)(r10 - 16); \
r1 = *(u32*)(r10 - 32); \
r1 = *(u64*)(r10 - 64); \
/* read from a spill of a wrong size, it is a separate \
* branch in check_stack_read_fixed_off() \
*/ \
*(u32*)(r10 - 72) = r1; \
r1 = *(u64*)(r10 - 72); \
r0 = 0; \
exit; \
"
::: __clobber_all);
}
/* Read an uninitialized value from stack at a variable offset */
SEC("socket")
__naked int read_uninit_stack_var_off(void *ctx)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
/* force stack depth to be 64 */ \
*(u64*)(r10 - 64) = r0; \
r0 = -r0; \
/* give r0 a range [-31, -1] */ \
if r0 s<= -32 goto exit_%=; \
if r0 s>= 0 goto exit_%=; \
/* access stack using r0 */ \
r1 = r10; \
r1 += r0; \
r2 = *(u8*)(r1 + 0); \
exit_%=: r0 = 0; \
exit; \
"
:
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
[...]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Allow reads from uninit stack
2023-02-17 13:13 ` Eduard Zingerman
@ 2023-02-17 21:58 ` Andrii Nakryiko
0 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2023-02-17 21:58 UTC (permalink / raw)
To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs
On Fri, Feb 17, 2023 at 5:13 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2023-02-16 at 16:36 -0800, Andrii Nakryiko wrote:
> > On Thu, Feb 16, 2023 at 10:36 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > This commits updates the following functions to allow reads from
> > > uninitialized stack locations when env->allow_uninit_stack option is
> > > enabled:
> > > - check_stack_read_fixed_off()
> > > - check_stack_range_initialized(), called from:
> > > - check_stack_read_var_off()
> > > - check_helper_mem_access()
> > >
> > > Such change allows to relax logic in stacksafe() to treat STACK_MISC
> > > and STACK_INVALID in a same way and make the following stack slot
> > > configurations equivalent:
> > >
> > > | Cached state | Current state |
> > > | stack slot | stack slot |
> > > |------------------+------------------|
> > > | STACK_INVALID or | STACK_INVALID or |
> > > | STACK_MISC | STACK_SPILL or |
> > > | | STACK_MISC or |
> > > | | STACK_ZERO or |
> > > | | STACK_DYNPTR |
> > >
> > > This leads to significant verification speed gains (see below).
> > >
> > > The idea was suggested by Andrii Nakryiko [1] and initial patch was
> > > created by Alexei Starovoitov [2].
> > >
> > > Currently the env->allow_uninit_stack is allowed for programs loaded
> > > by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.
> > >
> > > A number of test cases from verifier/*.c were expecting uninitialized
> > > stack access to be an error. These test cases were updated to execute
> > > in unprivileged mode (thus preserving the tests).
> > >
> > > The test progs/test_global_func10.c expected "invalid indirect access
> > > to stack" error message because of the access to uninitialized memory
> > > region. The test is updated to provoke the same error message by
> > > accessing stack out of allocated range.
> > >
> > > The following tests had to be removed because these can't be made
> > > unprivileged:
> > > - verifier/sock.c:
> > > - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
> > > stack_value"
> > > BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
> > > - verifier/var_off.c:
> > > - "indirect variable-offset stack access, max_off+size > max_initialized"
> > > - "indirect variable-offset stack access, uninitialized"
> > > These tests verify that access to uninitialized stack values is
> > > detected when stack offset is not a constant. However, variable
> > > stack access is prohibited in unprivileged mode, thus these tests
> > > are no longer valid.
> > >
> > > * * *
> > >
> > > Here is veristat log comparing this patch with current master on a
> > > set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
> > > and cilium BPF binaries (see [3]):
> > >
> > > $ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
> > > File Program States (A) States (B) States (DIFF)
> > > -------------------------- -------------------------- ---------- ---------- ----------------
> > > bpf_host.o tail_handle_ipv6_from_host 349 244 -105 (-30.09%)
> > > bpf_host.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
> > > bpf_lxc.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
> > > bpf_sock.o cil_sock4_connect 70 48 -22 (-31.43%)
> > > bpf_sock.o cil_sock4_sendmsg 68 46 -22 (-32.35%)
> > > bpf_xdp.o tail_handle_nat_fwd_ipv4 1554 803 -751 (-48.33%)
> > > bpf_xdp.o tail_lb_ipv4 6457 2473 -3984 (-61.70%)
> > > bpf_xdp.o tail_lb_ipv6 7249 3908 -3341 (-46.09%)
> > > pyperf600_bpf_loop.bpf.o on_event 287 145 -142 (-49.48%)
> > > strobemeta.bpf.o on_event 15915 4772 -11143 (-70.02%)
> > > strobemeta_nounroll2.bpf.o on_event 17087 3820 -13267 (-77.64%)
> > > xdp_synproxy_kern.bpf.o syncookie_tc 21271 6635 -14636 (-68.81%)
> > > xdp_synproxy_kern.bpf.o syncookie_xdp 23122 6024 -17098 (-73.95%)
> > > -------------------------- -------------------------- ---------- ---------- ----------------
> > >
> > > Note: I limited selection by states_pct<-30%.
> > >
> > > Inspection of differences in pyperf600_bpf_loop behavior shows that
> > > the following patch for the test removes almost all differences:
> > >
> > > - a/tools/testing/selftests/bpf/progs/pyperf.h
> > > + b/tools/testing/selftests/bpf/progs/pyperf.h
> > > @ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
> > > }
> > >
> > > if (event->pthread_match || !pidData->use_tls) {
> > > - void* frame_ptr;
> > > - FrameData frame;
> > > + void* frame_ptr = 0;
> > > + FrameData frame = {};
> > > Symbol sym = {};
> > > int cur_cpu = bpf_get_smp_processor_id();
> > >
> > > W/o this patch the difference comes from the following pattern
> > > (for different variables):
> > >
> > > static bool get_frame_data(... FrameData *frame ...)
> > > {
> > > ...
> > > bpf_probe_read_user(&frame->f_code, ...);
> > > if (!frame->f_code)
> > > return false;
> > > ...
> > > bpf_probe_read_user(&frame->co_name, ...);
> > > if (frame->co_name)
> > > ...;
> > > }
> > >
> > > int __on_event(struct bpf_raw_tracepoint_args *ctx)
> > > {
> > > FrameData frame;
> > > ...
> > > get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
> > > ...
> > > }
> > >
> > > SEC("raw_tracepoint/kfree_skb")
> > > int on_event(struct bpf_raw_tracepoint_args* ctx)
> > > {
> > > ...
> > > ret |= __on_event(ctx);
> > > ret |= __on_event(ctx);
> > > ...
> > > }
> > >
> > > With regards to value `frame->co_name` the following is important:
> > > - Because of the conditional `if (!frame->f_code)` each call to
> > > __on_event() produces two states, one with `frame->co_name` marked
> > > as STACK_MISC, another with it as is (and marked STACK_INVALID on a
> > > first call).
> > > - The call to bpf_probe_read_user() does not mark stack slots
> > > corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
> > > these slots as BPF_MISC, this happens because of the following loop
> > > in the check_helper_call():
> > >
> > > for (i = 0; i < meta.access_size; i++) {
> > > err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
> > > BPF_WRITE, -1, false);
> > > if (err)
> > > return err;
> > > }
> > >
> > > Note the size of the write, it is a one byte write for each byte
> > > touched by a helper. The BPF_B write does not lead to write marks
> > > for the target stack slot.
> > > - Which means that w/o this patch when second __on_event() call is
> > > verified `if (frame->co_name)` will propagate read marks first to a
> > > stack slot with STACK_MISC marks and second to a stack slot with
> > > STACK_INVALID marks and these states would be considered different.
> > >
> > > [1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
> > > [2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
> > > [3] git@github.com:anakryiko/cilium.git
> > >
> > > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > > Suggested-by: Alexei Starovoitov <ast@kernel.org>
> > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > > ---
> > > kernel/bpf/verifier.c | 10 ++
> > > .../selftests/bpf/progs/test_global_func10.c | 6 +-
> > > tools/testing/selftests/bpf/verifier/calls.c | 13 ++-
> > > .../bpf/verifier/helper_access_var_len.c | 104 ++++++++++++------
> > > .../testing/selftests/bpf/verifier/int_ptr.c | 9 +-
> > > .../selftests/bpf/verifier/search_pruning.c | 13 ++-
> > > tools/testing/selftests/bpf/verifier/sock.c | 27 -----
> > > .../selftests/bpf/verifier/spill_fill.c | 7 +-
> > > .../testing/selftests/bpf/verifier/var_off.c | 52 ---------
> > > 9 files changed, 107 insertions(+), 134 deletions(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 272563a0b770..6fbd0e25ccab 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -3826,6 +3826,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
> > > continue;
> > > if (type == STACK_MISC)
> > > continue;
> > > + if (type == STACK_INVALID && env->allow_uninit_stack)
> > > + continue;
> > > verbose(env, "invalid read from stack off %d+%d size %d\n",
> > > off, i, size);
> > > return -EACCES;
> > > @@ -3863,6 +3865,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
> > > continue;
> > > if (type == STACK_ZERO)
> > > continue;
> > > + if (type == STACK_INVALID && env->allow_uninit_stack)
> > > + continue;
> > > verbose(env, "invalid read from stack off %d+%d size %d\n",
> > > off, i, size);
> > > return -EACCES;
> > > @@ -5761,6 +5765,8 @@ static int check_stack_range_initialized(
> > > }
> > > goto mark;
> > > }
> > > + if (*stype == STACK_INVALID && env->allow_uninit_stack)
> > > + goto mark;
> >
> > should we support clobber and conversion to STACK_MISC like we do for
> > STACK_ZERO? If yes, probably cleaner to just extend condition to
> >
> > if ((*stype == STACK_ZERO) || (*stype == STACK_INVALID &&
> > env->allow_uninit_stack))
> >
> > ?
>
> As far as I understand, conversion of STACK_ZERO to STACK_MISC is
> necessary for safety reasons (like we can't be sure that memory will
> remain STACK_ZERO after clobber call).
>
> However for STACK_INVALID -> STACK_MISC case, I don't think there is a
> way to observe such change (apart from log output). After this patch
> there would be no difference between STACK_INVALID and STACK_MISC in
> privileged mode.
>
> Hence, such change is a matter of style and does not affect verifier
> behavior. If you think that the following is more concise:
>
> if ((*stype == STACK_ZERO) ||
> (*stype == STACK_INVALID && env->allow_uninit_stack)) {
> if (clobber) {
> /* helper can write anything into the stack */
> *stype = STACK_MISC;
> }
> goto mark;
> }
>
> I can make this update and add appropriate test, checking log output.
> Personally, I that intent would be more clear if the current notation
> is preserved.
It seems like the clobber flag is used when helper is writing out data
into stack memory. So it makes sense to represent that memory as
initialized but unknown, that is STACK_MISC. It's not INVALID anymore
after the helper call, is my point.
>
> >
> >
> > Other than that, looks good:
> >
> > Acked-by: Andrii Nakryiko <andrii@kernel.org>
> >
> > >
> > > if (is_spilled_reg(&state->stack[spi]) &&
> > > (state->stack[spi].spilled_ptr.type == SCALAR_VALUE ||
> > > @@ -13936,6 +13942,10 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
> > > if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID)
> > > continue;
> > >
> > > + if (env->allow_uninit_stack &&
> > > + old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_MISC)
> > > + continue;
> > > +
> > > /* explored stack has more populated slots than current stack
> > > * and these slots were used
> > > */
> >
> > [...]
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH bpf-next 2/2] selftests/bpf: Tests for uninitialized stack reads
2023-02-17 13:25 ` Eduard Zingerman
@ 2023-02-17 22:00 ` Andrii Nakryiko
2023-02-17 22:06 ` Eduard Zingerman
0 siblings, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2023-02-17 22:00 UTC (permalink / raw)
To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs
On Fri, Feb 17, 2023 at 5:25 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2023-02-16 at 16:55 -0800, Andrii Nakryiko wrote:
> > On Thu, Feb 16, 2023 at 10:36 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > Two testcases to make sure that stack reads from uninitialized
> > > locations are accepted by verifier when executed in privileged mode:
> > > - read from a fixed offset;
> > > - read from a variable offset.
> > >
> > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > > ---
> > > .../selftests/bpf/prog_tests/uninit_stack.c | 9 +++
> > > .../selftests/bpf/progs/uninit_stack.c | 55 +++++++++++++++++++
> > > 2 files changed, 64 insertions(+)
> > > create mode 100644 tools/testing/selftests/bpf/prog_tests/uninit_stack.c
> > > create mode 100644 tools/testing/selftests/bpf/progs/uninit_stack.c
> > >
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/uninit_stack.c b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c
> > > new file mode 100644
> > > index 000000000000..e64c71948491
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c
> > > @@ -0,0 +1,9 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +#include <test_progs.h>
> > > +#include "uninit_stack.skel.h"
> > > +
> > > +void test_uninit_stack(void)
> > > +{
> > > + RUN_TESTS(uninit_stack);
> > > +}
> > > diff --git a/tools/testing/selftests/bpf/progs/uninit_stack.c b/tools/testing/selftests/bpf/progs/uninit_stack.c
> > > new file mode 100644
> > > index 000000000000..20ff6a22c906
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/bpf/progs/uninit_stack.c
> > > @@ -0,0 +1,55 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +#include <linux/bpf.h>
> > > +#include <bpf/bpf_helpers.h>
> > > +#include "bpf_misc.h"
> > > +
> > > +/* Read an uninitialized value from stack at a fixed offset */
> > > +SEC("socket")
> > > +__naked int read_uninit_stack_fixed_off(void *ctx)
> > > +{
> > > + asm volatile (" \
> > > + // force stack depth to be 128 \
> > > + *(u64*)(r10 - 128) = r1; \
> > > + r1 = *(u8 *)(r10 - 8 ); \
> > > + r1 = *(u8 *)(r10 - 11); \
> > > + r1 = *(u8 *)(r10 - 13); \
> > > + r1 = *(u8 *)(r10 - 15); \
> > > + r1 = *(u16*)(r10 - 16); \
> > > + r1 = *(u32*)(r10 - 32); \
> > > + r1 = *(u64*)(r10 - 64); \
> > > + // read from a spill of a wrong size, it is a separate \
> > > + // branch in check_stack_read_fixed_off() \
> > > + *(u32*)(r10 - 72) = r1; \
> > > + r1 = *(u64*)(r10 - 72); \
> > > + r0 = 0; \
> > > + exit; \
> >
> > would it be better to
> >
> > r0 = *(u64*)(r10 - 72);
> > exit;
> >
> > to make sure that in the future verifier doesn't smartly optimize out
> > unused reads?
>
> Are there plans for such optimizations? If there are, many tests might
> be in trouble. I thought that this is delegated to the C compiler.
I'm not aware of them, just hypothetical concern
>
> For this particular case the rewrite might look as:
>
> asm volatile (" \
> r0 = 0; \
> /* force stack depth to be 128 */ \
> *(u64*)(r10 - 128) = r1; \
> r1 = *(u8 *)(r10 - 8 ); \
> r0 += r1; \
> r1 = *(u8 *)(r10 - 11); \
> r0 += r1; \
> r1 = *(u8 *)(r10 - 13); \
> r0 += r1; \
> r1 = *(u8 *)(r10 - 15); \
> r0 += r1; \
> r1 = *(u16*)(r10 - 16); \
> r0 += r1; \
> r1 = *(u32*)(r10 - 32); \
> r0 += r1; \
> r1 = *(u64*)(r10 - 64); \
> r0 += r1; \
> /* read from a spill of a wrong size, it is a separate \
> * branch in check_stack_read_fixed_off() \
> */ \
> *(u32*)(r10 - 72) = r1; \
> r1 = *(u64*)(r10 - 72); \
> r0 += r1; \
> exit; \
> "
> ::: __clobber_all);
>
> It works but is kinda ugly.
nah, no need
>
> ---
>
> Orthogonal to the above issue, I found that use of the '//' comments
> in the asm code w/o newlines is invalid, as it makes rest of the
> string a comment. I changed '\n\' line endings to '\' just before
> sending the patch and did not verify the change.
> => The patch-set would have to be resent.
I was wondering about that, but assumed you tested it ;) so yeah,
please fix and resend. (in that sense having each line separately
quoted allows much easier commenting, but we've decided on this style,
so let's stick to it
>
> >
> >
> > Either way, looks good to me:
> >
> > Acked-by: Andrii Nakryiko <andrii@kernel.org>
> >
> > > +"
> > > + ::: __clobber_all);
> > > +}
> > > +
> > > +/* Read an uninitialized value from stack at a variable offset */
> > > +SEC("socket")
> > > +__naked int read_uninit_stack_var_off(void *ctx)
> > > +{
> > > + asm volatile (" \
> > > + call %[bpf_get_prandom_u32]; \
> > > + // force stack depth to be 64 \
> > > + *(u64*)(r10 - 64) = r0; \
> > > + r0 = -r0; \
> > > + // give r0 a range [-31, -1] \
> > > + if r0 s<= -32 goto exit_%=; \
> > > + if r0 s>= 0 goto exit_%=; \
> > > + // access stack using r0 \
> > > + r1 = r10; \
> > > + r1 += r0; \
> > > + r2 = *(u8*)(r1 + 0); \
> > > +exit_%=: r0 = 0; \
> > > + exit; \
> > > +"
> > > + :
> > > + : __imm(bpf_get_prandom_u32)
> > > + : __clobber_all);
> > > +}
> > > +
> > > +char _license[] SEC("license") = "GPL";
> > > --
> > > 2.39.1
> > >
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH bpf-next 2/2] selftests/bpf: Tests for uninitialized stack reads
2023-02-17 22:00 ` Andrii Nakryiko
@ 2023-02-17 22:06 ` Eduard Zingerman
0 siblings, 0 replies; 12+ messages in thread
From: Eduard Zingerman @ 2023-02-17 22:06 UTC (permalink / raw)
To: Andrii Nakryiko; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs
On Fri, 2023-02-17 at 14:00 -0800, Andrii Nakryiko wrote:
[...]
> > Orthogonal to the above issue, I found that use of the '//' comments
> > in the asm code w/o newlines is invalid, as it makes rest of the
> > string a comment. I changed '\n\' line endings to '\' just before
> > sending the patch and did not verify the change.
> > => The patch-set would have to be resent.
>
> I was wondering about that, but assumed you tested it ;) so yeah,
> please fix and resend. (in that sense having each line separately
> quoted allows much easier commenting, but we've decided on this style,
> so let's stick to it
And syntax highlighting for strings vs comments :'(
Thank you for the replies, I'll update the patch #1 and resend.
[...]
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-02-17 22:06 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-16 18:36 [PATCH bpf-next 0/2] Allow reads from uninit stack Eduard Zingerman
2023-02-16 18:36 ` [PATCH bpf-next 1/2] bpf: " Eduard Zingerman
2023-02-17 0:36 ` Andrii Nakryiko
2023-02-17 13:13 ` Eduard Zingerman
2023-02-17 21:58 ` Andrii Nakryiko
2023-02-16 18:36 ` [PATCH bpf-next 2/2] selftests/bpf: Tests for uninitialized stack reads Eduard Zingerman
2023-02-17 0:55 ` Andrii Nakryiko
2023-02-17 13:25 ` Eduard Zingerman
2023-02-17 22:00 ` Andrii Nakryiko
2023-02-17 22:06 ` Eduard Zingerman
2023-02-17 20:37 ` [PATCH bpf-next 0/2] Allow reads from uninit stack Daniel Borkmann
2023-02-17 20:46 ` Eduard Zingerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox