* [PATCH bpf-next 0/2] bpf: stack depth tracking fix @ 2017-12-22 21:33 Alexei Starovoitov 2017-12-22 21:33 ` [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic Alexei Starovoitov 2017-12-22 21:33 ` [PATCH bpf-next 2/2] selftests/bpf: additional stack depth tests Alexei Starovoitov 0 siblings, 2 replies; 7+ messages in thread From: Alexei Starovoitov @ 2017-12-22 21:33 UTC (permalink / raw) To: David S . Miller; +Cc: Daniel Borkmann, Jann Horn, netdev, kernel-team Jann reported an issue with stack depth tracking. Fix it and add tests. Alexei Starovoitov (2): bpf: fix maximum stack depth tracking logic selftests/bpf: additional stack depth tests include/linux/bpf_verifier.h | 17 ++++++++++ kernel/bpf/verifier.c | 15 +++++---- tools/testing/selftests/bpf/test_verifier.c | 50 +++++++++++++++++++++++++++++ 3 files changed, 76 insertions(+), 6 deletions(-) -- 2.9.5 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic 2017-12-22 21:33 [PATCH bpf-next 0/2] bpf: stack depth tracking fix Alexei Starovoitov @ 2017-12-22 21:33 ` Alexei Starovoitov 2017-12-23 1:38 ` Jann Horn 2017-12-22 21:33 ` [PATCH bpf-next 2/2] selftests/bpf: additional stack depth tests Alexei Starovoitov 1 sibling, 1 reply; 7+ messages in thread From: Alexei Starovoitov @ 2017-12-22 21:33 UTC (permalink / raw) To: David S . Miller; +Cc: Daniel Borkmann, Jann Horn, netdev, kernel-team instead of computing max stack depth for current call chain only track the maximum possible stack depth of any function at given frame position. Such algorithm is simple and fast, but conservative, since it overestimates amount of stack used. Consider: main() // stack 32 { A(); B(); } A(){} // stack 256 B() // stack 64 { A(); } since A() is called at frame[1] and frame[2], the algorithm will estimate the max stack depth as 32 + 256 + 256 and will reject such program, though real max is 32 + 64 + 256. Fortunately the algorithm is good enough in practice. The alternative would be to track max stack of every function in the fast pass through the verifier and then do additional CFG walk just to compute max total. Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> --- include/linux/bpf_verifier.h | 17 +++++++++++++++++ kernel/bpf/verifier.c | 15 +++++++++------ 2 files changed, 26 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index aaac589e490c..adc65ef093d1 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -194,7 +194,24 @@ struct bpf_verifier_env { struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */ struct bpf_verifer_log log; u32 subprog_starts[BPF_MAX_SUBPROGS]; + /* computes the stack depth of each bpf function */ u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1]; + /* gradually computes the maximum stack depth out of all functions + * called at this frame position. Example: + * main() + * { + * a(); + * b(); + * } + * b() + * { + * a(); + * } + * frame_stack_depth[0] = stack depth of main() + * frame_stack_depth[1] = max { stack depth of a(), stack depth of b() } + * frame_stack_depth[2] = stack depth of a() + */ + u16 frame_stack_depth[MAX_CALL_FRAMES]; u32 subprog_cnt; }; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 4ae46b2cba88..096681bcb312 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1434,15 +1434,18 @@ static int update_stack_depth(struct bpf_verifier_env *env, struct bpf_verifier_state *cur = env->cur_state; int i; + if (stack < -off) + /* update known max for given subprogram */ + env->subprog_stack_depth[func->subprogno] = -off; + + stack = env->frame_stack_depth[func->frameno]; if (stack >= -off) return 0; + env->frame_stack_depth[func->frameno] = -off; - /* update known max for given subprogram */ - env->subprog_stack_depth[func->subprogno] = -off; - - /* compute the total for current call chain */ - for (i = 0; i <= cur->curframe; i++) { - u32 depth = env->subprog_stack_depth[cur->frame[i]->subprogno]; + /* some frame changed its max, recompute the total */ + for (i = 0; i < MAX_CALL_FRAMES; i++) { + u32 depth = env->frame_stack_depth[i]; /* round up to 32-bytes, since this is granularity * of interpreter stack sizes -- 2.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic 2017-12-22 21:33 ` [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic Alexei Starovoitov @ 2017-12-23 1:38 ` Jann Horn 2017-12-23 2:07 ` Alexei Starovoitov 0 siblings, 1 reply; 7+ messages in thread From: Jann Horn @ 2017-12-23 1:38 UTC (permalink / raw) To: Alexei Starovoitov Cc: David S . Miller, Daniel Borkmann, Network Development, kernel-team On Fri, Dec 22, 2017 at 10:33 PM, Alexei Starovoitov <ast@kernel.org> wrote: > instead of computing max stack depth for current call chain only > track the maximum possible stack depth of any function at given > frame position. Such algorithm is simple and fast, but conservative, > since it overestimates amount of stack used. Consider: > main() // stack 32 > { > A(); > B(); > } > > A(){} // stack 256 > > B() // stack 64 > { > A(); > } > > since A() is called at frame[1] and frame[2], the algorithm > will estimate the max stack depth as 32 + 256 + 256 and will reject > such program, though real max is 32 + 64 + 256. > > Fortunately the algorithm is good enough in practice. The alternative > would be to track max stack of every function in the fast pass through > the verifier and then do additional CFG walk just to compute max total. > > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") > Reported-by: Jann Horn <jannh@google.com> > Signed-off-by: Alexei Starovoitov <ast@kernel.org> Does this work in cases where multiple invocations of a function have different stack access patterns because their inputs have different bounds? Consider this pseudocode example: void main(void) { func1(0); func1(1); func2(1); } void func1(int alloc_or_recurse) { if (alloc_or_recurse) { frame_pointer[-300] = 1; } else { func2(alloc_or_recurse); } } void func2(int alloc_or_recurse) { if (alloc_or_recurse) { frame_pointer[-300] = 1; } } AFAICS this will work as follows: Call to func1->func2 runs without any stack accesses because the verifier can prove that alloc_or_recurse is 0. Second call to func1 allocates stack memory, bumping up frame_stack_depth[1]. Second call to func2 allocates stack memory, leaving frame_stack_depth[1] the same. So I think this will still pass the verifier, even though func1 and func2 will end up with 300 bytes stack memory each, causing the func1->func2 call to use more stack memory than permitted. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic 2017-12-23 1:38 ` Jann Horn @ 2017-12-23 2:07 ` Alexei Starovoitov 2017-12-23 2:26 ` Jann Horn 0 siblings, 1 reply; 7+ messages in thread From: Alexei Starovoitov @ 2017-12-23 2:07 UTC (permalink / raw) To: Jann Horn Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann, Network Development, kernel-team On Sat, Dec 23, 2017 at 02:38:29AM +0100, Jann Horn wrote: > On Fri, Dec 22, 2017 at 10:33 PM, Alexei Starovoitov <ast@kernel.org> wrote: > > instead of computing max stack depth for current call chain only > > track the maximum possible stack depth of any function at given > > frame position. Such algorithm is simple and fast, but conservative, > > since it overestimates amount of stack used. Consider: > > main() // stack 32 > > { > > A(); > > B(); > > } > > > > A(){} // stack 256 > > > > B() // stack 64 > > { > > A(); > > } > > > > since A() is called at frame[1] and frame[2], the algorithm > > will estimate the max stack depth as 32 + 256 + 256 and will reject > > such program, though real max is 32 + 64 + 256. > > > > Fortunately the algorithm is good enough in practice. The alternative > > would be to track max stack of every function in the fast pass through > > the verifier and then do additional CFG walk just to compute max total. > > > > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") > > Reported-by: Jann Horn <jannh@google.com> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > Does this work in cases where multiple invocations of a function have > different stack access patterns because their inputs have different > bounds? > > Consider this pseudocode example: > > void main(void) { > func1(0); > func1(1); > func2(1); > } > void func1(int alloc_or_recurse) { > if (alloc_or_recurse) { > frame_pointer[-300] = 1; > } else { > func2(alloc_or_recurse); > } > } > void func2(int alloc_or_recurse) { > if (alloc_or_recurse) { > frame_pointer[-300] = 1; > } > } > > AFAICS this will work as follows: > > Call to func1->func2 runs without any stack accesses because the > verifier can prove that alloc_or_recurse is 0. argh. right. I guess that ruins my attemp to do the stack check inline with the main verifier pass. Do you see an algorithm that can do it without extra cfg walk at the end? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic 2017-12-23 2:07 ` Alexei Starovoitov @ 2017-12-23 2:26 ` Jann Horn 2017-12-23 3:03 ` Alexei Starovoitov 0 siblings, 1 reply; 7+ messages in thread From: Jann Horn @ 2017-12-23 2:26 UTC (permalink / raw) To: Alexei Starovoitov Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann, Network Development, kernel-team On Sat, Dec 23, 2017 at 3:07 AM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Sat, Dec 23, 2017 at 02:38:29AM +0100, Jann Horn wrote: >> On Fri, Dec 22, 2017 at 10:33 PM, Alexei Starovoitov <ast@kernel.org> wrote: >> > instead of computing max stack depth for current call chain only >> > track the maximum possible stack depth of any function at given >> > frame position. Such algorithm is simple and fast, but conservative, >> > since it overestimates amount of stack used. Consider: >> > main() // stack 32 >> > { >> > A(); >> > B(); >> > } >> > >> > A(){} // stack 256 >> > >> > B() // stack 64 >> > { >> > A(); >> > } >> > >> > since A() is called at frame[1] and frame[2], the algorithm >> > will estimate the max stack depth as 32 + 256 + 256 and will reject >> > such program, though real max is 32 + 64 + 256. >> > >> > Fortunately the algorithm is good enough in practice. The alternative >> > would be to track max stack of every function in the fast pass through >> > the verifier and then do additional CFG walk just to compute max total. >> > >> > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") >> > Reported-by: Jann Horn <jannh@google.com> >> > Signed-off-by: Alexei Starovoitov <ast@kernel.org> >> >> Does this work in cases where multiple invocations of a function have >> different stack access patterns because their inputs have different >> bounds? >> >> Consider this pseudocode example: >> >> void main(void) { >> func1(0); >> func1(1); >> func2(1); >> } >> void func1(int alloc_or_recurse) { >> if (alloc_or_recurse) { >> frame_pointer[-300] = 1; >> } else { >> func2(alloc_or_recurse); >> } >> } >> void func2(int alloc_or_recurse) { >> if (alloc_or_recurse) { >> frame_pointer[-300] = 1; >> } >> } >> >> AFAICS this will work as follows: >> >> Call to func1->func2 runs without any stack accesses because the >> verifier can prove that alloc_or_recurse is 0. > > argh. right. > I guess that ruins my attemp to do the stack check inline > with the main verifier pass. > Do you see an algorithm that can do it without extra > cfg walk at the end? A crappy heuristic would be to forbid recursion (calling a function that is already present somewhere in the call stack) and then sum up the maximum stack depths of all functions at the end and see whether the sum is bigger than the maximum stack size. While it'd be horribly conservative, it might work for now? 512 bytes are a lot of stack. Or as a more complicated, but slightly less conservative heuristic, you could forbid recursion, record the maximum number of stack frames (max_stack_frames), then at the end select the top max_stack_frames functions with the biggest stack sizes and sum up their sizes? (Or if you want to support recursion, check max_stack_frames*biggest_frame_size<=MAX_BPF_STACK.) Anything else I can come up with is probably more complicated than an extra cfg walk. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic 2017-12-23 2:26 ` Jann Horn @ 2017-12-23 3:03 ` Alexei Starovoitov 0 siblings, 0 replies; 7+ messages in thread From: Alexei Starovoitov @ 2017-12-23 3:03 UTC (permalink / raw) To: Jann Horn Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann, Network Development, kernel-team On Sat, Dec 23, 2017 at 03:26:27AM +0100, Jann Horn wrote: > On Sat, Dec 23, 2017 at 3:07 AM, Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > On Sat, Dec 23, 2017 at 02:38:29AM +0100, Jann Horn wrote: > >> On Fri, Dec 22, 2017 at 10:33 PM, Alexei Starovoitov <ast@kernel.org> wrote: > >> > instead of computing max stack depth for current call chain only > >> > track the maximum possible stack depth of any function at given > >> > frame position. Such algorithm is simple and fast, but conservative, > >> > since it overestimates amount of stack used. Consider: > >> > main() // stack 32 > >> > { > >> > A(); > >> > B(); > >> > } > >> > > >> > A(){} // stack 256 > >> > > >> > B() // stack 64 > >> > { > >> > A(); > >> > } > >> > > >> > since A() is called at frame[1] and frame[2], the algorithm > >> > will estimate the max stack depth as 32 + 256 + 256 and will reject > >> > such program, though real max is 32 + 64 + 256. > >> > > >> > Fortunately the algorithm is good enough in practice. The alternative > >> > would be to track max stack of every function in the fast pass through > >> > the verifier and then do additional CFG walk just to compute max total. > >> > > >> > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") > >> > Reported-by: Jann Horn <jannh@google.com> > >> > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > >> > >> Does this work in cases where multiple invocations of a function have > >> different stack access patterns because their inputs have different > >> bounds? > >> > >> Consider this pseudocode example: > >> > >> void main(void) { > >> func1(0); > >> func1(1); > >> func2(1); > >> } > >> void func1(int alloc_or_recurse) { > >> if (alloc_or_recurse) { > >> frame_pointer[-300] = 1; > >> } else { > >> func2(alloc_or_recurse); > >> } > >> } > >> void func2(int alloc_or_recurse) { > >> if (alloc_or_recurse) { > >> frame_pointer[-300] = 1; > >> } > >> } > >> > >> AFAICS this will work as follows: > >> > >> Call to func1->func2 runs without any stack accesses because the > >> verifier can prove that alloc_or_recurse is 0. > > > > argh. right. > > I guess that ruins my attemp to do the stack check inline > > with the main verifier pass. > > Do you see an algorithm that can do it without extra > > cfg walk at the end? > > A crappy heuristic would be to forbid recursion (calling a function > that is already present somewhere in the call stack) the recursion is already forbidden. It's a 'back-edge' from cfg point of view. There are 3 tests that cover that in few variants. > and then sum up > the maximum stack depths of all functions at the end and see whether > the sum is bigger than the maximum stack size. While it'd be horribly > conservative, it might work for now? 512 bytes are a lot of stack. That's what I tried first while developing bpf calls. It should have been good enough, but round up to 32-byte makes even tiny function into sizeable and both *_noinline.c tests fail to load. Arguably they are artificial stress test that push the limits with number of calls, argument passing and pointer accesses, but I don't want to scale them down. > Or as a more complicated, but slightly less conservative heuristic, > you could forbid recursion, record the maximum number of stack frames > (max_stack_frames), then at the end select the top max_stack_frames > functions with the biggest stack sizes and sum up their sizes? it's pretty much the same as previous one. In these two tests I'm building long stack chain out of all functions. > Anything else I can come up with is probably more complicated than an > extra cfg walk. I guess we have to generalize (or remember) cfg anyway for future use, so will code it up now and see how it looks. Thank you for the feedback! ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH bpf-next 2/2] selftests/bpf: additional stack depth tests 2017-12-22 21:33 [PATCH bpf-next 0/2] bpf: stack depth tracking fix Alexei Starovoitov 2017-12-22 21:33 ` [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic Alexei Starovoitov @ 2017-12-22 21:33 ` Alexei Starovoitov 1 sibling, 0 replies; 7+ messages in thread From: Alexei Starovoitov @ 2017-12-22 21:33 UTC (permalink / raw) To: David S . Miller; +Cc: Daniel Borkmann, Jann Horn, netdev, kernel-team to test inner logic of stack depth tracking Signed-off-by: Alexei Starovoitov <ast@kernel.org> --- tools/testing/selftests/bpf/test_verifier.c | 50 +++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 71fb0be81b78..e0b21c95c3bc 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -8764,6 +8764,56 @@ static struct bpf_test tests[] = { .result = REJECT, }, { + "calls: stack depth check using three frames. test1", + .insns = { + /* main */ + BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 4), /* call A */ + BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 5), /* call B */ + BPF_ST_MEM(BPF_B, BPF_REG_10, -32, 0), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + /* A */ + BPF_ST_MEM(BPF_B, BPF_REG_10, -256, 0), + BPF_EXIT_INSN(), + /* B */ + BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, -3), /* call A */ + BPF_ST_MEM(BPF_B, BPF_REG_10, -64, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_XDP, + /* though stack_main=32, stack_A=256, stack_B=64 + * and max(main+A, main+A+B) < 512 the program is rejected + * since stack tracking is conservative + * frame[0]=32, frame[1]=256, frame[2]=256 + */ + .errstr = "combined stack size", + .result = REJECT, + }, + { + "calls: stack depth check using three frames. test2", + .insns = { + /* main */ + BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 4), /* call A */ + BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 5), /* call B */ + BPF_ST_MEM(BPF_B, BPF_REG_10, -32, 0), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + /* A */ + BPF_ST_MEM(BPF_B, BPF_REG_10, -64, 0), + BPF_EXIT_INSN(), + /* B */ + BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, -3), /* call A */ + BPF_ST_MEM(BPF_B, BPF_REG_10, -256, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_XDP, + /* though stack_main=32, stack_A=64, stack_B=256 + * and max(main+A, main+A+B) < 512 the program is accepted + * since frame[0]=32, frame[1]=256, frame[2]=64 + */ + .result = ACCEPT, + }, + { "calls: spill into caller stack frame", .insns = { BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), -- 2.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-12-23 3:03 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-12-22 21:33 [PATCH bpf-next 0/2] bpf: stack depth tracking fix Alexei Starovoitov 2017-12-22 21:33 ` [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic Alexei Starovoitov 2017-12-23 1:38 ` Jann Horn 2017-12-23 2:07 ` Alexei Starovoitov 2017-12-23 2:26 ` Jann Horn 2017-12-23 3:03 ` Alexei Starovoitov 2017-12-22 21:33 ` [PATCH bpf-next 2/2] selftests/bpf: additional stack depth tests Alexei Starovoitov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).