* [PATCH bpf-next 0/2] bpf: stack depth tracking fix
@ 2017-12-22 21:33 Alexei Starovoitov
2017-12-22 21:33 ` [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic Alexei Starovoitov
2017-12-22 21:33 ` [PATCH bpf-next 2/2] selftests/bpf: additional stack depth tests Alexei Starovoitov
0 siblings, 2 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2017-12-22 21:33 UTC (permalink / raw)
To: David S . Miller; +Cc: Daniel Borkmann, Jann Horn, netdev, kernel-team
Jann reported an issue with stack depth tracking.
Fix it and add tests.
Alexei Starovoitov (2):
bpf: fix maximum stack depth tracking logic
selftests/bpf: additional stack depth tests
include/linux/bpf_verifier.h | 17 ++++++++++
kernel/bpf/verifier.c | 15 +++++----
tools/testing/selftests/bpf/test_verifier.c | 50 +++++++++++++++++++++++++++++
3 files changed, 76 insertions(+), 6 deletions(-)
--
2.9.5
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic
2017-12-22 21:33 [PATCH bpf-next 0/2] bpf: stack depth tracking fix Alexei Starovoitov
@ 2017-12-22 21:33 ` Alexei Starovoitov
2017-12-23 1:38 ` Jann Horn
2017-12-22 21:33 ` [PATCH bpf-next 2/2] selftests/bpf: additional stack depth tests Alexei Starovoitov
1 sibling, 1 reply; 7+ messages in thread
From: Alexei Starovoitov @ 2017-12-22 21:33 UTC (permalink / raw)
To: David S . Miller; +Cc: Daniel Borkmann, Jann Horn, netdev, kernel-team
instead of computing max stack depth for current call chain only
track the maximum possible stack depth of any function at given
frame position. Such algorithm is simple and fast, but conservative,
since it overestimates amount of stack used. Consider:
main() // stack 32
{
A();
B();
}
A(){} // stack 256
B() // stack 64
{
A();
}
since A() is called at frame[1] and frame[2], the algorithm
will estimate the max stack depth as 32 + 256 + 256 and will reject
such program, though real max is 32 + 64 + 256.
Fortunately the algorithm is good enough in practice. The alternative
would be to track max stack of every function in the fast pass through
the verifier and then do additional CFG walk just to compute max total.
Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
include/linux/bpf_verifier.h | 17 +++++++++++++++++
kernel/bpf/verifier.c | 15 +++++++++------
2 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index aaac589e490c..adc65ef093d1 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -194,7 +194,24 @@ struct bpf_verifier_env {
struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
struct bpf_verifer_log log;
u32 subprog_starts[BPF_MAX_SUBPROGS];
+ /* computes the stack depth of each bpf function */
u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1];
+ /* gradually computes the maximum stack depth out of all functions
+ * called at this frame position. Example:
+ * main()
+ * {
+ * a();
+ * b();
+ * }
+ * b()
+ * {
+ * a();
+ * }
+ * frame_stack_depth[0] = stack depth of main()
+ * frame_stack_depth[1] = max { stack depth of a(), stack depth of b() }
+ * frame_stack_depth[2] = stack depth of a()
+ */
+ u16 frame_stack_depth[MAX_CALL_FRAMES];
u32 subprog_cnt;
};
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4ae46b2cba88..096681bcb312 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1434,15 +1434,18 @@ static int update_stack_depth(struct bpf_verifier_env *env,
struct bpf_verifier_state *cur = env->cur_state;
int i;
+ if (stack < -off)
+ /* update known max for given subprogram */
+ env->subprog_stack_depth[func->subprogno] = -off;
+
+ stack = env->frame_stack_depth[func->frameno];
if (stack >= -off)
return 0;
+ env->frame_stack_depth[func->frameno] = -off;
- /* update known max for given subprogram */
- env->subprog_stack_depth[func->subprogno] = -off;
-
- /* compute the total for current call chain */
- for (i = 0; i <= cur->curframe; i++) {
- u32 depth = env->subprog_stack_depth[cur->frame[i]->subprogno];
+ /* some frame changed its max, recompute the total */
+ for (i = 0; i < MAX_CALL_FRAMES; i++) {
+ u32 depth = env->frame_stack_depth[i];
/* round up to 32-bytes, since this is granularity
* of interpreter stack sizes
--
2.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH bpf-next 2/2] selftests/bpf: additional stack depth tests
2017-12-22 21:33 [PATCH bpf-next 0/2] bpf: stack depth tracking fix Alexei Starovoitov
2017-12-22 21:33 ` [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic Alexei Starovoitov
@ 2017-12-22 21:33 ` Alexei Starovoitov
1 sibling, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2017-12-22 21:33 UTC (permalink / raw)
To: David S . Miller; +Cc: Daniel Borkmann, Jann Horn, netdev, kernel-team
to test inner logic of stack depth tracking
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
tools/testing/selftests/bpf/test_verifier.c | 50 +++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 71fb0be81b78..e0b21c95c3bc 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -8764,6 +8764,56 @@ static struct bpf_test tests[] = {
.result = REJECT,
},
{
+ "calls: stack depth check using three frames. test1",
+ .insns = {
+ /* main */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 4), /* call A */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 5), /* call B */
+ BPF_ST_MEM(BPF_B, BPF_REG_10, -32, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* A */
+ BPF_ST_MEM(BPF_B, BPF_REG_10, -256, 0),
+ BPF_EXIT_INSN(),
+ /* B */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, -3), /* call A */
+ BPF_ST_MEM(BPF_B, BPF_REG_10, -64, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_XDP,
+ /* though stack_main=32, stack_A=256, stack_B=64
+ * and max(main+A, main+A+B) < 512 the program is rejected
+ * since stack tracking is conservative
+ * frame[0]=32, frame[1]=256, frame[2]=256
+ */
+ .errstr = "combined stack size",
+ .result = REJECT,
+ },
+ {
+ "calls: stack depth check using three frames. test2",
+ .insns = {
+ /* main */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 4), /* call A */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 5), /* call B */
+ BPF_ST_MEM(BPF_B, BPF_REG_10, -32, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* A */
+ BPF_ST_MEM(BPF_B, BPF_REG_10, -64, 0),
+ BPF_EXIT_INSN(),
+ /* B */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, -3), /* call A */
+ BPF_ST_MEM(BPF_B, BPF_REG_10, -256, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_XDP,
+ /* though stack_main=32, stack_A=64, stack_B=256
+ * and max(main+A, main+A+B) < 512 the program is accepted
+ * since frame[0]=32, frame[1]=256, frame[2]=64
+ */
+ .result = ACCEPT,
+ },
+ {
"calls: spill into caller stack frame",
.insns = {
BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
--
2.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic
2017-12-22 21:33 ` [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic Alexei Starovoitov
@ 2017-12-23 1:38 ` Jann Horn
2017-12-23 2:07 ` Alexei Starovoitov
0 siblings, 1 reply; 7+ messages in thread
From: Jann Horn @ 2017-12-23 1:38 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: David S . Miller, Daniel Borkmann, Network Development,
kernel-team
On Fri, Dec 22, 2017 at 10:33 PM, Alexei Starovoitov <ast@kernel.org> wrote:
> instead of computing max stack depth for current call chain only
> track the maximum possible stack depth of any function at given
> frame position. Such algorithm is simple and fast, but conservative,
> since it overestimates amount of stack used. Consider:
> main() // stack 32
> {
> A();
> B();
> }
>
> A(){} // stack 256
>
> B() // stack 64
> {
> A();
> }
>
> since A() is called at frame[1] and frame[2], the algorithm
> will estimate the max stack depth as 32 + 256 + 256 and will reject
> such program, though real max is 32 + 64 + 256.
>
> Fortunately the algorithm is good enough in practice. The alternative
> would be to track max stack of every function in the fast pass through
> the verifier and then do additional CFG walk just to compute max total.
>
> Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> Reported-by: Jann Horn <jannh@google.com>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Does this work in cases where multiple invocations of a function have
different stack access patterns because their inputs have different
bounds?
Consider this pseudocode example:
void main(void) {
func1(0);
func1(1);
func2(1);
}
void func1(int alloc_or_recurse) {
if (alloc_or_recurse) {
frame_pointer[-300] = 1;
} else {
func2(alloc_or_recurse);
}
}
void func2(int alloc_or_recurse) {
if (alloc_or_recurse) {
frame_pointer[-300] = 1;
}
}
AFAICS this will work as follows:
Call to func1->func2 runs without any stack accesses because the
verifier can prove that alloc_or_recurse is 0.
Second call to func1 allocates stack memory, bumping up
frame_stack_depth[1].
Second call to func2 allocates stack memory, leaving
frame_stack_depth[1] the same.
So I think this will still pass the verifier, even though func1
and func2 will end up with 300 bytes stack memory each, causing
the func1->func2 call to use more stack memory than permitted.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic
2017-12-23 1:38 ` Jann Horn
@ 2017-12-23 2:07 ` Alexei Starovoitov
2017-12-23 2:26 ` Jann Horn
0 siblings, 1 reply; 7+ messages in thread
From: Alexei Starovoitov @ 2017-12-23 2:07 UTC (permalink / raw)
To: Jann Horn
Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann,
Network Development, kernel-team
On Sat, Dec 23, 2017 at 02:38:29AM +0100, Jann Horn wrote:
> On Fri, Dec 22, 2017 at 10:33 PM, Alexei Starovoitov <ast@kernel.org> wrote:
> > instead of computing max stack depth for current call chain only
> > track the maximum possible stack depth of any function at given
> > frame position. Such algorithm is simple and fast, but conservative,
> > since it overestimates amount of stack used. Consider:
> > main() // stack 32
> > {
> > A();
> > B();
> > }
> >
> > A(){} // stack 256
> >
> > B() // stack 64
> > {
> > A();
> > }
> >
> > since A() is called at frame[1] and frame[2], the algorithm
> > will estimate the max stack depth as 32 + 256 + 256 and will reject
> > such program, though real max is 32 + 64 + 256.
> >
> > Fortunately the algorithm is good enough in practice. The alternative
> > would be to track max stack of every function in the fast pass through
> > the verifier and then do additional CFG walk just to compute max total.
> >
> > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> > Reported-by: Jann Horn <jannh@google.com>
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>
> Does this work in cases where multiple invocations of a function have
> different stack access patterns because their inputs have different
> bounds?
>
> Consider this pseudocode example:
>
> void main(void) {
> func1(0);
> func1(1);
> func2(1);
> }
> void func1(int alloc_or_recurse) {
> if (alloc_or_recurse) {
> frame_pointer[-300] = 1;
> } else {
> func2(alloc_or_recurse);
> }
> }
> void func2(int alloc_or_recurse) {
> if (alloc_or_recurse) {
> frame_pointer[-300] = 1;
> }
> }
>
> AFAICS this will work as follows:
>
> Call to func1->func2 runs without any stack accesses because the
> verifier can prove that alloc_or_recurse is 0.
argh. right.
I guess that ruins my attemp to do the stack check inline
with the main verifier pass.
Do you see an algorithm that can do it without extra
cfg walk at the end?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic
2017-12-23 2:07 ` Alexei Starovoitov
@ 2017-12-23 2:26 ` Jann Horn
2017-12-23 3:03 ` Alexei Starovoitov
0 siblings, 1 reply; 7+ messages in thread
From: Jann Horn @ 2017-12-23 2:26 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann,
Network Development, kernel-team
On Sat, Dec 23, 2017 at 3:07 AM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Sat, Dec 23, 2017 at 02:38:29AM +0100, Jann Horn wrote:
>> On Fri, Dec 22, 2017 at 10:33 PM, Alexei Starovoitov <ast@kernel.org> wrote:
>> > instead of computing max stack depth for current call chain only
>> > track the maximum possible stack depth of any function at given
>> > frame position. Such algorithm is simple and fast, but conservative,
>> > since it overestimates amount of stack used. Consider:
>> > main() // stack 32
>> > {
>> > A();
>> > B();
>> > }
>> >
>> > A(){} // stack 256
>> >
>> > B() // stack 64
>> > {
>> > A();
>> > }
>> >
>> > since A() is called at frame[1] and frame[2], the algorithm
>> > will estimate the max stack depth as 32 + 256 + 256 and will reject
>> > such program, though real max is 32 + 64 + 256.
>> >
>> > Fortunately the algorithm is good enough in practice. The alternative
>> > would be to track max stack of every function in the fast pass through
>> > the verifier and then do additional CFG walk just to compute max total.
>> >
>> > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
>> > Reported-by: Jann Horn <jannh@google.com>
>> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>>
>> Does this work in cases where multiple invocations of a function have
>> different stack access patterns because their inputs have different
>> bounds?
>>
>> Consider this pseudocode example:
>>
>> void main(void) {
>> func1(0);
>> func1(1);
>> func2(1);
>> }
>> void func1(int alloc_or_recurse) {
>> if (alloc_or_recurse) {
>> frame_pointer[-300] = 1;
>> } else {
>> func2(alloc_or_recurse);
>> }
>> }
>> void func2(int alloc_or_recurse) {
>> if (alloc_or_recurse) {
>> frame_pointer[-300] = 1;
>> }
>> }
>>
>> AFAICS this will work as follows:
>>
>> Call to func1->func2 runs without any stack accesses because the
>> verifier can prove that alloc_or_recurse is 0.
>
> argh. right.
> I guess that ruins my attemp to do the stack check inline
> with the main verifier pass.
> Do you see an algorithm that can do it without extra
> cfg walk at the end?
A crappy heuristic would be to forbid recursion (calling a function
that is already present somewhere in the call stack) and then sum up
the maximum stack depths of all functions at the end and see whether
the sum is bigger than the maximum stack size. While it'd be horribly
conservative, it might work for now? 512 bytes are a lot of stack.
Or as a more complicated, but slightly less conservative heuristic,
you could forbid recursion, record the maximum number of stack frames
(max_stack_frames), then at the end select the top max_stack_frames
functions with the biggest stack sizes and sum up their sizes? (Or if
you want to support recursion, check
max_stack_frames*biggest_frame_size<=MAX_BPF_STACK.)
Anything else I can come up with is probably more complicated than an
extra cfg walk.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic
2017-12-23 2:26 ` Jann Horn
@ 2017-12-23 3:03 ` Alexei Starovoitov
0 siblings, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2017-12-23 3:03 UTC (permalink / raw)
To: Jann Horn
Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann,
Network Development, kernel-team
On Sat, Dec 23, 2017 at 03:26:27AM +0100, Jann Horn wrote:
> On Sat, Dec 23, 2017 at 3:07 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> > On Sat, Dec 23, 2017 at 02:38:29AM +0100, Jann Horn wrote:
> >> On Fri, Dec 22, 2017 at 10:33 PM, Alexei Starovoitov <ast@kernel.org> wrote:
> >> > instead of computing max stack depth for current call chain only
> >> > track the maximum possible stack depth of any function at given
> >> > frame position. Such algorithm is simple and fast, but conservative,
> >> > since it overestimates amount of stack used. Consider:
> >> > main() // stack 32
> >> > {
> >> > A();
> >> > B();
> >> > }
> >> >
> >> > A(){} // stack 256
> >> >
> >> > B() // stack 64
> >> > {
> >> > A();
> >> > }
> >> >
> >> > since A() is called at frame[1] and frame[2], the algorithm
> >> > will estimate the max stack depth as 32 + 256 + 256 and will reject
> >> > such program, though real max is 32 + 64 + 256.
> >> >
> >> > Fortunately the algorithm is good enough in practice. The alternative
> >> > would be to track max stack of every function in the fast pass through
> >> > the verifier and then do additional CFG walk just to compute max total.
> >> >
> >> > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> >> > Reported-by: Jann Horn <jannh@google.com>
> >> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> >>
> >> Does this work in cases where multiple invocations of a function have
> >> different stack access patterns because their inputs have different
> >> bounds?
> >>
> >> Consider this pseudocode example:
> >>
> >> void main(void) {
> >> func1(0);
> >> func1(1);
> >> func2(1);
> >> }
> >> void func1(int alloc_or_recurse) {
> >> if (alloc_or_recurse) {
> >> frame_pointer[-300] = 1;
> >> } else {
> >> func2(alloc_or_recurse);
> >> }
> >> }
> >> void func2(int alloc_or_recurse) {
> >> if (alloc_or_recurse) {
> >> frame_pointer[-300] = 1;
> >> }
> >> }
> >>
> >> AFAICS this will work as follows:
> >>
> >> Call to func1->func2 runs without any stack accesses because the
> >> verifier can prove that alloc_or_recurse is 0.
> >
> > argh. right.
> > I guess that ruins my attemp to do the stack check inline
> > with the main verifier pass.
> > Do you see an algorithm that can do it without extra
> > cfg walk at the end?
>
> A crappy heuristic would be to forbid recursion (calling a function
> that is already present somewhere in the call stack)
the recursion is already forbidden. It's a 'back-edge' from cfg point
of view. There are 3 tests that cover that in few variants.
> and then sum up
> the maximum stack depths of all functions at the end and see whether
> the sum is bigger than the maximum stack size. While it'd be horribly
> conservative, it might work for now? 512 bytes are a lot of stack.
That's what I tried first while developing bpf calls.
It should have been good enough, but round up to 32-byte makes even
tiny function into sizeable and both *_noinline.c tests fail to load.
Arguably they are artificial stress test that push the limits with
number of calls, argument passing and pointer accesses, but I don't
want to scale them down.
> Or as a more complicated, but slightly less conservative heuristic,
> you could forbid recursion, record the maximum number of stack frames
> (max_stack_frames), then at the end select the top max_stack_frames
> functions with the biggest stack sizes and sum up their sizes?
it's pretty much the same as previous one. In these two tests I'm
building long stack chain out of all functions.
> Anything else I can come up with is probably more complicated than an
> extra cfg walk.
I guess we have to generalize (or remember) cfg anyway for future use,
so will code it up now and see how it looks.
Thank you for the feedback!
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-12-23 3:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-22 21:33 [PATCH bpf-next 0/2] bpf: stack depth tracking fix Alexei Starovoitov
2017-12-22 21:33 ` [PATCH bpf-next 1/2] bpf: fix maximum stack depth tracking logic Alexei Starovoitov
2017-12-23 1:38 ` Jann Horn
2017-12-23 2:07 ` Alexei Starovoitov
2017-12-23 2:26 ` Jann Horn
2017-12-23 3:03 ` Alexei Starovoitov
2017-12-22 21:33 ` [PATCH bpf-next 2/2] selftests/bpf: additional stack depth tests Alexei Starovoitov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).