* [PATCH v2] bpf: fix stackmap overflow check in __bpf_get_stackid() @ 2025-07-29 16:56 Arnaud Lecomte 2025-07-29 22:45 ` Yonghong Song 0 siblings, 1 reply; 28+ messages in thread From: Arnaud Lecomte @ 2025-07-29 16:56 UTC (permalink / raw) To: song, jolsa, ast, daniel, andrii, martin.lau, eddyz87, yonghong.song, john.fastabend, kpsingh, sdf, haoluo Cc: bpf, linux-kernel, syzkaller-bugs, syzbot+c9b724fbb41cf2538b7b, Arnaud Lecomte Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() when copying stack trace data. The issue occurs when the perf trace contains more stack entries than the stack map bucket can hold, leading to an out-of-bounds write in the bucket's data array. For build_id mode, we use sizeof(struct bpf_stack_build_id) to determine capacity, and for normal mode we use sizeof(u64). Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b Tested-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- Changes in v2: - Use utilty stack_map_data_size to compute map stack map size --- kernel/bpf/stackmap.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 3615c06b7dfa..6f225d477f07 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -230,7 +230,7 @@ static long __bpf_get_stackid(struct bpf_map *map, struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); struct stack_map_bucket *bucket, *new_bucket, *old_bucket; u32 skip = flags & BPF_F_SKIP_FIELD_MASK; - u32 hash, id, trace_nr, trace_len, i; + u32 hash, id, trace_nr, trace_len, i, max_depth; bool user = flags & BPF_F_USER_STACK; u64 *ips; bool hash_matches; @@ -241,6 +241,12 @@ static long __bpf_get_stackid(struct bpf_map *map, trace_nr = trace->nr - skip; trace_len = trace_nr * sizeof(u64); + + /* Clamp the trace to max allowed depth */ + max_depth = smap->map.value_size / stack_map_data_size(map); + if (trace_nr > max_depth) + trace_nr = max_depth; + ips = trace->ip + skip; hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); id = hash & (smap->n_buckets - 1); -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-07-29 16:56 [PATCH v2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte @ 2025-07-29 22:45 ` Yonghong Song 2025-07-30 7:10 ` Arnaud Lecomte 0 siblings, 1 reply; 28+ messages in thread From: Yonghong Song @ 2025-07-29 22:45 UTC (permalink / raw) To: Arnaud Lecomte, song, jolsa, ast, daniel, andrii, martin.lau, eddyz87, john.fastabend, kpsingh, sdf, haoluo Cc: bpf, linux-kernel, syzkaller-bugs, syzbot+c9b724fbb41cf2538b7b On 7/29/25 9:56 AM, Arnaud Lecomte wrote: > Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() > when copying stack trace data. The issue occurs when the perf trace > contains more stack entries than the stack map bucket can hold, > leading to an out-of-bounds write in the bucket's data array. > For build_id mode, we use sizeof(struct bpf_stack_build_id) > to determine capacity, and for normal mode we use sizeof(u64). > > Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b > Tested-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com > Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> Could you add a selftest? This way folks can easily find out what is the problem and why this fix solves the issue correctly. > --- > Changes in v2: > - Use utilty stack_map_data_size to compute map stack map size > --- > kernel/bpf/stackmap.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c > index 3615c06b7dfa..6f225d477f07 100644 > --- a/kernel/bpf/stackmap.c > +++ b/kernel/bpf/stackmap.c > @@ -230,7 +230,7 @@ static long __bpf_get_stackid(struct bpf_map *map, > struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); > struct stack_map_bucket *bucket, *new_bucket, *old_bucket; > u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > - u32 hash, id, trace_nr, trace_len, i; > + u32 hash, id, trace_nr, trace_len, i, max_depth; > bool user = flags & BPF_F_USER_STACK; > u64 *ips; > bool hash_matches; > @@ -241,6 +241,12 @@ static long __bpf_get_stackid(struct bpf_map *map, > > trace_nr = trace->nr - skip; > trace_len = trace_nr * sizeof(u64); > + > + /* Clamp the trace to max allowed depth */ > + max_depth = smap->map.value_size / stack_map_data_size(map); > + if (trace_nr > max_depth) > + trace_nr = max_depth; > + > ips = trace->ip + skip; > hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); > id = hash & (smap->n_buckets - 1); ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-07-29 22:45 ` Yonghong Song @ 2025-07-30 7:10 ` Arnaud Lecomte 2025-08-01 18:16 ` Lecomte, Arnaud 0 siblings, 1 reply; 28+ messages in thread From: Arnaud Lecomte @ 2025-07-30 7:10 UTC (permalink / raw) To: Yonghong Song, song, jolsa, ast, daniel, andrii, martin.lau, eddyz87, john.fastabend, kpsingh, sdf, haoluo Cc: bpf, linux-kernel, syzkaller-bugs, syzbot+c9b724fbb41cf2538b7b On 29/07/2025 23:45, Yonghong Song wrote: > > > On 7/29/25 9:56 AM, Arnaud Lecomte wrote: >> Syzkaller reported a KASAN slab-out-of-bounds write in >> __bpf_get_stackid() >> when copying stack trace data. The issue occurs when the perf trace >> contains more stack entries than the stack map bucket can hold, >> leading to an out-of-bounds write in the bucket's data array. >> For build_id mode, we use sizeof(struct bpf_stack_build_id) >> to determine capacity, and for normal mode we use sizeof(u64). >> >> Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >> Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b >> Tested-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >> Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> > > Could you add a selftest? This way folks can easily find out what is > the problem and why this fix solves the issue correctly. > Sure, will be done after work Thanks, Arnaud >> --- >> Changes in v2: >> - Use utilty stack_map_data_size to compute map stack map size >> --- >> kernel/bpf/stackmap.c | 8 +++++++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c >> index 3615c06b7dfa..6f225d477f07 100644 >> --- a/kernel/bpf/stackmap.c >> +++ b/kernel/bpf/stackmap.c >> @@ -230,7 +230,7 @@ static long __bpf_get_stackid(struct bpf_map *map, >> struct bpf_stack_map *smap = container_of(map, struct >> bpf_stack_map, map); >> struct stack_map_bucket *bucket, *new_bucket, *old_bucket; >> u32 skip = flags & BPF_F_SKIP_FIELD_MASK; >> - u32 hash, id, trace_nr, trace_len, i; >> + u32 hash, id, trace_nr, trace_len, i, max_depth; >> bool user = flags & BPF_F_USER_STACK; >> u64 *ips; >> bool hash_matches; >> @@ -241,6 +241,12 @@ static long __bpf_get_stackid(struct bpf_map *map, >> trace_nr = trace->nr - skip; >> trace_len = trace_nr * sizeof(u64); >> + >> + /* Clamp the trace to max allowed depth */ >> + max_depth = smap->map.value_size / stack_map_data_size(map); >> + if (trace_nr > max_depth) >> + trace_nr = max_depth; >> + >> ips = trace->ip + skip; >> hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); >> id = hash & (smap->n_buckets - 1); > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-07-30 7:10 ` Arnaud Lecomte @ 2025-08-01 18:16 ` Lecomte, Arnaud 2025-08-05 20:49 ` Arnaud Lecomte 0 siblings, 1 reply; 28+ messages in thread From: Lecomte, Arnaud @ 2025-08-01 18:16 UTC (permalink / raw) To: Yonghong Song, song, jolsa, ast, daniel, andrii, martin.lau, eddyz87, john.fastabend, kpsingh, sdf, haoluo Cc: bpf, linux-kernel, syzkaller-bugs, syzbot+c9b724fbb41cf2538b7b Well, it turns out it is less straightforward than it looked like to detect the memory corruption without KASAN. I am currently in holidays for the next 3 days so I've limited access to a computer. I should be able to sort this out on monday. Thanks, Arnaud On 30/07/2025 08:10, Arnaud Lecomte wrote: > On 29/07/2025 23:45, Yonghong Song wrote: >> >> >> On 7/29/25 9:56 AM, Arnaud Lecomte wrote: >>> Syzkaller reported a KASAN slab-out-of-bounds write in >>> __bpf_get_stackid() >>> when copying stack trace data. The issue occurs when the perf trace >>> contains more stack entries than the stack map bucket can hold, >>> leading to an out-of-bounds write in the bucket's data array. >>> For build_id mode, we use sizeof(struct bpf_stack_build_id) >>> to determine capacity, and for normal mode we use sizeof(u64). >>> >>> Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >>> Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b >>> Tested-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >>> Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> >> >> Could you add a selftest? This way folks can easily find out what is >> the problem and why this fix solves the issue correctly. >> > Sure, will be done after work > Thanks, > Arnaud >>> --- >>> Changes in v2: >>> - Use utilty stack_map_data_size to compute map stack map size >>> --- >>> kernel/bpf/stackmap.c | 8 +++++++- >>> 1 file changed, 7 insertions(+), 1 deletion(-) >>> >>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c >>> index 3615c06b7dfa..6f225d477f07 100644 >>> --- a/kernel/bpf/stackmap.c >>> +++ b/kernel/bpf/stackmap.c >>> @@ -230,7 +230,7 @@ static long __bpf_get_stackid(struct bpf_map *map, >>> struct bpf_stack_map *smap = container_of(map, struct >>> bpf_stack_map, map); >>> struct stack_map_bucket *bucket, *new_bucket, *old_bucket; >>> u32 skip = flags & BPF_F_SKIP_FIELD_MASK; >>> - u32 hash, id, trace_nr, trace_len, i; >>> + u32 hash, id, trace_nr, trace_len, i, max_depth; >>> bool user = flags & BPF_F_USER_STACK; >>> u64 *ips; >>> bool hash_matches; >>> @@ -241,6 +241,12 @@ static long __bpf_get_stackid(struct bpf_map *map, >>> trace_nr = trace->nr - skip; >>> trace_len = trace_nr * sizeof(u64); >>> + >>> + /* Clamp the trace to max allowed depth */ >>> + max_depth = smap->map.value_size / stack_map_data_size(map); >>> + if (trace_nr > max_depth) >>> + trace_nr = max_depth; >>> + >>> ips = trace->ip + skip; >>> hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); >>> id = hash & (smap->n_buckets - 1); >> >> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-01 18:16 ` Lecomte, Arnaud @ 2025-08-05 20:49 ` Arnaud Lecomte 2025-08-06 1:52 ` Yonghong Song 0 siblings, 1 reply; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-05 20:49 UTC (permalink / raw) To: Yonghong Song, song, jolsa, ast, daniel, andrii, martin.lau, eddyz87, john.fastabend, kpsingh, sdf, haoluo Cc: bpf, linux-kernel, syzkaller-bugs, syzbot+c9b724fbb41cf2538b7b Hi, I gave it several tries and I can't find a nice to do see properly. The main challenge is to find a way to detect memory corruption. I wanted to place a canary value by tweaking the map size but we don't have a way from a BPF program perspective to access to the size of a stack_map_bucket. If we decide to do this computation manually, we would end-up with maintainability issues: #include "vmlinux.h" #include "bpf/bpf_helpers.h" #define MAX_STACK_DEPTH 32 #define CANARY_VALUE 0xBADCAFE /* Calculate size based on known layout: * - fnode: sizeof(void*) * - hash: 4 bytes * - nr: 4 bytes * - data: MAX_STACK_DEPTH * 8 bytes * - canary: 8 bytes */ #define VALUE_SIZE (sizeof(void*) + 4 + 4 + (MAX_STACK_DEPTH * 8) + 8) struct { __uint(type, BPF_MAP_TYPE_STACK_TRACE); __uint(max_entries, 1); __uint(value_size, VALUE_SIZE); __uint(key_size, sizeof(u32)); } stackmap SEC(".maps"); static __attribute__((noinline)) void recursive_helper(int depth) { if (depth <= 0) return; asm volatile("" ::: "memory"); recursive_helper(depth - 1); } SEC("kprobe/do_sys_open") int test_stack_overflow(void *ctx) { u32 key = 0; u64 *stack = bpf_map_lookup_elem(&stackmap, &key); if (!stack) return 0; stack[MAX_STACK_DEPTH] = CANARY_VALUE; /* Force minimum stack depth */ recursive_helper(MAX_STACK_DEPTH + 10); (void)bpf_get_stackid(ctx, &stackmap, 0); return 0; } char _license[] SEC("license") = "GPL"; On 01/08/2025 19:16, Lecomte, Arnaud wrote: > Well, it turns out it is less straightforward than it looked like to > detect the memory corruption > without KASAN. I am currently in holidays for the next 3 days so I've > limited access to a > computer. I should be able to sort this out on monday. > > Thanks, > Arnaud > > On 30/07/2025 08:10, Arnaud Lecomte wrote: >> On 29/07/2025 23:45, Yonghong Song wrote: >>> >>> >>> On 7/29/25 9:56 AM, Arnaud Lecomte wrote: >>>> Syzkaller reported a KASAN slab-out-of-bounds write in >>>> __bpf_get_stackid() >>>> when copying stack trace data. The issue occurs when the perf trace >>>> contains more stack entries than the stack map bucket can hold, >>>> leading to an out-of-bounds write in the bucket's data array. >>>> For build_id mode, we use sizeof(struct bpf_stack_build_id) >>>> to determine capacity, and for normal mode we use sizeof(u64). >>>> >>>> Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >>>> Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b >>>> Tested-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >>>> Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> >>> >>> Could you add a selftest? This way folks can easily find out what is >>> the problem and why this fix solves the issue correctly. >>> >> Sure, will be done after work >> Thanks, >> Arnaud >>>> --- >>>> Changes in v2: >>>> - Use utilty stack_map_data_size to compute map stack map size >>>> --- >>>> kernel/bpf/stackmap.c | 8 +++++++- >>>> 1 file changed, 7 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c >>>> index 3615c06b7dfa..6f225d477f07 100644 >>>> --- a/kernel/bpf/stackmap.c >>>> +++ b/kernel/bpf/stackmap.c >>>> @@ -230,7 +230,7 @@ static long __bpf_get_stackid(struct bpf_map *map, >>>> struct bpf_stack_map *smap = container_of(map, struct >>>> bpf_stack_map, map); >>>> struct stack_map_bucket *bucket, *new_bucket, *old_bucket; >>>> u32 skip = flags & BPF_F_SKIP_FIELD_MASK; >>>> - u32 hash, id, trace_nr, trace_len, i; >>>> + u32 hash, id, trace_nr, trace_len, i, max_depth; >>>> bool user = flags & BPF_F_USER_STACK; >>>> u64 *ips; >>>> bool hash_matches; >>>> @@ -241,6 +241,12 @@ static long __bpf_get_stackid(struct bpf_map >>>> *map, >>>> trace_nr = trace->nr - skip; >>>> trace_len = trace_nr * sizeof(u64); >>>> + >>>> + /* Clamp the trace to max allowed depth */ >>>> + max_depth = smap->map.value_size / stack_map_data_size(map); >>>> + if (trace_nr > max_depth) >>>> + trace_nr = max_depth; >>>> + >>>> ips = trace->ip + skip; >>>> hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); >>>> id = hash & (smap->n_buckets - 1); >>> >>> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-05 20:49 ` Arnaud Lecomte @ 2025-08-06 1:52 ` Yonghong Song 2025-08-07 17:50 ` [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte 0 siblings, 1 reply; 28+ messages in thread From: Yonghong Song @ 2025-08-06 1:52 UTC (permalink / raw) To: Arnaud Lecomte, song, jolsa, ast, daniel, andrii, martin.lau, eddyz87, john.fastabend, kpsingh, sdf, haoluo Cc: bpf, linux-kernel, syzkaller-bugs, syzbot+c9b724fbb41cf2538b7b On 8/5/25 1:49 PM, Arnaud Lecomte wrote: > Hi, > I gave it several tries and I can't find a nice to do see properly. > The main challenge is to find a way to detect memory corruption. I > wanted to place a canary value > by tweaking the map size but we don't have a way from a BPF program > perspective to access to the size > of a stack_map_bucket. If we decide to do this computation manually, > we would end-up with maintainability > issues: > #include "vmlinux.h" > #include "bpf/bpf_helpers.h" > > #define MAX_STACK_DEPTH 32 > #define CANARY_VALUE 0xBADCAFE > > /* Calculate size based on known layout: > * - fnode: sizeof(void*) > * - hash: 4 bytes > * - nr: 4 bytes > * - data: MAX_STACK_DEPTH * 8 bytes > * - canary: 8 bytes > */ > #define VALUE_SIZE (sizeof(void*) + 4 + 4 + (MAX_STACK_DEPTH * 8) + 8) > > struct { > __uint(type, BPF_MAP_TYPE_STACK_TRACE); > __uint(max_entries, 1); > __uint(value_size, VALUE_SIZE); > __uint(key_size, sizeof(u32)); > } stackmap SEC(".maps"); > > static __attribute__((noinline)) void recursive_helper(int depth) { > if (depth <= 0) return; > asm volatile("" ::: "memory"); > recursive_helper(depth - 1); > } > > SEC("kprobe/do_sys_open") > int test_stack_overflow(void *ctx) { > u32 key = 0; > u64 *stack = bpf_map_lookup_elem(&stackmap, &key); > if (!stack) return 0; > > stack[MAX_STACK_DEPTH] = CANARY_VALUE; > > /* Force minimum stack depth */ > recursive_helper(MAX_STACK_DEPTH + 10); > > (void)bpf_get_stackid(ctx, &stackmap, 0); > return 0; > } > > char _license[] SEC("license") = "GPL"; It looks like it hard to trigger memory corruption inside the kernel. Maybe kasan can detect it for your specific example. If without selftests, you can do the following: __bpf_get_stack() already solved the problem you tried to fix. I suggest you refactor some portions of the code in __bpf_get_stack() to set trace_nr properly, and then you can use that refactored function in __bpf_get_stackid(). So two patches: 1. refactor portion of codes (related elem_size/trace_nr) in __bpf_get_stack(). 2. fix the issue in __bpf_get_stackid() with newly created function. > > On 01/08/2025 19:16, Lecomte, Arnaud wrote: >> Well, it turns out it is less straightforward than it looked like to >> detect the memory corruption >> without KASAN. I am currently in holidays for the next 3 days so >> I've limited access to a >> computer. I should be able to sort this out on monday. >> >> Thanks, >> Arnaud >> >> On 30/07/2025 08:10, Arnaud Lecomte wrote: >>> On 29/07/2025 23:45, Yonghong Song wrote: >>>> >>>> >>>> On 7/29/25 9:56 AM, Arnaud Lecomte wrote: >>>>> Syzkaller reported a KASAN slab-out-of-bounds write in >>>>> __bpf_get_stackid() >>>>> when copying stack trace data. The issue occurs when the perf trace >>>>> contains more stack entries than the stack map bucket can hold, >>>>> leading to an out-of-bounds write in the bucket's data array. >>>>> For build_id mode, we use sizeof(struct bpf_stack_build_id) >>>>> to determine capacity, and for normal mode we use sizeof(u64). >>>>> >>>>> Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >>>>> Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b >>>>> Tested-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >>>>> Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> >>>> >>>> Could you add a selftest? This way folks can easily find out what is >>>> the problem and why this fix solves the issue correctly. >>>> >>> Sure, will be done after work >>> Thanks, >>> Arnaud >>>>> --- >>>>> Changes in v2: >>>>> - Use utilty stack_map_data_size to compute map stack map size >>>>> --- >>>>> kernel/bpf/stackmap.c | 8 +++++++- >>>>> 1 file changed, 7 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c >>>>> index 3615c06b7dfa..6f225d477f07 100644 >>>>> --- a/kernel/bpf/stackmap.c >>>>> +++ b/kernel/bpf/stackmap.c >>>>> @@ -230,7 +230,7 @@ static long __bpf_get_stackid(struct bpf_map >>>>> *map, >>>>> struct bpf_stack_map *smap = container_of(map, struct >>>>> bpf_stack_map, map); >>>>> struct stack_map_bucket *bucket, *new_bucket, *old_bucket; >>>>> u32 skip = flags & BPF_F_SKIP_FIELD_MASK; >>>>> - u32 hash, id, trace_nr, trace_len, i; >>>>> + u32 hash, id, trace_nr, trace_len, i, max_depth; >>>>> bool user = flags & BPF_F_USER_STACK; >>>>> u64 *ips; >>>>> bool hash_matches; >>>>> @@ -241,6 +241,12 @@ static long __bpf_get_stackid(struct bpf_map >>>>> *map, >>>>> trace_nr = trace->nr - skip; >>>>> trace_len = trace_nr * sizeof(u64); >>>>> + >>>>> + /* Clamp the trace to max allowed depth */ >>>>> + max_depth = smap->map.value_size / stack_map_data_size(map); >>>>> + if (trace_nr > max_depth) >>>>> + trace_nr = max_depth; >>>>> + >>>>> ips = trace->ip + skip; >>>>> hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); >>>>> id = hash & (smap->n_buckets - 1); >>>> >>>> ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-06 1:52 ` Yonghong Song @ 2025-08-07 17:50 ` Arnaud Lecomte 2025-08-07 17:52 ` [PATCH 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-07 17:50 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, contact, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs A new helper function stack_map_calculate_max_depth() that computes the max depth for a stackmap. Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 38 ++++++++++++++++++++++++++++++-------- 1 file changed, 30 insertions(+), 8 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 3615c06b7dfa..14e034045310 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -42,6 +42,31 @@ static inline int stack_map_data_size(struct bpf_map *map) sizeof(struct bpf_stack_build_id) : sizeof(u64); } +/** + * stack_map_calculate_max_depth - Calculate maximum allowed stack trace depth + * @map_size: Size of the buffer/map value in bytes + * @elem_size: Size of each stack trace element + * @map_flags: BPF stack trace flags (BPF_F_USER_STACK, BPF_F_USER_BUILD_ID, ...) + * + * Return: Maximum number of stack trace entries that can be safely stored, + * or -EINVAL if size is not a multiple of elem_size + */ +static u32 stack_map_calculate_max_depth(u32 map_size, u32 map_elem_size, u64 map_flags) +{ + u32 max_depth; + u32 skip = map_flags & BPF_F_SKIP_FIELD_MASK; + + if (unlikely(map_size%map_elem_size)) + return -EINVAL; + + max_depth = map_size / map_elem_size; + max_depth += skip; + if (max_depth > sysctl_perf_event_max_stack) + return sysctl_perf_event_max_stack; + + return max_depth; +} + static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) { u64 elem_size = sizeof(struct stack_map_bucket) + @@ -406,7 +431,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, struct perf_callchain_entry *trace_in, void *buf, u32 size, u64 flags, bool may_fault) { - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; + u32 trace_nr, copy_len, elem_size, max_depth; bool user_build_id = flags & BPF_F_USER_BUILD_ID; bool crosstask = task && task != current; u32 skip = flags & BPF_F_SKIP_FIELD_MASK; @@ -423,8 +448,6 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, goto clear; elem_size = user_build_id ? sizeof(struct bpf_stack_build_id) : sizeof(u64); - if (unlikely(size % elem_size)) - goto clear; /* cannot get valid user stack for task without user_mode regs */ if (task && user && !user_mode(regs)) @@ -438,10 +461,9 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, goto clear; } - num_elem = size / elem_size; - max_depth = num_elem + skip; - if (sysctl_perf_event_max_stack < max_depth) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); + if (max_depth < 0) + goto err_fault; if (may_fault) rcu_read_lock(); /* need RCU for perf's callchain below */ @@ -461,7 +483,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, } trace_nr = trace->nr - skip; - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; + trace_nr = min(trace_nr, max_depth - skip); copy_len = trace_nr * elem_size; ips = trace->ip + skip; -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-07 17:50 ` [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte @ 2025-08-07 17:52 ` Arnaud Lecomte 2025-08-07 19:05 ` Yonghong Song 2025-08-07 19:01 ` [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song 2025-08-08 7:30 ` [syzbot ci] " syzbot ci 2 siblings, 1 reply; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-07 17:52 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs, Arnaud Lecomte Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() when copying stack trace data. The issue occurs when the perf trace contains more stack entries than the stack map bucket can hold, leading to an out-of-bounds write in the bucket's data array. Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 14e034045310..d7ef840971f0 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -250,7 +250,7 @@ get_callchain_entry_for_task(struct task_struct *task, u32 max_depth) } static long __bpf_get_stackid(struct bpf_map *map, - struct perf_callchain_entry *trace, u64 flags) + struct perf_callchain_entry *trace, u64 flags, u32 max_depth) { struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); struct stack_map_bucket *bucket, *new_bucket, *old_bucket; @@ -266,6 +266,8 @@ static long __bpf_get_stackid(struct bpf_map *map, trace_nr = trace->nr - skip; trace_len = trace_nr * sizeof(u64); + trace_nr = min(trace_nr, max_depth - skip); + ips = trace->ip + skip; hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); id = hash & (smap->n_buckets - 1); @@ -325,19 +327,19 @@ static long __bpf_get_stackid(struct bpf_map *map, BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, u64, flags) { - u32 max_depth = map->value_size / stack_map_data_size(map); - u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 elem_size = stack_map_data_size(map); bool user = flags & BPF_F_USER_STACK; struct perf_callchain_entry *trace; bool kernel = !user; + u32 max_depth; if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) return -EINVAL; - max_depth += skip; - if (max_depth > sysctl_perf_event_max_stack) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + if (max_depth < 0) + return -EFAULT; trace = get_perf_callchain(regs, 0, kernel, user, max_depth, false, false); @@ -346,7 +348,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, /* couldn't fetch the stack trace */ return -EFAULT; - return __bpf_get_stackid(map, trace, flags); + return __bpf_get_stackid(map, trace, flags, max_depth); } const struct bpf_func_proto bpf_get_stackid_proto = { @@ -378,6 +380,7 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, bool kernel, user; __u64 nr_kernel; int ret; + u32 elem_size, pe_max_depth; /* perf_sample_data doesn't have callchain, use bpf_get_stackid */ if (!(event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) @@ -396,24 +399,25 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, return -EFAULT; nr_kernel = count_kernel_ip(trace); - + elem_size = stack_map_data_size(map); if (kernel) { __u64 nr = trace->nr; trace->nr = nr_kernel; - ret = __bpf_get_stackid(map, trace, flags); + pe_max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, pe_max_depth); /* restore nr */ trace->nr = nr; } else { /* user */ u64 skip = flags & BPF_F_SKIP_FIELD_MASK; - skip += nr_kernel; if (skip > BPF_F_SKIP_FIELD_MASK) return -EFAULT; flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; - ret = __bpf_get_stackid(map, trace, flags); + pe_max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, pe_max_depth); } return ret; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-07 17:52 ` [PATCH 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte @ 2025-08-07 19:05 ` Yonghong Song 0 siblings, 0 replies; 28+ messages in thread From: Yonghong Song @ 2025-08-07 19:05 UTC (permalink / raw) To: Arnaud Lecomte Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs On 8/7/25 10:52 AM, Arnaud Lecomte wrote: > Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() > when copying stack trace data. The issue occurs when the perf trace > contains more stack entries than the stack map bucket can hold, > leading to an out-of-bounds write in the bucket's data array. > > Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b > Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> > --- > kernel/bpf/stackmap.c | 26 +++++++++++++++----------- > 1 file changed, 15 insertions(+), 11 deletions(-) > > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c > index 14e034045310..d7ef840971f0 100644 > --- a/kernel/bpf/stackmap.c > +++ b/kernel/bpf/stackmap.c > @@ -250,7 +250,7 @@ get_callchain_entry_for_task(struct task_struct *task, u32 max_depth) > } > > static long __bpf_get_stackid(struct bpf_map *map, > - struct perf_callchain_entry *trace, u64 flags) > + struct perf_callchain_entry *trace, u64 flags, u32 max_depth) > { > struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); > struct stack_map_bucket *bucket, *new_bucket, *old_bucket; > @@ -266,6 +266,8 @@ static long __bpf_get_stackid(struct bpf_map *map, > > trace_nr = trace->nr - skip; > trace_len = trace_nr * sizeof(u64); > + trace_nr = min(trace_nr, max_depth - skip); > + > ips = trace->ip + skip; > hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); > id = hash & (smap->n_buckets - 1); > @@ -325,19 +327,19 @@ static long __bpf_get_stackid(struct bpf_map *map, > BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, > u64, flags) > { > - u32 max_depth = map->value_size / stack_map_data_size(map); > - u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > + u32 elem_size = stack_map_data_size(map); > bool user = flags & BPF_F_USER_STACK; > struct perf_callchain_entry *trace; > bool kernel = !user; > + u32 max_depth; > > if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | > BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) > return -EINVAL; > > - max_depth += skip; > - if (max_depth > sysctl_perf_event_max_stack) > - max_depth = sysctl_perf_event_max_stack; > + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); > + if (max_depth < 0) > + return -EFAULT; the above condition is not needed. > > trace = get_perf_callchain(regs, 0, kernel, user, max_depth, > false, false); > @@ -346,7 +348,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, > /* couldn't fetch the stack trace */ > return -EFAULT; > > - return __bpf_get_stackid(map, trace, flags); > + return __bpf_get_stackid(map, trace, flags, max_depth); > } > > const struct bpf_func_proto bpf_get_stackid_proto = { > @@ -378,6 +380,7 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, > bool kernel, user; > __u64 nr_kernel; > int ret; > + u32 elem_size, pe_max_depth; pe_max_depth -> max_depth. > > /* perf_sample_data doesn't have callchain, use bpf_get_stackid */ > if (!(event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) > @@ -396,24 +399,25 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, > return -EFAULT; > > nr_kernel = count_kernel_ip(trace); > - > + elem_size = stack_map_data_size(map); > if (kernel) { > __u64 nr = trace->nr; > > trace->nr = nr_kernel; > - ret = __bpf_get_stackid(map, trace, flags); > + pe_max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); > + ret = __bpf_get_stackid(map, trace, flags, pe_max_depth); > > /* restore nr */ > trace->nr = nr; > } else { /* user */ > u64 skip = flags & BPF_F_SKIP_FIELD_MASK; > - please keep an empty line here. > skip += nr_kernel; > if (skip > BPF_F_SKIP_FIELD_MASK) > return -EFAULT; > > flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; > - ret = __bpf_get_stackid(map, trace, flags); > + pe_max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); > + ret = __bpf_get_stackid(map, trace, flags, pe_max_depth); > } > return ret; > } ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-07 17:50 ` [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte 2025-08-07 17:52 ` [PATCH 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte @ 2025-08-07 19:01 ` Yonghong Song 2025-08-07 19:07 ` Yonghong Song 2025-08-08 7:30 ` [syzbot ci] " syzbot ci 2 siblings, 1 reply; 28+ messages in thread From: Yonghong Song @ 2025-08-07 19:01 UTC (permalink / raw) To: Arnaud Lecomte Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs On 8/7/25 10:50 AM, Arnaud Lecomte wrote: > A new helper function stack_map_calculate_max_depth() that > computes the max depth for a stackmap. > > Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> > --- > kernel/bpf/stackmap.c | 38 ++++++++++++++++++++++++++++++-------- > 1 file changed, 30 insertions(+), 8 deletions(-) > > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c > index 3615c06b7dfa..14e034045310 100644 > --- a/kernel/bpf/stackmap.c > +++ b/kernel/bpf/stackmap.c > @@ -42,6 +42,31 @@ static inline int stack_map_data_size(struct bpf_map *map) > sizeof(struct bpf_stack_build_id) : sizeof(u64); > } > > +/** > + * stack_map_calculate_max_depth - Calculate maximum allowed stack trace depth > + * @map_size: Size of the buffer/map value in bytes > + * @elem_size: Size of each stack trace element > + * @map_flags: BPF stack trace flags (BPF_F_USER_STACK, BPF_F_USER_BUILD_ID, ...) > + * > + * Return: Maximum number of stack trace entries that can be safely stored, > + * or -EINVAL if size is not a multiple of elem_size -EINVAL is not needed here. See below. > + */ > +static u32 stack_map_calculate_max_depth(u32 map_size, u32 map_elem_size, u64 map_flags) map_elem_size -> elem_size > +{ > + u32 max_depth; > + u32 skip = map_flags & BPF_F_SKIP_FIELD_MASK; reverse Christmas tree? > + > + if (unlikely(map_size%map_elem_size)) > + return -EINVAL; The above should not be here. The checking 'map_size % map_elem_size' is only needed for bpf_get_stack(), not applicable for bpf_get_stackid(). > + > + max_depth = map_size / map_elem_size; > + max_depth += skip; > + if (max_depth > sysctl_perf_event_max_stack) > + return sysctl_perf_event_max_stack; > + > + return max_depth; > +} > + > static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) > { > u64 elem_size = sizeof(struct stack_map_bucket) + > @@ -406,7 +431,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > struct perf_callchain_entry *trace_in, > void *buf, u32 size, u64 flags, bool may_fault) > { > - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; > + u32 trace_nr, copy_len, elem_size, max_depth; > bool user_build_id = flags & BPF_F_USER_BUILD_ID; > bool crosstask = task && task != current; > u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > @@ -423,8 +448,6 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > goto clear; > > elem_size = user_build_id ? sizeof(struct bpf_stack_build_id) : sizeof(u64); > - if (unlikely(size % elem_size)) > - goto clear; Please keep this one. > > /* cannot get valid user stack for task without user_mode regs */ > if (task && user && !user_mode(regs)) > @@ -438,10 +461,9 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > goto clear; > } > > - num_elem = size / elem_size; > - max_depth = num_elem + skip; > - if (sysctl_perf_event_max_stack < max_depth) > - max_depth = sysctl_perf_event_max_stack; > + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); > + if (max_depth < 0) > + goto err_fault; max_depth is never less than 0. > > if (may_fault) > rcu_read_lock(); /* need RCU for perf's callchain below */ > @@ -461,7 +483,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > } > > trace_nr = trace->nr - skip; > - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; > + trace_nr = min(trace_nr, max_depth - skip); > copy_len = trace_nr * elem_size; > > ips = trace->ip + skip; ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-07 19:01 ` [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song @ 2025-08-07 19:07 ` Yonghong Song 2025-08-09 11:56 ` [PATCH v2 " Arnaud Lecomte 0 siblings, 1 reply; 28+ messages in thread From: Yonghong Song @ 2025-08-07 19:07 UTC (permalink / raw) To: Arnaud Lecomte Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs On 8/7/25 12:01 PM, Yonghong Song wrote: > > > On 8/7/25 10:50 AM, Arnaud Lecomte wrote: >> A new helper function stack_map_calculate_max_depth() that >> computes the max depth for a stackmap. >> >> Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> >> --- >> kernel/bpf/stackmap.c | 38 ++++++++++++++++++++++++++++++-------- >> 1 file changed, 30 insertions(+), 8 deletions(-) >> >> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c >> index 3615c06b7dfa..14e034045310 100644 >> --- a/kernel/bpf/stackmap.c >> +++ b/kernel/bpf/stackmap.c >> @@ -42,6 +42,31 @@ static inline int stack_map_data_size(struct >> bpf_map *map) >> sizeof(struct bpf_stack_build_id) : sizeof(u64); >> } >> +/** >> + * stack_map_calculate_max_depth - Calculate maximum allowed stack >> trace depth >> + * @map_size: Size of the buffer/map value in bytes >> + * @elem_size: Size of each stack trace element >> + * @map_flags: BPF stack trace flags (BPF_F_USER_STACK, >> BPF_F_USER_BUILD_ID, ...) One more thing: map_flags -> flags, as 'flags is used in bpf_get_stackid/bpf_get_stack etc. >> + * >> + * Return: Maximum number of stack trace entries that can be safely >> stored, >> + * or -EINVAL if size is not a multiple of elem_size > > -EINVAL is not needed here. See below. [...] ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-07 19:07 ` Yonghong Song @ 2025-08-09 11:56 ` Arnaud Lecomte 2025-08-09 11:58 ` [PATCH v2 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 0 siblings, 1 reply; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-09 11:56 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, contact, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs A new helper function stack_map_calculate_max_depth() that computes the max depth for a stackmap. Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 3615c06b7dfa..532447606532 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -42,6 +42,27 @@ static inline int stack_map_data_size(struct bpf_map *map) sizeof(struct bpf_stack_build_id) : sizeof(u64); } +/** + * stack_map_calculate_max_depth - Calculate maximum allowed stack trace depth + * @map_size: Size of the buffer/map value in bytes + * @elem_size: Size of each stack trace element + * @flags: BPF stack trace flags (BPF_F_USER_STACK, BPF_F_USER_BUILD_ID, ...) + * + * Return: Maximum number of stack trace entries that can be safely stored + */ +static u32 stack_map_calculate_max_depth(u32 map_size, u32 elem_size, u64 flags) +{ + u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 max_depth; + + max_depth = map_size / elem_size; + max_depth += skip; + if (max_depth > sysctl_perf_event_max_stack) + return sysctl_perf_event_max_stack; + + return max_depth; +} + static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) { u64 elem_size = sizeof(struct stack_map_bucket) + @@ -406,7 +427,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, struct perf_callchain_entry *trace_in, void *buf, u32 size, u64 flags, bool may_fault) { - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; + u32 trace_nr, copy_len, elem_size, max_depth; bool user_build_id = flags & BPF_F_USER_BUILD_ID; bool crosstask = task && task != current; u32 skip = flags & BPF_F_SKIP_FIELD_MASK; @@ -438,10 +459,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, goto clear; } - num_elem = size / elem_size; - max_depth = num_elem + skip; - if (sysctl_perf_event_max_stack < max_depth) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); if (may_fault) rcu_read_lock(); /* need RCU for perf's callchain below */ @@ -461,7 +479,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, } trace_nr = trace->nr - skip; - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; + trace_nr = min(trace_nr, max_depth - skip); copy_len = trace_nr * elem_size; ips = trace->ip + skip; -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-09 11:56 ` [PATCH v2 " Arnaud Lecomte @ 2025-08-09 11:58 ` Arnaud Lecomte 2025-08-09 12:09 ` [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte 0 siblings, 1 reply; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-09 11:58 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs, Arnaud Lecomte Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() when copying stack trace data. The issue occurs when the perf trace contains more stack entries than the stack map bucket can hold, leading to an out-of-bounds write in the bucket's data array. Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 532447606532..30c4f7f2ccd1 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -246,7 +246,7 @@ get_callchain_entry_for_task(struct task_struct *task, u32 max_depth) } static long __bpf_get_stackid(struct bpf_map *map, - struct perf_callchain_entry *trace, u64 flags) + struct perf_callchain_entry *trace, u64 flags, u32 max_depth) { struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); struct stack_map_bucket *bucket, *new_bucket, *old_bucket; @@ -262,6 +262,8 @@ static long __bpf_get_stackid(struct bpf_map *map, trace_nr = trace->nr - skip; trace_len = trace_nr * sizeof(u64); + trace_nr = min(trace_nr, max_depth - skip); + ips = trace->ip + skip; hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); id = hash & (smap->n_buckets - 1); @@ -321,19 +323,19 @@ static long __bpf_get_stackid(struct bpf_map *map, BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, u64, flags) { - u32 max_depth = map->value_size / stack_map_data_size(map); - u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 elem_size = stack_map_data_size(map); bool user = flags & BPF_F_USER_STACK; struct perf_callchain_entry *trace; bool kernel = !user; + u32 max_depth; if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) return -EINVAL; - max_depth += skip; - if (max_depth > sysctl_perf_event_max_stack) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + if (max_depth < 0) + return -EFAULT; trace = get_perf_callchain(regs, 0, kernel, user, max_depth, false, false); @@ -342,7 +344,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, /* couldn't fetch the stack trace */ return -EFAULT; - return __bpf_get_stackid(map, trace, flags); + return __bpf_get_stackid(map, trace, flags, max_depth); } const struct bpf_func_proto bpf_get_stackid_proto = { @@ -374,6 +376,7 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, bool kernel, user; __u64 nr_kernel; int ret; + u32 elem_size, pe_max_depth; /* perf_sample_data doesn't have callchain, use bpf_get_stackid */ if (!(event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) @@ -392,24 +395,25 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, return -EFAULT; nr_kernel = count_kernel_ip(trace); - + elem_size = stack_map_data_size(map); if (kernel) { __u64 nr = trace->nr; trace->nr = nr_kernel; - ret = __bpf_get_stackid(map, trace, flags); + pe_max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, pe_max_depth); /* restore nr */ trace->nr = nr; } else { /* user */ u64 skip = flags & BPF_F_SKIP_FIELD_MASK; - skip += nr_kernel; if (skip > BPF_F_SKIP_FIELD_MASK) return -EFAULT; flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; - ret = __bpf_get_stackid(map, trace, flags); + pe_max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, pe_max_depth); } return ret; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-09 11:58 ` [PATCH v2 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte @ 2025-08-09 12:09 ` Arnaud Lecomte 2025-08-09 12:14 ` [PATCH RESEND v2 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 2025-08-12 4:39 ` [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song 0 siblings, 2 replies; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-09 12:09 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs, Arnaud Lecomte A new helper function stack_map_calculate_max_depth() that computes the max depth for a stackmap. Changes in v2: - Removed the checking 'map_size % map_elem_size' from stack_map_calculate_max_depth - Changed stack_map_calculate_max_depth params name to be more generic Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 3615c06b7dfa..532447606532 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -42,6 +42,27 @@ static inline int stack_map_data_size(struct bpf_map *map) sizeof(struct bpf_stack_build_id) : sizeof(u64); } +/** + * stack_map_calculate_max_depth - Calculate maximum allowed stack trace depth + * @map_size: Size of the buffer/map value in bytes + * @elem_size: Size of each stack trace element + * @flags: BPF stack trace flags (BPF_F_USER_STACK, BPF_F_USER_BUILD_ID, ...) + * + * Return: Maximum number of stack trace entries that can be safely stored + */ +static u32 stack_map_calculate_max_depth(u32 map_size, u32 elem_size, u64 flags) +{ + u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 max_depth; + + max_depth = map_size / elem_size; + max_depth += skip; + if (max_depth > sysctl_perf_event_max_stack) + return sysctl_perf_event_max_stack; + + return max_depth; +} + static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) { u64 elem_size = sizeof(struct stack_map_bucket) + @@ -406,7 +427,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, struct perf_callchain_entry *trace_in, void *buf, u32 size, u64 flags, bool may_fault) { - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; + u32 trace_nr, copy_len, elem_size, max_depth; bool user_build_id = flags & BPF_F_USER_BUILD_ID; bool crosstask = task && task != current; u32 skip = flags & BPF_F_SKIP_FIELD_MASK; @@ -438,10 +459,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, goto clear; } - num_elem = size / elem_size; - max_depth = num_elem + skip; - if (sysctl_perf_event_max_stack < max_depth) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); if (may_fault) rcu_read_lock(); /* need RCU for perf's callchain below */ @@ -461,7 +479,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, } trace_nr = trace->nr - skip; - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; + trace_nr = min(trace_nr, max_depth - skip); copy_len = trace_nr * elem_size; ips = trace->ip + skip; -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH RESEND v2 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-09 12:09 ` [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte @ 2025-08-09 12:14 ` Arnaud Lecomte 2025-08-12 4:39 ` [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song 1 sibling, 0 replies; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-09 12:14 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs, Arnaud Lecomte Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() when copying stack trace data. The issue occurs when the perf trace contains more stack entries than the stack map bucket can hold, leading to an out-of-bounds write in the bucket's data array. Changes in v2: - Fixed max_depth names across get stack id Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 532447606532..b3995724776c 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -246,7 +246,7 @@ get_callchain_entry_for_task(struct task_struct *task, u32 max_depth) } static long __bpf_get_stackid(struct bpf_map *map, - struct perf_callchain_entry *trace, u64 flags) + struct perf_callchain_entry *trace, u64 flags, u32 max_depth) { struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); struct stack_map_bucket *bucket, *new_bucket, *old_bucket; @@ -262,6 +262,8 @@ static long __bpf_get_stackid(struct bpf_map *map, trace_nr = trace->nr - skip; trace_len = trace_nr * sizeof(u64); + trace_nr = min(trace_nr, max_depth - skip); + ips = trace->ip + skip; hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); id = hash & (smap->n_buckets - 1); @@ -321,19 +323,17 @@ static long __bpf_get_stackid(struct bpf_map *map, BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, u64, flags) { - u32 max_depth = map->value_size / stack_map_data_size(map); - u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 elem_size = stack_map_data_size(map); bool user = flags & BPF_F_USER_STACK; struct perf_callchain_entry *trace; bool kernel = !user; + u32 max_depth; if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) return -EINVAL; - max_depth += skip; - if (max_depth > sysctl_perf_event_max_stack) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); trace = get_perf_callchain(regs, 0, kernel, user, max_depth, false, false); @@ -342,7 +342,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, /* couldn't fetch the stack trace */ return -EFAULT; - return __bpf_get_stackid(map, trace, flags); + return __bpf_get_stackid(map, trace, flags, max_depth); } const struct bpf_func_proto bpf_get_stackid_proto = { @@ -374,6 +374,7 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, bool kernel, user; __u64 nr_kernel; int ret; + u32 elem_size, max_depth; /* perf_sample_data doesn't have callchain, use bpf_get_stackid */ if (!(event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) @@ -392,16 +393,18 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, return -EFAULT; nr_kernel = count_kernel_ip(trace); - + elem_size = stack_map_data_size(map); if (kernel) { __u64 nr = trace->nr; trace->nr = nr_kernel; - ret = __bpf_get_stackid(map, trace, flags); + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, max_depth); /* restore nr */ trace->nr = nr; } else { /* user */ + u64 skip = flags & BPF_F_SKIP_FIELD_MASK; skip += nr_kernel; @@ -409,7 +412,8 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, return -EFAULT; flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; - ret = __bpf_get_stackid(map, trace, flags); + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, max_depth); } return ret; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-09 12:09 ` [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte 2025-08-09 12:14 ` [PATCH RESEND v2 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte @ 2025-08-12 4:39 ` Yonghong Song 2025-08-12 19:30 ` [PATCH bpf-next v3 " Arnaud Lecomte 2025-08-12 19:32 ` [PATCH RESEND v2 " Arnaud Lecomte 1 sibling, 2 replies; 28+ messages in thread From: Yonghong Song @ 2025-08-12 4:39 UTC (permalink / raw) To: Arnaud Lecomte Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs On 8/9/25 5:09 AM, Arnaud Lecomte wrote: > A new helper function stack_map_calculate_max_depth() that > computes the max depth for a stackmap. Please add 'bpf-next' in the subject like [PATCH bpf-next v2 1/2] so CI can properly test the patch set. > > Changes in v2: > - Removed the checking 'map_size % map_elem_size' from stack_map_calculate_max_depth > - Changed stack_map_calculate_max_depth params name to be more generic > > Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> > --- > kernel/bpf/stackmap.c | 30 ++++++++++++++++++++++++------ > 1 file changed, 24 insertions(+), 6 deletions(-) > > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c > index 3615c06b7dfa..532447606532 100644 > --- a/kernel/bpf/stackmap.c > +++ b/kernel/bpf/stackmap.c > @@ -42,6 +42,27 @@ static inline int stack_map_data_size(struct bpf_map *map) > sizeof(struct bpf_stack_build_id) : sizeof(u64); > } > > +/** > + * stack_map_calculate_max_depth - Calculate maximum allowed stack trace depth > + * @map_size: Size of the buffer/map value in bytes let us rename 'map_size' to 'size' since the size represents size of buffer or map, not just for map. > + * @elem_size: Size of each stack trace element > + * @flags: BPF stack trace flags (BPF_F_USER_STACK, BPF_F_USER_BUILD_ID, ...) > + * > + * Return: Maximum number of stack trace entries that can be safely stored > + */ > +static u32 stack_map_calculate_max_depth(u32 map_size, u32 elem_size, u64 flags) map_size -> size Also, you can replace 'flags' to 'skip', so below 'u32 skip = flags & BPF_F_SKIP_FIELD_MASK' is not necessary. > +{ > + u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > + u32 max_depth; > + > + max_depth = map_size / elem_size; > + max_depth += skip; > + if (max_depth > sysctl_perf_event_max_stack) > + return sysctl_perf_event_max_stack; > + > + return max_depth; > +} > + > static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) > { > u64 elem_size = sizeof(struct stack_map_bucket) + > @@ -406,7 +427,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > struct perf_callchain_entry *trace_in, > void *buf, u32 size, u64 flags, bool may_fault) > { > - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; > + u32 trace_nr, copy_len, elem_size, max_depth; > bool user_build_id = flags & BPF_F_USER_BUILD_ID; > bool crosstask = task && task != current; > u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > @@ -438,10 +459,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > goto clear; > } > > - num_elem = size / elem_size; > - max_depth = num_elem + skip; > - if (sysctl_perf_event_max_stack < max_depth) > - max_depth = sysctl_perf_event_max_stack; > + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); > > if (may_fault) > rcu_read_lock(); /* need RCU for perf's callchain below */ > @@ -461,7 +479,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > } > > trace_nr = trace->nr - skip; > - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; > + trace_nr = min(trace_nr, max_depth - skip); > copy_len = trace_nr * elem_size; > > ips = trace->ip + skip; ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH bpf-next v3 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-12 4:39 ` [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song @ 2025-08-12 19:30 ` Arnaud Lecomte 2025-08-12 19:32 ` [PATCH bpf-next v3 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 2025-08-13 5:54 ` [PATCH bpf-next v3 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song 2025-08-12 19:32 ` [PATCH RESEND v2 " Arnaud Lecomte 1 sibling, 2 replies; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-12 19:30 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, contact, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs A new helper function stack_map_calculate_max_depth() that computes the max depth for a stackmap. Changes in v2: - Removed the checking 'map_size % map_elem_size' from stack_map_calculate_max_depth - Changed stack_map_calculate_max_depth params name to be more generic Changes in v3: - Changed map size param to size in max depth helper Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 3615c06b7dfa..a267567e36dd 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -42,6 +42,27 @@ static inline int stack_map_data_size(struct bpf_map *map) sizeof(struct bpf_stack_build_id) : sizeof(u64); } +/** + * stack_map_calculate_max_depth - Calculate maximum allowed stack trace depth + * @size: Size of the buffer/map value in bytes + * @elem_size: Size of each stack trace element + * @flags: BPF stack trace flags (BPF_F_USER_STACK, BPF_F_USER_BUILD_ID, ...) + * + * Return: Maximum number of stack trace entries that can be safely stored + */ +static u32 stack_map_calculate_max_depth(u32 size, u32 elem_size, u64 flags) +{ + u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 max_depth; + + max_depth = size / elem_size; + max_depth += skip; + if (max_depth > sysctl_perf_event_max_stack) + return sysctl_perf_event_max_stack; + + return max_depth; +} + static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) { u64 elem_size = sizeof(struct stack_map_bucket) + @@ -406,7 +427,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, struct perf_callchain_entry *trace_in, void *buf, u32 size, u64 flags, bool may_fault) { - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; + u32 trace_nr, copy_len, elem_size, max_depth; bool user_build_id = flags & BPF_F_USER_BUILD_ID; bool crosstask = task && task != current; u32 skip = flags & BPF_F_SKIP_FIELD_MASK; @@ -438,10 +459,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, goto clear; } - num_elem = size / elem_size; - max_depth = num_elem + skip; - if (sysctl_perf_event_max_stack < max_depth) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); if (may_fault) rcu_read_lock(); /* need RCU for perf's callchain below */ @@ -461,7 +479,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, } trace_nr = trace->nr - skip; - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; + trace_nr = min(trace_nr, max_depth - skip); copy_len = trace_nr * elem_size; ips = trace->ip + skip; -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH bpf-next v3 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-12 19:30 ` [PATCH bpf-next v3 " Arnaud Lecomte @ 2025-08-12 19:32 ` Arnaud Lecomte 2025-08-13 5:59 ` Yonghong Song 2025-08-13 5:54 ` [PATCH bpf-next v3 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song 1 sibling, 1 reply; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-12 19:32 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs, Arnaud Lecomte Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() when copying stack trace data. The issue occurs when the perf trace contains more stack entries than the stack map bucket can hold, leading to an out-of-bounds write in the bucket's data array. Changes in v2: - Fixed max_depth names across get stack id Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index a267567e36dd..e1ee18cbbbb2 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -246,7 +246,7 @@ get_callchain_entry_for_task(struct task_struct *task, u32 max_depth) } static long __bpf_get_stackid(struct bpf_map *map, - struct perf_callchain_entry *trace, u64 flags) + struct perf_callchain_entry *trace, u64 flags, u32 max_depth) { struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); struct stack_map_bucket *bucket, *new_bucket, *old_bucket; @@ -262,6 +262,8 @@ static long __bpf_get_stackid(struct bpf_map *map, trace_nr = trace->nr - skip; trace_len = trace_nr * sizeof(u64); + trace_nr = min(trace_nr, max_depth - skip); + ips = trace->ip + skip; hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); id = hash & (smap->n_buckets - 1); @@ -321,19 +323,17 @@ static long __bpf_get_stackid(struct bpf_map *map, BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, u64, flags) { - u32 max_depth = map->value_size / stack_map_data_size(map); - u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 elem_size = stack_map_data_size(map); bool user = flags & BPF_F_USER_STACK; struct perf_callchain_entry *trace; bool kernel = !user; + u32 max_depth; if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) return -EINVAL; - max_depth += skip; - if (max_depth > sysctl_perf_event_max_stack) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); trace = get_perf_callchain(regs, 0, kernel, user, max_depth, false, false); @@ -342,7 +342,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, /* couldn't fetch the stack trace */ return -EFAULT; - return __bpf_get_stackid(map, trace, flags); + return __bpf_get_stackid(map, trace, flags, max_depth); } const struct bpf_func_proto bpf_get_stackid_proto = { @@ -374,6 +374,7 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, bool kernel, user; __u64 nr_kernel; int ret; + u32 elem_size, max_depth; /* perf_sample_data doesn't have callchain, use bpf_get_stackid */ if (!(event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) @@ -392,16 +393,18 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, return -EFAULT; nr_kernel = count_kernel_ip(trace); - + elem_size = stack_map_data_size(map); if (kernel) { __u64 nr = trace->nr; trace->nr = nr_kernel; - ret = __bpf_get_stackid(map, trace, flags); + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, max_depth); /* restore nr */ trace->nr = nr; } else { /* user */ + u64 skip = flags & BPF_F_SKIP_FIELD_MASK; skip += nr_kernel; @@ -409,7 +412,8 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, return -EFAULT; flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; - ret = __bpf_get_stackid(map, trace, flags); + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, max_depth); } return ret; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH bpf-next v3 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-12 19:32 ` [PATCH bpf-next v3 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte @ 2025-08-13 5:59 ` Yonghong Song 2025-08-13 20:46 ` [PATCH bpf-next v4 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte 0 siblings, 1 reply; 28+ messages in thread From: Yonghong Song @ 2025-08-13 5:59 UTC (permalink / raw) To: Arnaud Lecomte Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs On 8/12/25 12:32 PM, Arnaud Lecomte wrote: > Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() > when copying stack trace data. The issue occurs when the perf trace > contains more stack entries than the stack map bucket can hold, > leading to an out-of-bounds write in the bucket's data array. > > Changes in v2: > - Fixed max_depth names across get stack id > > Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b > Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> LGTM with a few nits below. Acked-by: Yonghong Song <yonghong.song@linux.dev> > --- > kernel/bpf/stackmap.c | 24 ++++++++++++++---------- > 1 file changed, 14 insertions(+), 10 deletions(-) > > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c > index a267567e36dd..e1ee18cbbbb2 100644 > --- a/kernel/bpf/stackmap.c > +++ b/kernel/bpf/stackmap.c > @@ -246,7 +246,7 @@ get_callchain_entry_for_task(struct task_struct *task, u32 max_depth) > } > > static long __bpf_get_stackid(struct bpf_map *map, > - struct perf_callchain_entry *trace, u64 flags) > + struct perf_callchain_entry *trace, u64 flags, u32 max_depth) > { > struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); > struct stack_map_bucket *bucket, *new_bucket, *old_bucket; > @@ -262,6 +262,8 @@ static long __bpf_get_stackid(struct bpf_map *map, > > trace_nr = trace->nr - skip; > trace_len = trace_nr * sizeof(u64); > + trace_nr = min(trace_nr, max_depth - skip); > + > ips = trace->ip + skip; > hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); > id = hash & (smap->n_buckets - 1); > @@ -321,19 +323,17 @@ static long __bpf_get_stackid(struct bpf_map *map, > BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, > u64, flags) > { > - u32 max_depth = map->value_size / stack_map_data_size(map); > - u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > + u32 elem_size = stack_map_data_size(map); > bool user = flags & BPF_F_USER_STACK; > struct perf_callchain_entry *trace; > bool kernel = !user; > + u32 max_depth; > > if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | > BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) > return -EINVAL; > > - max_depth += skip; > - if (max_depth > sysctl_perf_event_max_stack) > - max_depth = sysctl_perf_event_max_stack; > + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); > > trace = get_perf_callchain(regs, 0, kernel, user, max_depth, > false, false); > @@ -342,7 +342,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, > /* couldn't fetch the stack trace */ > return -EFAULT; > > - return __bpf_get_stackid(map, trace, flags); > + return __bpf_get_stackid(map, trace, flags, max_depth); > } > > const struct bpf_func_proto bpf_get_stackid_proto = { > @@ -374,6 +374,7 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, > bool kernel, user; > __u64 nr_kernel; > int ret; > + u32 elem_size, max_depth; > > /* perf_sample_data doesn't have callchain, use bpf_get_stackid */ > if (!(event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) > @@ -392,16 +393,18 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, > return -EFAULT; > > nr_kernel = count_kernel_ip(trace); > - > + elem_size = stack_map_data_size(map); > if (kernel) { > __u64 nr = trace->nr; > > trace->nr = nr_kernel; > - ret = __bpf_get_stackid(map, trace, flags); > + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); > + ret = __bpf_get_stackid(map, trace, flags, max_depth); > > /* restore nr */ > trace->nr = nr; > } else { /* user */ > + Remove the above empty line. > u64 skip = flags & BPF_F_SKIP_FIELD_MASK; > > skip += nr_kernel; > @@ -409,7 +412,8 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, > return -EFAULT; > > flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; > - ret = __bpf_get_stackid(map, trace, flags); > + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); > + ret = __bpf_get_stackid(map, trace, flags, max_depth); > } > return ret; > } ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH bpf-next v4 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-13 5:59 ` Yonghong Song @ 2025-08-13 20:46 ` Arnaud Lecomte 2025-08-13 20:55 ` [PATCH bpf-next v4 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 0 siblings, 1 reply; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-13 20:46 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, contact, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs A new helper function stack_map_calculate_max_depth() that computes the max depth for a stackmap. Changes in v2: - Removed the checking 'map_size % map_elem_size' from stack_map_calculate_max_depth - Changed stack_map_calculate_max_depth params name to be more generic Changes in v3: - Changed map size param to size in max depth helper Changes in v4: - Fixed indentation in max depth helper for args Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 3615c06b7dfa..b9cc6c72a2a5 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -42,6 +42,27 @@ static inline int stack_map_data_size(struct bpf_map *map) sizeof(struct bpf_stack_build_id) : sizeof(u64); } +/** + * stack_map_calculate_max_depth - Calculate maximum allowed stack trace depth + * @size: Size of the buffer/map value in bytes + * @elem_size: Size of each stack trace element + * @flags: BPF stack trace flags (BPF_F_USER_STACK, BPF_F_USER_BUILD_ID, ...) + * + * Return: Maximum number of stack trace entries that can be safely stored + */ +static u32 stack_map_calculate_max_depth(u32 size, u32 elem_size, u64 flags) +{ + u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 max_depth; + + max_depth = size / elem_size; + max_depth += skip; + if (max_depth > sysctl_perf_event_max_stack) + return sysctl_perf_event_max_stack; + + return max_depth; +} + static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) { u64 elem_size = sizeof(struct stack_map_bucket) + @@ -406,7 +427,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, struct perf_callchain_entry *trace_in, void *buf, u32 size, u64 flags, bool may_fault) { - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; + u32 trace_nr, copy_len, elem_size, max_depth; bool user_build_id = flags & BPF_F_USER_BUILD_ID; bool crosstask = task && task != current; u32 skip = flags & BPF_F_SKIP_FIELD_MASK; @@ -438,10 +459,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, goto clear; } - num_elem = size / elem_size; - max_depth = num_elem + skip; - if (sysctl_perf_event_max_stack < max_depth) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); if (may_fault) rcu_read_lock(); /* need RCU for perf's callchain below */ @@ -461,7 +479,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, } trace_nr = trace->nr - skip; - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; + trace_nr = min(trace_nr, max_depth - skip); copy_len = trace_nr * elem_size; ips = trace->ip + skip; -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH bpf-next v4 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-13 20:46 ` [PATCH bpf-next v4 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte @ 2025-08-13 20:55 ` Arnaud Lecomte 2025-08-18 13:49 ` Lecomte, Arnaud 0 siblings, 1 reply; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-13 20:55 UTC (permalink / raw) To: yonghong.song Cc: andrii, ast, bpf, contact, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() when copying stack trace data. The issue occurs when the perf trace contains more stack entries than the stack map bucket can hold, leading to an out-of-bounds write in the bucket's data array. Changes in v2: - Fixed max_depth names across get stack id Changes in v4: - Removed unnecessary empty line in __bpf_get_stackid Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> --- kernel/bpf/stackmap.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index b9cc6c72a2a5..318f150460bb 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -246,7 +246,7 @@ get_callchain_entry_for_task(struct task_struct *task, u32 max_depth) } static long __bpf_get_stackid(struct bpf_map *map, - struct perf_callchain_entry *trace, u64 flags) + struct perf_callchain_entry *trace, u64 flags, u32 max_depth) { struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); struct stack_map_bucket *bucket, *new_bucket, *old_bucket; @@ -262,6 +262,8 @@ static long __bpf_get_stackid(struct bpf_map *map, trace_nr = trace->nr - skip; trace_len = trace_nr * sizeof(u64); + trace_nr = min(trace_nr, max_depth - skip); + ips = trace->ip + skip; hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); id = hash & (smap->n_buckets - 1); @@ -321,19 +323,17 @@ static long __bpf_get_stackid(struct bpf_map *map, BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, u64, flags) { - u32 max_depth = map->value_size / stack_map_data_size(map); - u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 elem_size = stack_map_data_size(map); bool user = flags & BPF_F_USER_STACK; struct perf_callchain_entry *trace; bool kernel = !user; + u32 max_depth; if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) return -EINVAL; - max_depth += skip; - if (max_depth > sysctl_perf_event_max_stack) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); trace = get_perf_callchain(regs, 0, kernel, user, max_depth, false, false); @@ -342,7 +342,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, /* couldn't fetch the stack trace */ return -EFAULT; - return __bpf_get_stackid(map, trace, flags); + return __bpf_get_stackid(map, trace, flags, max_depth); } const struct bpf_func_proto bpf_get_stackid_proto = { @@ -374,6 +374,7 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, bool kernel, user; __u64 nr_kernel; int ret; + u32 elem_size, max_depth; /* perf_sample_data doesn't have callchain, use bpf_get_stackid */ if (!(event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) @@ -392,12 +393,13 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, return -EFAULT; nr_kernel = count_kernel_ip(trace); - + elem_size = stack_map_data_size(map); if (kernel) { __u64 nr = trace->nr; trace->nr = nr_kernel; - ret = __bpf_get_stackid(map, trace, flags); + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, max_depth); /* restore nr */ trace->nr = nr; @@ -409,7 +411,8 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, return -EFAULT; flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; - ret = __bpf_get_stackid(map, trace, flags); + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); + ret = __bpf_get_stackid(map, trace, flags, max_depth); } return ret; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH bpf-next v4 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-13 20:55 ` [PATCH bpf-next v4 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte @ 2025-08-18 13:49 ` Lecomte, Arnaud 2025-08-18 16:57 ` Yonghong Song 0 siblings, 1 reply; 28+ messages in thread From: Lecomte, Arnaud @ 2025-08-18 13:49 UTC (permalink / raw) To: song, jolsa Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs, yonghong.song Hey, Just forwarding the patch to the associated maintainers with `stackmap.c`. Have a great day, Cheers On 13/08/2025 21:55, Arnaud Lecomte wrote: > Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid() > when copying stack trace data. The issue occurs when the perf trace > contains more stack entries than the stack map bucket can hold, > leading to an out-of-bounds write in the bucket's data array. > > Changes in v2: > - Fixed max_depth names across get stack id > > Changes in v4: > - Removed unnecessary empty line in __bpf_get_stackid > > Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b > Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> > --- > kernel/bpf/stackmap.c | 23 +++++++++++++---------- > 1 file changed, 13 insertions(+), 10 deletions(-) > > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c > index b9cc6c72a2a5..318f150460bb 100644 > --- a/kernel/bpf/stackmap.c > +++ b/kernel/bpf/stackmap.c > @@ -246,7 +246,7 @@ get_callchain_entry_for_task(struct task_struct *task, u32 max_depth) > } > > static long __bpf_get_stackid(struct bpf_map *map, > - struct perf_callchain_entry *trace, u64 flags) > + struct perf_callchain_entry *trace, u64 flags, u32 max_depth) > { > struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); > struct stack_map_bucket *bucket, *new_bucket, *old_bucket; > @@ -262,6 +262,8 @@ static long __bpf_get_stackid(struct bpf_map *map, > > trace_nr = trace->nr - skip; > trace_len = trace_nr * sizeof(u64); > + trace_nr = min(trace_nr, max_depth - skip); > + > ips = trace->ip + skip; > hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0); > id = hash & (smap->n_buckets - 1); > @@ -321,19 +323,17 @@ static long __bpf_get_stackid(struct bpf_map *map, > BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, > u64, flags) > { > - u32 max_depth = map->value_size / stack_map_data_size(map); > - u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > + u32 elem_size = stack_map_data_size(map); > bool user = flags & BPF_F_USER_STACK; > struct perf_callchain_entry *trace; > bool kernel = !user; > + u32 max_depth; > > if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | > BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) > return -EINVAL; > > - max_depth += skip; > - if (max_depth > sysctl_perf_event_max_stack) > - max_depth = sysctl_perf_event_max_stack; > + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); > > trace = get_perf_callchain(regs, 0, kernel, user, max_depth, > false, false); > @@ -342,7 +342,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, > /* couldn't fetch the stack trace */ > return -EFAULT; > > - return __bpf_get_stackid(map, trace, flags); > + return __bpf_get_stackid(map, trace, flags, max_depth); > } > > const struct bpf_func_proto bpf_get_stackid_proto = { > @@ -374,6 +374,7 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, > bool kernel, user; > __u64 nr_kernel; > int ret; > + u32 elem_size, max_depth; > > /* perf_sample_data doesn't have callchain, use bpf_get_stackid */ > if (!(event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) > @@ -392,12 +393,13 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, > return -EFAULT; > > nr_kernel = count_kernel_ip(trace); > - > + elem_size = stack_map_data_size(map); > if (kernel) { > __u64 nr = trace->nr; > > trace->nr = nr_kernel; > - ret = __bpf_get_stackid(map, trace, flags); > + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); > + ret = __bpf_get_stackid(map, trace, flags, max_depth); > > /* restore nr */ > trace->nr = nr; > @@ -409,7 +411,8 @@ BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, > return -EFAULT; > > flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; > - ret = __bpf_get_stackid(map, trace, flags); > + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); > + ret = __bpf_get_stackid(map, trace, flags, max_depth); > } > return ret; > } ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH bpf-next v4 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-18 13:49 ` Lecomte, Arnaud @ 2025-08-18 16:57 ` Yonghong Song 2025-08-18 17:02 ` Yonghong Song 0 siblings, 1 reply; 28+ messages in thread From: Yonghong Song @ 2025-08-18 16:57 UTC (permalink / raw) To: Lecomte, Arnaud, song, jolsa Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, kpsingh, linux-kernel, martin.lau, sdf, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs On 8/18/25 6:49 AM, Lecomte, Arnaud wrote: > Hey, > Just forwarding the patch to the associated maintainers with > `stackmap.c`. Arnaud, please add Ack (provided in comments for v3) to make things easier for maintainers. Also, looks like all your patch sets (v1 to v4) in the same thread. It would be good to have all these versions in separate thread. Please look at some examples in bpf mailing list. > Have a great day, > Cheers > > On 13/08/2025 21:55, Arnaud Lecomte wrote: >> Syzkaller reported a KASAN slab-out-of-bounds write in >> __bpf_get_stackid() >> when copying stack trace data. The issue occurs when the perf trace >> contains more stack entries than the stack map bucket can hold, >> leading to an out-of-bounds write in the bucket's data array. >> >> Changes in v2: >> - Fixed max_depth names across get stack id >> >> Changes in v4: >> - Removed unnecessary empty line in __bpf_get_stackid >> >> Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >> Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b >> Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> >> --- >> kernel/bpf/stackmap.c | 23 +++++++++++++---------- >> 1 file changed, 13 insertions(+), 10 deletions(-) >> [...] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH bpf-next v4 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-18 16:57 ` Yonghong Song @ 2025-08-18 17:02 ` Yonghong Song 2025-08-19 16:20 ` Arnaud Lecomte 0 siblings, 1 reply; 28+ messages in thread From: Yonghong Song @ 2025-08-18 17:02 UTC (permalink / raw) To: Lecomte, Arnaud, song, jolsa Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, kpsingh, linux-kernel, martin.lau, sdf, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs On 8/18/25 9:57 AM, Yonghong Song wrote: > > > On 8/18/25 6:49 AM, Lecomte, Arnaud wrote: >> Hey, >> Just forwarding the patch to the associated maintainers with >> `stackmap.c`. > > Arnaud, please add Ack (provided in comments for v3) to make things > easier > for maintainers. > > Also, looks like all your patch sets (v1 to v4) in the same thread. sorry, it should be v3 and v4 in the same thread. > It would be good to have all these versions in separate thread. > Please look at some examples in bpf mailing list. > >> Have a great day, >> Cheers >> >> On 13/08/2025 21:55, Arnaud Lecomte wrote: >>> Syzkaller reported a KASAN slab-out-of-bounds write in >>> __bpf_get_stackid() >>> when copying stack trace data. The issue occurs when the perf trace >>> contains more stack entries than the stack map bucket can hold, >>> leading to an out-of-bounds write in the bucket's data array. >>> >>> Changes in v2: >>> - Fixed max_depth names across get stack id >>> >>> Changes in v4: >>> - Removed unnecessary empty line in __bpf_get_stackid >>> >>> Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >>> Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b >>> Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> >>> --- >>> kernel/bpf/stackmap.c | 23 +++++++++++++---------- >>> 1 file changed, 13 insertions(+), 10 deletions(-) >>> > [...] > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH bpf-next v4 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() 2025-08-18 17:02 ` Yonghong Song @ 2025-08-19 16:20 ` Arnaud Lecomte 0 siblings, 0 replies; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-19 16:20 UTC (permalink / raw) To: Yonghong Song, song, jolsa Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, kpsingh, linux-kernel, martin.lau, sdf, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs On 18/08/2025 18:02, Yonghong Song wrote: > > > On 8/18/25 9:57 AM, Yonghong Song wrote: >> >> >> On 8/18/25 6:49 AM, Lecomte, Arnaud wrote: >>> Hey, >>> Just forwarding the patch to the associated maintainers with >>> `stackmap.c`. >> >> Arnaud, please add Ack (provided in comments for v3) to make things >> easier >> for maintainers. >> >> Also, looks like all your patch sets (v1 to v4) in the same thread. > > sorry, it should be v3 and v4 in the same thread. > Hey, ty for the feedback ! I am going to provide the link to the v3 in the v4 commit and resent the v4 with the Acked-by. >> It would be good to have all these versions in separate thread. >> Please look at some examples in bpf mailing list. >> >>> Have a great day, >>> Cheers >>> >>> On 13/08/2025 21:55, Arnaud Lecomte wrote: >>>> Syzkaller reported a KASAN slab-out-of-bounds write in >>>> __bpf_get_stackid() >>>> when copying stack trace data. The issue occurs when the perf trace >>>> contains more stack entries than the stack map bucket can hold, >>>> leading to an out-of-bounds write in the bucket's data array. >>>> >>>> Changes in v2: >>>> - Fixed max_depth names across get stack id >>>> >>>> Changes in v4: >>>> - Removed unnecessary empty line in __bpf_get_stackid >>>> >>>> Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com >>>> Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b >>>> Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> >>>> --- >>>> kernel/bpf/stackmap.c | 23 +++++++++++++---------- >>>> 1 file changed, 13 insertions(+), 10 deletions(-) >>>> >> [...] >> >> > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH bpf-next v3 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-12 19:30 ` [PATCH bpf-next v3 " Arnaud Lecomte 2025-08-12 19:32 ` [PATCH bpf-next v3 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte @ 2025-08-13 5:54 ` Yonghong Song 1 sibling, 0 replies; 28+ messages in thread From: Yonghong Song @ 2025-08-13 5:54 UTC (permalink / raw) To: Arnaud Lecomte Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs On 8/12/25 12:30 PM, Arnaud Lecomte wrote: > A new helper function stack_map_calculate_max_depth() that > computes the max depth for a stackmap. > > Changes in v2: > - Removed the checking 'map_size % map_elem_size' from > stack_map_calculate_max_depth > - Changed stack_map_calculate_max_depth params name to be more generic > > Changes in v3: > - Changed map size param to size in max depth helper > > Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> LGTM with a small nit below. Acked-by: Yonghong Song <yonghong.song@linux.dev> > --- > kernel/bpf/stackmap.c | 30 ++++++++++++++++++++++++------ > 1 file changed, 24 insertions(+), 6 deletions(-) > > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c > index 3615c06b7dfa..a267567e36dd 100644 > --- a/kernel/bpf/stackmap.c > +++ b/kernel/bpf/stackmap.c > @@ -42,6 +42,27 @@ static inline int stack_map_data_size(struct bpf_map *map) > sizeof(struct bpf_stack_build_id) : sizeof(u64); > } > > +/** > + * stack_map_calculate_max_depth - Calculate maximum allowed stack trace depth > + * @size: Size of the buffer/map value in bytes > + * @elem_size: Size of each stack trace element > + * @flags: BPF stack trace flags (BPF_F_USER_STACK, BPF_F_USER_BUILD_ID, ...) Let us have consistent format, e.g. * @size: Size of ... * @elem_size: Size of ... * @flags: BPF stack trace ... > + * > + * Return: Maximum number of stack trace entries that can be safely stored > + */ > +static u32 stack_map_calculate_max_depth(u32 size, u32 elem_size, u64 flags) > +{ > + u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > + u32 max_depth; > + > + max_depth = size / elem_size; > + max_depth += skip; > + if (max_depth > sysctl_perf_event_max_stack) > + return sysctl_perf_event_max_stack; > + > + return max_depth; > +} > + > static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) > { > u64 elem_size = sizeof(struct stack_map_bucket) + > @@ -406,7 +427,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > struct perf_callchain_entry *trace_in, > void *buf, u32 size, u64 flags, bool may_fault) > { > - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; > + u32 trace_nr, copy_len, elem_size, max_depth; > bool user_build_id = flags & BPF_F_USER_BUILD_ID; > bool crosstask = task && task != current; > u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > @@ -438,10 +459,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > goto clear; > } > > - num_elem = size / elem_size; > - max_depth = num_elem + skip; > - if (sysctl_perf_event_max_stack < max_depth) > - max_depth = sysctl_perf_event_max_stack; > + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); > > if (may_fault) > rcu_read_lock(); /* need RCU for perf's callchain below */ > @@ -461,7 +479,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > } > > trace_nr = trace->nr - skip; > - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; > + trace_nr = min(trace_nr, max_depth - skip); > copy_len = trace_nr * elem_size; > > ips = trace->ip + skip; ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() 2025-08-12 4:39 ` [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song 2025-08-12 19:30 ` [PATCH bpf-next v3 " Arnaud Lecomte @ 2025-08-12 19:32 ` Arnaud Lecomte 1 sibling, 0 replies; 28+ messages in thread From: Arnaud Lecomte @ 2025-08-12 19:32 UTC (permalink / raw) To: Yonghong Song Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot+c9b724fbb41cf2538b7b, syzkaller-bugs Thanks Yonghong for your feedbacks and your patience ! On 12/08/2025 05:39, Yonghong Song wrote: > > > On 8/9/25 5:09 AM, Arnaud Lecomte wrote: >> A new helper function stack_map_calculate_max_depth() that >> computes the max depth for a stackmap. > > Please add 'bpf-next' in the subject like [PATCH bpf-next v2 1/2] > so CI can properly test the patch set. > >> >> Changes in v2: >> - Removed the checking 'map_size % map_elem_size' from >> stack_map_calculate_max_depth >> - Changed stack_map_calculate_max_depth params name to be more generic >> >> Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> >> --- >> kernel/bpf/stackmap.c | 30 ++++++++++++++++++++++++------ >> 1 file changed, 24 insertions(+), 6 deletions(-) >> >> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c >> index 3615c06b7dfa..532447606532 100644 >> --- a/kernel/bpf/stackmap.c >> +++ b/kernel/bpf/stackmap.c >> @@ -42,6 +42,27 @@ static inline int stack_map_data_size(struct >> bpf_map *map) >> sizeof(struct bpf_stack_build_id) : sizeof(u64); >> } >> +/** >> + * stack_map_calculate_max_depth - Calculate maximum allowed stack >> trace depth >> + * @map_size: Size of the buffer/map value in bytes > > let us rename 'map_size' to 'size' since the size represents size of > buffer or map, not just for map. > >> + * @elem_size: Size of each stack trace element >> + * @flags: BPF stack trace flags (BPF_F_USER_STACK, >> BPF_F_USER_BUILD_ID, ...) >> + * >> + * Return: Maximum number of stack trace entries that can be safely >> stored >> + */ >> +static u32 stack_map_calculate_max_depth(u32 map_size, u32 >> elem_size, u64 flags) > > map_size -> size > Also, you can replace 'flags' to 'skip', so below 'u32 skip = flags & > BPF_F_SKIP_FIELD_MASK' > is not necessary. > >> +{ >> + u32 skip = flags & BPF_F_SKIP_FIELD_MASK; >> + u32 max_depth; >> + >> + max_depth = map_size / elem_size; >> + max_depth += skip; >> + if (max_depth > sysctl_perf_event_max_stack) >> + return sysctl_perf_event_max_stack; >> + >> + return max_depth; >> +} >> + >> static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) >> { >> u64 elem_size = sizeof(struct stack_map_bucket) + >> @@ -406,7 +427,7 @@ static long __bpf_get_stack(struct pt_regs *regs, >> struct task_struct *task, >> struct perf_callchain_entry *trace_in, >> void *buf, u32 size, u64 flags, bool may_fault) >> { >> - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; >> + u32 trace_nr, copy_len, elem_size, max_depth; >> bool user_build_id = flags & BPF_F_USER_BUILD_ID; >> bool crosstask = task && task != current; >> u32 skip = flags & BPF_F_SKIP_FIELD_MASK; >> @@ -438,10 +459,7 @@ static long __bpf_get_stack(struct pt_regs >> *regs, struct task_struct *task, >> goto clear; >> } >> - num_elem = size / elem_size; >> - max_depth = num_elem + skip; >> - if (sysctl_perf_event_max_stack < max_depth) >> - max_depth = sysctl_perf_event_max_stack; >> + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); >> if (may_fault) >> rcu_read_lock(); /* need RCU for perf's callchain below */ >> @@ -461,7 +479,7 @@ static long __bpf_get_stack(struct pt_regs *regs, >> struct task_struct *task, >> } >> trace_nr = trace->nr - skip; >> - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; >> + trace_nr = min(trace_nr, max_depth - skip); >> copy_len = trace_nr * elem_size; >> ips = trace->ip + skip; > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [syzbot ci] Re: bpf: refactor max_depth computation in bpf_get_stack() 2025-08-07 17:50 ` [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte 2025-08-07 17:52 ` [PATCH 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 2025-08-07 19:01 ` [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song @ 2025-08-08 7:30 ` syzbot ci 2 siblings, 0 replies; 28+ messages in thread From: syzbot ci @ 2025-08-08 7:30 UTC (permalink / raw) To: andrii, ast, bpf, contact, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, sdf, song, syzbot, syzkaller-bugs, yonghong.song Cc: syzbot, syzkaller-bugs syzbot ci has tested the following series [v1] bpf: refactor max_depth computation in bpf_get_stack() https://lore.kernel.org/all/20250807175032.7381-1-contact@arnaud-lcm.com * [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() * [PATCH 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() and found the following issues: * KASAN: stack-out-of-bounds Write in __bpf_get_stack * PANIC: double fault in its_return_thunk Full report is available here: https://ci.syzbot.org/series/2af1b227-99e3-4e64-ac23-827848a4b8a5 *** KASAN: stack-out-of-bounds Write in __bpf_get_stack tree: bpf-next URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next.git base: f3af62b6cee8af9f07012051874af2d2a451f0e5 arch: amd64 compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7 config: https://ci.syzbot.org/builds/5e5c6698-7b84-4bf2-a1ee-1b6223c8d4c3/config C repro: https://ci.syzbot.org/findings/1355d710-d133-43fd-9061-18b2de6844a4/c_repro syz repro: https://ci.syzbot.org/findings/1355d710-d133-43fd-9061-18b2de6844a4/syz_repro netdevsim netdevsim1 netdevsim0: renamed from eth0 netdevsim netdevsim1 netdevsim1: renamed from eth1 ================================================================== BUG: KASAN: stack-out-of-bounds in __bpf_get_stack+0x54a/0xa70 kernel/bpf/stackmap.c:501 Write of size 208 at addr ffffc90003655ee8 by task syz-executor/5952 CPU: 1 UID: 0 PID: 5952 Comm: syz-executor Not tainted 6.16.0-syzkaller-11113-gf3af62b6cee8-dirty #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report+0x118/0x150 mm/kasan/report.c:595 check_region_inline mm/kasan/generic.c:-1 [inline] kasan_check_range+0x2b0/0x2c0 mm/kasan/generic.c:189 __asan_memcpy+0x40/0x70 mm/kasan/shadow.c:106 __bpf_get_stack+0x54a/0xa70 kernel/bpf/stackmap.c:501 ____bpf_get_stack kernel/bpf/stackmap.c:525 [inline] bpf_get_stack+0x33/0x50 kernel/bpf/stackmap.c:522 ____bpf_get_stack_raw_tp kernel/trace/bpf_trace.c:1835 [inline] bpf_get_stack_raw_tp+0x1a9/0x220 kernel/trace/bpf_trace.c:1825 bpf_prog_4e330ebee64cb698+0x43/0x4b bpf_dispatcher_nop_func include/linux/bpf.h:1332 [inline] __bpf_prog_run include/linux/filter.h:718 [inline] bpf_prog_run include/linux/filter.h:725 [inline] __bpf_trace_run kernel/trace/bpf_trace.c:2257 [inline] bpf_trace_run10+0x2e4/0x500 kernel/trace/bpf_trace.c:2306 __bpf_trace_percpu_alloc_percpu+0x364/0x400 include/trace/events/percpu.h:11 __do_trace_percpu_alloc_percpu include/trace/events/percpu.h:11 [inline] trace_percpu_alloc_percpu include/trace/events/percpu.h:11 [inline] pcpu_alloc_noprof+0x1534/0x16b0 mm/percpu.c:1892 fib_nh_common_init+0x9c/0x3b0 net/ipv4/fib_semantics.c:620 fib6_nh_init+0x1608/0x1ff0 net/ipv6/route.c:3671 ip6_route_info_create_nh+0x16a/0xab0 net/ipv6/route.c:3892 ip6_route_add+0x6e/0x1b0 net/ipv6/route.c:3944 addrconf_add_mroute net/ipv6/addrconf.c:2552 [inline] addrconf_add_dev+0x24f/0x340 net/ipv6/addrconf.c:2570 addrconf_dev_config net/ipv6/addrconf.c:3479 [inline] addrconf_init_auto_addrs+0x57c/0xa30 net/ipv6/addrconf.c:3567 addrconf_notify+0xacc/0x1010 net/ipv6/addrconf.c:3740 notifier_call_chain+0x1b6/0x3e0 kernel/notifier.c:85 call_netdevice_notifiers_extack net/core/dev.c:2267 [inline] call_netdevice_notifiers net/core/dev.c:2281 [inline] __dev_notify_flags+0x18d/0x2e0 net/core/dev.c:-1 netif_change_flags+0xe8/0x1a0 net/core/dev.c:9608 do_setlink+0xc55/0x41c0 net/core/rtnetlink.c:3143 rtnl_changelink net/core/rtnetlink.c:3761 [inline] __rtnl_newlink net/core/rtnetlink.c:3920 [inline] rtnl_newlink+0x160b/0x1c70 net/core/rtnetlink.c:4057 rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6946 netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2552 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0x82f/0x9e0 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:714 [inline] __sock_sendmsg+0x21c/0x270 net/socket.c:729 __sys_sendto+0x3bd/0x520 net/socket.c:2228 __do_sys_sendto net/socket.c:2235 [inline] __se_sys_sendto net/socket.c:2231 [inline] __x64_sys_sendto+0xde/0x100 net/socket.c:2231 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fec5c790a7c Code: 2a 5f 02 00 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 70 5f 02 00 48 8b RSP: 002b:00007fff7b55f7b0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fec5d4e35c0 RCX: 00007fec5c790a7c RDX: 0000000000000030 RSI: 00007fec5d4e3610 RDI: 0000000000000006 RBP: 0000000000000000 R08: 00007fff7b55f804 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000006 R13: 0000000000000000 R14: 00007fec5d4e3610 R15: 0000000000000000 </TASK> The buggy address belongs to stack of task syz-executor/5952 and is located at offset 296 in frame: __bpf_get_stack+0x0/0xa70 include/linux/mmap_lock.h:-1 This frame has 1 object: [32, 36) 'rctx.i' The buggy address belongs to a 8-page vmalloc region starting at 0xffffc90003650000 allocated at copy_process+0x54b/0x3c00 kernel/fork.c:2002 The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888024c63200 pfn:0x24c62 flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff) raw: 00fff00000000000 0000000000000000 dead000000000122 0000000000000000 raw: ffff888024c63200 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x2dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_ZERO|__GFP_NOWARN), pid 5845, tgid 5845 (syz-executor), ts 59049058263, free_ts 59031992240 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x240/0x2a0 mm/page_alloc.c:1851 prep_new_page mm/page_alloc.c:1859 [inline] get_page_from_freelist+0x21e4/0x22c0 mm/page_alloc.c:3858 __alloc_frozen_pages_noprof+0x181/0x370 mm/page_alloc.c:5148 alloc_pages_mpol+0x232/0x4a0 mm/mempolicy.c:2416 alloc_frozen_pages_noprof mm/mempolicy.c:2487 [inline] alloc_pages_noprof+0xa9/0x190 mm/mempolicy.c:2507 vm_area_alloc_pages mm/vmalloc.c:3642 [inline] __vmalloc_area_node mm/vmalloc.c:3720 [inline] __vmalloc_node_range_noprof+0x97d/0x12f0 mm/vmalloc.c:3893 __vmalloc_node_noprof+0xc2/0x110 mm/vmalloc.c:3956 alloc_thread_stack_node kernel/fork.c:318 [inline] dup_task_struct+0x3e7/0x860 kernel/fork.c:879 copy_process+0x54b/0x3c00 kernel/fork.c:2002 kernel_clone+0x21e/0x840 kernel/fork.c:2603 __do_sys_clone3 kernel/fork.c:2907 [inline] __se_sys_clone3+0x256/0x2d0 kernel/fork.c:2886 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f page last free pid 5907 tgid 5907 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1395 [inline] __free_frozen_pages+0xbc4/0xd30 mm/page_alloc.c:2895 vfree+0x25a/0x400 mm/vmalloc.c:3434 kcov_put kernel/kcov.c:439 [inline] kcov_close+0x28/0x50 kernel/kcov.c:535 __fput+0x44c/0xa70 fs/file_table.c:468 task_work_run+0x1d4/0x260 kernel/task_work.c:227 exit_task_work include/linux/task_work.h:40 [inline] do_exit+0x6b5/0x2300 kernel/exit.c:966 do_group_exit+0x21c/0x2d0 kernel/exit.c:1107 get_signal+0x1286/0x1340 kernel/signal.c:3034 arch_do_signal_or_restart+0x9a/0x750 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop+0x75/0x110 kernel/entry/common.c:40 exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline] do_syscall_64+0x2bd/0x3b0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f Memory state around the buggy address: ffffc90003655e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffc90003655e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffffc90003655f00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 f2 f2 ^ ffffc90003655f80: 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 f3 ffffc90003656000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== *** PANIC: double fault in its_return_thunk tree: bpf-next URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next.git base: f3af62b6cee8af9f07012051874af2d2a451f0e5 arch: amd64 compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7 config: https://ci.syzbot.org/builds/5e5c6698-7b84-4bf2-a1ee-1b6223c8d4c3/config C repro: https://ci.syzbot.org/findings/1bf5dce6-467f-4bcd-9357-2726101d2ad1/c_repro syz repro: https://ci.syzbot.org/findings/1bf5dce6-467f-4bcd-9357-2726101d2ad1/syz_repro traps: PANIC: double fault, error_code: 0x0 Oops: double fault: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 5789 Comm: syz-executor930 Not tainted 6.16.0-syzkaller-11113-gf3af62b6cee8-dirty #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 RIP: 0010:its_return_thunk+0x0/0x10 arch/x86/lib/retpoline.S:412 Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc <c3> cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e9 6b 2b b9 f5 cc RSP: 0018:ffffffffa0000877 EFLAGS: 00010246 RAX: 2161df6de464b300 RBX: 4800be48c0315641 RCX: 2161df6de464b300 RDX: 0000000000000000 RSI: ffffffff8dba01ee RDI: ffff888105cc9cc0 RBP: eb7a3aa9e9c95e41 R08: ffffffff81000130 R09: ffffffff81000130 R10: ffffffff81d017ac R11: ffffffff8b7707da R12: 3145ffff888028c3 R13: ee8948f875894cf6 R14: 000002baf8c68348 R15: e1cb3861e8c93100 FS: 0000555557cbc380(0000) GS:ffff8880b862a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa0000868 CR3: 0000000028468000 CR4: 00000000000006f0 Call Trace: Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:its_return_thunk+0x0/0x10 arch/x86/lib/retpoline.S:412 Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc <c3> cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e9 6b 2b b9 f5 cc RSP: 0018:ffffffffa0000877 EFLAGS: 00010246 RAX: 2161df6de464b300 RBX: 4800be48c0315641 RCX: 2161df6de464b300 RDX: 0000000000000000 RSI: ffffffff8dba01ee RDI: ffff888105cc9cc0 RBP: eb7a3aa9e9c95e41 R08: ffffffff81000130 R09: ffffffff81000130 R10: ffffffff81d017ac R11: ffffffff8b7707da R12: 3145ffff888028c3 R13: ee8948f875894cf6 R14: 000002baf8c68348 R15: e1cb3861e8c93100 FS: 0000555557cbc380(0000) GS:ffff8880b862a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa0000868 CR3: 0000000028468000 CR4: 00000000000006f0 ---------------- Code disassembly (best guess): 0: cc int3 1: cc int3 2: cc int3 3: cc int3 4: cc int3 5: cc int3 6: cc int3 7: cc int3 8: cc int3 9: cc int3 a: cc int3 b: cc int3 c: cc int3 d: cc int3 e: cc int3 f: cc int3 10: cc int3 11: cc int3 12: cc int3 13: cc int3 14: cc int3 15: cc int3 16: cc int3 17: cc int3 18: cc int3 19: cc int3 1a: cc int3 1b: cc int3 1c: cc int3 1d: cc int3 1e: cc int3 1f: cc int3 20: cc int3 21: cc int3 22: cc int3 23: cc int3 24: cc int3 25: cc int3 26: cc int3 27: cc int3 28: cc int3 29: cc int3 * 2a: c3 ret <-- trapping instruction 2b: cc int3 2c: 90 nop 2d: 90 nop 2e: 90 nop 2f: 90 nop 30: 90 nop 31: 90 nop 32: 90 nop 33: 90 nop 34: 90 nop 35: 90 nop 36: 90 nop 37: 90 nop 38: 90 nop 39: 90 nop 3a: e9 6b 2b b9 f5 jmp 0xf5b92baa 3f: cc int3 *** If these findings have caused you to resend the series or submit a separate fix, please add the following tag to your commit message: Tested-by: syzbot@syzkaller.appspotmail.com --- This report is generated by a bot. It may contain errors. syzbot ci engineers can be reached at syzkaller@googlegroups.com. ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2025-08-19 16:21 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-29 16:56 [PATCH v2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 2025-07-29 22:45 ` Yonghong Song 2025-07-30 7:10 ` Arnaud Lecomte 2025-08-01 18:16 ` Lecomte, Arnaud 2025-08-05 20:49 ` Arnaud Lecomte 2025-08-06 1:52 ` Yonghong Song 2025-08-07 17:50 ` [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte 2025-08-07 17:52 ` [PATCH 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 2025-08-07 19:05 ` Yonghong Song 2025-08-07 19:01 ` [PATCH 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song 2025-08-07 19:07 ` Yonghong Song 2025-08-09 11:56 ` [PATCH v2 " Arnaud Lecomte 2025-08-09 11:58 ` [PATCH v2 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 2025-08-09 12:09 ` [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte 2025-08-09 12:14 ` [PATCH RESEND v2 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 2025-08-12 4:39 ` [PATCH RESEND v2 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song 2025-08-12 19:30 ` [PATCH bpf-next v3 " Arnaud Lecomte 2025-08-12 19:32 ` [PATCH bpf-next v3 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 2025-08-13 5:59 ` Yonghong Song 2025-08-13 20:46 ` [PATCH bpf-next v4 1/2] bpf: refactor max_depth computation in bpf_get_stack() Arnaud Lecomte 2025-08-13 20:55 ` [PATCH bpf-next v4 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid() Arnaud Lecomte 2025-08-18 13:49 ` Lecomte, Arnaud 2025-08-18 16:57 ` Yonghong Song 2025-08-18 17:02 ` Yonghong Song 2025-08-19 16:20 ` Arnaud Lecomte 2025-08-13 5:54 ` [PATCH bpf-next v3 1/2] bpf: refactor max_depth computation in bpf_get_stack() Yonghong Song 2025-08-12 19:32 ` [PATCH RESEND v2 " Arnaud Lecomte 2025-08-08 7:30 ` [syzbot ci] " syzbot ci
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).