* [RESEND] Fwd: [BUG] list corruption in __bpf_lru_node_move () 【 bug found and suggestions for fixing it】
[not found] <263a77e4-9ba8-f9e2-4aaf-5e2854d487e5@huaweicloud.com>
@ 2025-03-10 2:19 ` Hou Tao
2025-03-11 0:46 ` Martin KaFai Lau
0 siblings, 1 reply; 2+ messages in thread
From: Hou Tao @ 2025-03-10 2:19 UTC (permalink / raw)
To: Strforexc yn
Cc: Martin KaFai Lau, Alexei Starovoitov,, Daniel Borkmann,
Andrii Nakryiko, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
linux-kernel, bpf
Resend due to the HTML part in the reply. Sorry for the inconvenience.
Hi,
On 3/5/2025 9:28 PM, Strforexc yn wrote:
> Hi Maintainers,
>
> When using our customized Syzkaller to fuzz the latest Linux kernel,
> the following crash was triggered.
> Kernel Config : https://github.com/Strforexc/LinuxKernelbug/blob/main/.config
>
> A kernel BUG was reported due to list corruption during BPF LRU node movement.
> The issue occurs when the node being moved is the sole element in its list and
> also the next_inactive_rotation candidate. After moving, the list became empty,
> but next_inactive_rotation incorrectly pointed to the moved node, causing later
> operations to corrupt the list.
The list being pointed by next_inactive_rotation is a doubly linked list
(aka, struct list_head), therefore, there are at least two nodes in the
non-empty list: the head of the list and the sole element. When the node
is the last element in the list, next_inactive_rotation will be pointed
to the head of the list after the move. So I don't think the analysis
and the fix below is correct.
>
> Here is my fix suggestion:
> The fix checks if the node was the only element before adjusting
> next_inactive_rotation. If so, it sets the pointer to NULL, preventing invalid
> access.
>
> diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c
> index XXXXXXX..XXXXXXX 100644
> --- a/kernel/bpf/bpf_lru_list.c
> +++ b/kernel/bpf/bpf_lru_list.c
> @@ -119,8 +119,13 @@ static void __bpf_lru_node_move(struct bpf_lru_list *l,
> * move the next_inactive_rotation pointer also.
> */
> if (&node->list == l->next_inactive_rotation)
> - l->next_inactive_rotation = l->next_inactive_rotation->prev;
> -
> + {
> + if (l->next_inactive_rotation->prev == &node->list) {
> + l->next_inactive_rotation = NULL;
> + } else {
> + l->next_inactive_rotation = l->next_inactive_rotation->prev;
> + }
> + }
> list_move(&node->list, &l->lists[tgt_type]);
> }
>
> -- 2.34.1 Our knowledge of the kernel is somewhat limited, and we'd
> appreciate it if you could determine if there is such an issue. If
> this issue doesn't have an impact, please ignore it ☺. If you fix this
> issue, please add the following tag to the commit: Reported-by:
> Zhizhuo Tang strforexctzzchange@foxmail.com, Jianzhou Zhao
> xnxc22xnxc22@qq.com, Haoran Liu <cherest_san@163.com> Last is my
> report: vmalloc memory list_add corruption. next->prev should be prev
> (ffffe8ffac433e40), but was 50ffffe8ffac433e. (next=ffffe8ffac433e41).
> ------------[ cut here ]------------ kernel BUG at
> lib/list_debug.c:29! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> PTI CPU: 0 UID: 0 PID: 14524 Comm: syz.0.285 Not tainted
> 6.14.0-rc5-00013-g99fa936e8e4f #1 Hardware name: QEMU Standard PC
> (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP:
> 0010:__list_add_valid_or_report+0xfc/0x1a0 lib/list_debug.c:29
I suspect that the content of lists[BPF_LRU_LIST_T_ACTIVE].next has been
corrupted, because the pointer itself should be at least 8-bytes
aligned, but its value is 0xffffe8ffac433e41. Also only the last bit of
the next pointer is different with the address of
list[BPF_LRU_LIST_T_ACTIVE] itelse (aka 0xffffe8ffac433e40).
> Code: 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 a6 00 00 00
> 49 8b 54 24 08 4c 89 e1 48 c7 c7 c0 1f f2 8b e8 55 54 d3 fc 90 <0f> 0b
> 48 89 f7 48 89 34 24 e8 16 54 33 fd 48 8b 34 24 48 b8 00 00 RSP:
> 0018:ffffc900033779b0 EFLAGS: 00010046 RAX: 0000000000000075 RBX:
> ffffc900035777c8 RCX: 0000000000000000 RDX: 0000000000000000 RSI:
> 0000000000000000 RDI: 0000000000000000 RBP: ffffe8ffac433e40 R08:
> 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11:
> 0000000000000000 R12: ffffe8ffac433e41 R13: ffffc900035777c8 R14:
> ffffe8ffac433e49 R15: ffffe8ffac433e50 FS: 00007fef15ddd640(0000)
> GS:ffff88802b600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033 CR2: 00007ffd53abb238 CR3: 00000000296f4000
> CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400 Call Trace: <TASK> __list_add_valid
> include/linux/list.h:88 [inline] __list_add include/linux/list.h:150
> [inline] list_add include/linux/list.h:169 [inline] list_move
> include/linux/list.h:299 [inline] __bpf_lru_node_move+0x21a/0x480
> kernel/bpf/bpf_lru_list.c:126
> __bpf_lru_list_rotate_inactive+0x20f/0x310
> kernel/bpf/bpf_lru_list.c:196 __bpf_lru_list_rotate
> kernel/bpf/bpf_lru_list.c:247 [inline] bpf_percpu_lru_pop_free
> kernel/bpf/bpf_lru_list.c:417 [inline] bpf_lru_pop_free+0x157/0x370
> kernel/bpf/bpf_lru_list.c:502 prealloc_lru_pop+0x23/0xf0
> kernel/bpf/hashtab.c:308 htab_lru_map_update_elem+0x14c/0xbe0
> kernel/bpf/hashtab.c:1251 bpf_map_update_value+0x675/0xf50
> kernel/bpf/syscall.c:289 generic_map_update_batch+0x44a/0x5f0
> kernel/bpf/syscall.c:1963 bpf_map_do_batch+0x4be/0x610
> kernel/bpf/syscall.c:5303 __sys_bpf+0x1002/0x1630
> kernel/bpf/syscall.c:5859 __do_sys_bpf kernel/bpf/syscall.c:5902
> [inline] __se_sys_bpf kernel/bpf/syscall.c:5900 [inline]
> __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5900 do_syscall_x64
> arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcb/0x260
> arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7fef14fb85ad Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00
> f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c
> 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7
> d8 64 89 01 48 RSP: 002b:00007fef15ddcf98 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000141 RAX: ffffffffffffffda RBX: 00007fef15245fa0 RCX:
> 00007fef14fb85ad RDX: 0000000000000038 RSI: 0000400000000000 RDI:
> 000000000000001a RBP: 00007fef1506a8d6 R08: 0000000000000000 R09:
> 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12:
> 0000000000000000 R13: 0000000000000000 R14: 00007fef15245fa0 R15:
> 00007fef15dbd000 </TASK> Modules linked in: ---[ end trace
> 0000000000000000 ]--- RIP: 0010:__list_add_valid_or_report+0xfc/0x1a0
> lib/list_debug.c:29 Code: 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00
> 0f 85 a6 00 00 00 49 8b 54 24 08 4c 89 e1 48 c7 c7 c0 1f f2 8b e8 55
> 54 d3 fc 90 <0f> 0b 48 89 f7 48 89 34 24 e8 16 54 33 fd 48 8b 34 24 48
> b8 00 00 RSP: 0018:ffffc900033779b0 EFLAGS: 00010046 RAX:
> 0000000000000075 RBX: ffffc900035777c8 RCX: 0000000000000000 RDX:
> 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP:
> ffffe8ffac433e40 R08: 0000000000000000 R09: 0000000000000000 R10:
> 0000000000000000 R11: 0000000000000000 R12: ffffe8ffac433e41 R13:
> ffffc900035777c8 R14: ffffe8ffac433e49 R15: ffffe8ffac433e50 FS:
> 00007fef15ddd640(0000) GS:ffff88802b600000(0000)
> knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033 CR2: 00007ffd53abb238 CR3: 00000000296f4000 CR4:
> 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400 Regards, Strforexc .
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [RESEND] Fwd: [BUG] list corruption in __bpf_lru_node_move () 【 bug found and suggestions for fixing it】
2025-03-10 2:19 ` [RESEND] Fwd: [BUG] list corruption in __bpf_lru_node_move () 【 bug found and suggestions for fixing it】 Hou Tao
@ 2025-03-11 0:46 ` Martin KaFai Lau
0 siblings, 0 replies; 2+ messages in thread
From: Martin KaFai Lau @ 2025-03-11 0:46 UTC (permalink / raw)
To: Hou Tao, Strforexc yn
Cc: Alexei Starovoitov,, Daniel Borkmann, Andrii Nakryiko,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, linux-kernel,
bpf
On 3/9/25 7:19 PM, Hou Tao wrote:
> Resend due to the HTML part in the reply. Sorry for the inconvenience.
>
> Hi,
>
> On 3/5/2025 9:28 PM, Strforexc yn wrote:
>> Hi Maintainers,
>>
>> When using our customized Syzkaller to fuzz the latest Linux kernel,
>> the following crash was triggered.
>> Kernel Config : https://github.com/Strforexc/LinuxKernelbug/blob/main/.config
>>
>> A kernel BUG was reported due to list corruption during BPF LRU node movement.
>> The issue occurs when the node being moved is the sole element in its list and
>> also the next_inactive_rotation candidate. After moving, the list became empty,
>> but next_inactive_rotation incorrectly pointed to the moved node, causing later
>> operations to corrupt the list.
>
> The list being pointed by next_inactive_rotation is a doubly linked list
> (aka, struct list_head), therefore, there are at least two nodes in the
> non-empty list: the head of the list and the sole element. When the node
> is the last element in the list, next_inactive_rotation will be pointed
> to the head of the list after the move. So I don't think the analysis
> and the fix below is correct.
>>
>> Here is my fix suggestion:
>> The fix checks if the node was the only element before adjusting
>> next_inactive_rotation. If so, it sets the pointer to NULL, preventing invalid
>> access.
>>
>> diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c
>> index XXXXXXX..XXXXXXX 100644
>> --- a/kernel/bpf/bpf_lru_list.c
>> +++ b/kernel/bpf/bpf_lru_list.c
>> @@ -119,8 +119,13 @@ static void __bpf_lru_node_move(struct bpf_lru_list *l,
>> * move the next_inactive_rotation pointer also.
>> */
>> if (&node->list == l->next_inactive_rotation)
>> - l->next_inactive_rotation = l->next_inactive_rotation->prev;
>> -
>> + {
>> + if (l->next_inactive_rotation->prev == &node->list) {
I don't think it is the right fix. I don't see how both this new "if" and the
above "if (&node->list == l->next_inactive_rotation)" can be true together. If
it fixed the issue, the root cause should be somewhere else.
I tried to simulate a one node inactive list and then rotate to the active list.
I cannot reproduce it.
Can you share the syzkaller reproducer that you have used to test this fix?
Is it something that you have seen recently and something that you can bisect?
>> + l->next_inactive_rotation = NULL;
>> + } else {
>> + l->next_inactive_rotation = l->next_inactive_rotation->prev;
>> + }
>> + }
>> list_move(&node->list, &l->lists[tgt_type]);
>> }
>>
>> -- 2.34.1 Our knowledge of the kernel is somewhat limited, and we'd
>> appreciate it if you could determine if there is such an issue. If
>> this issue doesn't have an impact, please ignore it ☺. If you fix this
>> issue, please add the following tag to the commit: Reported-by:
>> Zhizhuo Tang strforexctzzchange@foxmail.com, Jianzhou Zhao
>> xnxc22xnxc22@qq.com, Haoran Liu <cherest_san@163.com> Last is my
>> report: vmalloc memory list_add corruption. next->prev should be prev
>> (ffffe8ffac433e40), but was 50ffffe8ffac433e. (next=ffffe8ffac433e41).
>> ------------[ cut here ]------------ kernel BUG at
>> lib/list_debug.c:29! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN
>> PTI CPU: 0 UID: 0 PID: 14524 Comm: syz.0.285 Not tainted
>> 6.14.0-rc5-00013-g99fa936e8e4f #1 Hardware name: QEMU Standard PC
>> (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP:
>> 0010:__list_add_valid_or_report+0xfc/0x1a0 lib/list_debug.c:29
>
> I suspect that the content of lists[BPF_LRU_LIST_T_ACTIVE].next has been
> corrupted, because the pointer itself should be at least 8-bytes
> aligned, but its value is 0xffffe8ffac433e41. Also only the last bit of
It is more puzzling. Instead of the inactive list, the active list's head is
corrupted in the last bit of its next. I don't see the lru code path is reusing
the last bit of the next pointer. It is not a hlist_nulls... We need the
syzkaller reproducer to understand it better.
> the next pointer is different with the address of
> list[BPF_LRU_LIST_T_ACTIVE] itelse (aka 0xffffe8ffac433e40).
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-03-11 0:46 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <263a77e4-9ba8-f9e2-4aaf-5e2854d487e5@huaweicloud.com>
2025-03-10 2:19 ` [RESEND] Fwd: [BUG] list corruption in __bpf_lru_node_move () 【 bug found and suggestions for fixing it】 Hou Tao
2025-03-11 0:46 ` Martin KaFai Lau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox