* [syzbot] kernel BUG in __page_mapcount
@ 2021-05-31 0:53 syzbot
2021-12-21 17:24 ` syzbot
0 siblings, 1 reply; 8+ messages in thread
From: syzbot @ 2021-05-31 0:53 UTC (permalink / raw)
To: akpm, chinwen.chang, jannh, linux-fsdevel, linux-kernel,
syzkaller-bugs, vbabka, walken
Hello,
syzbot found the following issue on:
HEAD commit: 7ac3a1c1 Merge tag 'mtd/fixes-for-5.13-rc4' of git://git.k..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14559d5bd00000
kernel config: https://syzkaller.appspot.com/x/.config?x=f9f3fc7daa178986
dashboard link: https://syzkaller.appspot.com/bug?extid=1f52b3a18d5633fa7f82
compiler: Debian clang version 11.0.1-2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11132d5bd00000
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1f52b3a18d5633fa7f82@syzkaller.appspotmail.com
__mmput+0x111/0x370 kernel/fork.c:1096
exit_mm+0x67e/0x7d0 kernel/exit.c:502
do_exit+0x6b9/0x23d0 kernel/exit.c:813
do_group_exit+0x168/0x2d0 kernel/exit.c:923
get_signal+0x1770/0x2180 kernel/signal.c:2835
arch_do_signal_or_restart+0x8e/0x6c0 arch/x86/kernel/signal.c:789
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0xac/0x200 kernel/entry/common.c:208
__syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
syscall_exit_to_user_mode+0x26/0x70 kernel/entry/common.c:301
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:686!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 10694 Comm: syz-executor.0 Not tainted 5.13.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:PageDoubleMap include/linux/page-flags.h:686 [inline]
RIP: 0010:__page_mapcount+0x2b3/0x2d0 mm/util.c:728
Code: e8 72 25 cf ff 4c 89 ff 48 c7 c6 40 fb 39 8a e8 03 4c 04 00 0f 0b e8 5c 25 cf ff 4c 89 ff 48 c7 c6 40 fc 39 8a e8 ed 4b 04 00 <0f> 0b e8 46 25 cf ff 4c 89 ff 48 c7 c6 80 fc 39 8a e8 d7 4b 04 00
RSP: 0018:ffffc90001ff7460 EFLAGS: 00010246
RAX: e8070b6faabf8b00 RBX: 00fff0000008001d RCX: ffff888047280000
RDX: 0000000000000000 RSI: 000000000000ffff RDI: 000000000000ffff
RBP: 0000000000000000 R08: ffffffff81ce2584 R09: ffffed1017363f24
R10: ffffed1017363f24 R11: 0000000000000000 R12: 1ffffd4000265001
R13: 00000000ffffffff R14: dffffc0000000000 R15: ffffea0001328000
FS: 00007f6e83636700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000568000 CR3: 000000002b559000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
page_mapcount include/linux/mm.h:873 [inline]
smaps_account+0x79d/0x980 fs/proc/task_mmu.c:467
smaps_pte_entry fs/proc/task_mmu.c:533 [inline]
smaps_pte_range+0x6ed/0xfc0 fs/proc/task_mmu.c:596
walk_pmd_range mm/pagewalk.c:89 [inline]
walk_pud_range mm/pagewalk.c:160 [inline]
walk_p4d_range mm/pagewalk.c:193 [inline]
walk_pgd_range mm/pagewalk.c:229 [inline]
__walk_page_range+0xd64/0x1ad0 mm/pagewalk.c:331
walk_page_vma+0x3c2/0x500 mm/pagewalk.c:482
smap_gather_stats fs/proc/task_mmu.c:769 [inline]
show_smaps_rollup+0x49d/0xc20 fs/proc/task_mmu.c:872
seq_read_iter+0x43a/0xcf0 fs/seq_file.c:227
seq_read+0x445/0x5c0 fs/seq_file.c:159
do_loop_readv_writev fs/read_write.c:761 [inline]
do_iter_read+0x464/0x660 fs/read_write.c:803
vfs_readv fs/read_write.c:921 [inline]
do_preadv+0x1f7/0x340 fs/read_write.c:1013
do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665d9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6e83636188 EFLAGS: 00000246 ORIG_RAX: 0000000000000127
RAX: ffffffffffffffda RBX: 000000000056bf80 RCX: 00000000004665d9
RDX: 0000000000000001 RSI: 0000000020000780 RDI: 0000000000000004
RBP: 00000000004bfcb9 R08: 0000000000000003 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf80
R13: 00007ffeb607b5df R14: 00007f6e83636300 R15: 0000000000022000
Modules linked in:
---[ end trace e65a33e7d2bffb07 ]---
RIP: 0010:PageDoubleMap include/linux/page-flags.h:686 [inline]
RIP: 0010:__page_mapcount+0x2b3/0x2d0 mm/util.c:728
Code: e8 72 25 cf ff 4c 89 ff 48 c7 c6 40 fb 39 8a e8 03 4c 04 00 0f 0b e8 5c 25 cf ff 4c 89 ff 48 c7 c6 40 fc 39 8a e8 ed 4b 04 00 <0f> 0b e8 46 25 cf ff 4c 89 ff 48 c7 c6 80 fc 39 8a e8 d7 4b 04 00
RSP: 0018:ffffc90001ff7460 EFLAGS: 00010246
RAX: e8070b6faabf8b00 RBX: 00fff0000008001d RCX: ffff888047280000
RDX: 0000000000000000 RSI: 000000000000ffff RDI: 000000000000ffff
RBP: 0000000000000000 R08: ffffffff81ce2584 R09: ffffed1017363f24
R10: ffffed1017363f24 R11: 0000000000000000 R12: 1ffffd4000265001
R13: 00000000ffffffff R14: dffffc0000000000 R15: ffffea0001328000
FS: 00007f6e83636700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000568000 CR3: 000000002b559000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] kernel BUG in __page_mapcount
2021-05-31 0:53 [syzbot] kernel BUG in __page_mapcount syzbot
@ 2021-12-21 17:24 ` syzbot
2021-12-21 18:24 ` Yang Shi
0 siblings, 1 reply; 8+ messages in thread
From: syzbot @ 2021-12-21 17:24 UTC (permalink / raw)
To: akpm, apopple, chinwen.chang, fgheet255t, jannh, khlebnikov,
kirill.shutemov, kirill, linux-fsdevel, linux-kernel, linux-mm,
peterx, peterz, syzkaller-bugs, tonymarislogistics, vbabka,
walken, willy, ziy
syzbot has found a reproducer for the following issue on:
HEAD commit: 6e0567b73052 Merge tag 'for-linus' of git://git.kernel.org..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14c192b3b00000
kernel config: https://syzkaller.appspot.com/x/.config?x=ae22d1ee4fbca18
dashboard link: https://syzkaller.appspot.com/bug?extid=1f52b3a18d5633fa7f82
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133200fdb00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17c3102db00000
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1f52b3a18d5633fa7f82@syzkaller.appspotmail.com
__mmput+0x122/0x4b0 kernel/fork.c:1113
mmput+0x56/0x60 kernel/fork.c:1134
exit_mm kernel/exit.c:507 [inline]
do_exit+0xb27/0x2b40 kernel/exit.c:819
do_group_exit+0x125/0x310 kernel/exit.c:929
get_signal+0x47d/0x2220 kernel/signal.c:2852
arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
__syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:785!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline]
RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744
Code: e8 d3 16 d1 ff 48 c7 c6 c0 00 b6 89 48 89 ef e8 94 4e 04 00 0f 0b e8 bd 16 d1 ff 48 c7 c6 60 01 b6 89 48 89 ef e8 7e 4e 04 00 <0f> 0b e8 a7 16 d1 ff 48 c7 c6 a0 01 b6 89 4c 89 f7 e8 68 4e 04 00
RSP: 0018:ffffc90002b6f7b8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff888019619d00 RSI: ffffffff81a68c12 RDI: 0000000000000003
RBP: ffffea0001bdc2c0 R08: 0000000000000029 R09: 00000000ffffffff
R10: ffffffff8903e29f R11: 00000000ffffffff R12: 00000000ffffffff
R13: 00000000ffffea00 R14: ffffc90002b6fb30 R15: ffffea0001bd8001
FS: 00007faa2aefd700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff7e663318 CR3: 0000000018c6e000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
page_mapcount include/linux/mm.h:837 [inline]
smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466
smaps_pte_entry fs/proc/task_mmu.c:538 [inline]
smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601
walk_pmd_range mm/pagewalk.c:128 [inline]
walk_pud_range mm/pagewalk.c:205 [inline]
walk_p4d_range mm/pagewalk.c:240 [inline]
walk_pgd_range mm/pagewalk.c:277 [inline]
__walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379
walk_page_vma+0x277/0x350 mm/pagewalk.c:530
smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768
smap_gather_stats fs/proc/task_mmu.c:741 [inline]
show_smap+0xc6/0x440 fs/proc/task_mmu.c:822
seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272
seq_read+0x3e0/0x5b0 fs/seq_file.c:162
vfs_read+0x1b5/0x600 fs/read_write.c:479
ksys_read+0x12d/0x250 fs/read_write.c:619
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7faa2af6c969
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007faa2aefd288 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007faa2aff4418 RCX: 00007faa2af6c969
RDX: 0000000000002025 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00007faa2aff4410 R08: 00007faa2aefd700 R09: 0000000000000000
R10: 00007faa2aefd700 R11: 0000000000000246 R12: 00007faa2afc20ac
R13: 00007fff7e6632bf R14: 00007faa2aefd400 R15: 0000000000022000
</TASK>
Modules linked in:
---[ end trace 24ec93ff95e4ac3d ]---
RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline]
RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744
Code: e8 d3 16 d1 ff 48 c7 c6 c0 00 b6 89 48 89 ef e8 94 4e 04 00 0f 0b e8 bd 16 d1 ff 48 c7 c6 60 01 b6 89 48 89 ef e8 7e 4e 04 00 <0f> 0b e8 a7 16 d1 ff 48 c7 c6 a0 01 b6 89 4c 89 f7 e8 68 4e 04 00
RSP: 0018:ffffc90002b6f7b8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff888019619d00 RSI: ffffffff81a68c12 RDI: 0000000000000003
RBP: ffffea0001bdc2c0 R08: 0000000000000029 R09: 00000000ffffffff
R10: ffffffff8903e29f R11: 00000000ffffffff R12: 00000000ffffffff
R13: 00000000ffffea00 R14: ffffc90002b6fb30 R15: ffffea0001bd8001
FS: 00007faa2aefd700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff7e663318 CR3: 0000000018c6e000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] kernel BUG in __page_mapcount
2021-12-21 17:24 ` syzbot
@ 2021-12-21 18:24 ` Yang Shi
2021-12-21 18:40 ` Matthew Wilcox
0 siblings, 1 reply; 8+ messages in thread
From: Yang Shi @ 2021-12-21 18:24 UTC (permalink / raw)
To: syzbot
Cc: Andrew Morton, Alistair Popple, chinwen.chang, fgheet255t,
Jann Horn, Konstantin Khlebnikov, Kirill A. Shutemov,
Kirill A. Shutemov, Linux FS-devel Mailing List,
Linux Kernel Mailing List, Linux MM, Peter Xu, Peter Zijlstra,
syzkaller-bugs, tonymarislogistics, Vlastimil Babka, walken,
Matthew Wilcox, Zi Yan
On Tue, Dec 21, 2021 at 9:24 AM syzbot
<syzbot+1f52b3a18d5633fa7f82@syzkaller.appspotmail.com> wrote:
>
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 6e0567b73052 Merge tag 'for-linus' of git://git.kernel.org..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=14c192b3b00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=ae22d1ee4fbca18
> dashboard link: https://syzkaller.appspot.com/bug?extid=1f52b3a18d5633fa7f82
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133200fdb00000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17c3102db00000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+1f52b3a18d5633fa7f82@syzkaller.appspotmail.com
>
> __mmput+0x122/0x4b0 kernel/fork.c:1113
> mmput+0x56/0x60 kernel/fork.c:1134
> exit_mm kernel/exit.c:507 [inline]
> do_exit+0xb27/0x2b40 kernel/exit.c:819
> do_group_exit+0x125/0x310 kernel/exit.c:929
> get_signal+0x47d/0x2220 kernel/signal.c:2852
> arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
> handle_signal_work kernel/entry/common.c:148 [inline]
> exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
> exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
> __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
> syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
> do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> ------------[ cut here ]------------
> kernel BUG at include/linux/page-flags.h:785!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline]
> RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744
It seems the THP is split during smaps walk. The reproducer does call
MADV_FREE on partial THP which may split the huge page.
The below fix (untested) should be able to fix it.
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ad667dbc96f5..97feb15a2448 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -433,6 +433,7 @@ static void smaps_account(struct mem_size_stats
*mss, struct page *page,
{
int i, nr = compound ? compound_nr(page) : 1;
unsigned long size = nr * PAGE_SIZE;
+ struct page *head = compound_head(page);
/*
* First accumulate quantities that depend only on |size| and the type
@@ -462,6 +463,11 @@ static void smaps_account(struct mem_size_stats
*mss, struct page *page,
locked, true);
return;
}
+
+ /* Lost the race with THP split */
+ if (!get_page_unless_zero(head))
+ return;
+
for (i = 0; i < nr; i++, page++) {
int mapcount = page_mapcount(page);
unsigned long pss = PAGE_SIZE << PSS_SHIFT;
@@ -470,6 +476,8 @@ static void smaps_account(struct mem_size_stats
*mss, struct page *page,
smaps_page_accumulate(mss, page, PAGE_SIZE, pss, dirty, locked,
mapcount < 2);
}
+
+ put_page(head);
}
> Code: e8 d3 16 d1 ff 48 c7 c6 c0 00 b6 89 48 89 ef e8 94 4e 04 00 0f 0b e8 bd 16 d1 ff 48 c7 c6 60 01 b6 89 48 89 ef e8 7e 4e 04 00 <0f> 0b e8 a7 16 d1 ff 48 c7 c6 a0 01 b6 89 4c 89 f7 e8 68 4e 04 00
> RSP: 0018:ffffc90002b6f7b8 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff888019619d00 RSI: ffffffff81a68c12 RDI: 0000000000000003
> RBP: ffffea0001bdc2c0 R08: 0000000000000029 R09: 00000000ffffffff
> R10: ffffffff8903e29f R11: 00000000ffffffff R12: 00000000ffffffff
> R13: 00000000ffffea00 R14: ffffc90002b6fb30 R15: ffffea0001bd8001
> FS: 00007faa2aefd700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fff7e663318 CR3: 0000000018c6e000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> page_mapcount include/linux/mm.h:837 [inline]
> smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466
> smaps_pte_entry fs/proc/task_mmu.c:538 [inline]
> smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601
> walk_pmd_range mm/pagewalk.c:128 [inline]
> walk_pud_range mm/pagewalk.c:205 [inline]
> walk_p4d_range mm/pagewalk.c:240 [inline]
> walk_pgd_range mm/pagewalk.c:277 [inline]
> __walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379
> walk_page_vma+0x277/0x350 mm/pagewalk.c:530
> smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768
> smap_gather_stats fs/proc/task_mmu.c:741 [inline]
> show_smap+0xc6/0x440 fs/proc/task_mmu.c:822
> seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272
> seq_read+0x3e0/0x5b0 fs/seq_file.c:162
> vfs_read+0x1b5/0x600 fs/read_write.c:479
> ksys_read+0x12d/0x250 fs/read_write.c:619
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7faa2af6c969
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007faa2aefd288 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 00007faa2aff4418 RCX: 00007faa2af6c969
> RDX: 0000000000002025 RSI: 0000000020000100 RDI: 0000000000000003
> RBP: 00007faa2aff4410 R08: 00007faa2aefd700 R09: 0000000000000000
> R10: 00007faa2aefd700 R11: 0000000000000246 R12: 00007faa2afc20ac
> R13: 00007fff7e6632bf R14: 00007faa2aefd400 R15: 0000000000022000
> </TASK>
> Modules linked in:
> ---[ end trace 24ec93ff95e4ac3d ]---
> RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline]
> RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744
> Code: e8 d3 16 d1 ff 48 c7 c6 c0 00 b6 89 48 89 ef e8 94 4e 04 00 0f 0b e8 bd 16 d1 ff 48 c7 c6 60 01 b6 89 48 89 ef e8 7e 4e 04 00 <0f> 0b e8 a7 16 d1 ff 48 c7 c6 a0 01 b6 89 4c 89 f7 e8 68 4e 04 00
> RSP: 0018:ffffc90002b6f7b8 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff888019619d00 RSI: ffffffff81a68c12 RDI: 0000000000000003
> RBP: ffffea0001bdc2c0 R08: 0000000000000029 R09: 00000000ffffffff
> R10: ffffffff8903e29f R11: 00000000ffffffff R12: 00000000ffffffff
> R13: 00000000ffffea00 R14: ffffc90002b6fb30 R15: ffffea0001bd8001
> FS: 00007faa2aefd700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fff7e663318 CR3: 0000000018c6e000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>
>
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [syzbot] kernel BUG in __page_mapcount
2021-12-21 18:24 ` Yang Shi
@ 2021-12-21 18:40 ` Matthew Wilcox
2021-12-21 19:07 ` Yang Shi
2022-01-05 19:05 ` Yang Shi
0 siblings, 2 replies; 8+ messages in thread
From: Matthew Wilcox @ 2021-12-21 18:40 UTC (permalink / raw)
To: Yang Shi
Cc: syzbot, Andrew Morton, Alistair Popple, chinwen.chang, fgheet255t,
Jann Horn, Konstantin Khlebnikov, Kirill A. Shutemov,
Kirill A. Shutemov, Linux FS-devel Mailing List,
Linux Kernel Mailing List, Linux MM, Peter Xu, Peter Zijlstra,
syzkaller-bugs, tonymarislogistics, Vlastimil Babka, walken,
Zi Yan
On Tue, Dec 21, 2021 at 10:24:27AM -0800, Yang Shi wrote:
> It seems the THP is split during smaps walk. The reproducer does call
> MADV_FREE on partial THP which may split the huge page.
>
> The below fix (untested) should be able to fix it.
Did you read the rest of the thread on this? If the page is being
migrated, we should still account it ... also, you've changed the
refcount, so this:
if (page_count(page) == 1) {
smaps_page_accumulate(mss, page, size, size << PSS_SHIFT, dirty,
locked, true);
return;
}
will never trigger.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] kernel BUG in __page_mapcount
2021-12-21 18:40 ` Matthew Wilcox
@ 2021-12-21 19:07 ` Yang Shi
2021-12-22 1:42 ` Yang Shi
2022-01-05 19:05 ` Yang Shi
1 sibling, 1 reply; 8+ messages in thread
From: Yang Shi @ 2021-12-21 19:07 UTC (permalink / raw)
To: Matthew Wilcox
Cc: syzbot, Andrew Morton, Alistair Popple, chinwen.chang, fgheet255t,
Jann Horn, Konstantin Khlebnikov, Kirill A. Shutemov,
Kirill A. Shutemov, Linux FS-devel Mailing List,
Linux Kernel Mailing List, Linux MM, Peter Xu, Peter Zijlstra,
syzkaller-bugs, tonymarislogistics, Vlastimil Babka, walken,
Zi Yan
On Tue, Dec 21, 2021 at 10:40 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Dec 21, 2021 at 10:24:27AM -0800, Yang Shi wrote:
> > It seems the THP is split during smaps walk. The reproducer does call
> > MADV_FREE on partial THP which may split the huge page.
> >
> > The below fix (untested) should be able to fix it.
>
> Did you read the rest of the thread on this? If the page is being
> migrated, we should still account it ... also, you've changed the
Yes, the being migrated pages may be skipped. We should be able to add
a new flag to smaps_account() to indicate this is a migration entry
then don't elevate the page count.
> refcount, so this:
>
> if (page_count(page) == 1) {
> smaps_page_accumulate(mss, page, size, size << PSS_SHIFT, dirty,
> locked, true);
> return;
> }
>
> will never trigger.
The get_page_unless_zero() is called after this block.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] kernel BUG in __page_mapcount
2021-12-21 19:07 ` Yang Shi
@ 2021-12-22 1:42 ` Yang Shi
0 siblings, 0 replies; 8+ messages in thread
From: Yang Shi @ 2021-12-22 1:42 UTC (permalink / raw)
To: Matthew Wilcox
Cc: syzbot, Andrew Morton, Alistair Popple, chinwen.chang, fgheet255t,
Jann Horn, Konstantin Khlebnikov, Kirill A. Shutemov,
Kirill A. Shutemov, Linux FS-devel Mailing List,
Linux Kernel Mailing List, Linux MM, Peter Xu, Peter Zijlstra,
syzkaller-bugs, tonymarislogistics, Vlastimil Babka, Zi Yan
On Tue, Dec 21, 2021 at 11:07 AM Yang Shi <shy828301@gmail.com> wrote:
>
> On Tue, Dec 21, 2021 at 10:40 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Tue, Dec 21, 2021 at 10:24:27AM -0800, Yang Shi wrote:
> > > It seems the THP is split during smaps walk. The reproducer does call
> > > MADV_FREE on partial THP which may split the huge page.
> > >
> > > The below fix (untested) should be able to fix it.
> >
> > Did you read the rest of the thread on this? If the page is being
> > migrated, we should still account it ... also, you've changed the
>
> Yes, the being migrated pages may be skipped. We should be able to add
> a new flag to smaps_account() to indicate this is a migration entry
> then don't elevate the page count.
It seems not that straightforward. THP split converts PTEs to
migration entries too. So we can't tell if it is real migration or
just in the middle of THP split.
We just need to serialize against THP split for PTE mapped subpages.
So in real life workload it might be ok to skip accounting migration
pages? Typically the migration is a transient state, so the under
accounting should be transient too. Or account migration pages
separately, just like swap entries?
I may revisit this after the holiday. If you have any better ideas,
please feel free to propose.
>
> > refcount, so this:
> >
> > if (page_count(page) == 1) {
> > smaps_page_accumulate(mss, page, size, size << PSS_SHIFT, dirty,
> > locked, true);
> > return;
> > }
> >
> > will never trigger.
>
> The get_page_unless_zero() is called after this block.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] kernel BUG in __page_mapcount
2021-12-21 18:40 ` Matthew Wilcox
2021-12-21 19:07 ` Yang Shi
@ 2022-01-05 19:05 ` Yang Shi
2022-01-11 23:14 ` Yang Shi
1 sibling, 1 reply; 8+ messages in thread
From: Yang Shi @ 2022-01-05 19:05 UTC (permalink / raw)
To: Matthew Wilcox
Cc: syzbot, Andrew Morton, Alistair Popple, chinwen.chang, fgheet255t,
Jann Horn, Konstantin Khlebnikov, Kirill A. Shutemov,
Kirill A. Shutemov, Linux FS-devel Mailing List,
Linux Kernel Mailing List, Linux MM, Peter Xu, Peter Zijlstra,
syzkaller-bugs, tonymarislogistics, Vlastimil Babka, walken,
Zi Yan
On Tue, Dec 21, 2021 at 10:40 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Dec 21, 2021 at 10:24:27AM -0800, Yang Shi wrote:
> > It seems the THP is split during smaps walk. The reproducer does call
> > MADV_FREE on partial THP which may split the huge page.
> >
> > The below fix (untested) should be able to fix it.
>
> Did you read the rest of the thread on this? If the page is being
I just revisited this. Now I see what you mean about "the rest of the
thread". My gmail client doesn't put them in the same thread, sigh...
Yeah, try_get_compound_head() seems like the right way.
Or we just simply treat migration entries as mapcount == 1 as Kirill
suggested or just skip migration entries since they are transient or
show migration entries separately.
> migrated, we should still account it ... also, you've changed the
> refcount, so this:
>
> if (page_count(page) == 1) {
> smaps_page_accumulate(mss, page, size, size << PSS_SHIFT, dirty,
> locked, true);
> return;
> }
>
> will never trigger.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] kernel BUG in __page_mapcount
2022-01-05 19:05 ` Yang Shi
@ 2022-01-11 23:14 ` Yang Shi
0 siblings, 0 replies; 8+ messages in thread
From: Yang Shi @ 2022-01-11 23:14 UTC (permalink / raw)
To: Matthew Wilcox
Cc: syzbot, Andrew Morton, Alistair Popple, chinwen.chang, fgheet255t,
Jann Horn, Konstantin Khlebnikov, Kirill A. Shutemov,
Kirill A. Shutemov, Linux FS-devel Mailing List,
Linux Kernel Mailing List, Linux MM, Peter Xu, Peter Zijlstra,
syzkaller-bugs, tonymarislogistics, Vlastimil Babka, Zi Yan
On Wed, Jan 5, 2022 at 11:05 AM Yang Shi <shy828301@gmail.com> wrote:
>
> On Tue, Dec 21, 2021 at 10:40 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Tue, Dec 21, 2021 at 10:24:27AM -0800, Yang Shi wrote:
> > > It seems the THP is split during smaps walk. The reproducer does call
> > > MADV_FREE on partial THP which may split the huge page.
> > >
> > > The below fix (untested) should be able to fix it.
> >
> > Did you read the rest of the thread on this? If the page is being
>
> I just revisited this. Now I see what you mean about "the rest of the
> thread". My gmail client doesn't put them in the same thread, sigh...
>
> Yeah, try_get_compound_head() seems like the right way.
>
> Or we just simply treat migration entries as mapcount == 1 as Kirill
> suggested or just skip migration entries since they are transient or
> show migration entries separately.
I think Kirill's suggestion makes some sense. The migration entry's
mapcount is 0 so "pss /= mapcount" is not called at all, so the
migration entry is actually treated like mapcount == 1. This doesn't
change the behavior. Not like swap entry, we actually can't tell how
many references for the migration entry.
But we should handle private device entry differently since its
mapcount is inc'ed when it is shared between processes. The regular
migration entry could be identified by is_migration_entry() easily.
Using try_get_compound_head() seems overkilling IMHO.
I just came up with the below patch (built and running the producer
didn't trigger the bug for me so far). If it looks fine, I will submit
it in a formal patch with more comments.
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ad667dbc96f5..6a48bbb51efa 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -429,7 +429,8 @@ static void smaps_page_accumulate(struct
mem_size_stats *mss,
}
static void smaps_account(struct mem_size_stats *mss, struct page *page,
- bool compound, bool young, bool dirty, bool locked)
+ bool compound, bool young, bool dirty, bool locked,
+ bool migration)
{
int i, nr = compound ? compound_nr(page) : 1;
unsigned long size = nr * PAGE_SIZE;
@@ -457,7 +458,7 @@ static void smaps_account(struct mem_size_stats
*mss, struct page *page,
* If any subpage of the compound page mapped with PTE it would elevate
* page_count().
*/
- if (page_count(page) == 1) {
+ if ((page_count(page) == 1) || migration) {
smaps_page_accumulate(mss, page, size, size << PSS_SHIFT, dirty,
locked, true);
return;
@@ -506,6 +507,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
struct vm_area_struct *vma = walk->vma;
bool locked = !!(vma->vm_flags & VM_LOCKED);
struct page *page = NULL;
+ bool migration = false;
if (pte_present(*pte)) {
page = vm_normal_page(vma, addr, *pte);
@@ -525,8 +527,11 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
} else {
mss->swap_pss += (u64)PAGE_SIZE << PSS_SHIFT;
}
- } else if (is_pfn_swap_entry(swpent))
+ } else if (is_pfn_swap_entry(swpent)) {
+ if (is_migration_entry(swpent))
+ migration = true;
page = pfn_swap_entry_to_page(swpent);
+ }
} else {
smaps_pte_hole_lookup(addr, walk);
return;
@@ -535,7 +540,8 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
if (!page)
return;
- smaps_account(mss, page, false, pte_young(*pte),
pte_dirty(*pte), locked);
+ smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte),
+ locked, migration);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -546,6 +552,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
struct vm_area_struct *vma = walk->vma;
bool locked = !!(vma->vm_flags & VM_LOCKED);
struct page *page = NULL;
+ bool migration = false;
if (pmd_present(*pmd)) {
/* FOLL_DUMP will return -EFAULT on huge zero page */
@@ -553,8 +560,10 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
} else if (unlikely(thp_migration_supported() && is_swap_pmd(*pmd))) {
swp_entry_t entry = pmd_to_swp_entry(*pmd);
- if (is_migration_entry(entry))
+ if (is_migration_entry(entry)) {
+ migration = true;
page = pfn_swap_entry_to_page(entry);
+ }
}
if (IS_ERR_OR_NULL(page))
return;
@@ -566,7 +575,9 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
/* pass */;
else
mss->file_thp += HPAGE_PMD_SIZE;
- smaps_account(mss, page, true, pmd_young(*pmd),
pmd_dirty(*pmd), locked);
+
+ smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd),
+ locked, migration);
}
#else
static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
>
>
> > migrated, we should still account it ... also, you've changed the
> > refcount, so this:
> >
> > if (page_count(page) == 1) {
> > smaps_page_accumulate(mss, page, size, size << PSS_SHIFT, dirty,
> > locked, true);
> > return;
> > }
> >
> > will never trigger.
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-01-11 23:14 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-31 0:53 [syzbot] kernel BUG in __page_mapcount syzbot
2021-12-21 17:24 ` syzbot
2021-12-21 18:24 ` Yang Shi
2021-12-21 18:40 ` Matthew Wilcox
2021-12-21 19:07 ` Yang Shi
2021-12-22 1:42 ` Yang Shi
2022-01-05 19:05 ` Yang Shi
2022-01-11 23:14 ` Yang Shi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).