* [syzbot] [ext4?] general protection fault in jbd2__journal_start @ 2024-01-26 9:05 syzbot 2024-01-30 14:52 ` [syzbot] [xfs?] " syzbot 0 siblings, 1 reply; 14+ messages in thread From: syzbot @ 2024-01-26 9:05 UTC (permalink / raw) To: adilger.kernel, jack, linux-ext4, linux-fsdevel, linux-kernel, syzkaller-bugs, tytso Hello, syzbot found the following issue on: HEAD commit: 7a396820222d Merge tag 'v6.8-rc-part2-smb-client' of git:/.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=15fca78fe80000 kernel config: https://syzkaller.appspot.com/x/.config?x=7059b09d0488022 dashboard link: https://syzkaller.appspot.com/bug?extid=cdee56dbcdf0096ef605 compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 Unfortunately, I don't have any reproducer for this issue yet. Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/da73c2c8f5fe/disk-7a396820.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/10d2d2be8831/vmlinux-7a396820.xz kernel image: https://storage.googleapis.com/syzbot-assets/939406fd4919/bzImage-7a396820.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+cdee56dbcdf0096ef605@syzkaller.appspotmail.com general protection fault, probably for non-canonical address 0xdffffc000a8a4829: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x0000000054524148-0x000000005452414f] CPU: 0 PID: 3394 Comm: syz-executor.5 Not tainted 6.7.0-syzkaller-12991-g7a396820222d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 RIP: 0010:jbd2__journal_start+0x87/0x5d0 fs/jbd2/transaction.c:496 Code: 74 63 48 8b 1b 48 85 db 74 79 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 23 46 8f ff 48 8b 2b 48 89 e8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 ef e8 0a 46 8f ff 4c 39 65 00 0f 85 1a RSP: 0018:ffffc900154d65c8 EFLAGS: 00010203 RAX: 000000000a8a4829 RBX: ffff8880234e7618 RCX: 0000000000040000 RDX: ffffc9000a3a1000 RSI: 000000000000195c RDI: 000000000000195d RBP: 000000005452414e R08: 0000000000000c40 R09: 0000000000000001 R10: dffffc0000000000 R11: ffffed1005541071 R12: ffff88802aa0a000 R13: dffffc0000000000 R14: 0000000000000c40 R15: 0000000000000002 FS: 00007fbf47a2a6c0(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020020000 CR3: 0000000030c1a000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __ext4_journal_start_sb+0x215/0x5b0 fs/ext4/ext4_jbd2.c:112 __ext4_journal_start fs/ext4/ext4_jbd2.h:326 [inline] ext4_dirty_inode+0x92/0x110 fs/ext4/inode.c:5969 __mark_inode_dirty+0x305/0xda0 fs/fs-writeback.c:2452 generic_update_time fs/inode.c:1905 [inline] inode_update_time fs/inode.c:1918 [inline] __file_update_time fs/inode.c:2106 [inline] file_update_time+0x39b/0x3e0 fs/inode.c:2136 ext4_page_mkwrite+0x207/0xdf0 fs/ext4/inode.c:6090 do_page_mkwrite+0x197/0x470 mm/memory.c:2966 wp_page_shared mm/memory.c:3353 [inline] do_wp_page+0x20e3/0x4c80 mm/memory.c:3493 handle_pte_fault mm/memory.c:5160 [inline] __handle_mm_fault+0x26a3/0x72b0 mm/memory.c:5285 handle_mm_fault+0x27e/0x770 mm/memory.c:5450 do_user_addr_fault arch/x86/mm/fault.c:1415 [inline] handle_page_fault arch/x86/mm/fault.c:1507 [inline] exc_page_fault+0x2ad/0x870 arch/x86/mm/fault.c:1563 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570 RIP: 0010:rep_movs_alternative+0x4a/0x70 arch/x86/lib/copy_user_64.S:71 Code: 75 f1 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 8b 06 48 89 07 48 83 c6 08 48 83 c7 08 83 e9 08 74 df 83 f9 08 73 e8 eb c9 <f3> a4 c3 48 89 c8 48 c1 e9 03 83 e0 07 f3 48 a5 89 c1 85 c9 75 b3 RSP: 0018:ffffc900154d70f8 EFLAGS: 00050202 RAX: ffffffff848bfd01 RBX: 0000000020020040 RCX: 0000000000000040 RDX: 0000000000000000 RSI: ffff88802cded190 RDI: 0000000020020000 RBP: 1ffff92002a9af26 R08: ffff88802cded1cf R09: 1ffff110059bda39 R10: dffffc0000000000 R11: ffffed10059bda3a R12: 00000000000000c0 R13: dffffc0000000000 R14: 000000002001ff80 R15: ffff88802cded110 copy_user_generic arch/x86/include/asm/uaccess_64.h:112 [inline] raw_copy_to_user arch/x86/include/asm/uaccess_64.h:133 [inline] _copy_to_user+0x86/0xa0 lib/usercopy.c:41 copy_to_user include/linux/uaccess.h:191 [inline] xfs_bulkstat_fmt+0x4f/0x120 fs/xfs/xfs_ioctl.c:744 xfs_bulkstat_one_int+0xd8b/0x12e0 fs/xfs/xfs_itable.c:161 xfs_bulkstat_iwalk+0x72/0xb0 fs/xfs/xfs_itable.c:239 xfs_iwalk_ag_recs+0x4c3/0x820 fs/xfs/xfs_iwalk.c:220 xfs_iwalk_run_callbacks+0x25b/0x490 fs/xfs/xfs_iwalk.c:376 xfs_iwalk_ag+0xad6/0xbd0 fs/xfs/xfs_iwalk.c:482 xfs_iwalk+0x360/0x6f0 fs/xfs/xfs_iwalk.c:584 xfs_bulkstat+0x4f8/0x6c0 fs/xfs/xfs_itable.c:308 xfs_ioc_bulkstat+0x3d0/0x450 fs/xfs/xfs_ioctl.c:867 xfs_file_ioctl+0x6a5/0x1980 fs/xfs/xfs_ioctl.c:1994 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7fbf46c7cda9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fbf47a2a0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fbf46dabf80 RCX: 00007fbf46c7cda9 RDX: 000000002001fc40 RSI: 000000008040587f RDI: 0000000000000006 RBP: 00007fbf46cc947a R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007fbf46dabf80 R15: 00007ffee39fcd08 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:jbd2__journal_start+0x87/0x5d0 fs/jbd2/transaction.c:496 Code: 74 63 48 8b 1b 48 85 db 74 79 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 23 46 8f ff 48 8b 2b 48 89 e8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 ef e8 0a 46 8f ff 4c 39 65 00 0f 85 1a RSP: 0018:ffffc900154d65c8 EFLAGS: 00010203 RAX: 000000000a8a4829 RBX: ffff8880234e7618 RCX: 0000000000040000 RDX: ffffc9000a3a1000 RSI: 000000000000195c RDI: 000000000000195d RBP: 000000005452414e R08: 0000000000000c40 R09: 0000000000000001 R10: dffffc0000000000 R11: ffffed1005541071 R12: ffff88802aa0a000 R13: dffffc0000000000 R14: 0000000000000c40 R15: 0000000000000002 FS: 00007fbf47a2a6c0(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020020000 CR3: 0000000030c1a000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess): 0: 74 63 je 0x65 2: 48 8b 1b mov (%rbx),%rbx 5: 48 85 db test %rbx,%rbx 8: 74 79 je 0x83 a: 48 89 d8 mov %rbx,%rax d: 48 c1 e8 03 shr $0x3,%rax 11: 42 80 3c 28 00 cmpb $0x0,(%rax,%r13,1) 16: 74 08 je 0x20 18: 48 89 df mov %rbx,%rdi 1b: e8 23 46 8f ff call 0xff8f4643 20: 48 8b 2b mov (%rbx),%rbp 23: 48 89 e8 mov %rbp,%rax 26: 48 c1 e8 03 shr $0x3,%rax * 2a: 42 80 3c 28 00 cmpb $0x0,(%rax,%r13,1) <-- trapping instruction 2f: 74 08 je 0x39 31: 48 89 ef mov %rbp,%rdi 34: e8 0a 46 8f ff call 0xff8f4643 39: 4c 39 65 00 cmp %r12,0x0(%rbp) 3d: 0f .byte 0xf 3e: 85 1a test %ebx,(%rdx) --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start 2024-01-26 9:05 [syzbot] [ext4?] general protection fault in jbd2__journal_start syzbot @ 2024-01-30 14:52 ` syzbot 2024-01-30 23:37 ` current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) Dave Chinner ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: syzbot @ 2024-01-30 14:52 UTC (permalink / raw) To: adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs, tytso syzbot has found a reproducer for the following issue on: HEAD commit: 861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam.. git tree: upstream console+strace: https://syzkaller.appspot.com/x/log.txt?x=13ca8d97e80000 kernel config: https://syzkaller.appspot.com/x/.config?x=b0b9993d7d6d1990 dashboard link: https://syzkaller.appspot.com/bug?extid=cdee56dbcdf0096ef605 compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=104393efe80000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1393b90fe80000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/7c6cc521298d/disk-861c0981.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/6203c94955db/vmlinux-861c0981.xz kernel image: https://storage.googleapis.com/syzbot-assets/17e76e12b58c/bzImage-861c0981.xz mounted in repro: https://storage.googleapis.com/syzbot-assets/d31d4eed2912/mount_3.gz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+cdee56dbcdf0096ef605@syzkaller.appspotmail.com general protection fault, probably for non-canonical address 0xdffffc000a8a4829: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x0000000054524148-0x000000005452414f] CPU: 1 PID: 5065 Comm: syz-executor260 Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 RIP: 0010:jbd2__journal_start+0x87/0x5d0 fs/jbd2/transaction.c:496 Code: 74 63 48 8b 1b 48 85 db 74 79 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 63 4d 8f ff 48 8b 2b 48 89 e8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 ef e8 4a 4d 8f ff 4c 39 65 00 0f 85 1a RSP: 0018:ffffc900043265c8 EFLAGS: 00010203 RAX: 000000000a8a4829 RBX: ffff8880205fa3a8 RCX: ffff8880235dbb80 RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88801c1a6000 RBP: 000000005452414e R08: 0000000000000c40 R09: 0000000000000001 R10: dffffc0000000000 R11: ffffed1003834871 R12: ffff88801c1a6000 R13: dffffc0000000000 R14: 0000000000000c40 R15: 0000000000000002 FS: 0000555556f90380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020020000 CR3: 0000000021fed000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __ext4_journal_start_sb+0x215/0x5b0 fs/ext4/ext4_jbd2.c:112 __ext4_journal_start fs/ext4/ext4_jbd2.h:326 [inline] ext4_dirty_inode+0x92/0x110 fs/ext4/inode.c:5969 __mark_inode_dirty+0x305/0xda0 fs/fs-writeback.c:2452 generic_update_time fs/inode.c:1905 [inline] inode_update_time fs/inode.c:1918 [inline] __file_update_time fs/inode.c:2106 [inline] file_update_time+0x39b/0x3e0 fs/inode.c:2136 ext4_page_mkwrite+0x207/0xdf0 fs/ext4/inode.c:6090 do_page_mkwrite+0x197/0x470 mm/memory.c:2966 wp_page_shared mm/memory.c:3353 [inline] do_wp_page+0x20e3/0x4c80 mm/memory.c:3493 handle_pte_fault mm/memory.c:5160 [inline] __handle_mm_fault+0x26a3/0x72b0 mm/memory.c:5285 handle_mm_fault+0x27e/0x770 mm/memory.c:5450 do_user_addr_fault arch/x86/mm/fault.c:1415 [inline] handle_page_fault arch/x86/mm/fault.c:1507 [inline] exc_page_fault+0x2ad/0x870 arch/x86/mm/fault.c:1563 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570 RIP: 0010:rep_movs_alternative+0x4a/0x70 arch/x86/lib/copy_user_64.S:71 Code: 75 f1 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 8b 06 48 89 07 48 83 c6 08 48 83 c7 08 83 e9 08 74 df 83 f9 08 73 e8 eb c9 <f3> a4 c3 48 89 c8 48 c1 e9 03 83 e0 07 f3 48 a5 89 c1 85 c9 75 b3 RSP: 0018:ffffc900043270f8 EFLAGS: 00050202 RAX: ffffffff848cda01 RBX: 0000000020020040 RCX: 0000000000000040 RDX: 0000000000000000 RSI: ffff8880131873b0 RDI: 0000000020020000 RBP: 1ffff92000864f26 R08: ffff8880131873ef R09: 1ffff11002630e7d R10: dffffc0000000000 R11: ffffed1002630e7e R12: 00000000000000c0 R13: dffffc0000000000 R14: 000000002001ff80 R15: ffff888013187330 copy_user_generic arch/x86/include/asm/uaccess_64.h:112 [inline] raw_copy_to_user arch/x86/include/asm/uaccess_64.h:133 [inline] _copy_to_user+0x86/0xa0 lib/usercopy.c:41 copy_to_user include/linux/uaccess.h:191 [inline] xfs_bulkstat_fmt+0x4f/0x120 fs/xfs/xfs_ioctl.c:744 xfs_bulkstat_one_int+0xd8b/0x12e0 fs/xfs/xfs_itable.c:161 xfs_bulkstat_iwalk+0x72/0xb0 fs/xfs/xfs_itable.c:239 xfs_iwalk_ag_recs+0x4c3/0x820 fs/xfs/xfs_iwalk.c:220 xfs_iwalk_run_callbacks+0x25b/0x490 fs/xfs/xfs_iwalk.c:376 xfs_iwalk_ag+0xad6/0xbd0 fs/xfs/xfs_iwalk.c:482 xfs_iwalk+0x360/0x6f0 fs/xfs/xfs_iwalk.c:584 xfs_bulkstat+0x4f8/0x6c0 fs/xfs/xfs_itable.c:308 xfs_ioc_bulkstat+0x3d0/0x450 fs/xfs/xfs_ioctl.c:867 xfs_file_ioctl+0x6a5/0x1980 fs/xfs/xfs_ioctl.c:1994 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7f02d4018b59 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffdbe0deb98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0030656c69662f2e RCX: 00007f02d4018b59 RDX: 000000002001fc40 RSI: 000000008040587f RDI: 0000000000000004 RBP: 00000000000116e3 R08: 0000000000000000 R09: 0000555556f914c0 R10: 0000000020000300 R11: 0000000000000246 R12: 00007ffdbe0debc0 R13: 00007ffdbe0debac R14: 431bde82d7b634db R15: 00007f02d406103b </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:jbd2__journal_start+0x87/0x5d0 fs/jbd2/transaction.c:496 Code: 74 63 48 8b 1b 48 85 db 74 79 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 63 4d 8f ff 48 8b 2b 48 89 e8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 ef e8 4a 4d 8f ff 4c 39 65 00 0f 85 1a RSP: 0018:ffffc900043265c8 EFLAGS: 00010203 RAX: 000000000a8a4829 RBX: ffff8880205fa3a8 RCX: ffff8880235dbb80 RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88801c1a6000 RBP: 000000005452414e R08: 0000000000000c40 R09: 0000000000000001 R10: dffffc0000000000 R11: ffffed1003834871 R12: ffff88801c1a6000 R13: dffffc0000000000 R14: 0000000000000c40 R15: 0000000000000002 FS: 0000555556f90380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020020000 CR3: 0000000021fed000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess): 0: 74 63 je 0x65 2: 48 8b 1b mov (%rbx),%rbx 5: 48 85 db test %rbx,%rbx 8: 74 79 je 0x83 a: 48 89 d8 mov %rbx,%rax d: 48 c1 e8 03 shr $0x3,%rax 11: 42 80 3c 28 00 cmpb $0x0,(%rax,%r13,1) 16: 74 08 je 0x20 18: 48 89 df mov %rbx,%rdi 1b: e8 63 4d 8f ff call 0xff8f4d83 20: 48 8b 2b mov (%rbx),%rbp 23: 48 89 e8 mov %rbp,%rax 26: 48 c1 e8 03 shr $0x3,%rax * 2a: 42 80 3c 28 00 cmpb $0x0,(%rax,%r13,1) <-- trapping instruction 2f: 74 08 je 0x39 31: 48 89 ef mov %rbp,%rdi 34: e8 4a 4d 8f ff call 0xff8f4d83 39: 4c 39 65 00 cmp %r12,0x0(%rbp) 3d: 0f .byte 0xf 3e: 85 1a test %ebx,(%rdx) --- If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. ^ permalink raw reply [flat|nested] 14+ messages in thread
* current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) 2024-01-30 14:52 ` [syzbot] [xfs?] " syzbot @ 2024-01-30 23:37 ` Dave Chinner 2024-01-31 3:46 ` Darrick J. Wong ` (2 more replies) 2024-01-31 7:40 ` [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start Edward Adam Davis 2024-01-31 12:04 ` [PATCH] jbd2: user-memory-access " Edward Adam Davis 2 siblings, 3 replies; 14+ messages in thread From: Dave Chinner @ 2024-01-30 23:37 UTC (permalink / raw) To: syzbot Cc: adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs, tytso On Tue, Jan 30, 2024 at 06:52:21AM -0800, syzbot wrote: > syzbot has found a reproducer for the following issue on: > > HEAD commit: 861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam.. > git tree: upstream > console+strace: https://syzkaller.appspot.com/x/log.txt?x=13ca8d97e80000 > kernel config: https://syzkaller.appspot.com/x/.config?x=b0b9993d7d6d1990 > dashboard link: https://syzkaller.appspot.com/bug?extid=cdee56dbcdf0096ef605 > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=104393efe80000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1393b90fe80000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/7c6cc521298d/disk-861c0981.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/6203c94955db/vmlinux-861c0981.xz > kernel image: https://storage.googleapis.com/syzbot-assets/17e76e12b58c/bzImage-861c0981.xz > mounted in repro: https://storage.googleapis.com/syzbot-assets/d31d4eed2912/mount_3.gz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+cdee56dbcdf0096ef605@syzkaller.appspotmail.com > > general protection fault, probably for non-canonical address 0xdffffc000a8a4829: 0000 [#1] PREEMPT SMP KASAN > KASAN: probably user-memory-access in range [0x0000000054524148-0x000000005452414f] > CPU: 1 PID: 5065 Comm: syz-executor260 Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 > RIP: 0010:jbd2__journal_start+0x87/0x5d0 fs/jbd2/transaction.c:496 > Code: 74 63 48 8b 1b 48 85 db 74 79 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 63 4d 8f ff 48 8b 2b 48 89 e8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 ef e8 4a 4d 8f ff 4c 39 65 00 0f 85 1a > RSP: 0018:ffffc900043265c8 EFLAGS: 00010203 > RAX: 000000000a8a4829 RBX: ffff8880205fa3a8 RCX: ffff8880235dbb80 > RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88801c1a6000 > RBP: 000000005452414e R08: 0000000000000c40 R09: 0000000000000001 ^^^^^^^^ Hmmmm - TRAN. That's looks suspicious, I'll come back to that. > R10: dffffc0000000000 R11: ffffed1003834871 R12: ffff88801c1a6000 > R13: dffffc0000000000 R14: 0000000000000c40 R15: 0000000000000002 > FS: 0000555556f90380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000020020000 CR3: 0000000021fed000 CR4: 00000000003506f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > __ext4_journal_start_sb+0x215/0x5b0 fs/ext4/ext4_jbd2.c:112 > __ext4_journal_start fs/ext4/ext4_jbd2.h:326 [inline] > ext4_dirty_inode+0x92/0x110 fs/ext4/inode.c:5969 > __mark_inode_dirty+0x305/0xda0 fs/fs-writeback.c:2452 > generic_update_time fs/inode.c:1905 [inline] > inode_update_time fs/inode.c:1918 [inline] > __file_update_time fs/inode.c:2106 [inline] > file_update_time+0x39b/0x3e0 fs/inode.c:2136 > ext4_page_mkwrite+0x207/0xdf0 fs/ext4/inode.c:6090 > do_page_mkwrite+0x197/0x470 mm/memory.c:2966 > wp_page_shared mm/memory.c:3353 [inline] > do_wp_page+0x20e3/0x4c80 mm/memory.c:3493 > handle_pte_fault mm/memory.c:5160 [inline] > __handle_mm_fault+0x26a3/0x72b0 mm/memory.c:5285 > handle_mm_fault+0x27e/0x770 mm/memory.c:5450 > do_user_addr_fault arch/x86/mm/fault.c:1415 [inline] > handle_page_fault arch/x86/mm/fault.c:1507 [inline] > exc_page_fault+0x2ad/0x870 arch/x86/mm/fault.c:1563 > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570 EXt4 is triggering a BUG_ON: handle_t *handle = journal_current_handle(); int err; if (!journal) return ERR_PTR(-EROFS); if (handle) { >>>>>>>>> J_ASSERT(handle->h_transaction->t_journal == journal); handle->h_ref++; return handle; } via a journal assert failure. It appears that current->journal_info isn't what it is supposed to be. It's finding something with TRAN in it, I think. I'll come back to this. What syzbot is doing is creating a file on it's root filesystem and write()ing 0x208e24b bytes (zeroes from an anonymous mmap() region, I think) to it to initialise it's contents. Then it mmap()s the ext4 file for 0xb36000 bytes and copies the raw test filesystem image in the source code into it. It then creates a memfd that it decompresses the data in the mapped ext4 file into and creates a loop device that points to that memfd. It then mounts the loop device and we get an XFS filesystem which doesn't appear to contain any corruptions in it at all. It then runs a bulkstat pass on the the XFS filesystem. This is where it gets interesting. The user buffer that XFS copies the inode data into points to a memory address inside the range of the ext4 file that the filesystem image was copied to. It does not overlap with the filesystem image. Hence when XFS goes to copy the inodes into the user buffer, it triggers write page faults on the ext4 backing file. That's this part of the trace: > RIP: 0010:rep_movs_alternative+0x4a/0x70 arch/x86/lib/copy_user_64.S:71 > Code: 75 f1 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 8b 06 48 89 07 48 83 c6 08 48 83 c7 08 83 e9 08 74 df 83 f9 08 73 e8 eb c9 <f3> a4 c3 48 89 c8 48 c1 e9 03 83 e0 07 f3 48 a5 89 c1 85 c9 75 b3 > RSP: 0018:ffffc900043270f8 EFLAGS: 00050202 > RAX: ffffffff848cda01 RBX: 0000000020020040 RCX: 0000000000000040 > RDX: 0000000000000000 RSI: ffff8880131873b0 RDI: 0000000020020000 > RBP: 1ffff92000864f26 R08: ffff8880131873ef R09: 1ffff11002630e7d > R10: dffffc0000000000 R11: ffffed1002630e7e R12: 00000000000000c0 > R13: dffffc0000000000 R14: 000000002001ff80 R15: ffff888013187330 > copy_user_generic arch/x86/include/asm/uaccess_64.h:112 [inline] > raw_copy_to_user arch/x86/include/asm/uaccess_64.h:133 [inline] > _copy_to_user+0x86/0xa0 lib/usercopy.c:41 > copy_to_user include/linux/uaccess.h:191 [inline] > xfs_bulkstat_fmt+0x4f/0x120 fs/xfs/xfs_ioctl.c:744 > xfs_bulkstat_one_int+0xd8b/0x12e0 fs/xfs/xfs_itable.c:161 > xfs_bulkstat_iwalk+0x72/0xb0 fs/xfs/xfs_itable.c:239 > xfs_iwalk_ag_recs+0x4c3/0x820 fs/xfs/xfs_iwalk.c:220 > xfs_iwalk_run_callbacks+0x25b/0x490 fs/xfs/xfs_iwalk.c:376 > xfs_iwalk_ag+0xad6/0xbd0 fs/xfs/xfs_iwalk.c:482 > xfs_iwalk+0x360/0x6f0 fs/xfs/xfs_iwalk.c:584 > xfs_bulkstat+0x4f8/0x6c0 fs/xfs/xfs_itable.c:308 > xfs_ioc_bulkstat+0x3d0/0x450 fs/xfs/xfs_ioctl.c:867 > xfs_file_ioctl+0x6a5/0x1980 fs/xfs/xfs_ioctl.c:1994 > vfs_ioctl fs/ioctl.c:51 [inline] > __do_sys_ioctl fs/ioctl.c:871 [inline] > __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:857 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x63/0x6b What is interesting here is this is running in an empty XFS transaction context so that the bulkstat operation garbage collects all the buffers it accesses without us having to explicit cleanup - they all get released when we cancel the transaction context at the end of the process. But that means copy-out is running inside a transaction context, and that means current->journal_info contains a pointer to the current struct xfs_trans the bulkstat operation is using. Guess what we have as the first item in a struct xfs_trans? That's right, it's a magic number, and that magic number is: #define XFS_TRANS_HEADER_MAGIC 0x5452414e /* TRAN */ It should be obvious what has happened now - current->journal_info is not null, so ext4 thinks it owns the structure attached there and panics when it finds that it isn't an ext4 journal handle being held there. I don't think there are any clear rules as to how filesystems can and can't use current->journal_info. In general, a task can't jump from one filesystem to another inside a transaction context like this, so there's never been a serious concern about nested current->journal_info assignments like this in the past. XFS is doing nothing wrong - we're allowed to define transaction contexts however we want and use current->journal_info in this way. However, we have to acknowledge that ext4 has also done nothing wrong by assuming current->journal_info should below to it if it is not null. Indeed, XFS does the same thing. The question here is what to do about this? The obvious solution is to have save/restore semantics in the filesystem code that sets/clears current->journal_info, and then filesystems can also do the necessary "recursion into same filesystem" checks they need to ensure that they aren't nesting transactions in a way that can deadlock. Maybe there are other options - should filesystems even be allowed to trigger page faults when they have set current->journal_info? What other potential avenues are there that could cause this sort of transaction context nesting that we haven't realised exist? Does ext4 data=jounral have problems like this in the data copy-in path? What other filesystems allow page faults in transaction contexts? Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) 2024-01-30 23:37 ` current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) Dave Chinner @ 2024-01-31 3:46 ` Darrick J. Wong 2024-01-31 4:58 ` Theodore Ts'o 2024-01-31 12:02 ` Jan Kara 2 siblings, 0 replies; 14+ messages in thread From: Darrick J. Wong @ 2024-01-31 3:46 UTC (permalink / raw) To: Dave Chinner Cc: syzbot, adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs, tytso On Wed, Jan 31, 2024 at 10:37:18AM +1100, Dave Chinner wrote: > On Tue, Jan 30, 2024 at 06:52:21AM -0800, syzbot wrote: > > syzbot has found a reproducer for the following issue on: > > > > HEAD commit: 861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam.. > > git tree: upstream > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=13ca8d97e80000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=b0b9993d7d6d1990 > > dashboard link: https://syzkaller.appspot.com/bug?extid=cdee56dbcdf0096ef605 > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=104393efe80000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1393b90fe80000 > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/7c6cc521298d/disk-861c0981.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/6203c94955db/vmlinux-861c0981.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/17e76e12b58c/bzImage-861c0981.xz > > mounted in repro: https://storage.googleapis.com/syzbot-assets/d31d4eed2912/mount_3.gz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+cdee56dbcdf0096ef605@syzkaller.appspotmail.com > > > > general protection fault, probably for non-canonical address 0xdffffc000a8a4829: 0000 [#1] PREEMPT SMP KASAN > > KASAN: probably user-memory-access in range [0x0000000054524148-0x000000005452414f] > > CPU: 1 PID: 5065 Comm: syz-executor260 Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 > > RIP: 0010:jbd2__journal_start+0x87/0x5d0 fs/jbd2/transaction.c:496 > > Code: 74 63 48 8b 1b 48 85 db 74 79 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 63 4d 8f ff 48 8b 2b 48 89 e8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 ef e8 4a 4d 8f ff 4c 39 65 00 0f 85 1a > > RSP: 0018:ffffc900043265c8 EFLAGS: 00010203 > > RAX: 000000000a8a4829 RBX: ffff8880205fa3a8 RCX: ffff8880235dbb80 > > RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88801c1a6000 > > RBP: 000000005452414e R08: 0000000000000c40 R09: 0000000000000001 > ^^^^^^^^ > Hmmmm - TRAN. That's looks suspicious, I'll come back to that. > > > R10: dffffc0000000000 R11: ffffed1003834871 R12: ffff88801c1a6000 > > R13: dffffc0000000000 R14: 0000000000000c40 R15: 0000000000000002 > > FS: 0000555556f90380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000020020000 CR3: 0000000021fed000 CR4: 00000000003506f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <TASK> > > __ext4_journal_start_sb+0x215/0x5b0 fs/ext4/ext4_jbd2.c:112 > > __ext4_journal_start fs/ext4/ext4_jbd2.h:326 [inline] > > ext4_dirty_inode+0x92/0x110 fs/ext4/inode.c:5969 > > __mark_inode_dirty+0x305/0xda0 fs/fs-writeback.c:2452 > > generic_update_time fs/inode.c:1905 [inline] > > inode_update_time fs/inode.c:1918 [inline] > > __file_update_time fs/inode.c:2106 [inline] > > file_update_time+0x39b/0x3e0 fs/inode.c:2136 > > ext4_page_mkwrite+0x207/0xdf0 fs/ext4/inode.c:6090 > > do_page_mkwrite+0x197/0x470 mm/memory.c:2966 > > wp_page_shared mm/memory.c:3353 [inline] > > do_wp_page+0x20e3/0x4c80 mm/memory.c:3493 > > handle_pte_fault mm/memory.c:5160 [inline] > > __handle_mm_fault+0x26a3/0x72b0 mm/memory.c:5285 > > handle_mm_fault+0x27e/0x770 mm/memory.c:5450 > > do_user_addr_fault arch/x86/mm/fault.c:1415 [inline] > > handle_page_fault arch/x86/mm/fault.c:1507 [inline] > > exc_page_fault+0x2ad/0x870 arch/x86/mm/fault.c:1563 > > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570 > > EXt4 is triggering a BUG_ON: > > handle_t *handle = journal_current_handle(); > int err; > > if (!journal) > return ERR_PTR(-EROFS); > > if (handle) { > >>>>>>>>> J_ASSERT(handle->h_transaction->t_journal == journal); > handle->h_ref++; > return handle; > } > > via a journal assert failure. It appears that current->journal_info > isn't what it is supposed to be. It's finding something with TRAN in > it, I think. I'll come back to this. > > What syzbot is doing is creating a file on it's root filesystem and > write()ing 0x208e24b bytes (zeroes from an anonymous mmap() region, > I think) to it to initialise it's contents. > > Then it mmap()s the ext4 file for 0xb36000 bytes and copies the raw > test filesystem image in the source code into it. It then creates a > memfd that it decompresses the data in the mapped ext4 file into and > creates a loop device that points to that memfd. It then mounts the > loop device and we get an XFS filesystem which doesn't appear to > contain any corruptions in it at all. It then runs a bulkstat pass > on the the XFS filesystem. > > This is where it gets interesting. The user buffer that XFS > copies the inode data into points to a memory address inside the > range of the ext4 file that the filesystem image was copied to. > It does not overlap with the filesystem image. > > Hence when XFS goes to copy the inodes into the user buffer, it > triggers write page faults on the ext4 backing file. > > That's this part of the trace: > > > > RIP: 0010:rep_movs_alternative+0x4a/0x70 arch/x86/lib/copy_user_64.S:71 > > Code: 75 f1 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 8b 06 48 89 07 48 83 c6 08 48 83 c7 08 83 e9 08 74 df 83 f9 08 73 e8 eb c9 <f3> a4 c3 48 89 c8 48 c1 e9 03 83 e0 07 f3 48 a5 89 c1 85 c9 75 b3 > > RSP: 0018:ffffc900043270f8 EFLAGS: 00050202 > > RAX: ffffffff848cda01 RBX: 0000000020020040 RCX: 0000000000000040 > > RDX: 0000000000000000 RSI: ffff8880131873b0 RDI: 0000000020020000 > > RBP: 1ffff92000864f26 R08: ffff8880131873ef R09: 1ffff11002630e7d > > R10: dffffc0000000000 R11: ffffed1002630e7e R12: 00000000000000c0 > > R13: dffffc0000000000 R14: 000000002001ff80 R15: ffff888013187330 > > copy_user_generic arch/x86/include/asm/uaccess_64.h:112 [inline] > > raw_copy_to_user arch/x86/include/asm/uaccess_64.h:133 [inline] > > _copy_to_user+0x86/0xa0 lib/usercopy.c:41 > > copy_to_user include/linux/uaccess.h:191 [inline] > > xfs_bulkstat_fmt+0x4f/0x120 fs/xfs/xfs_ioctl.c:744 > > xfs_bulkstat_one_int+0xd8b/0x12e0 fs/xfs/xfs_itable.c:161 > > xfs_bulkstat_iwalk+0x72/0xb0 fs/xfs/xfs_itable.c:239 > > xfs_iwalk_ag_recs+0x4c3/0x820 fs/xfs/xfs_iwalk.c:220 > > xfs_iwalk_run_callbacks+0x25b/0x490 fs/xfs/xfs_iwalk.c:376 > > xfs_iwalk_ag+0xad6/0xbd0 fs/xfs/xfs_iwalk.c:482 > > xfs_iwalk+0x360/0x6f0 fs/xfs/xfs_iwalk.c:584 > > xfs_bulkstat+0x4f8/0x6c0 fs/xfs/xfs_itable.c:308 > > xfs_ioc_bulkstat+0x3d0/0x450 fs/xfs/xfs_ioctl.c:867 > > xfs_file_ioctl+0x6a5/0x1980 fs/xfs/xfs_ioctl.c:1994 > > vfs_ioctl fs/ioctl.c:51 [inline] > > __do_sys_ioctl fs/ioctl.c:871 [inline] > > __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:857 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x63/0x6b > > What is interesting here is this is running in an empty XFS > transaction context so that the bulkstat operation garbage collects > all the buffers it accesses without us having to explicit cleanup - > they all get released when we cancel the transaction context at the > end of the process. > > But that means copy-out is running inside a transaction context, and > that means current->journal_info contains a pointer to the current > struct xfs_trans the bulkstat operation is using. > > Guess what we have as the first item in a struct xfs_trans? That's > right, it's a magic number, and that magic number is: > > #define XFS_TRANS_HEADER_MAGIC 0x5452414e /* TRAN */ > > It should be obvious what has happened now - > current->journal_info is not null, so ext4 thinks it owns the > structure attached there and panics when it finds that it isn't an > ext4 journal handle being held there. > > I don't think there are any clear rules as to how filesystems can > and can't use current->journal_info. In general, a task can't jump > from one filesystem to another inside a transaction context like > this, so there's never been a serious concern about nested > current->journal_info assignments like this in the past. > > XFS is doing nothing wrong - we're allowed to define transaction > contexts however we want and use current->journal_info in this way. > However, we have to acknowledge that ext4 has also done nothing > wrong by assuming current->journal_info should below to it if it is > not null. Indeed, XFS does the same thing. Getting late here, so this will be pretty terse-- Thinking narrowly about just xfs, I think this means that the bulkstat/inumbers implementations need to allocate a bounce buffer to format records into so that it can copy_to_user without any locks held. We have no idea if the destination page is a file page or anonymous memory or whatever. Or: Do we really need to set current->journal_info for empty transactions? > The question here is what to do about this? The obvious solution is > to have save/restore semantics in the filesystem code that > sets/clears current->journal_info, and then filesystems can also do > the necessary "recursion into same filesystem" checks they need to > ensure that they aren't nesting transactions in a way that can > deadlock. We don't know what locks might be held by whatever code set journal_info. I don't see how we could push it aside sanely? > Maybe there are other options - should filesystems even be allowed to > trigger page faults when they have set current->journal_info? I wonder if we ought to be checking current->journal_info in our own page_mkwrite handler and throwing back an errno? I don't think we want to go down the rabbithele of "someone else was running a transaction, maybe we can proceed with an update anyway???". > What other potential avenues are there that could cause this sort of > transaction context nesting that we haven't realised exist? Does > ext4 data=jounral have problems like this in the data copy-in path? > What other filesystems allow page faults in transaction contexts? Ugh, whackamole. :/ --D > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) 2024-01-30 23:37 ` current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) Dave Chinner 2024-01-31 3:46 ` Darrick J. Wong @ 2024-01-31 4:58 ` Theodore Ts'o 2024-01-31 5:20 ` Matthew Wilcox 2024-01-31 12:02 ` Jan Kara 2 siblings, 1 reply; 14+ messages in thread From: Theodore Ts'o @ 2024-01-31 4:58 UTC (permalink / raw) To: Dave Chinner Cc: syzbot, adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs On Wed, Jan 31, 2024 at 10:37:18AM +1100, Dave Chinner wrote: > It should be obvious what has happened now - > current->journal_info is not null, so ext4 thinks it owns the > structure attached there and panics when it finds that it isn't an > ext4 journal handle being held there. > > I don't think there are any clear rules as to how filesystems can > and can't use current->journal_info. In general, a task can't jump > from one filesystem to another inside a transaction context like > this, so there's never been a serious concern about nested > current->journal_info assignments like this in the past. > > XFS is doing nothing wrong - we're allowed to define transaction > contexts however we want and use current->journal_info in this way. > However, we have to acknowledge that ext4 has also done nothing > wrong by assuming current->journal_info should below to it if it is > not null. Indeed, XFS does the same thing. Nice analysis. Fundamentally the current usage of current->journal_info assumes that a process would only be calling into one file system at a time. But obviously that's not going to be true in the case of one file system writing to memory which then triggers a page fault. As far as other potential avenues that could cause this kind of nesting, the other one which comes to mind might be sendfile(2) -- although in general the reader side won't trigger a transaction since the atime update tends to be done lazily. > The question here is what to do about this? The obvious solution is > to have save/restore semantics in the filesystem code that > sets/clears current->journal_info, and then filesystems can also do > the necessary "recursion into same filesystem" checks they need to > ensure that they aren't nesting transactions in a way that can > deadlock. > > Maybe there are other options - should filesystems even be allowed to > trigger page faults when they have set current->journal_info? Hmm, could XFS pre-fault target memory buffer for the bulkstat output before starting its transaction? Alternatively, ext4 could do a save of current->journal_info before starting to process the page fault, and restore it when it is done. Both of these seem a bit hacky, and the question is indeed, are there other avenues that might cause the transaction context nesting, such that a more general solution is called for? - Ted ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) 2024-01-31 4:58 ` Theodore Ts'o @ 2024-01-31 5:20 ` Matthew Wilcox 2024-01-31 5:47 ` Christoph Hellwig 2024-01-31 6:02 ` Dave Chinner 0 siblings, 2 replies; 14+ messages in thread From: Matthew Wilcox @ 2024-01-31 5:20 UTC (permalink / raw) To: Theodore Ts'o Cc: Dave Chinner, syzbot, adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs On Tue, Jan 30, 2024 at 11:58:22PM -0500, Theodore Ts'o wrote: > Hmm, could XFS pre-fault target memory buffer for the bulkstat output > before starting its transaction? Alternatively, ext4 could do a save > of current->journal_info before starting to process the page fault, > and restore it when it is done. Both of these seem a bit hacky, and > the question is indeed, are there other avenues that might cause the > transaction context nesting, such that a more general solution is > called for? I'd suggest that saving off current->journal_info is risky because it might cover a real problem where you've taken a pagefault inside a transaction (eg ext4 faulting while in the middle of a transaction on the same filesystem that contains the faulting file). Seems to me that we shouldn't be writing to userspace while in the middle of a transaction. We could even assert that in copy_to_user()? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) 2024-01-31 5:20 ` Matthew Wilcox @ 2024-01-31 5:47 ` Christoph Hellwig 2024-01-31 6:02 ` Dave Chinner 1 sibling, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2024-01-31 5:47 UTC (permalink / raw) To: Matthew Wilcox Cc: Theodore Ts'o, Dave Chinner, syzbot, adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs On Wed, Jan 31, 2024 at 05:20:10AM +0000, Matthew Wilcox wrote: > I'd suggest that saving off current->journal_info is risky because > it might cover a real problem where you've taken a pagefault inside > a transaction (eg ext4 faulting while in the middle of a transaction on > the same filesystem that contains the faulting file). Agreed. > Seems to me that we shouldn't be writing to userspace while in the > middle of a transaction. We could even assert that in copy_to_user()? That sounds useful, but also rather expensive. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) 2024-01-31 5:20 ` Matthew Wilcox 2024-01-31 5:47 ` Christoph Hellwig @ 2024-01-31 6:02 ` Dave Chinner 2024-01-31 6:17 ` Christoph Hellwig 1 sibling, 1 reply; 14+ messages in thread From: Dave Chinner @ 2024-01-31 6:02 UTC (permalink / raw) To: Matthew Wilcox Cc: Theodore Ts'o, syzbot, adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs On Wed, Jan 31, 2024 at 05:20:10AM +0000, Matthew Wilcox wrote: > On Tue, Jan 30, 2024 at 11:58:22PM -0500, Theodore Ts'o wrote: > > Hmm, could XFS pre-fault target memory buffer for the bulkstat output > > before starting its transaction? Alternatively, ext4 could do a save > > of current->journal_info before starting to process the page fault, > > and restore it when it is done. Both of these seem a bit hacky, and > > the question is indeed, are there other avenues that might cause the > > transaction context nesting, such that a more general solution is > > called for? > > I'd suggest that saving off current->journal_info is risky because > it might cover a real problem where you've taken a pagefault inside > a transaction (eg ext4 faulting while in the middle of a transaction on > the same filesystem that contains the faulting file). Depends. Look at it from the POV of XFS: 1. we can identify current->journal_info as belonging to XFS because of the TRAN magic number in the first 32 bits of the structure. 2. the struct xfs_trans has a pointer to the xfs_mount which points to the VFS superblock, which means we can identify exactly which filesystem instance the transaction belongs to. 3. We can determine if the transaction has a journal reservation from the log ticket the transaction holds. From these three things, we can identify if we are about to recurse into the same filesystem, and if so, determine if it is safe to run new transactions from within that context. > Seems to me that we shouldn't be writing to userspace while in the > middle of a transaction. We could even assert that in copy_to_user()? On further thinking I don't think the problem is taking page faults within a filesystem transaction context is the problem here. We're using the transaction context for automated garbage collection so we never have to care about leaking object references in the code, not because we are running a modification and holding a journal reservation. It's the journals reservations that cannot be allowed to nest in XFS, not the transactions themselves. IOWs, the real problem is that multiple filesystems use the same task field for their own individual purposes, and that then leads these sorts of recursion problems when a task jumps from one filesystem context to another and the task field is then mis-interpretted. I've had a look at the XFS usage of current->journal_info, and I think we can just remove it. It's used for warning that we are running a writepages operations from within transaction context (which should never happen from XFS nor memory reclaim). it's also used in the ->destroy_inode path to determine if we are running in a context where we cannot block on filesystem locks or transaction reservations. The former we can remove with no loss, the latter we can simply check PF_MEMALLOC_NOFS as that is set by transaction contexts. IOWs, I think that, in general, page faults in user buffers are fine in empty XFS transaction contexts as long as we aren't using some structure that some other filesystem might then to process the page fault.... This may not be true for other filesystems, but I don't think we can really say "page faults in any filesystem transaction are unsafe and should be banned".... -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) 2024-01-31 6:02 ` Dave Chinner @ 2024-01-31 6:17 ` Christoph Hellwig 0 siblings, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2024-01-31 6:17 UTC (permalink / raw) To: Dave Chinner Cc: Matthew Wilcox, Theodore Ts'o, syzbot, adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs On Wed, Jan 31, 2024 at 05:02:25PM +1100, Dave Chinner wrote: > This may not be true for other filesystems, but I don't think we > can really say "page faults in any filesystem transaction are unsafe > and should be banned".... I think the point is page faults with current->journal_info set is unsafe, as the can recurse into another file system using it. If we don't set current->journal_info (and your ideas for that sound sensible to me), the question of page faults in transactions is decoupled from that. We just need to ensure we never recurse into a transaction in the same or a dependent fs, which ot me suggest we'd better avoid it if we easily can. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) 2024-01-30 23:37 ` current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) Dave Chinner 2024-01-31 3:46 ` Darrick J. Wong 2024-01-31 4:58 ` Theodore Ts'o @ 2024-01-31 12:02 ` Jan Kara 2 siblings, 0 replies; 14+ messages in thread From: Jan Kara @ 2024-01-31 12:02 UTC (permalink / raw) To: Dave Chinner Cc: syzbot, adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs, tytso On Wed 31-01-24 10:37:18, Dave Chinner wrote: > On Tue, Jan 30, 2024 at 06:52:21AM -0800, syzbot wrote: > > syzbot has found a reproducer for the following issue on: > > > > HEAD commit: 861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam.. > > git tree: upstream > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=13ca8d97e80000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=b0b9993d7d6d1990 > > dashboard link: https://syzkaller.appspot.com/bug?extid=cdee56dbcdf0096ef605 > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=104393efe80000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1393b90fe80000 > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/7c6cc521298d/disk-861c0981.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/6203c94955db/vmlinux-861c0981.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/17e76e12b58c/bzImage-861c0981.xz > > mounted in repro: https://storage.googleapis.com/syzbot-assets/d31d4eed2912/mount_3.gz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+cdee56dbcdf0096ef605@syzkaller.appspotmail.com > > > > general protection fault, probably for non-canonical address 0xdffffc000a8a4829: 0000 [#1] PREEMPT SMP KASAN > > KASAN: probably user-memory-access in range [0x0000000054524148-0x000000005452414f] > > CPU: 1 PID: 5065 Comm: syz-executor260 Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 > > RIP: 0010:jbd2__journal_start+0x87/0x5d0 fs/jbd2/transaction.c:496 > > Code: 74 63 48 8b 1b 48 85 db 74 79 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 63 4d 8f ff 48 8b 2b 48 89 e8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 ef e8 4a 4d 8f ff 4c 39 65 00 0f 85 1a > > RSP: 0018:ffffc900043265c8 EFLAGS: 00010203 > > RAX: 000000000a8a4829 RBX: ffff8880205fa3a8 RCX: ffff8880235dbb80 > > RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88801c1a6000 > > RBP: 000000005452414e R08: 0000000000000c40 R09: 0000000000000001 > ^^^^^^^^ > Hmmmm - TRAN. That's looks suspicious, I'll come back to that. Indeed, thanks for the great analysis. <snip analysis> > The question here is what to do about this? The obvious solution is > to have save/restore semantics in the filesystem code that > sets/clears current->journal_info, and then filesystems can also do > the necessary "recursion into same filesystem" checks they need to > ensure that they aren't nesting transactions in a way that can > deadlock. As others have mentioned, this seems potentially dangerous because that just hides potential deadlocks. E.g. for ext4 taking a page fault while having a transaction started violates lock ordering requirements (mapping->invalidate_lock > transaction start). OTOH we have lockdep tracking for this anyway so I guess we don't care too much for ext4. > Maybe there are other options - should filesystems even be allowed to > trigger page faults when they have set current->journal_info? For ext4 it would definitely be a bug if this happens and it is not only about usage of current->journal_info as I wrote above. > What other potential avenues are there that could cause this sort of > transaction context nesting that we haven't realised exist? Does > ext4 data=jounral have problems like this in the data copy-in path? > What other filesystems allow page faults in transaction contexts? So I'm reasonably confident we aren't hitting any such path in ext4 as lockdep would tell us about it (we treat transaction start as lock acquisition in jbd2 and tell lockdep about it). For the write path, we are relying on VFS prefaulting pages before calling ->write_begin (where we start a transaction) and then doing atomic copy. For the read path we don't start any transaction (except for possible atime update but that's just a tiny transaction on the side after the read completes). So ext4 on its own should be fine. But we have also btrfs, ceph, gfs2, nilfs2, ocfs2, and reiserfs using current->journal_info so overall chances for interaction are ... non-trivial. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start 2024-01-30 14:52 ` [syzbot] [xfs?] " syzbot 2024-01-30 23:37 ` current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) Dave Chinner @ 2024-01-31 7:40 ` Edward Adam Davis 2024-01-31 11:17 ` syzbot 2024-01-31 12:04 ` [PATCH] jbd2: user-memory-access " Edward Adam Davis 2 siblings, 1 reply; 14+ messages in thread From: Edward Adam Davis @ 2024-01-31 7:40 UTC (permalink / raw) To: syzbot+cdee56dbcdf0096ef605; +Cc: linux-kernel, syzkaller-bugs please test general protection fault in jbd2__journal_start #syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index cb0b8d6fc0c6..702312cd5392 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -493,6 +493,9 @@ handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks, return ERR_PTR(-EROFS); if (handle) { + if (handle->saved_alloc_context & ~PF_MEMALLOC_NOFS) + return ERR_PTR(-EBUSY); + J_ASSERT(handle->h_transaction->t_journal == journal); handle->h_ref++; return handle; ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start 2024-01-31 7:40 ` [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start Edward Adam Davis @ 2024-01-31 11:17 ` syzbot 0 siblings, 0 replies; 14+ messages in thread From: syzbot @ 2024-01-31 11:17 UTC (permalink / raw) To: eadavis, linux-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-and-tested-by: syzbot+cdee56dbcdf0096ef605@syzkaller.appspotmail.com Tested on: commit: 1bbb19b6 Merge tag 'erofs-for-6.8-rc3-fixes' of git://.. git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master console output: https://syzkaller.appspot.com/x/log.txt?x=16f6172fe80000 kernel config: https://syzkaller.appspot.com/x/.config?x=b0b9993d7d6d1990 dashboard link: https://syzkaller.appspot.com/bug?extid=cdee56dbcdf0096ef605 compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=17477cb0180000 Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] jbd2: user-memory-access in jbd2__journal_start 2024-01-30 14:52 ` [syzbot] [xfs?] " syzbot 2024-01-30 23:37 ` current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) Dave Chinner 2024-01-31 7:40 ` [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start Edward Adam Davis @ 2024-01-31 12:04 ` Edward Adam Davis 2024-01-31 15:41 ` Jan Kara 2 siblings, 1 reply; 14+ messages in thread From: Edward Adam Davis @ 2024-01-31 12:04 UTC (permalink / raw) To: syzbot+cdee56dbcdf0096ef605 Cc: adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs, tytso Before reusing the handle, it is necessary to confirm that the transaction is ready. Reported-and-tested-by: syzbot+cdee56dbcdf0096ef605@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> --- fs/jbd2/transaction.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index cb0b8d6fc0c6..702312cd5392 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -493,6 +493,9 @@ handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks, return ERR_PTR(-EROFS); if (handle) { + if (handle->saved_alloc_context & ~PF_MEMALLOC_NOFS) + return ERR_PTR(-EBUSY); + J_ASSERT(handle->h_transaction->t_journal == journal); handle->h_ref++; return handle; -- 2.43.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] jbd2: user-memory-access in jbd2__journal_start 2024-01-31 12:04 ` [PATCH] jbd2: user-memory-access " Edward Adam Davis @ 2024-01-31 15:41 ` Jan Kara 0 siblings, 0 replies; 14+ messages in thread From: Jan Kara @ 2024-01-31 15:41 UTC (permalink / raw) To: Edward Adam Davis Cc: syzbot+cdee56dbcdf0096ef605, adilger.kernel, chandan.babu, jack, linux-ext4, linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs, tytso On Wed 31-01-24 20:04:27, Edward Adam Davis wrote: > Before reusing the handle, it is necessary to confirm that the transaction is > ready. > > Reported-and-tested-by: syzbot+cdee56dbcdf0096ef605@syzkaller.appspotmail.com > Signed-off-by: Edward Adam Davis <eadavis@qq.com> Sorry but no. Dave found a way to fix this particular problem in XFS and your patch would not really improve anything because we'd just crash when dereferencing handle->saved_alloc_context. Honza > diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c > index cb0b8d6fc0c6..702312cd5392 100644 > --- a/fs/jbd2/transaction.c > +++ b/fs/jbd2/transaction.c > @@ -493,6 +493,9 @@ handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks, > return ERR_PTR(-EROFS); > > if (handle) { > + if (handle->saved_alloc_context & ~PF_MEMALLOC_NOFS) > + return ERR_PTR(-EBUSY); > + > J_ASSERT(handle->h_transaction->t_journal == journal); > handle->h_ref++; > return handle; > -- > 2.43.0 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-01-31 15:41 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-01-26 9:05 [syzbot] [ext4?] general protection fault in jbd2__journal_start syzbot 2024-01-30 14:52 ` [syzbot] [xfs?] " syzbot 2024-01-30 23:37 ` current->journal_info got nested! (was Re: [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start) Dave Chinner 2024-01-31 3:46 ` Darrick J. Wong 2024-01-31 4:58 ` Theodore Ts'o 2024-01-31 5:20 ` Matthew Wilcox 2024-01-31 5:47 ` Christoph Hellwig 2024-01-31 6:02 ` Dave Chinner 2024-01-31 6:17 ` Christoph Hellwig 2024-01-31 12:02 ` Jan Kara 2024-01-31 7:40 ` [syzbot] [xfs?] [ext4?] general protection fault in jbd2__journal_start Edward Adam Davis 2024-01-31 11:17 ` syzbot 2024-01-31 12:04 ` [PATCH] jbd2: user-memory-access " Edward Adam Davis 2024-01-31 15:41 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox