linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [ext4?] KASAN: out-of-bounds Read in ext4_xattr_set_entry
@ 2024-09-22  0:16 syzbot
  2024-09-22  5:46 ` Qianqiang Liu
  0 siblings, 1 reply; 18+ messages in thread
From: syzbot @ 2024-09-22  0:16 UTC (permalink / raw)
  To: adilger.kernel, linux-ext4, linux-kernel, syzkaller-bugs, tytso

Hello,

syzbot found the following issue on:

HEAD commit:    a940d9a43e62 Merge tag 'soc-arm-6.12' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16689207980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1653f803fffa3848
dashboard link: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11689207980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=104d469f980000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-a940d9a4.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/371e11b6a9e5/vmlinux-a940d9a4.xz
kernel image: https://storage.googleapis.com/syzbot-assets/920eb0c53785/bzImage-a940d9a4.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/4146c0b2e744/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com

EXT4-fs (loop0): 1 truncate cleaned up
EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 r/w without journal. Quota mode: writeback.
==================================================================
BUG: KASAN: out-of-bounds in ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
Read of size 18446744073709551572 at addr ffff888036426850 by task syz-executor264/5095

CPU: 0 UID: 0 PID: 5095 Comm: syz-executor264 Not tainted 6.11.0-syzkaller-03917-ga940d9a43e62 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:93 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
 __asan_memmove+0x29/0x70 mm/kasan/shadow.c:94
 ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
 ext4_xattr_block_set+0xa39/0x3980 fs/ext4/xattr.c:2028
 ext4_xattr_move_to_block fs/ext4/xattr.c:2669 [inline]
 ext4_xattr_make_inode_space fs/ext4/xattr.c:2744 [inline]
 ext4_expand_extra_isize_ea+0x12d7/0x1cf0 fs/ext4/xattr.c:2836
 __ext4_expand_extra_isize+0x2fb/0x3e0 fs/ext4/inode.c:5858
 ext4_try_to_expand_extra_isize fs/ext4/inode.c:5901 [inline]
 __ext4_mark_inode_dirty+0x524/0x880 fs/ext4/inode.c:5979
 __ext4_unlink+0x6c2/0xb50 fs/ext4/namei.c:3289
 ext4_unlink+0x1bf/0x5a0 fs/ext4/namei.c:3318
 vfs_unlink+0x365/0x650 fs/namei.c:4469
 do_unlinkat+0x4ae/0x830 fs/namei.c:4533
 __do_sys_unlink fs/namei.c:4581 [inline]
 __se_sys_unlink fs/namei.c:4579 [inline]
 __x64_sys_unlink+0x47/0x50 fs/namei.c:4579
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f2549cabe99
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff914df8a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000057
RAX: ffffffffffffffda RBX: 0031656c69662f2e RCX: 00007f2549cabe99
RDX: 00007f2549cabe99 RSI: 00007f2549cabe99 RDI: 0000000020000180
RBP: 0032656c69662f2e R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff914df8e0
R13: 00007fff914dfb08 R14: 431bde82d7b634db R15: 00007f2549cf503b
 </TASK>

Allocated by task 5095:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
 __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
 kasan_kmalloc include/linux/kasan.h:211 [inline]
 __do_kmalloc_node mm/slub.c:4159 [inline]
 __kmalloc_node_track_caller_noprof+0x225/0x440 mm/slub.c:4178
 kmemdup_noprof+0x2a/0x60 mm/util.c:133
 ext4_xattr_block_set+0x88b/0x3980 fs/ext4/xattr.c:1976
 ext4_xattr_move_to_block fs/ext4/xattr.c:2669 [inline]
 ext4_xattr_make_inode_space fs/ext4/xattr.c:2744 [inline]
 ext4_expand_extra_isize_ea+0x12d7/0x1cf0 fs/ext4/xattr.c:2836
 __ext4_expand_extra_isize+0x2fb/0x3e0 fs/ext4/inode.c:5858
 ext4_try_to_expand_extra_isize fs/ext4/inode.c:5901 [inline]
 __ext4_mark_inode_dirty+0x524/0x880 fs/ext4/inode.c:5979
 __ext4_unlink+0x6c2/0xb50 fs/ext4/namei.c:3289
 ext4_unlink+0x1bf/0x5a0 fs/ext4/namei.c:3318
 vfs_unlink+0x365/0x650 fs/namei.c:4469
 do_unlinkat+0x4ae/0x830 fs/namei.c:4533
 __do_sys_unlink fs/namei.c:4581 [inline]
 __se_sys_unlink fs/namei.c:4579 [inline]
 __x64_sys_unlink+0x47/0x50 fs/namei.c:4579
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff888036426800
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 80 bytes inside of
 1024-byte region [ffff888036426800, ffff888036426c00)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x36424
head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x4fff00000000040(head|node=1|zone=1|lastcpupid=0x7ff)
page_type: 0xfdffffff(slab)
raw: 04fff00000000040 ffff88801ac41dc0 dead000000000100 dead000000000122
raw: 0000000000000000 0000000080080008 00000001fdffffff 0000000000000000
head: 04fff00000000040 ffff88801ac41dc0 dead000000000100 dead000000000122
head: 0000000000000000 0000000080080008 00000001fdffffff 0000000000000000
head: 04fff00000000002 ffffea0000d90901 ffffffffffffffff 0000000000000000
head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 5049, tgid 5049 (dhcpcd), ts 74832155643, free_ts 73695396465
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1500
 prep_new_page mm/page_alloc.c:1508 [inline]
 get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3446
 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4702
 __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
 alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
 alloc_slab_page+0x5f/0x120 mm/slub.c:2319
 allocate_slab+0x5a/0x2f0 mm/slub.c:2482
 new_slab mm/slub.c:2535 [inline]
 ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3721
 __slab_alloc+0x58/0xa0 mm/slub.c:3811
 __slab_alloc_node mm/slub.c:3864 [inline]
 slab_alloc_node mm/slub.c:4026 [inline]
 __do_kmalloc_node mm/slub.c:4158 [inline]
 __kmalloc_noprof+0x25a/0x400 mm/slub.c:4171
 kmalloc_noprof include/linux/slab.h:694 [inline]
 load_elf_phdrs+0x162/0x260 fs/binfmt_elf.c:526
 load_elf_binary+0x920/0x2680 fs/binfmt_elf.c:955
 search_binary_handler fs/exec.c:1820 [inline]
 exec_binprm fs/exec.c:1862 [inline]
 bprm_execve+0xaf8/0x1770 fs/exec.c:1913
 do_execveat_common+0x55f/0x6f0 fs/exec.c:2020
 do_execve fs/exec.c:2094 [inline]
 __do_sys_execve fs/exec.c:2170 [inline]
 __se_sys_execve fs/exec.c:2165 [inline]
 __x64_sys_execve+0x92/0xb0 fs/exec.c:2165
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5032 tgid 5032 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 free_pages_prepare mm/page_alloc.c:1101 [inline]
 free_unref_page+0xd22/0xea0 mm/page_alloc.c:2619
 __slab_free+0x31b/0x3d0 mm/slub.c:4385
 qlink_free mm/kasan/quarantine.c:163 [inline]
 qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179
 kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
 __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slub.c:3989 [inline]
 slab_alloc_node mm/slub.c:4038 [inline]
 __do_kmalloc_node mm/slub.c:4158 [inline]
 __kmalloc_noprof+0x1a6/0x400 mm/slub.c:4171
 kmalloc_noprof include/linux/slab.h:694 [inline]
 tomoyo_realpath_from_path+0xcf/0x5e0 security/tomoyo/realpath.c:251
 tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
 tomoyo_path_perm+0x2b7/0x740 security/tomoyo/file.c:822
 security_inode_getattr+0x130/0x330 security/security.c:2371
 vfs_getattr+0x45/0x430 fs/stat.c:204
 vfs_fstat fs/stat.c:229 [inline]
 vfs_fstatat+0xe4/0x190 fs/stat.c:338
 __do_sys_newfstatat fs/stat.c:505 [inline]
 __se_sys_newfstatat fs/stat.c:499 [inline]
 __x64_sys_newfstatat+0x11d/0x1a0 fs/stat.c:499
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Memory state around the buggy address:
 ffff888036426700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888036426780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888036426800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
                                                 ^
 ffff888036426880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff888036426900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [ext4?] KASAN: out-of-bounds Read in ext4_xattr_set_entry
  2024-09-22  0:16 [syzbot] [ext4?] KASAN: out-of-bounds Read in ext4_xattr_set_entry syzbot
@ 2024-09-22  5:46 ` Qianqiang Liu
  2024-09-22  6:35   ` syzbot
  0 siblings, 1 reply; 18+ messages in thread
From: Qianqiang Liu @ 2024-09-22  5:46 UTC (permalink / raw)
  To: syzbot; +Cc: adilger.kernel, linux-ext4, linux-kernel, syzkaller-bugs, tytso

On Sat, Sep 21, 2024 at 05:16:22PM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    a940d9a43e62 Merge tag 'soc-arm-6.12' of git://git.kernel...
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=16689207980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=1653f803fffa3848
> dashboard link: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11689207980000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=104d469f980000
> 
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-a940d9a4.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/371e11b6a9e5/vmlinux-a940d9a4.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/920eb0c53785/bzImage-a940d9a4.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/4146c0b2e744/mount_0.gz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
> 
> EXT4-fs (loop0): 1 truncate cleaned up
> EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 r/w without journal. Quota mode: writeback.
> ==================================================================
> BUG: KASAN: out-of-bounds in ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
> Read of size 18446744073709551572 at addr ffff888036426850 by task syz-executor264/5095
> 
> CPU: 0 UID: 0 PID: 5095 Comm: syz-executor264 Not tainted 6.11.0-syzkaller-03917-ga940d9a43e62 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:93 [inline]
>  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
>  print_address_description mm/kasan/report.c:377 [inline]
>  print_report+0x169/0x550 mm/kasan/report.c:488
>  kasan_report+0x143/0x180 mm/kasan/report.c:601
>  kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
>  __asan_memmove+0x29/0x70 mm/kasan/shadow.c:94
>  ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
>  ext4_xattr_block_set+0xa39/0x3980 fs/ext4/xattr.c:2028
>  ext4_xattr_move_to_block fs/ext4/xattr.c:2669 [inline]
>  ext4_xattr_make_inode_space fs/ext4/xattr.c:2744 [inline]
>  ext4_expand_extra_isize_ea+0x12d7/0x1cf0 fs/ext4/xattr.c:2836
>  __ext4_expand_extra_isize+0x2fb/0x3e0 fs/ext4/inode.c:5858
>  ext4_try_to_expand_extra_isize fs/ext4/inode.c:5901 [inline]
>  __ext4_mark_inode_dirty+0x524/0x880 fs/ext4/inode.c:5979
>  __ext4_unlink+0x6c2/0xb50 fs/ext4/namei.c:3289
>  ext4_unlink+0x1bf/0x5a0 fs/ext4/namei.c:3318
>  vfs_unlink+0x365/0x650 fs/namei.c:4469
>  do_unlinkat+0x4ae/0x830 fs/namei.c:4533
>  __do_sys_unlink fs/namei.c:4581 [inline]
>  __se_sys_unlink fs/namei.c:4579 [inline]
>  __x64_sys_unlink+0x47/0x50 fs/namei.c:4579
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f2549cabe99
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fff914df8a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000057
> RAX: ffffffffffffffda RBX: 0031656c69662f2e RCX: 00007f2549cabe99
> RDX: 00007f2549cabe99 RSI: 00007f2549cabe99 RDI: 0000000020000180
> RBP: 0032656c69662f2e R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff914df8e0
> R13: 00007fff914dfb08 R14: 431bde82d7b634db R15: 00007f2549cf503b
>  </TASK>
> 
> Allocated by task 5095:
>  kasan_save_stack mm/kasan/common.c:47 [inline]
>  kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
>  poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
>  __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
>  kasan_kmalloc include/linux/kasan.h:211 [inline]
>  __do_kmalloc_node mm/slub.c:4159 [inline]
>  __kmalloc_node_track_caller_noprof+0x225/0x440 mm/slub.c:4178
>  kmemdup_noprof+0x2a/0x60 mm/util.c:133
>  ext4_xattr_block_set+0x88b/0x3980 fs/ext4/xattr.c:1976
>  ext4_xattr_move_to_block fs/ext4/xattr.c:2669 [inline]
>  ext4_xattr_make_inode_space fs/ext4/xattr.c:2744 [inline]
>  ext4_expand_extra_isize_ea+0x12d7/0x1cf0 fs/ext4/xattr.c:2836
>  __ext4_expand_extra_isize+0x2fb/0x3e0 fs/ext4/inode.c:5858
>  ext4_try_to_expand_extra_isize fs/ext4/inode.c:5901 [inline]
>  __ext4_mark_inode_dirty+0x524/0x880 fs/ext4/inode.c:5979
>  __ext4_unlink+0x6c2/0xb50 fs/ext4/namei.c:3289
>  ext4_unlink+0x1bf/0x5a0 fs/ext4/namei.c:3318
>  vfs_unlink+0x365/0x650 fs/namei.c:4469
>  do_unlinkat+0x4ae/0x830 fs/namei.c:4533
>  __do_sys_unlink fs/namei.c:4581 [inline]
>  __se_sys_unlink fs/namei.c:4579 [inline]
>  __x64_sys_unlink+0x47/0x50 fs/namei.c:4579
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> The buggy address belongs to the object at ffff888036426800
>  which belongs to the cache kmalloc-1k of size 1024
> The buggy address is located 80 bytes inside of
>  1024-byte region [ffff888036426800, ffff888036426c00)
> 
> The buggy address belongs to the physical page:
> page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x36424
> head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> flags: 0x4fff00000000040(head|node=1|zone=1|lastcpupid=0x7ff)
> page_type: 0xfdffffff(slab)
> raw: 04fff00000000040 ffff88801ac41dc0 dead000000000100 dead000000000122
> raw: 0000000000000000 0000000080080008 00000001fdffffff 0000000000000000
> head: 04fff00000000040 ffff88801ac41dc0 dead000000000100 dead000000000122
> head: 0000000000000000 0000000080080008 00000001fdffffff 0000000000000000
> head: 04fff00000000002 ffffea0000d90901 ffffffffffffffff 0000000000000000
> head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
> page_owner tracks the page as allocated
> page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 5049, tgid 5049 (dhcpcd), ts 74832155643, free_ts 73695396465
>  set_page_owner include/linux/page_owner.h:32 [inline]
>  post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1500
>  prep_new_page mm/page_alloc.c:1508 [inline]
>  get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3446
>  __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4702
>  __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
>  alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
>  alloc_slab_page+0x5f/0x120 mm/slub.c:2319
>  allocate_slab+0x5a/0x2f0 mm/slub.c:2482
>  new_slab mm/slub.c:2535 [inline]
>  ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3721
>  __slab_alloc+0x58/0xa0 mm/slub.c:3811
>  __slab_alloc_node mm/slub.c:3864 [inline]
>  slab_alloc_node mm/slub.c:4026 [inline]
>  __do_kmalloc_node mm/slub.c:4158 [inline]
>  __kmalloc_noprof+0x25a/0x400 mm/slub.c:4171
>  kmalloc_noprof include/linux/slab.h:694 [inline]
>  load_elf_phdrs+0x162/0x260 fs/binfmt_elf.c:526
>  load_elf_binary+0x920/0x2680 fs/binfmt_elf.c:955
>  search_binary_handler fs/exec.c:1820 [inline]
>  exec_binprm fs/exec.c:1862 [inline]
>  bprm_execve+0xaf8/0x1770 fs/exec.c:1913
>  do_execveat_common+0x55f/0x6f0 fs/exec.c:2020
>  do_execve fs/exec.c:2094 [inline]
>  __do_sys_execve fs/exec.c:2170 [inline]
>  __se_sys_execve fs/exec.c:2165 [inline]
>  __x64_sys_execve+0x92/0xb0 fs/exec.c:2165
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> page last free pid 5032 tgid 5032 stack trace:
>  reset_page_owner include/linux/page_owner.h:25 [inline]
>  free_pages_prepare mm/page_alloc.c:1101 [inline]
>  free_unref_page+0xd22/0xea0 mm/page_alloc.c:2619
>  __slab_free+0x31b/0x3d0 mm/slub.c:4385
>  qlink_free mm/kasan/quarantine.c:163 [inline]
>  qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179
>  kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
>  __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322
>  kasan_slab_alloc include/linux/kasan.h:201 [inline]
>  slab_post_alloc_hook mm/slub.c:3989 [inline]
>  slab_alloc_node mm/slub.c:4038 [inline]
>  __do_kmalloc_node mm/slub.c:4158 [inline]
>  __kmalloc_noprof+0x1a6/0x400 mm/slub.c:4171
>  kmalloc_noprof include/linux/slab.h:694 [inline]
>  tomoyo_realpath_from_path+0xcf/0x5e0 security/tomoyo/realpath.c:251
>  tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
>  tomoyo_path_perm+0x2b7/0x740 security/tomoyo/file.c:822
>  security_inode_getattr+0x130/0x330 security/security.c:2371
>  vfs_getattr+0x45/0x430 fs/stat.c:204
>  vfs_fstat fs/stat.c:229 [inline]
>  vfs_fstatat+0xe4/0x190 fs/stat.c:338
>  __do_sys_newfstatat fs/stat.c:505 [inline]
>  __se_sys_newfstatat fs/stat.c:499 [inline]
>  __x64_sys_newfstatat+0x11d/0x1a0 fs/stat.c:499
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> Memory state around the buggy address:
>  ffff888036426700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff888036426780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff888036426800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>                                                  ^
>  ffff888036426880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ffff888036426900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup
> 

#syz test

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 46ce2f21fef9..336badb46246 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1776,7 +1776,14 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
 	} else if (s->not_found) {
 		/* Insert new name. */
 		size_t size = EXT4_XATTR_LEN(name_len);
-		size_t rest = (void *)last - (void *)here + sizeof(__u32);
+		size_t rest;
+
+		if (last < here) {
+			ret = -ENOSPC;
+			goto out;
+		} else {
+			rest = (void *)last - (void *)here + sizeof(__u32);
+		}
 
 		memmove((void *)here + size, here, rest);
 		memset(here, 0, size);
-- 
Best,
Qianqiang Liu


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [syzbot] [ext4?] KASAN: out-of-bounds Read in ext4_xattr_set_entry
  2024-09-22  5:46 ` Qianqiang Liu
@ 2024-09-22  6:35   ` syzbot
  2024-09-22  6:42     ` [PATCH] ext4: fix out-of-bounds issue " Qianqiang Liu
  2024-10-02  6:31     ` [syzbot] [ext4?] KASAN: out-of-bounds Read " Qianqiang Liu
  0 siblings, 2 replies; 18+ messages in thread
From: syzbot @ 2024-09-22  6:35 UTC (permalink / raw)
  To: adilger.kernel, linux-ext4, linux-kernel, qianqiang.liu,
	syzkaller-bugs, tytso

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
Tested-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com

Tested on:

commit:         88264981 Merge tag 'sched_ext-for-6.12' of git://git.k..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17c78e07980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5b5c53071a819d59
dashboard link: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=16538e07980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-09-22  6:35   ` syzbot
@ 2024-09-22  6:42     ` Qianqiang Liu
  2024-10-01  9:41       ` Ojaswin Mujoo
  2024-10-08  7:40       ` Baokun Li
  2024-10-02  6:31     ` [syzbot] [ext4?] KASAN: out-of-bounds Read " Qianqiang Liu
  1 sibling, 2 replies; 18+ messages in thread
From: Qianqiang Liu @ 2024-09-22  6:42 UTC (permalink / raw)
  To: tytso
  Cc: adilger.kernel, syzbot, linux-ext4, linux-kernel, syzkaller-bugs,
	qianqiang.liu

syzbot has found an out-of-bounds issue in ext4_xattr_set_entry:

==================================================================
BUG: KASAN: out-of-bounds in ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
Read of size 18446744073709551572 at addr ffff888036426850 by task syz-executor264/5095

CPU: 0 UID: 0 PID: 5095 Comm: syz-executor264 Not tainted 6.11.0-syzkaller-03917-ga940d9a43e62 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:93 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
 __asan_memmove+0x29/0x70 mm/kasan/shadow.c:94
 ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
[...]
==================================================================

This issue is caused by a negative size in memmove.
We need to check for this.

Fixes: dec214d00e0d ("ext4: xattr inode deduplication")
Reported-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
Tested-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
---
 fs/ext4/xattr.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 46ce2f21fef9..336badb46246 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1776,7 +1776,14 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
 	} else if (s->not_found) {
 		/* Insert new name. */
 		size_t size = EXT4_XATTR_LEN(name_len);
-		size_t rest = (void *)last - (void *)here + sizeof(__u32);
+		size_t rest;
+
+		if (last < here) {
+			ret = -ENOSPC;
+			goto out;
+		} else {
+			rest = (void *)last - (void *)here + sizeof(__u32);
+		}
 
 		memmove((void *)here + size, here, rest);
 		memset(here, 0, size);
-- 
2.34.1

-- 
Best,
Qianqiang Liu


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-09-22  6:42     ` [PATCH] ext4: fix out-of-bounds issue " Qianqiang Liu
@ 2024-10-01  9:41       ` Ojaswin Mujoo
  2024-10-01 10:15         ` Qianqiang Liu
  2024-10-02  6:27         ` Qianqiang Liu
  2024-10-08  7:40       ` Baokun Li
  1 sibling, 2 replies; 18+ messages in thread
From: Ojaswin Mujoo @ 2024-10-01  9:41 UTC (permalink / raw)
  To: Qianqiang Liu
  Cc: tytso, adilger.kernel, syzbot, linux-ext4, linux-kernel,
	syzkaller-bugs

On Sun, Sep 22, 2024 at 02:42:49PM +0800, Qianqiang Liu wrote:
> syzbot has found an out-of-bounds issue in ext4_xattr_set_entry:
> 
> ==================================================================
> BUG: KASAN: out-of-bounds in ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
> Read of size 18446744073709551572 at addr ffff888036426850 by task syz-executor264/5095
> 
> CPU: 0 UID: 0 PID: 5095 Comm: syz-executor264 Not tainted 6.11.0-syzkaller-03917-ga940d9a43e62 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:93 [inline]
>  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
>  print_address_description mm/kasan/report.c:377 [inline]
>  print_report+0x169/0x550 mm/kasan/report.c:488
>  kasan_report+0x143/0x180 mm/kasan/report.c:601
>  kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
>  __asan_memmove+0x29/0x70 mm/kasan/shadow.c:94
>  ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
> [...]
> ==================================================================
> 
> This issue is caused by a negative size in memmove.
> We need to check for this.
> 
> Fixes: dec214d00e0d ("ext4: xattr inode deduplication")
> Reported-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
> Tested-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
> Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
> ---
>  fs/ext4/xattr.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index 46ce2f21fef9..336badb46246 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -1776,7 +1776,14 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
>  	} else if (s->not_found) {
>  		/* Insert new name. */
>  		size_t size = EXT4_XATTR_LEN(name_len);
> -		size_t rest = (void *)last - (void *)here + sizeof(__u32);
> +		size_t rest;
> +
> +		if (last < here) {
> +			ret = -ENOSPC;
> +			goto out;
> +		} else {
> +			rest = (void *)last - (void *)here + sizeof(__u32);
> +		}

Hey Qianqiang,

Thanks for the patch. I'm still reviewing this codepath but I do have
some questions around the patch. So I understand that xattrs are
arranged in the following format:

 *   +------------------+
 *   | header           |
 *   | entry 1          | 
 *   | entry 2          | 
 *   | entry 3          | 
 *   | four null bytes  | <--- last
 *   | . . .            | 
 *   | . . .            | <--- here
 *   | . . .            | 
 *   | value 1          | 
 *   | value 3          | 
 *   | value 2          | 
 *   +------------------+

Now, in this error, my understanding is that we are actually ending up
in a case where "here" ie the place where the new xattr entry will go is
beyond the "last" ie the last slot for xattr entry and that is causing
an underflow, something like the above diagram.

My only concern is that why were we not able to detect this in the logic
near the start of the function where we explcity check if we have enough
space? 

Perhaps we should be fixing the logic in that if {..} instead
since the comment a few lines above your fix:

	/* No failures allowed past this point. */

does suggest that we can't error out below that point, so ideally all
the checks would have been done before that.

I'm still going through the issue, will update here if needed.

Regards,
ojaswin

>  
>  		memmove((void *)here + size, here, rest);
>  		memset(here, 0, size);
> -- 
> 2.34.1
> 
> -- 
> Best,
> Qianqiang Liu
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-01  9:41       ` Ojaswin Mujoo
@ 2024-10-01 10:15         ` Qianqiang Liu
  2024-10-02  6:27         ` Qianqiang Liu
  1 sibling, 0 replies; 18+ messages in thread
From: Qianqiang Liu @ 2024-10-01 10:15 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: tytso, adilger.kernel, syzbot, linux-ext4, linux-kernel,
	syzkaller-bugs

Hi Ojaswin,

On Tue, Oct 01, 2024 at 03:11:44PM +0530, Ojaswin Mujoo wrote:
> 
> Hey Qianqiang,
> 
> Thanks for the patch. I'm still reviewing this codepath but I do have
> some questions around the patch. So I understand that xattrs are
> arranged in the following format:
> 
>  *   +------------------+
>  *   | header           |
>  *   | entry 1          | 
>  *   | entry 2          | 
>  *   | entry 3          | 
>  *   | four null bytes  | <--- last
>  *   | . . .            | 
>  *   | . . .            | <--- here
>  *   | . . .            | 
>  *   | value 1          | 
>  *   | value 3          | 
>  *   | value 2          | 
>  *   +------------------+
> 
> Now, in this error, my understanding is that we are actually ending up
> in a case where "here" ie the place where the new xattr entry will go is
> beyond the "last" ie the last slot for xattr entry and that is causing
> an underflow, something like the above diagram.
> 
> My only concern is that why were we not able to detect this in the logic
> near the start of the function where we explcity check if we have enough
> space? 
> 
> Perhaps we should be fixing the logic in that if {..} instead
> since the comment a few lines above your fix:
> 
> 	/* No failures allowed past this point. */
> 
> does suggest that we can't error out below that point, so ideally all
> the checks would have been done before that.
> 
> I'm still going through the issue, will update here if needed.
> 
> Regards,
> ojaswin
> 

Thank you for your suggestions.
I will investigate this issue further. If there are any updates, I will
let you know.

-- 
Best,
Qianqiang Liu


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-01  9:41       ` Ojaswin Mujoo
  2024-10-01 10:15         ` Qianqiang Liu
@ 2024-10-02  6:27         ` Qianqiang Liu
  1 sibling, 0 replies; 18+ messages in thread
From: Qianqiang Liu @ 2024-10-02  6:27 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: tytso, adilger.kernel, syzbot, linux-ext4, linux-kernel,
	syzkaller-bugs

Hi Ojaswin,

On Tue, Oct 01, 2024 at 03:11:44PM +0530, Ojaswin Mujoo wrote:
> 
> Hey Qianqiang,
> 
> Thanks for the patch. I'm still reviewing this codepath but I do have
> some questions around the patch. So I understand that xattrs are
> arranged in the following format:
> 
>  *   +------------------+
>  *   | header           |
>  *   | entry 1          | 
>  *   | entry 2          | 
>  *   | entry 3          | 
>  *   | four null bytes  | <--- last
>  *   | . . .            | 
>  *   | . . .            | <--- here
>  *   | . . .            | 
>  *   | value 1          | 
>  *   | value 3          | 
>  *   | value 2          | 
>  *   +------------------+
> 
> Now, in this error, my understanding is that we are actually ending up
> in a case where "here" ie the place where the new xattr entry will go is
> beyond the "last" ie the last slot for xattr entry and that is causing
> an underflow, something like the above diagram.
> 
> My only concern is that why were we not able to detect this in the logic
> near the start of the function where we explcity check if we have enough
> space? 
> 
> Perhaps we should be fixing the logic in that if {..} instead
> since the comment a few lines above your fix:
> 
> 	/* No failures allowed past this point. */
> 
> does suggest that we can't error out below that point, so ideally all
> the checks would have been done before that.
> 
> I'm still going through the issue, will update here if needed.
> 
> Regards,
> ojaswin
> 

I reviewed the codepath, and here is the backtrace when the error occurs:

=> vfs_unlink
=> ext4_unlink
=> __ext4_unlink
=> __ext4_mark_inode_dirty
=> ext4_try_to_expand_extra_isize
=> __ext4_expand_extra_isize
=> ext4_expand_extra_isize_ea
=> ext4_xattr_make_inode_space
=> ext4_xattr_move_to_block -> ext4_xattr_block_find -> xattr_find_entry
=> ext4_xattr_block_set
=> ext4_xattr_set_entry
=> memmove((void *)here + size, here, rest);

The xattr_find_entry function return -ENODATA, but beacuase of the
following code, the error does not be returned to caller:

static int
ext4_xattr_block_find(struct inode *inode, struct ext4_xattr_info *i,
		      struct ext4_xattr_block_find *bs)
{
	...
	error = xattr_find_entry(inode, &bs->s.here, bs->s.end,
				 i->name_index, i->name, 1);
	if (error && error != -ENODATA)
		return error;
	...
}

So, perhaps we could modify the code as follows:

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index e0e1956dcdd3..649b278d4c1f 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1884,7 +1884,7 @@ ext4_xattr_block_find(struct inode *inode, struct ext4_xattr_info *i,
 		bs->s.here = bs->s.first;
 		error = xattr_find_entry(inode, &bs->s.here, bs->s.end,
 					 i->name_index, i->name, 1);
-		if (error && error != -ENODATA)
+		if (error)
 			return error;
 		bs->s.not_found = error;
 	}

Or, we could check if s->not_found is -ENODATA in the ext4_xattr_set_entry function.

Any suggestions?

-- 
Best,
Qianqiang Liu


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [syzbot] [ext4?] KASAN: out-of-bounds Read in ext4_xattr_set_entry
  2024-09-22  6:35   ` syzbot
  2024-09-22  6:42     ` [PATCH] ext4: fix out-of-bounds issue " Qianqiang Liu
@ 2024-10-02  6:31     ` Qianqiang Liu
  2024-10-02  6:54       ` syzbot
  1 sibling, 1 reply; 18+ messages in thread
From: Qianqiang Liu @ 2024-10-02  6:31 UTC (permalink / raw)
  To: syzbot
  Cc: ojaswin, adilger.kernel, tytso, linux-ext4, linux-kernel,
	syzkaller-bugs

#syz test

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index e0e1956dcdd3..649b278d4c1f 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1884,7 +1884,7 @@ ext4_xattr_block_find(struct inode *inode, struct ext4_xattr_info *i,
 		bs->s.here = bs->s.first;
 		error = xattr_find_entry(inode, &bs->s.here, bs->s.end,
 					 i->name_index, i->name, 1);
-		if (error && error != -ENODATA)
+		if (error)
 			return error;
 		bs->s.not_found = error;
 	}

-- 
Best,
Qianqiang Liu


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [syzbot] [ext4?] KASAN: out-of-bounds Read in ext4_xattr_set_entry
  2024-10-02  6:31     ` [syzbot] [ext4?] KASAN: out-of-bounds Read " Qianqiang Liu
@ 2024-10-02  6:54       ` syzbot
  0 siblings, 0 replies; 18+ messages in thread
From: syzbot @ 2024-10-02  6:54 UTC (permalink / raw)
  To: adilger.kernel, linux-ext4, linux-kernel, ojaswin, qianqiang.liu,
	syzkaller-bugs, tytso

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
Tested-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com

Tested on:

commit:         e32cde8d Merge tag 'sched_ext-for-6.12-rc1-fixes-1' of..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13657dd0580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=286b31f2cf1c36b5
dashboard link: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=122da580580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-09-22  6:42     ` [PATCH] ext4: fix out-of-bounds issue " Qianqiang Liu
  2024-10-01  9:41       ` Ojaswin Mujoo
@ 2024-10-08  7:40       ` Baokun Li
  2024-10-09 15:50         ` Jan Kara
  1 sibling, 1 reply; 18+ messages in thread
From: Baokun Li @ 2024-10-08  7:40 UTC (permalink / raw)
  To: Qianqiang Liu, tytso, Jan Kara
  Cc: adilger.kernel, syzbot, linux-ext4, linux-kernel, syzkaller-bugs,
	Yang Erkun

On 2024/9/22 14:42, Qianqiang Liu wrote:
> syzbot has found an out-of-bounds issue in ext4_xattr_set_entry:
>
> ==================================================================
> BUG: KASAN: out-of-bounds in ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
> Read of size 18446744073709551572 at addr ffff888036426850 by task syz-executor264/5095
>
> CPU: 0 UID: 0 PID: 5095 Comm: syz-executor264 Not tainted 6.11.0-syzkaller-03917-ga940d9a43e62 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> Call Trace:
>   <TASK>
>   __dump_stack lib/dump_stack.c:93 [inline]
>   dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
>   print_address_description mm/kasan/report.c:377 [inline]
>   print_report+0x169/0x550 mm/kasan/report.c:488
>   kasan_report+0x143/0x180 mm/kasan/report.c:601
>   kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
>   __asan_memmove+0x29/0x70 mm/kasan/shadow.c:94
>   ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
> [...]
> ==================================================================
>
> This issue is caused by a negative size in memmove.
> We need to check for this.
>
> Fixes: dec214d00e0d ("ext4: xattr inode deduplication")
> Reported-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
> Tested-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
> Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
> ---
>   fs/ext4/xattr.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index 46ce2f21fef9..336badb46246 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -1776,7 +1776,14 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
>   	} else if (s->not_found) {
>   		/* Insert new name. */
>   		size_t size = EXT4_XATTR_LEN(name_len);
> -		size_t rest = (void *)last - (void *)here + sizeof(__u32);
> +		size_t rest;
> +
> +		if (last < here) {
> +			ret = -ENOSPC;
> +			goto out;
> +		} else {
> +			rest = (void *)last - (void *)here + sizeof(__u32);
> +		}
>   
>   		memmove((void *)here + size, here, rest);
>   		memset(here, 0, size);
This change just passes syzbot's test cases without fixing the real
problem.

The root cause of the problem is that the inode's xattr block is marked as
free in the block bitmap, so that block is allocated to the ea inode
resulting in the data in the xattr block being overwritten, and the last of
the second lookups changing resulting in out-of-bounds access.

The stack that triggers the problem is as follows:

// An inode with an xattr block of 33.
__ext4_mark_inode_dirty
  __ext4_expand_extra_isize
   ext4_expand_extra_isize_ea
    ext4_xattr_make_inode_space
     // Move xattr from inode to block
     ext4_xattr_move_to_block
      // Find out if the xattr exists in the block
      ext4_xattr_block_find
       // If xattr does not exist, here == last
       xattr_find_entry
      // Add a new xattr to the block
      ext4_xattr_block_set
       |// xattr is too long, needs an ea inode
       |ext4_xattr_inode_lookup_create
       | ext4_xattr_inode_create
       | ext4_xattr_inode_write
       |  ext4_map_blocks
       |   // xattr block 33 is assigned to the new ea inode
       |  memcpy(bh->b_data, buf, csize)
       |   // The value of xattr overwrites the data in the xattr block.
       |ext4_xattr_set_entry
        // Since the contents of the xattr block have changed,
        // now here == last does not hold, so it is possible to
        // have last < here and trigger an out-of-bounds access.

So I think we should probably add a helper function ext4_mb_block_inuse()
that checks if xattr block is free with the block bitmap in check_xattrs().

Or go one step further and add a mechanism like xfs Reverse-Mapping, which
makes sure that allocated blocks do point to the target inode, which could
replace the current block_validity, and could also be used in future online
fscks.

Ted, Honza, do you have any thoughts?

Regards,
Baokun

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-08  7:40       ` Baokun Li
@ 2024-10-09 15:50         ` Jan Kara
  2024-10-11  2:18           ` Baokun Li
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kara @ 2024-10-09 15:50 UTC (permalink / raw)
  To: Baokun Li
  Cc: Qianqiang Liu, tytso, Jan Kara, adilger.kernel, syzbot,
	linux-ext4, linux-kernel, syzkaller-bugs, Yang Erkun

On Tue 08-10-24 15:40:39, Baokun Li wrote:
> On 2024/9/22 14:42, Qianqiang Liu wrote:
> > syzbot has found an out-of-bounds issue in ext4_xattr_set_entry:
> > 
> > ==================================================================
> > BUG: KASAN: out-of-bounds in ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
> > Read of size 18446744073709551572 at addr ffff888036426850 by task syz-executor264/5095
> > 
> > CPU: 0 UID: 0 PID: 5095 Comm: syz-executor264 Not tainted 6.11.0-syzkaller-03917-ga940d9a43e62 #0
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > Call Trace:
> >   <TASK>
> >   __dump_stack lib/dump_stack.c:93 [inline]
> >   dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
> >   print_address_description mm/kasan/report.c:377 [inline]
> >   print_report+0x169/0x550 mm/kasan/report.c:488
> >   kasan_report+0x143/0x180 mm/kasan/report.c:601
> >   kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
> >   __asan_memmove+0x29/0x70 mm/kasan/shadow.c:94
> >   ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
> > [...]
> > ==================================================================
> > 
> > This issue is caused by a negative size in memmove.
> > We need to check for this.
> > 
> > Fixes: dec214d00e0d ("ext4: xattr inode deduplication")
> > Reported-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
> > Tested-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
> > Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
> > ---
> >   fs/ext4/xattr.c | 9 ++++++++-
> >   1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> > index 46ce2f21fef9..336badb46246 100644
> > --- a/fs/ext4/xattr.c
> > +++ b/fs/ext4/xattr.c
> > @@ -1776,7 +1776,14 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
> >   	} else if (s->not_found) {
> >   		/* Insert new name. */
> >   		size_t size = EXT4_XATTR_LEN(name_len);
> > -		size_t rest = (void *)last - (void *)here + sizeof(__u32);
> > +		size_t rest;
> > +
> > +		if (last < here) {
> > +			ret = -ENOSPC;
> > +			goto out;
> > +		} else {
> > +			rest = (void *)last - (void *)here + sizeof(__u32);
> > +		}
> >   		memmove((void *)here + size, here, rest);
> >   		memset(here, 0, size);
> This change just passes syzbot's test cases without fixing the real
> problem.
> 
> The root cause of the problem is that the inode's xattr block is marked as
> free in the block bitmap, so that block is allocated to the ea inode
> resulting in the data in the xattr block being overwritten, and the last of
> the second lookups changing resulting in out-of-bounds access.
> 
> The stack that triggers the problem is as follows:
> 
> // An inode with an xattr block of 33.
> __ext4_mark_inode_dirty
>  __ext4_expand_extra_isize
>   ext4_expand_extra_isize_ea
>    ext4_xattr_make_inode_space
>     // Move xattr from inode to block
>     ext4_xattr_move_to_block
>      // Find out if the xattr exists in the block
>      ext4_xattr_block_find
>       // If xattr does not exist, here == last
>       xattr_find_entry
>      // Add a new xattr to the block
>      ext4_xattr_block_set
>       |// xattr is too long, needs an ea inode
>       |ext4_xattr_inode_lookup_create
>       | ext4_xattr_inode_create
>       | ext4_xattr_inode_write
>       |  ext4_map_blocks
>       |   // xattr block 33 is assigned to the new ea inode
>       |  memcpy(bh->b_data, buf, csize)
>       |   // The value of xattr overwrites the data in the xattr block.
>       |ext4_xattr_set_entry
>        // Since the contents of the xattr block have changed,
>        // now here == last does not hold, so it is possible to
>        // have last < here and trigger an out-of-bounds access.
> 
> So I think we should probably add a helper function ext4_mb_block_inuse()
> that checks if xattr block is free with the block bitmap in check_xattrs().

Well, even that would be a relatively narrow fix. You could have e.g.
file reference the xattr block as one of its data blocks and then corrupt
xattr contents at unfortunate moment. That will not get fixed by checking
whether the block is allocated. These multiply claimed blocks (as e2fsck
calls it) are very hard to detect inside the kernel.

> Or go one step further and add a mechanism like xfs Reverse-Mapping, which
> makes sure that allocated blocks do point to the target inode, which could
> replace the current block_validity, and could also be used in future online
> fscks.

Well, that is a rather big change. It requires significant on-disk format
change and also performance cost when to maintain. Furthermore for xattr
blocks which can be shared by many inodes it is not even clear how to
implement this... So I'm not sure we really want to do this either.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-09 15:50         ` Jan Kara
@ 2024-10-11  2:18           ` Baokun Li
  2024-10-14 16:31             ` Jan Kara
  0 siblings, 1 reply; 18+ messages in thread
From: Baokun Li @ 2024-10-11  2:18 UTC (permalink / raw)
  To: Jan Kara
  Cc: Qianqiang Liu, tytso, adilger.kernel, syzbot, linux-ext4,
	linux-kernel, syzkaller-bugs, Yang Erkun

On 2024/10/9 23:50, Jan Kara wrote:
> On Tue 08-10-24 15:40:39, Baokun Li wrote:
>> On 2024/9/22 14:42, Qianqiang Liu wrote:
>>> syzbot has found an out-of-bounds issue in ext4_xattr_set_entry:
>>>
>>> ==================================================================
>>> BUG: KASAN: out-of-bounds in ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
>>> Read of size 18446744073709551572 at addr ffff888036426850 by task syz-executor264/5095
>>>
>>> CPU: 0 UID: 0 PID: 5095 Comm: syz-executor264 Not tainted 6.11.0-syzkaller-03917-ga940d9a43e62 #0
>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
>>> Call Trace:
>>>    <TASK>
>>>    __dump_stack lib/dump_stack.c:93 [inline]
>>>    dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
>>>    print_address_description mm/kasan/report.c:377 [inline]
>>>    print_report+0x169/0x550 mm/kasan/report.c:488
>>>    kasan_report+0x143/0x180 mm/kasan/report.c:601
>>>    kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
>>>    __asan_memmove+0x29/0x70 mm/kasan/shadow.c:94
>>>    ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
>>> [...]
>>> ==================================================================
>>>
>>> This issue is caused by a negative size in memmove.
>>> We need to check for this.
>>>
>>> Fixes: dec214d00e0d ("ext4: xattr inode deduplication")
>>> Reported-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
>>> Closes: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
>>> Tested-by: syzbot+f792df426ff0f5ceb8d1@syzkaller.appspotmail.com
>>> Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
>>> ---
>>>    fs/ext4/xattr.c | 9 ++++++++-
>>>    1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
>>> index 46ce2f21fef9..336badb46246 100644
>>> --- a/fs/ext4/xattr.c
>>> +++ b/fs/ext4/xattr.c
>>> @@ -1776,7 +1776,14 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
>>>    	} else if (s->not_found) {
>>>    		/* Insert new name. */
>>>    		size_t size = EXT4_XATTR_LEN(name_len);
>>> -		size_t rest = (void *)last - (void *)here + sizeof(__u32);
>>> +		size_t rest;
>>> +
>>> +		if (last < here) {
>>> +			ret = -ENOSPC;
>>> +			goto out;
>>> +		} else {
>>> +			rest = (void *)last - (void *)here + sizeof(__u32);
>>> +		}
>>>    		memmove((void *)here + size, here, rest);
>>>    		memset(here, 0, size);
>> This change just passes syzbot's test cases without fixing the real
>> problem.
>>
>> The root cause of the problem is that the inode's xattr block is marked as
>> free in the block bitmap, so that block is allocated to the ea inode
>> resulting in the data in the xattr block being overwritten, and the last of
>> the second lookups changing resulting in out-of-bounds access.
>>
>> The stack that triggers the problem is as follows:
>>
>> // An inode with an xattr block of 33.
>> __ext4_mark_inode_dirty
>>   __ext4_expand_extra_isize
>>    ext4_expand_extra_isize_ea
>>     ext4_xattr_make_inode_space
>>      // Move xattr from inode to block
>>      ext4_xattr_move_to_block
>>       // Find out if the xattr exists in the block
>>       ext4_xattr_block_find
>>        // If xattr does not exist, here == last
>>        xattr_find_entry
>>       // Add a new xattr to the block
>>       ext4_xattr_block_set
>>        |// xattr is too long, needs an ea inode
>>        |ext4_xattr_inode_lookup_create
>>        | ext4_xattr_inode_create
>>        | ext4_xattr_inode_write
>>        |  ext4_map_blocks
>>        |   // xattr block 33 is assigned to the new ea inode
>>        |  memcpy(bh->b_data, buf, csize)
>>        |   // The value of xattr overwrites the data in the xattr block.
>>        |ext4_xattr_set_entry
>>         // Since the contents of the xattr block have changed,
>>         // now here == last does not hold, so it is possible to
>>         // have last < here and trigger an out-of-bounds access.
>>
>> So I think we should probably add a helper function ext4_mb_block_inuse()
>> that checks if xattr block is free with the block bitmap in check_xattrs().
Hi Honza,

Thanks so much for your thoughts and feedback!
> Well, even that would be a relatively narrow fix. You could have e.g.
> file reference the xattr block as one of its data blocks and then corrupt
> xattr contents at unfortunate moment. That will not get fixed by checking
> whether the block is allocated. These multiply claimed blocks (as e2fsck
> calls it) are very hard to detect inside the kernel.
Yes, after locating the issue, the first thought was to just get the buffer
lock and check xattr magic and xattr block checksum. However, if the block
is allocated as an xattr block to another file, the issue may still occur.

Therefore we have to make sure that the block has been allocated to the
current file. With the block bitmap we can verify that the current block
is allocated, but as you pointed out we cannot verify that it is only
allocated to the current file.

That means we need some means to find the owner of the block by block,
and then I came up with xfs Reverse-Mapping.
>> Or go one step further and add a mechanism like xfs Reverse-Mapping, which
>> makes sure that allocated blocks do point to the target inode, which could
>> replace the current block_validity, and could also be used in future online
>> fscks.
> Well, that is a rather big change. It requires significant on-disk format
> change and also performance cost when to maintain. Furthermore for xattr
> blocks which can be shared by many inodes it is not even clear how to
> implement this... So I'm not sure we really want to do this either.
>
> 								Honza
Yes, there can be a lot of work involved.

  * Perhaps we could create an rmap file to store the rmap tree to avoid
    on-disk format changes.
  * The performance impact of maintaining rmap really needs to be evaluated,
    perhaps by writing a simple DEMO to test it.
  * XFS supports shared blocks(A.K.A. reflink.), so even if the physical
    blocks are the same, but the inodes are different or the logical blocks
    are different, they will be recorded multiple times in the tree. So the
    shared xattr block can be handled similarly.

We have plans to support online fsck in the future, and implementing rmap
is one of the steps. Perhaps one can wait until rmap is implemented to
assess whether it is worth a strict check here.

Implementing rmap may take some time, until then we can avoid the problem
as much as possible by checking the magic and xattr block csum.
Maybe something like this?

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 7647e9f6e190..cd3ae1e3371c 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1676,6 +1676,13 @@ static int ext4_xattr_set_entry(struct 
ext4_xattr_info *i,
                 }
         }

+       if (WARN_ON_ONCE(last < here)) {
+               EXT4_ERROR_INODE(inode, "corrupted xattr entries in %s",
+                                       is_block ? "block" : "ibody");
+               ret = -EFSCORRUPTED;
+               goto out;
+       }
+
         /* Check whether we have enough space. */
         if (i->value) {
                 size_t free;
@@ -1923,6 +1930,7 @@ ext4_xattr_block_set(handle_t *handle, struct 
inode *inode,
         }

         if (s->base) {
+               struct ext4_xattr_header *hdr;
                 int offset = (char *)s->here - bs->bh->b_data;

                 BUFFER_TRACE(bs->bh, "get_write_access");
@@ -1932,6 +1940,16 @@ ext4_xattr_block_set(handle_t *handle, struct 
inode *inode,
                         goto cleanup;

                 lock_buffer(bs->bh);
+               hdr = header(s->base);
+
+               if (hdr->h_magic != cpu_to_le32(EXT4_XATTR_MAGIC) ||
+                   (ext4_has_metadata_csum(inode->i_sb) &&
+                    (ext4_xattr_block_csum(inode, bs->bh->b_blocknr, 
hdr) !=
+                     hdr->h_checksum))) {
+                       unlock_buffer(bs->bh);
+                       error = -EFSCORRUPTED;
+                       goto bad_block;
+               }

                 if (header(s->base)->h_refcount == cpu_to_le32(1)) {
                         __u32 hash = le32_to_cpu(BHDR(bs->bh)->h_hash);

--
Thanks,
Baokun

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-11  2:18           ` Baokun Li
@ 2024-10-14 16:31             ` Jan Kara
  2024-10-16  8:02               ` Baokun Li
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kara @ 2024-10-14 16:31 UTC (permalink / raw)
  To: Baokun Li
  Cc: Jan Kara, Qianqiang Liu, tytso, adilger.kernel, syzbot,
	linux-ext4, linux-kernel, syzkaller-bugs, Yang Erkun

On Fri 11-10-24 10:18:04, Baokun Li wrote:
> On 2024/10/9 23:50, Jan Kara wrote:
> > > Or go one step further and add a mechanism like xfs Reverse-Mapping, which
> > > makes sure that allocated blocks do point to the target inode, which could
> > > replace the current block_validity, and could also be used in future online
> > > fscks.
> > Well, that is a rather big change. It requires significant on-disk format
> > change and also performance cost when to maintain. Furthermore for xattr
> > blocks which can be shared by many inodes it is not even clear how to
> > implement this... So I'm not sure we really want to do this either.
> 
> Yes, there can be a lot of work involved.
> 
>  * Perhaps we could create an rmap file to store the rmap tree to avoid
>    on-disk format changes.
>  * The performance impact of maintaining rmap really needs to be evaluated,
>    perhaps by writing a simple DEMO to test it.
>  * XFS supports shared blocks(A.K.A. reflink.), so even if the physical
>    blocks are the same, but the inodes are different or the logical blocks
>    are different, they will be recorded multiple times in the tree. So the
>    shared xattr block can be handled similarly.
> 
> We have plans to support online fsck in the future, and implementing rmap
> is one of the steps. Perhaps one can wait until rmap is implemented to
> assess whether it is worth a strict check here.

Yes, we could implement something like this be as you wrote, it's going to
be a lot of work. We've briefly discussed this with Ted on ext4 call and we
came to a conclusion that this is a type of corruption ext4 may never
protect agaist. You simply should not mount arbitrarily corrupted
filesystems... But if you want to try, sure go ahead :)

One relatively easy solution to similar class of problems would be to store
the type of metadata buffer inside the buffer_head when we are verifying
checksum, clear the info when freeing the block in __ext4_forget(), and
fail with EFSCORRUPTED error when one type -> another type transition would
happen.

> Implementing rmap may take some time, until then we can avoid the problem
> as much as possible by checking the magic and xattr block csum.
> Maybe something like this?
> 
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index 7647e9f6e190..cd3ae1e3371c 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -1676,6 +1676,13 @@ static int ext4_xattr_set_entry(struct
> ext4_xattr_info *i,
>                 }
>         }
> 
> +       if (WARN_ON_ONCE(last < here)) {
> +               EXT4_ERROR_INODE(inode, "corrupted xattr entries in %s",
> +                                       is_block ? "block" : "ibody");
> +               ret = -EFSCORRUPTED;
> +               goto out;
> +       }
> +
>         /* Check whether we have enough space. */
>         if (i->value) {
>                 size_t free;
> @@ -1923,6 +1930,7 @@ ext4_xattr_block_set(handle_t *handle, struct inode
> *inode,
>         }
> 
>         if (s->base) {
> +               struct ext4_xattr_header *hdr;
>                 int offset = (char *)s->here - bs->bh->b_data;
> 
>                 BUFFER_TRACE(bs->bh, "get_write_access");
> @@ -1932,6 +1940,16 @@ ext4_xattr_block_set(handle_t *handle, struct inode
> *inode,
>                         goto cleanup;
> 
>                 lock_buffer(bs->bh);
> +               hdr = header(s->base);
> +
> +               if (hdr->h_magic != cpu_to_le32(EXT4_XATTR_MAGIC) ||
> +                   (ext4_has_metadata_csum(inode->i_sb) &&
> +                    (ext4_xattr_block_csum(inode, bs->bh->b_blocknr, hdr)
> !=
> +                     hdr->h_checksum))) {
> +                       unlock_buffer(bs->bh);
> +                       error = -EFSCORRUPTED;
> +                       goto bad_block;
> +               }
> 
>                 if (header(s->base)->h_refcount == cpu_to_le32(1)) {
>                         __u32 hash = le32_to_cpu(BHDR(bs->bh)->h_hash);

Hum, there are more places in xattr code that access a buffer that could
have been modified. So why do you add check into this place? Is it somehow
special?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-14 16:31             ` Jan Kara
@ 2024-10-16  8:02               ` Baokun Li
  2024-10-16 20:47                 ` Theodore Ts'o
  0 siblings, 1 reply; 18+ messages in thread
From: Baokun Li @ 2024-10-16  8:02 UTC (permalink / raw)
  To: Jan Kara
  Cc: Qianqiang Liu, tytso, adilger.kernel, syzbot, linux-ext4,
	linux-kernel, syzkaller-bugs, Yang Erkun, Baokun Li

On 2024/10/15 0:31, Jan Kara wrote:
> On Fri 11-10-24 10:18:04, Baokun Li wrote:
>> On 2024/10/9 23:50, Jan Kara wrote:
>>>> Or go one step further and add a mechanism like xfs Reverse-Mapping, which
>>>> makes sure that allocated blocks do point to the target inode, which could
>>>> replace the current block_validity, and could also be used in future online
>>>> fscks.
>>> Well, that is a rather big change. It requires significant on-disk format
>>> change and also performance cost when to maintain. Furthermore for xattr
>>> blocks which can be shared by many inodes it is not even clear how to
>>> implement this... So I'm not sure we really want to do this either.
>> Yes, there can be a lot of work involved.
>>
>>   * Perhaps we could create an rmap file to store the rmap tree to avoid
>>     on-disk format changes.
>>   * The performance impact of maintaining rmap really needs to be evaluated,
>>     perhaps by writing a simple DEMO to test it.
>>   * XFS supports shared blocks(A.K.A. reflink.), so even if the physical
>>     blocks are the same, but the inodes are different or the logical blocks
>>     are different, they will be recorded multiple times in the tree. So the
>>     shared xattr block can be handled similarly.
>>
>> We have plans to support online fsck in the future, and implementing rmap
>> is one of the steps. Perhaps one can wait until rmap is implemented to
>> assess whether it is worth a strict check here.
> Yes, we could implement something like this be as you wrote, it's going to
> be a lot of work. We've briefly discussed this with Ted on ext4 call and we
> came to a conclusion that this is a type of corruption ext4 may never
> protect agaist. You simply should not mount arbitrarily corrupted
> filesystems...

Thank you for discussing this with Ted.

This kind of problem is really tricky.

>   But if you want to try, sure go ahead :)
As server clusters get larger and larger, server maintenance becomes very
difficult. Therefore, timely detection of problems (i.e. online scanning,
similar to e2fsck -fn) and timely and non-stop fixing of problems (i.e.
online fsck, similar to e2fsck -a) have always been the requirements of
our customers. Thus online fsck has been on our TODO list, and it's really
time to start doing it. 😀
> One relatively easy solution to similar class of problems would be to store
> the type of metadata buffer inside the buffer_head when we are verifying
> checksum, clear the info when freeing the block in __ext4_forget(), and
> fail with EFSCORRUPTED error when one type -> another type transition would
> happen.
This solution looks great. If we save checksum further, we can sense not
only the change of block type, but also the change of block data.

But now journal_head takes up bh->b_private, so to record information,
we need to add new fields to buffer_head (e.g. b_info), or modify the logic
of journal_head (e.g. make bh->b_private hold ext4_bufdata, and include the
journal_head in ext4_bufdata. ocfs2 needs a similar adaptation).
>> Implementing rmap may take some time, until then we can avoid the problem
>> as much as possible by checking the magic and xattr block csum.
>> Maybe something like this?
>>
>> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
>> index 7647e9f6e190..cd3ae1e3371c 100644
>> --- a/fs/ext4/xattr.c
>> +++ b/fs/ext4/xattr.c
>> @@ -1676,6 +1676,13 @@ static int ext4_xattr_set_entry(struct
>> ext4_xattr_info *i,
>>                  }
>>          }
>>
>> +       if (WARN_ON_ONCE(last < here)) {
>> +               EXT4_ERROR_INODE(inode, "corrupted xattr entries in %s",
>> +                                       is_block ? "block" : "ibody");
>> +               ret = -EFSCORRUPTED;
>> +               goto out;
>> +       }
>> +
>>          /* Check whether we have enough space. */
>>          if (i->value) {
>>                  size_t free;
>> @@ -1923,6 +1930,7 @@ ext4_xattr_block_set(handle_t *handle, struct inode
>> *inode,
>>          }
>>
>>          if (s->base) {
>> +               struct ext4_xattr_header *hdr;
>>                  int offset = (char *)s->here - bs->bh->b_data;
>>
>>                  BUFFER_TRACE(bs->bh, "get_write_access");
>> @@ -1932,6 +1940,16 @@ ext4_xattr_block_set(handle_t *handle, struct inode
>> *inode,
>>                          goto cleanup;
>>
>>                  lock_buffer(bs->bh);
>> +               hdr = header(s->base);
>> +
>> +               if (hdr->h_magic != cpu_to_le32(EXT4_XATTR_MAGIC) ||
>> +                   (ext4_has_metadata_csum(inode->i_sb) &&
>> +                    (ext4_xattr_block_csum(inode, bs->bh->b_blocknr, hdr)
>> !=
>> +                     hdr->h_checksum))) {
>> +                       unlock_buffer(bs->bh);
>> +                       error = -EFSCORRUPTED;
>> +                       goto bad_block;
>> +               }
>>
>>                  if (header(s->base)->h_refcount == cpu_to_le32(1)) {
>>                          __u32 hash = le32_to_cpu(BHDR(bs->bh)->h_hash);
> Hum, there are more places in xattr code that access a buffer that could
> have been modified. So why do you add check into this place? Is it somehow
> special?
>
> 								Honza

The out-of-bounds access occurs here because the last dentry obtained
in ext4_xattr_set_entry() is not the same as the last dentry obtained
in ext4_xattr_block_find().

When we modify an xattr, we always hold the inode lock, so the ibody
case is not considered. Check_xattrs() is called to do a full check
when looking up an xattr, so it's not necessary to consider that either.

When inserting an xattr into an xattr block, there are three cases:

  * xattr block is newly allocated, and the block is allocated only
    after the entry is set in memory.
  * xattr block is unshared, insert the new xattr directly.
  * xattr block is shared, copy the xattr block and insert the new xattr.

Only in the last two cases can a multiply claimed xattr block result in
out-of-bounds access, so the buffer lock is held to verify that the data
is correct.
(It looks like kmemdup should be moved to the buffer lock here as well.🤔)

Thanks again!

Regards,
Baokun


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-16  8:02               ` Baokun Li
@ 2024-10-16 20:47                 ` Theodore Ts'o
  2024-10-17 12:42                   ` Baokun Li
  0 siblings, 1 reply; 18+ messages in thread
From: Theodore Ts'o @ 2024-10-16 20:47 UTC (permalink / raw)
  To: Baokun Li
  Cc: Jan Kara, Qianqiang Liu, adilger.kernel, syzbot, linux-ext4,
	linux-kernel, syzkaller-bugs, Yang Erkun

On Wed, Oct 16, 2024 at 04:02:40PM +0800, Baokun Li wrote:
> As server clusters get larger and larger, server maintenance becomes very
> difficult. Therefore, timely detection of problems (i.e. online scanning,
> similar to e2fsck -fn) and timely and non-stop fixing of problems (i.e.
> online fsck, similar to e2fsck -a) have always been the requirements of
> our customers. Thus online fsck has been on our TODO list, and it's really
> time to start doing it. 😀

As far as online scaning is concerned, if you are using LVM, we can
use a combination of dm-snapshot and e2fsck -fn --- that is what the
e2scrub command automates.

Online fsck is much harder, since it would require back pointers to do
this efficienctly.  To do this, a general way of solving this would
involve a generalized some kind of b-tree or b+tree where changes are
managed via jbd2.  This could be used so that (for example) if we had
a tree which maps block ranges to an inode number, then given a block
number, we can figure out which inode "owns" that block.  The harder
part is those objects that have multiple forward pointers --- for
example an inode might have multiple hard links to multiple
directories, so we need to handle this somehow.

If we had the jbd2-aware b+tree, we could also use this add support
for reflink/clone, which would also be cool.

If this is something that your team really weants to work on, what I'd
suggest is to create a rough design of what the journaled b+tree would
look like, and then implement it first, since this is the prerequisite
for a huge number of advanced file system features.  Implementation
should be done in a way that makes it easy for the code to be usable
both in the kernel and in e2fsprogs, since life will be much easier if
we have e2fsck and debugfs support for the new file system data
structures from the very beginning of the development.

If your company is willing to invest in the engineering effort to do
this, great!  But I have to point out that an alternative approach
that you should consider is whether XFS might be a closer match for
some of your customers' needs.  The advantage of ext4 is that it is
much simpler and easier to understand that XFS.  But as we add these
new features, ext4 will get more complex.  And so one of the design
considerations we should keep in mind is to keep ext4 as simple and
miantainable as possible, even as we add new functionality.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-16 20:47                 ` Theodore Ts'o
@ 2024-10-17 12:42                   ` Baokun Li
  2024-10-17 14:47                     ` Theodore Ts'o
  0 siblings, 1 reply; 18+ messages in thread
From: Baokun Li @ 2024-10-17 12:42 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Jan Kara, Qianqiang Liu, adilger.kernel, syzbot, linux-ext4,
	linux-kernel, syzkaller-bugs, Yang Erkun, Baokun Li

On 2024/10/17 4:47, Theodore Ts'o wrote:
> On Wed, Oct 16, 2024 at 04:02:40PM +0800, Baokun Li wrote:
>> As server clusters get larger and larger, server maintenance becomes very
>> difficult. Therefore, timely detection of problems (i.e. online scanning,
>> similar to e2fsck -fn) and timely and non-stop fixing of problems (i.e.
>> online fsck, similar to e2fsck -a) have always been the requirements of
>> our customers. Thus online fsck has been on our TODO list, and it's really
>> time to start doing it. 😀
> As far as online scaning is concerned, if you are using LVM, we can
> use a combination of dm-snapshot and e2fsck -fn --- that is what the
> e2scrub command automates.
Yes, e2scrub is very nice, but it has too many limitations. We have some
customers who don't use lvm.
> Online fsck is much harder, since it would require back pointers to do
> this efficienctly.
Indeed, our rough plan is to first implement isolation of abnormal file
system resources, so that the system can continue to run normally even
when there is an error; then implement online scanning, so that the
maintainer can see the health report at any time; and finally implement
the most difficult online repair.
>   To do this, a general way of solving this would
> involve a generalized some kind of b-tree or b+tree where changes are
> managed via jbd2.  This could be used so that (for example) if we had
> a tree which maps block ranges to an inode number, then given a block
> number, we can figure out which inode "owns" that block.  The harder
> part is those objects that have multiple forward pointers --- for
> example an inode might have multiple hard links to multiple
> directories, so we need to handle this somehow.
We do need to establish the mapping of physical blocks to inodes and inodes
to parent dir. By tree managed by jbd2 do you mean updating the tree when
committing to journal? Or are updates to the tree logged to journal?
>
> If we had the jbd2-aware b+tree, we could also use this add support
> for reflink/clone, which would also be cool.
Yeah, reflink is pretty cool, we can try it out when the others are done.
>
> If this is something that your team really weants to work on, what I'd
> suggest is to create a rough design of what the journaled b+tree would
> look like, and then implement it first, since this is the prerequisite
> for a huge number of advanced file system features.  Implementation
> should be done in a way that makes it easy for the code to be usable
> both in the kernel and in e2fsprogs, since life will be much easier if
> we have e2fsck and debugfs support for the new file system data
> structures from the very beginning of the development.
Thank you for your suggestion! This is really key to the development. We'll
discuss the overall design internally before consulting the community.
> If your company is willing to invest in the engineering effort to do
> this, great!  But I have to point out that an alternative approach
> that you should consider is whether XFS might be a closer match for
> some of your customers' needs.  The advantage of ext4 is that it is
> much simpler and easier to understand that XFS.
The XFS maintainability enhancement is something my colleague is working
on. But we have a fair number of downstream customers who prefer ext4, so
it's worth investing the time to do that.
> But as we add these
> new features, ext4 will get more complex.  And so one of the design
> considerations we should keep in mind is to keep ext4 as simple and
> miantainable as possible, even as we add new functionality.
>
> Cheers,
>
> 						- Ted
Of course! we will keep the code as simple and maintainable as possible.

Thanks again for your input! 😉

Cheers,
Baokun

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-17 12:42                   ` Baokun Li
@ 2024-10-17 14:47                     ` Theodore Ts'o
  2024-10-18  3:44                       ` Baokun Li
  0 siblings, 1 reply; 18+ messages in thread
From: Theodore Ts'o @ 2024-10-17 14:47 UTC (permalink / raw)
  To: Baokun Li
  Cc: Jan Kara, Qianqiang Liu, adilger.kernel, syzbot, linux-ext4,
	linux-kernel, syzkaller-bugs, Yang Erkun

On Thu, Oct 17, 2024 at 08:42:59PM +0800, Baokun Li wrote:
> Indeed, our rough plan is to first implement isolation of abnormal file
> system resources, so that the system can continue to run normally even
> when there is an error; then implement online scanning, so that the
> maintainer can see the health report at any time; and finally implement
> the most difficult online repair.

We have some of this already; if a block group has obvious
inconsistencies --- for example, if there is an attempt to mark a
block or inode as free, but it's already marked as free as the
allocation bitmap, we can mark the block group as inconsistent, and
then avoid allocating from the block group.  That's easy because it's
the kind of inconsistency which can be detected locally.

The problem comes with those inconsistencies which require a global
examination of the file system data structures.  For example, is the
refcount of an inode correct?  Or is a block claimed by more than one
inode?  The e2scrub approach requires creating a read-only snapshot
(which is why we need LVM) and then running e2fsck in userspace,
because it does a global examination of all file system data
structures.

> We do need to establish the mapping of physical blocks to inodes and
> inodes to parent dir. By tree managed by jbd2 do you mean updating
> the tree when committing to journal? Or are updates to the tree
> logged to journal?

When we allocate a block, we need to journal the changes to the
allocation bitmap.  If we are going to also update the reverse mapping
data structure, that needs to be journalled also, so that after a
crash, the data structures are consistent.

						- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry
  2024-10-17 14:47                     ` Theodore Ts'o
@ 2024-10-18  3:44                       ` Baokun Li
  0 siblings, 0 replies; 18+ messages in thread
From: Baokun Li @ 2024-10-18  3:44 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Jan Kara, Qianqiang Liu, adilger.kernel, syzbot, linux-ext4,
	linux-kernel, syzkaller-bugs, Yang Erkun, Baokun Li

On 2024/10/17 22:47, Theodore Ts'o wrote:
> On Thu, Oct 17, 2024 at 08:42:59PM +0800, Baokun Li wrote:
>> Indeed, our rough plan is to first implement isolation of abnormal file
>> system resources, so that the system can continue to run normally even
>> when there is an error; then implement online scanning, so that the
>> maintainer can see the health report at any time; and finally implement
>> the most difficult online repair.
> We have some of this already; if a block group has obvious
> inconsistencies --- for example, if there is an attempt to mark a
> block or inode as free, but it's already marked as free as the
> allocation bitmap, we can mark the block group as inconsistent, and
> then avoid allocating from the block group.  That's easy because it's
> the kind of inconsistency which can be detected locally.
Yes, there is now block group level isolation. Our goal is to further
reduce the scope of isolation to minimise the impact of isolation.

The rough idea is to isolate resources where errors are reported,
and throw errors when isolation is not possible. This may be a bit
crude, but after implementing inline scanning we can achieve precise
fine-grained isolation.
> The problem comes with those inconsistencies which require a global
> examination of the file system data structures.  For example, is the
> refcount of an inode correct?  Or is a block claimed by more than one
> inode?  The e2scrub approach requires creating a read-only snapshot
> (which is why we need LVM) and then running e2fsck in userspace,
> because it does a global examination of all file system data
> structures.
Indeed, consistency is a tricky issue, and we'll focus on that piece of
logic.
>> We do need to establish the mapping of physical blocks to inodes and
>> inodes to parent dir. By tree managed by jbd2 do you mean updating
>> the tree when committing to journal? Or are updates to the tree
>> logged to journal?
> When we allocate a block, we need to journal the changes to the
> allocation bitmap.  If we are going to also update the reverse mapping
> data structure, that needs to be journalled also, so that after a
> crash, the data structures are consistent.
>
> 						- Ted
>
Of course, we have to make sure that the metadata modification and the tree
update are in the same transaction, otherwise there is no guarantee that
the metadata is consistent.

Thank you for your input!

Regards,
Baokun


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-10-18  3:44 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-22  0:16 [syzbot] [ext4?] KASAN: out-of-bounds Read in ext4_xattr_set_entry syzbot
2024-09-22  5:46 ` Qianqiang Liu
2024-09-22  6:35   ` syzbot
2024-09-22  6:42     ` [PATCH] ext4: fix out-of-bounds issue " Qianqiang Liu
2024-10-01  9:41       ` Ojaswin Mujoo
2024-10-01 10:15         ` Qianqiang Liu
2024-10-02  6:27         ` Qianqiang Liu
2024-10-08  7:40       ` Baokun Li
2024-10-09 15:50         ` Jan Kara
2024-10-11  2:18           ` Baokun Li
2024-10-14 16:31             ` Jan Kara
2024-10-16  8:02               ` Baokun Li
2024-10-16 20:47                 ` Theodore Ts'o
2024-10-17 12:42                   ` Baokun Li
2024-10-17 14:47                     ` Theodore Ts'o
2024-10-18  3:44                       ` Baokun Li
2024-10-02  6:31     ` [syzbot] [ext4?] KASAN: out-of-bounds Read " Qianqiang Liu
2024-10-02  6:54       ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).