[syzbot] [mm?] KASAN: slab-use-after-free Read in hugetlb

All of lore.kernel.org
 help / color / mirror / Atom feed

* [syzbot] [mm?] KASAN: slab-use-after-free Read in hugetlb_fault (2)
@ 2024-09-08 18:23 syzbot
  2024-09-09  9:57 ` Muchun Song
  0 siblings, 1 reply; 7+ messages in thread
From: syzbot @ 2024-09-08 18:23 UTC (permalink / raw)
  To: akpm, linux-kernel, linux-mm, muchun.song, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    88fac17500f4 Merge tag 'fuse-fixes-6.11-rc7' of git://git...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13291d97980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=660f6eb11f9c7dc5
dashboard link: https://syzkaller.appspot.com/bug?extid=2dab93857ee95f2eeb08
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/6dfa1c637f53/disk-88fac175.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/7a322b491698/vmlinux-88fac175.xz
kernel image: https://storage.googleapis.com/syzbot-assets/edc9184a3a97/bzImage-88fac175.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: slab-use-after-free in __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]
BUG: KASAN: slab-use-after-free in hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]
BUG: KASAN: slab-use-after-free in hugetlb_no_page mm/hugetlb.c:6380 [inline]
BUG: KASAN: slab-use-after-free in hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
Read of size 8 at addr ffff88807c17f9d0 by task syz.0.4558/26998

CPU: 1 UID: 0 PID: 26998 Comm: syz.0.4558 Not tainted 6.11.0-rc6-syzkaller-00026-g88fac17500f4 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:93 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]
 hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]
 hugetlb_no_page mm/hugetlb.c:6380 [inline]
 hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
 handle_mm_fault+0x1901/0x1bc0 mm/memory.c:5830
 do_user_addr_fault arch/x86/mm/fault.c:1338 [inline]
 handle_page_fault arch/x86/mm/fault.c:1481 [inline]
 exc_page_fault+0x459/0x8c0 arch/x86/mm/fault.c:1539
 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0033:0x7f2b63744998
Code: fc 89 37 c3 c5 fa 6f 06 c5 fa 6f 4c 16 f0 c5 fa 7f 07 c5 fa 7f 4c 17 f0 c3 66 0f 1f 84 00 00 00 00 00 48 8b 4c 16 f8 48 8b 36 <48> 89 37 48 89 4c 17 f8 c3 c5 fe 6f 54 16 e0 c5 fe 6f 5c 16 c0 c5
RSP: 002b:00007f2b63a5fb88 EFLAGS: 00010206
RAX: 00000000200002c0 RBX: 0000000000000004 RCX: 00676e7277682f76
RDX: 000000000000000b RSI: 7277682f7665642f RDI: 00000000200002c0
RBP: 00007f2b63937a80 R08: 00007f2b63600000 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000009 R12: 000000000014aa5e
R13: 00007f2b63a5fc90 R14: 0000000000000032 R15: fffffffffffffffe
 </TASK>

Allocated by task 27000:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 unpoison_slab_object mm/kasan/common.c:312 [inline]
 __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slub.c:3988 [inline]
 slab_alloc_node mm/slub.c:4037 [inline]
 kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4044
 vm_area_alloc+0x24/0x1d0 kernel/fork.c:471
 mmap_region+0xc3d/0x2090 mm/mmap.c:2944
 do_mmap+0x8f9/0x1010 mm/mmap.c:1468
 vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
 ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 26255:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
 poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
 __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
 kasan_slab_free include/linux/kasan.h:184 [inline]
 slab_free_hook mm/slub.c:2252 [inline]
 slab_free mm/slub.c:4473 [inline]
 kmem_cache_free+0x145/0x350 mm/slub.c:4548
 rcu_do_batch kernel/rcu/tree.c:2569 [inline]
 rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2843
 handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
 do_softirq+0x11b/0x1e0 kernel/softirq.c:455
 __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
 spin_unlock_bh include/linux/spinlock.h:396 [inline]
 __fib6_clean_all+0x327/0x4b0 net/ipv6/ip6_fib.c:2277
 rt6_sync_down_dev net/ipv6/route.c:4908 [inline]
 rt6_disable_ip+0x164/0x7e0 net/ipv6/route.c:4913
 addrconf_ifdown+0x15d/0x1bd0 net/ipv6/addrconf.c:3856
 addrconf_notify+0x3cb/0x1020
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
 call_netdevice_notifiers net/core/dev.c:2046 [inline]
 dev_close_many+0x33c/0x4c0 net/core/dev.c:1587
 unregister_netdevice_many_notify+0x50b/0x1c40 net/core/dev.c:11327
 unregister_netdevice_many net/core/dev.c:11414 [inline]
 default_device_exit_batch+0xa0f/0xa90 net/core/dev.c:11897
 ops_exit_list net/core/net_namespace.c:178 [inline]
 cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
 process_one_work kernel/workqueue.c:3231 [inline]
 process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
 worker_thread+0x86d/0xd10 kernel/workqueue.c:3389
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Last potentially related work creation:
 kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
 __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
 __call_rcu_common kernel/rcu/tree.c:3106 [inline]
 call_rcu+0x167/0xa70 kernel/rcu/tree.c:3210
 remove_vma mm/mmap.c:189 [inline]
 remove_mt mm/mmap.c:2415 [inline]
 do_vmi_align_munmap+0x155c/0x18c0 mm/mmap.c:2758
 do_vmi_munmap+0x261/0x2f0 mm/mmap.c:2830
 mmap_region+0x72f/0x2090 mm/mmap.c:2881
 do_mmap+0x8f9/0x1010 mm/mmap.c:1468
 vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
 ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff88807c17f9b0
 which belongs to the cache vm_area_struct of size 184
The buggy address is located 32 bytes inside of
 freed 184-byte region [ffff88807c17f9b0, ffff88807c17fa68)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7c17f
memcg:ffff888028997401
anon flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xfdffffff(slab)
raw: 00fff00000000000 ffff88801bafdb40 ffffea0001f89e00 000000000000000d
raw: 0000000000000000 0000000000100010 00000001fdffffff ffff888028997401
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x152cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 26741, tgid 26741 (dhcpcd-run-hook), ts 1341391347767, free_ts 1341166373745
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
 prep_new_page mm/page_alloc.c:1501 [inline]
 get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
 __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
 alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
 alloc_slab_page+0x5f/0x120 mm/slub.c:2321
 allocate_slab+0x5a/0x2f0 mm/slub.c:2484
 new_slab mm/slub.c:2537 [inline]
 ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3723
 __slab_alloc+0x58/0xa0 mm/slub.c:3813
 __slab_alloc_node mm/slub.c:3866 [inline]
 slab_alloc_node mm/slub.c:4025 [inline]
 kmem_cache_alloc_noprof+0x1c1/0x2a0 mm/slub.c:4044
 vm_area_dup+0x27/0x290 kernel/fork.c:486
 dup_mmap kernel/fork.c:695 [inline]
 dup_mm kernel/fork.c:1672 [inline]
 copy_mm+0xc7b/0x1f30 kernel/fork.c:1721
 copy_process+0x187a/0x3dc0 kernel/fork.c:2374
 kernel_clone+0x226/0x8f0 kernel/fork.c:2781
 __do_sys_clone kernel/fork.c:2924 [inline]
 __se_sys_clone kernel/fork.c:2908 [inline]
 __x64_sys_clone+0x258/0x2a0 kernel/fork.c:2908
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 26730 tgid 26718 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 free_pages_prepare mm/page_alloc.c:1094 [inline]
 free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
 __folio_put+0x2c8/0x440 mm/swap.c:128
 migrate_folio_move mm/migrate.c:1330 [inline]
 migrate_pages_batch+0x2a76/0x3560 mm/migrate.c:1818
 migrate_pages_sync mm/migrate.c:1884 [inline]
 migrate_pages+0x1f59/0x3460 mm/migrate.c:1993
 do_mbind mm/mempolicy.c:1388 [inline]
 kernel_mbind mm/mempolicy.c:1531 [inline]
 __do_sys_mbind mm/mempolicy.c:1605 [inline]
 __se_sys_mbind+0x1490/0x19f0 mm/mempolicy.c:1601
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Memory state around the buggy address:
 ffff88807c17f880: fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 00
 ffff88807c17f900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
>ffff88807c17f980: fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb fb
                                                 ^
 ffff88807c17fa00: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
 ffff88807c17fa80: fc fc fc fc fc 00 00 00 00 00 00 00 00 00 00 00
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [mm?] KASAN: slab-use-after-free Read in hugetlb_fault (2)
  2024-09-08 18:23 [syzbot] [mm?] KASAN: slab-use-after-free Read in hugetlb_fault (2) syzbot
@ 2024-09-09  9:57 ` Muchun Song
  2024-09-09 21:06   ` Vishal Moola
  0 siblings, 1 reply; 7+ messages in thread
From: Muchun Song @ 2024-09-09  9:57 UTC (permalink / raw)
  To: syzbot
  Cc: Andrew Morton, LKML, Linux Memory Management List, syzkaller-bugs,
	Vishal Moola



> On Sep 9, 2024, at 02:23, syzbot <syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com> wrote:
> 
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    88fac17500f4 Merge tag 'fuse-fixes-6.11-rc7' of git://git...
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13291d97980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=660f6eb11f9c7dc5
> dashboard link: https://syzkaller.appspot.com/bug?extid=2dab93857ee95f2eeb08
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/6dfa1c637f53/disk-88fac175.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/7a322b491698/vmlinux-88fac175.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/edc9184a3a97/bzImage-88fac175.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com
> 
> ==================================================================
> BUG: KASAN: slab-use-after-free in __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]

This is accessing vma structure.

> BUG: KASAN: slab-use-after-free in hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]
> BUG: KASAN: slab-use-after-free in hugetlb_no_page mm/hugetlb.c:6380 [inline]
> BUG: KASAN: slab-use-after-free in hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
> Read of size 8 at addr ffff88807c17f9d0 by task syz.0.4558/26998
> 
> CPU: 1 UID: 0 PID: 26998 Comm: syz.0.4558 Not tainted 6.11.0-rc6-syzkaller-00026-g88fac17500f4 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:93 [inline]
> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
> print_address_description mm/kasan/report.c:377 [inline]
> print_report+0x169/0x550 mm/kasan/report.c:488
> kasan_report+0x143/0x180 mm/kasan/report.c:601
> __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]
> hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]

I think vma is freed before this call of hugetlb_vma_unlock_read()
but after hugetlb_vma_lock_read() in hugetlb_fault(). I found a
possible scenario to cause this problem.

hugetlb_no_page()
	ret = vmf_anon_prepare()
		if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
			if (!mmap_read_trylock(vma->vm_mm)) {
				vma_end_read(vma);
				// VMA lock is released, which could be freed before the call of hugetlb_vma_unlock_read().
				return VM_FAULT_RETRY;
			}
		}
	if (unlikely(ret))
		goto out;
out:
	hugetlb_vma_unlock_read(vma); // UAF of VMA

The culprit commit should be
	
	7c43a553792a1 ("hugetlb: allow faults to be handled under the VMA lock").

I will take a closer look at the solution tomorrow. And Cc the author of the
above commit, maybe have some comments on this.

Muchun,
Thanks.

> hugetlb_no_page mm/hugetlb.c:6380 [inline]
> hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
> handle_mm_fault+0x1901/0x1bc0 mm/memory.c:5830
> do_user_addr_fault arch/x86/mm/fault.c:1338 [inline]
> handle_page_fault arch/x86/mm/fault.c:1481 [inline]
> exc_page_fault+0x459/0x8c0 arch/x86/mm/fault.c:1539
> asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> RIP: 0033:0x7f2b63744998
> Code: fc 89 37 c3 c5 fa 6f 06 c5 fa 6f 4c 16 f0 c5 fa 7f 07 c5 fa 7f 4c 17 f0 c3 66 0f 1f 84 00 00 00 00 00 48 8b 4c 16 f8 48 8b 36 <48> 89 37 48 89 4c 17 f8 c3 c5 fe 6f 54 16 e0 c5 fe 6f 5c 16 c0 c5
> RSP: 002b:00007f2b63a5fb88 EFLAGS: 00010206
> RAX: 00000000200002c0 RBX: 0000000000000004 RCX: 00676e7277682f76
> RDX: 000000000000000b RSI: 7277682f7665642f RDI: 00000000200002c0
> RBP: 00007f2b63937a80 R08: 00007f2b63600000 R09: 0000000000000001
> R10: 0000000000000001 R11: 0000000000000009 R12: 000000000014aa5e
> R13: 00007f2b63a5fc90 R14: 0000000000000032 R15: fffffffffffffffe
> </TASK>
> 
> Allocated by task 27000:
> kasan_save_stack mm/kasan/common.c:47 [inline]
> kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
> unpoison_slab_object mm/kasan/common.c:312 [inline]
> __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
> kasan_slab_alloc include/linux/kasan.h:201 [inline]
> slab_post_alloc_hook mm/slub.c:3988 [inline]
> slab_alloc_node mm/slub.c:4037 [inline]
> kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4044
> vm_area_alloc+0x24/0x1d0 kernel/fork.c:471
> mmap_region+0xc3d/0x2090 mm/mmap.c:2944
> do_mmap+0x8f9/0x1010 mm/mmap.c:1468
> vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
> ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> Freed by task 26255:
> kasan_save_stack mm/kasan/common.c:47 [inline]
> kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
> kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
> poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
> __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
> kasan_slab_free include/linux/kasan.h:184 [inline]
> slab_free_hook mm/slub.c:2252 [inline]
> slab_free mm/slub.c:4473 [inline]
> kmem_cache_free+0x145/0x350 mm/slub.c:4548
> rcu_do_batch kernel/rcu/tree.c:2569 [inline]
> rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2843

VMA structure is freed via rcu, so it is really a UAF problem.

> handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
> do_softirq+0x11b/0x1e0 kernel/softirq.c:455
> __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
> spin_unlock_bh include/linux/spinlock.h:396 [inline]
> __fib6_clean_all+0x327/0x4b0 net/ipv6/ip6_fib.c:2277
> rt6_sync_down_dev net/ipv6/route.c:4908 [inline]
> rt6_disable_ip+0x164/0x7e0 net/ipv6/route.c:4913
> addrconf_ifdown+0x15d/0x1bd0 net/ipv6/addrconf.c:3856
> addrconf_notify+0x3cb/0x1020
> notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
> call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
> call_netdevice_notifiers net/core/dev.c:2046 [inline]
> dev_close_many+0x33c/0x4c0 net/core/dev.c:1587
> unregister_netdevice_many_notify+0x50b/0x1c40 net/core/dev.c:11327
> unregister_netdevice_many net/core/dev.c:11414 [inline]
> default_device_exit_batch+0xa0f/0xa90 net/core/dev.c:11897
> ops_exit_list net/core/net_namespace.c:178 [inline]
> cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
> process_one_work kernel/workqueue.c:3231 [inline]
> process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
> worker_thread+0x86d/0xd10 kernel/workqueue.c:3389
> kthread+0x2f0/0x390 kernel/kthread.c:389
> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> 
> Last potentially related work creation:
> kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
> __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
> __call_rcu_common kernel/rcu/tree.c:3106 [inline]
> call_rcu+0x167/0xa70 kernel/rcu/tree.c:3210
> remove_vma mm/mmap.c:189 [inline]
> remove_mt mm/mmap.c:2415 [inline]
> do_vmi_align_munmap+0x155c/0x18c0 mm/mmap.c:2758
> do_vmi_munmap+0x261/0x2f0 mm/mmap.c:2830
> mmap_region+0x72f/0x2090 mm/mmap.c:2881
> do_mmap+0x8f9/0x1010 mm/mmap.c:1468
> vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
> ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> The buggy address belongs to the object at ffff88807c17f9b0
> which belongs to the cache vm_area_struct of size 184
> The buggy address is located 32 bytes inside of
> freed 184-byte region [ffff88807c17f9b0, ffff88807c17fa68)
> 
> The buggy address belongs to the physical page:
> page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7c17f
> memcg:ffff888028997401
> anon flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
> page_type: 0xfdffffff(slab)
> raw: 00fff00000000000 ffff88801bafdb40 ffffea0001f89e00 000000000000000d
> raw: 0000000000000000 0000000000100010 00000001fdffffff ffff888028997401
> page dumped because: kasan: bad access detected
> page_owner tracks the page as allocated
> page last allocated via order 0, migratetype Unmovable, gfp_mask 0x152cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 26741, tgid 26741 (dhcpcd-run-hook), ts 1341391347767, free_ts 1341166373745
> set_page_owner include/linux/page_owner.h:32 [inline]
> post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
> prep_new_page mm/page_alloc.c:1501 [inline]
> get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
> __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
> __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
> alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
> alloc_slab_page+0x5f/0x120 mm/slub.c:2321
> allocate_slab+0x5a/0x2f0 mm/slub.c:2484
> new_slab mm/slub.c:2537 [inline]
> ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3723
> __slab_alloc+0x58/0xa0 mm/slub.c:3813
> __slab_alloc_node mm/slub.c:3866 [inline]
> slab_alloc_node mm/slub.c:4025 [inline]
> kmem_cache_alloc_noprof+0x1c1/0x2a0 mm/slub.c:4044
> vm_area_dup+0x27/0x290 kernel/fork.c:486
> dup_mmap kernel/fork.c:695 [inline]
> dup_mm kernel/fork.c:1672 [inline]
> copy_mm+0xc7b/0x1f30 kernel/fork.c:1721
> copy_process+0x187a/0x3dc0 kernel/fork.c:2374
> kernel_clone+0x226/0x8f0 kernel/fork.c:2781
> __do_sys_clone kernel/fork.c:2924 [inline]
> __se_sys_clone kernel/fork.c:2908 [inline]
> __x64_sys_clone+0x258/0x2a0 kernel/fork.c:2908
> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> page last free pid 26730 tgid 26718 stack trace:
> reset_page_owner include/linux/page_owner.h:25 [inline]
> free_pages_prepare mm/page_alloc.c:1094 [inline]
> free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
> __folio_put+0x2c8/0x440 mm/swap.c:128
> migrate_folio_move mm/migrate.c:1330 [inline]
> migrate_pages_batch+0x2a76/0x3560 mm/migrate.c:1818
> migrate_pages_sync mm/migrate.c:1884 [inline]
> migrate_pages+0x1f59/0x3460 mm/migrate.c:1993
> do_mbind mm/mempolicy.c:1388 [inline]
> kernel_mbind mm/mempolicy.c:1531 [inline]
> __do_sys_mbind mm/mempolicy.c:1605 [inline]
> __se_sys_mbind+0x1490/0x19f0 mm/mempolicy.c:1601
> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> Memory state around the buggy address:
> ffff88807c17f880: fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 00
> ffff88807c17f900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
>> ffff88807c17f980: fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb fb
>                                                 ^
> ffff88807c17fa00: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
> ffff88807c17fa80: fc fc fc fc fc 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [mm?] KASAN: slab-use-after-free Read in hugetlb_fault (2)
  2024-09-09  9:57 ` Muchun Song
@ 2024-09-09 21:06   ` Vishal Moola
  2024-09-10 19:27     ` Vishal Moola
  0 siblings, 1 reply; 7+ messages in thread
From: Vishal Moola @ 2024-09-09 21:06 UTC (permalink / raw)
  To: Muchun Song
  Cc: syzbot, Andrew Morton, LKML, Linux Memory Management List,
	syzkaller-bugs

On Mon, Sep 09, 2024 at 05:57:52PM +0800, Muchun Song wrote:
> 
> 
> > On Sep 9, 2024, at 02:23, syzbot <syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com> wrote:
> > 
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    88fac17500f4 Merge tag 'fuse-fixes-6.11-rc7' of git://git...
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13291d97980000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=660f6eb11f9c7dc5
> > dashboard link: https://syzkaller.appspot.com/bug?extid=2dab93857ee95f2eeb08
> > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > 
> > Unfortunately, I don't have any reproducer for this issue yet.
> > 
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/6dfa1c637f53/disk-88fac175.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/7a322b491698/vmlinux-88fac175.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/edc9184a3a97/bzImage-88fac175.xz
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com
> > 
> > ==================================================================
> > BUG: KASAN: slab-use-after-free in __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]
> 
> This is accessing vma structure.
> 
> > BUG: KASAN: slab-use-after-free in hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]
> > BUG: KASAN: slab-use-after-free in hugetlb_no_page mm/hugetlb.c:6380 [inline]
> > BUG: KASAN: slab-use-after-free in hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
> > Read of size 8 at addr ffff88807c17f9d0 by task syz.0.4558/26998
> > 
> > CPU: 1 UID: 0 PID: 26998 Comm: syz.0.4558 Not tainted 6.11.0-rc6-syzkaller-00026-g88fac17500f4 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
> > Call Trace:
> > <TASK>
> > __dump_stack lib/dump_stack.c:93 [inline]
> > dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
> > print_address_description mm/kasan/report.c:377 [inline]
> > print_report+0x169/0x550 mm/kasan/report.c:488
> > kasan_report+0x143/0x180 mm/kasan/report.c:601
> > __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]
> > hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]
> 
> I think vma is freed before this call of hugetlb_vma_unlock_read()
> but after hugetlb_vma_lock_read() in hugetlb_fault(). I found a
> possible scenario to cause this problem.
> 
> hugetlb_no_page()
> 	ret = vmf_anon_prepare()
> 		if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
> 			if (!mmap_read_trylock(vma->vm_mm)) {
> 				vma_end_read(vma);
> 				// VMA lock is released, which could be freed before the call of hugetlb_vma_unlock_read().
> 				return VM_FAULT_RETRY;
> 			}
> 		}
> 	if (unlikely(ret))
> 		goto out;
> out:
> 	hugetlb_vma_unlock_read(vma); // UAF of VMA

Thanks for catching this, it indeed looks like the problem. I don't
think its easy to reproduce since we would have to unmap the vma while
a fault is being handled (and failing).

This same issue should be present in hugetlb_wp() as well, so I'm thinking
the best fix would be to make another function similar to
vmf_anon_prepare() that doesn't release the vma lock. Then wait to drop
the lock until hugetlb_vma_unlock_read() is called.

I'll have that fix out tomorrow.

> The culprit commit should be
> 	
> 	7c43a553792a1 ("hugetlb: allow faults to be handled under the VMA lock").
> 
> I will take a closer look at the solution tomorrow. And Cc the author of the
> above commit, maybe have some comments on this.
> 
> Muchun,
> Thanks.
> 
> > hugetlb_no_page mm/hugetlb.c:6380 [inline]
> > hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
> > handle_mm_fault+0x1901/0x1bc0 mm/memory.c:5830
> > do_user_addr_fault arch/x86/mm/fault.c:1338 [inline]
> > handle_page_fault arch/x86/mm/fault.c:1481 [inline]
> > exc_page_fault+0x459/0x8c0 arch/x86/mm/fault.c:1539
> > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> > RIP: 0033:0x7f2b63744998
> > Code: fc 89 37 c3 c5 fa 6f 06 c5 fa 6f 4c 16 f0 c5 fa 7f 07 c5 fa 7f 4c 17 f0 c3 66 0f 1f 84 00 00 00 00 00 48 8b 4c 16 f8 48 8b 36 <48> 89 37 48 89 4c 17 f8 c3 c5 fe 6f 54 16 e0 c5 fe 6f 5c 16 c0 c5
> > RSP: 002b:00007f2b63a5fb88 EFLAGS: 00010206
> > RAX: 00000000200002c0 RBX: 0000000000000004 RCX: 00676e7277682f76
> > RDX: 000000000000000b RSI: 7277682f7665642f RDI: 00000000200002c0
> > RBP: 00007f2b63937a80 R08: 00007f2b63600000 R09: 0000000000000001
> > R10: 0000000000000001 R11: 0000000000000009 R12: 000000000014aa5e
> > R13: 00007f2b63a5fc90 R14: 0000000000000032 R15: fffffffffffffffe
> > </TASK>
> > 
> > Allocated by task 27000:
> > kasan_save_stack mm/kasan/common.c:47 [inline]
> > kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
> > unpoison_slab_object mm/kasan/common.c:312 [inline]
> > __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
> > kasan_slab_alloc include/linux/kasan.h:201 [inline]
> > slab_post_alloc_hook mm/slub.c:3988 [inline]
> > slab_alloc_node mm/slub.c:4037 [inline]
> > kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4044
> > vm_area_alloc+0x24/0x1d0 kernel/fork.c:471
> > mmap_region+0xc3d/0x2090 mm/mmap.c:2944
> > do_mmap+0x8f9/0x1010 mm/mmap.c:1468
> > vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
> > ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
> > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > 
> > Freed by task 26255:
> > kasan_save_stack mm/kasan/common.c:47 [inline]
> > kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
> > kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
> > poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
> > __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
> > kasan_slab_free include/linux/kasan.h:184 [inline]
> > slab_free_hook mm/slub.c:2252 [inline]
> > slab_free mm/slub.c:4473 [inline]
> > kmem_cache_free+0x145/0x350 mm/slub.c:4548
> > rcu_do_batch kernel/rcu/tree.c:2569 [inline]
> > rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2843
> 
> VMA structure is freed via rcu, so it is really a UAF problem.
> 
> > handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
> > do_softirq+0x11b/0x1e0 kernel/softirq.c:455
> > __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
> > spin_unlock_bh include/linux/spinlock.h:396 [inline]
> > __fib6_clean_all+0x327/0x4b0 net/ipv6/ip6_fib.c:2277
> > rt6_sync_down_dev net/ipv6/route.c:4908 [inline]
> > rt6_disable_ip+0x164/0x7e0 net/ipv6/route.c:4913
> > addrconf_ifdown+0x15d/0x1bd0 net/ipv6/addrconf.c:3856
> > addrconf_notify+0x3cb/0x1020
> > notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
> > call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
> > call_netdevice_notifiers net/core/dev.c:2046 [inline]
> > dev_close_many+0x33c/0x4c0 net/core/dev.c:1587
> > unregister_netdevice_many_notify+0x50b/0x1c40 net/core/dev.c:11327
> > unregister_netdevice_many net/core/dev.c:11414 [inline]
> > default_device_exit_batch+0xa0f/0xa90 net/core/dev.c:11897
> > ops_exit_list net/core/net_namespace.c:178 [inline]
> > cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
> > process_one_work kernel/workqueue.c:3231 [inline]
> > process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
> > worker_thread+0x86d/0xd10 kernel/workqueue.c:3389
> > kthread+0x2f0/0x390 kernel/kthread.c:389
> > ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> > 
> > Last potentially related work creation:
> > kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
> > __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
> > __call_rcu_common kernel/rcu/tree.c:3106 [inline]
> > call_rcu+0x167/0xa70 kernel/rcu/tree.c:3210
> > remove_vma mm/mmap.c:189 [inline]
> > remove_mt mm/mmap.c:2415 [inline]
> > do_vmi_align_munmap+0x155c/0x18c0 mm/mmap.c:2758
> > do_vmi_munmap+0x261/0x2f0 mm/mmap.c:2830
> > mmap_region+0x72f/0x2090 mm/mmap.c:2881
> > do_mmap+0x8f9/0x1010 mm/mmap.c:1468
> > vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
> > ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
> > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > 
> > The buggy address belongs to the object at ffff88807c17f9b0
> > which belongs to the cache vm_area_struct of size 184
> > The buggy address is located 32 bytes inside of
> > freed 184-byte region [ffff88807c17f9b0, ffff88807c17fa68)
> > 
> > The buggy address belongs to the physical page:
> > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7c17f
> > memcg:ffff888028997401
> > anon flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
> > page_type: 0xfdffffff(slab)
> > raw: 00fff00000000000 ffff88801bafdb40 ffffea0001f89e00 000000000000000d
> > raw: 0000000000000000 0000000000100010 00000001fdffffff ffff888028997401
> > page dumped because: kasan: bad access detected
> > page_owner tracks the page as allocated
> > page last allocated via order 0, migratetype Unmovable, gfp_mask 0x152cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 26741, tgid 26741 (dhcpcd-run-hook), ts 1341391347767, free_ts 1341166373745
> > set_page_owner include/linux/page_owner.h:32 [inline]
> > post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
> > prep_new_page mm/page_alloc.c:1501 [inline]
> > get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
> > __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
> > __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
> > alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
> > alloc_slab_page+0x5f/0x120 mm/slub.c:2321
> > allocate_slab+0x5a/0x2f0 mm/slub.c:2484
> > new_slab mm/slub.c:2537 [inline]
> > ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3723
> > __slab_alloc+0x58/0xa0 mm/slub.c:3813
> > __slab_alloc_node mm/slub.c:3866 [inline]
> > slab_alloc_node mm/slub.c:4025 [inline]
> > kmem_cache_alloc_noprof+0x1c1/0x2a0 mm/slub.c:4044
> > vm_area_dup+0x27/0x290 kernel/fork.c:486
> > dup_mmap kernel/fork.c:695 [inline]
> > dup_mm kernel/fork.c:1672 [inline]
> > copy_mm+0xc7b/0x1f30 kernel/fork.c:1721
> > copy_process+0x187a/0x3dc0 kernel/fork.c:2374
> > kernel_clone+0x226/0x8f0 kernel/fork.c:2781
> > __do_sys_clone kernel/fork.c:2924 [inline]
> > __se_sys_clone kernel/fork.c:2908 [inline]
> > __x64_sys_clone+0x258/0x2a0 kernel/fork.c:2908
> > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > page last free pid 26730 tgid 26718 stack trace:
> > reset_page_owner include/linux/page_owner.h:25 [inline]
> > free_pages_prepare mm/page_alloc.c:1094 [inline]
> > free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
> > __folio_put+0x2c8/0x440 mm/swap.c:128
> > migrate_folio_move mm/migrate.c:1330 [inline]
> > migrate_pages_batch+0x2a76/0x3560 mm/migrate.c:1818
> > migrate_pages_sync mm/migrate.c:1884 [inline]
> > migrate_pages+0x1f59/0x3460 mm/migrate.c:1993
> > do_mbind mm/mempolicy.c:1388 [inline]
> > kernel_mbind mm/mempolicy.c:1531 [inline]
> > __do_sys_mbind mm/mempolicy.c:1605 [inline]
> > __se_sys_mbind+0x1490/0x19f0 mm/mempolicy.c:1601
> > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > 
> > Memory state around the buggy address:
> > ffff88807c17f880: fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 00
> > ffff88807c17f900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
> >> ffff88807c17f980: fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb fb
> >                                                 ^
> > ffff88807c17fa00: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
> > ffff88807c17fa80: fc fc fc fc fc 00 00 00 00 00 00 00 00 00 00 00
> > ==================================================================
> > 
> > 
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > 
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > 
> > If the report is already addressed, let syzbot know by replying with:
> > #syz fix: exact-commit-title
> > 
> > If you want to overwrite report's subsystems, reply with:
> > #syz set subsystems: new-subsystem
> > (See the list of subsystem names on the web dashboard)
> > 
> > If the report is a duplicate of another one, reply with:
> > #syz dup: exact-subject-of-another-report
> > 
> > If you want to undo deduplication, reply with:
> > #syz undup
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [mm?] KASAN: slab-use-after-free Read in hugetlb_fault (2)
  2024-09-09 21:06   ` Vishal Moola
@ 2024-09-10 19:27     ` Vishal Moola
  2024-09-14  5:50       ` Muchun Song
  0 siblings, 1 reply; 7+ messages in thread
From: Vishal Moola @ 2024-09-10 19:27 UTC (permalink / raw)
  To: Muchun Song
  Cc: syzbot, Andrew Morton, LKML, Linux Memory Management List,
	syzkaller-bugs

[-- Attachment #1: Type: text/plain, Size: 13359 bytes --]

On Mon, Sep 09, 2024 at 02:06:13PM -0700, Vishal Moola wrote:
> On Mon, Sep 09, 2024 at 05:57:52PM +0800, Muchun Song wrote:
> > 
> > 
> > > On Sep 9, 2024, at 02:23, syzbot <syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com> wrote:
> > > 
> > > Hello,
> > > 
> > > syzbot found the following issue on:
> > > 
> > > HEAD commit:    88fac17500f4 Merge tag 'fuse-fixes-6.11-rc7' of git://git...
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=13291d97980000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=660f6eb11f9c7dc5
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=2dab93857ee95f2eeb08
> > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > 
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > > 
> > > Downloadable assets:
> > > disk image: https://storage.googleapis.com/syzbot-assets/6dfa1c637f53/disk-88fac175.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/7a322b491698/vmlinux-88fac175.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/edc9184a3a97/bzImage-88fac175.xz
> > > 
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com
> > > 
> > > ==================================================================
> > > BUG: KASAN: slab-use-after-free in __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]
> > 
> > This is accessing vma structure.
> > 
> > > BUG: KASAN: slab-use-after-free in hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]
> > > BUG: KASAN: slab-use-after-free in hugetlb_no_page mm/hugetlb.c:6380 [inline]
> > > BUG: KASAN: slab-use-after-free in hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
> > > Read of size 8 at addr ffff88807c17f9d0 by task syz.0.4558/26998
> > > 
> > > CPU: 1 UID: 0 PID: 26998 Comm: syz.0.4558 Not tainted 6.11.0-rc6-syzkaller-00026-g88fac17500f4 #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
> > > Call Trace:
> > > <TASK>
> > > __dump_stack lib/dump_stack.c:93 [inline]
> > > dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
> > > print_address_description mm/kasan/report.c:377 [inline]
> > > print_report+0x169/0x550 mm/kasan/report.c:488
> > > kasan_report+0x143/0x180 mm/kasan/report.c:601
> > > __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]
> > > hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]
> > 
> > I think vma is freed before this call of hugetlb_vma_unlock_read()
> > but after hugetlb_vma_lock_read() in hugetlb_fault(). I found a
> > possible scenario to cause this problem.
> > 
> > hugetlb_no_page()
> > 	ret = vmf_anon_prepare()
> > 		if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
> > 			if (!mmap_read_trylock(vma->vm_mm)) {
> > 				vma_end_read(vma);
> > 				// VMA lock is released, which could be freed before the call of hugetlb_vma_unlock_read().
> > 				return VM_FAULT_RETRY;
> > 			}
> > 		}
> > 	if (unlikely(ret))
> > 		goto out;
> > out:
> > 	hugetlb_vma_unlock_read(vma); // UAF of VMA
> 
> Thanks for catching this, it indeed looks like the problem. I don't
> think its easy to reproduce since we would have to unmap the vma while
> a fault is being handled (and failing).
> 
> This same issue should be present in hugetlb_wp() as well, so I'm thinking
> the best fix would be to make another function similar to
> vmf_anon_prepare() that doesn't release the vma lock. Then wait to drop
> the lock until hugetlb_vma_unlock_read() is called.
> 
> I'll have that fix out tomorrow.

The 2 attached patches should fix this.

> > The culprit commit should be
> > 	
> > 	7c43a553792a1 ("hugetlb: allow faults to be handled under the VMA lock").
> > 
> > I will take a closer look at the solution tomorrow. And Cc the author of the
> > above commit, maybe have some comments on this.
> > 
> > Muchun,
> > Thanks.
> > 
> > > hugetlb_no_page mm/hugetlb.c:6380 [inline]
> > > hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
> > > handle_mm_fault+0x1901/0x1bc0 mm/memory.c:5830
> > > do_user_addr_fault arch/x86/mm/fault.c:1338 [inline]
> > > handle_page_fault arch/x86/mm/fault.c:1481 [inline]
> > > exc_page_fault+0x459/0x8c0 arch/x86/mm/fault.c:1539
> > > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> > > RIP: 0033:0x7f2b63744998
> > > Code: fc 89 37 c3 c5 fa 6f 06 c5 fa 6f 4c 16 f0 c5 fa 7f 07 c5 fa 7f 4c 17 f0 c3 66 0f 1f 84 00 00 00 00 00 48 8b 4c 16 f8 48 8b 36 <48> 89 37 48 89 4c 17 f8 c3 c5 fe 6f 54 16 e0 c5 fe 6f 5c 16 c0 c5
> > > RSP: 002b:00007f2b63a5fb88 EFLAGS: 00010206
> > > RAX: 00000000200002c0 RBX: 0000000000000004 RCX: 00676e7277682f76
> > > RDX: 000000000000000b RSI: 7277682f7665642f RDI: 00000000200002c0
> > > RBP: 00007f2b63937a80 R08: 00007f2b63600000 R09: 0000000000000001
> > > R10: 0000000000000001 R11: 0000000000000009 R12: 000000000014aa5e
> > > R13: 00007f2b63a5fc90 R14: 0000000000000032 R15: fffffffffffffffe
> > > </TASK>
> > > 
> > > Allocated by task 27000:
> > > kasan_save_stack mm/kasan/common.c:47 [inline]
> > > kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
> > > unpoison_slab_object mm/kasan/common.c:312 [inline]
> > > __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
> > > kasan_slab_alloc include/linux/kasan.h:201 [inline]
> > > slab_post_alloc_hook mm/slub.c:3988 [inline]
> > > slab_alloc_node mm/slub.c:4037 [inline]
> > > kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4044
> > > vm_area_alloc+0x24/0x1d0 kernel/fork.c:471
> > > mmap_region+0xc3d/0x2090 mm/mmap.c:2944
> > > do_mmap+0x8f9/0x1010 mm/mmap.c:1468
> > > vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
> > > ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
> > > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > > 
> > > Freed by task 26255:
> > > kasan_save_stack mm/kasan/common.c:47 [inline]
> > > kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
> > > kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
> > > poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
> > > __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
> > > kasan_slab_free include/linux/kasan.h:184 [inline]
> > > slab_free_hook mm/slub.c:2252 [inline]
> > > slab_free mm/slub.c:4473 [inline]
> > > kmem_cache_free+0x145/0x350 mm/slub.c:4548
> > > rcu_do_batch kernel/rcu/tree.c:2569 [inline]
> > > rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2843
> > 
> > VMA structure is freed via rcu, so it is really a UAF problem.
> > 
> > > handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
> > > do_softirq+0x11b/0x1e0 kernel/softirq.c:455
> > > __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
> > > spin_unlock_bh include/linux/spinlock.h:396 [inline]
> > > __fib6_clean_all+0x327/0x4b0 net/ipv6/ip6_fib.c:2277
> > > rt6_sync_down_dev net/ipv6/route.c:4908 [inline]
> > > rt6_disable_ip+0x164/0x7e0 net/ipv6/route.c:4913
> > > addrconf_ifdown+0x15d/0x1bd0 net/ipv6/addrconf.c:3856
> > > addrconf_notify+0x3cb/0x1020
> > > notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
> > > call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
> > > call_netdevice_notifiers net/core/dev.c:2046 [inline]
> > > dev_close_many+0x33c/0x4c0 net/core/dev.c:1587
> > > unregister_netdevice_many_notify+0x50b/0x1c40 net/core/dev.c:11327
> > > unregister_netdevice_many net/core/dev.c:11414 [inline]
> > > default_device_exit_batch+0xa0f/0xa90 net/core/dev.c:11897
> > > ops_exit_list net/core/net_namespace.c:178 [inline]
> > > cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
> > > process_one_work kernel/workqueue.c:3231 [inline]
> > > process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
> > > worker_thread+0x86d/0xd10 kernel/workqueue.c:3389
> > > kthread+0x2f0/0x390 kernel/kthread.c:389
> > > ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> > > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> > > 
> > > Last potentially related work creation:
> > > kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
> > > __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
> > > __call_rcu_common kernel/rcu/tree.c:3106 [inline]
> > > call_rcu+0x167/0xa70 kernel/rcu/tree.c:3210
> > > remove_vma mm/mmap.c:189 [inline]
> > > remove_mt mm/mmap.c:2415 [inline]
> > > do_vmi_align_munmap+0x155c/0x18c0 mm/mmap.c:2758
> > > do_vmi_munmap+0x261/0x2f0 mm/mmap.c:2830
> > > mmap_region+0x72f/0x2090 mm/mmap.c:2881
> > > do_mmap+0x8f9/0x1010 mm/mmap.c:1468
> > > vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
> > > ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
> > > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > > 
> > > The buggy address belongs to the object at ffff88807c17f9b0
> > > which belongs to the cache vm_area_struct of size 184
> > > The buggy address is located 32 bytes inside of
> > > freed 184-byte region [ffff88807c17f9b0, ffff88807c17fa68)
> > > 
> > > The buggy address belongs to the physical page:
> > > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7c17f
> > > memcg:ffff888028997401
> > > anon flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
> > > page_type: 0xfdffffff(slab)
> > > raw: 00fff00000000000 ffff88801bafdb40 ffffea0001f89e00 000000000000000d
> > > raw: 0000000000000000 0000000000100010 00000001fdffffff ffff888028997401
> > > page dumped because: kasan: bad access detected
> > > page_owner tracks the page as allocated
> > > page last allocated via order 0, migratetype Unmovable, gfp_mask 0x152cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 26741, tgid 26741 (dhcpcd-run-hook), ts 1341391347767, free_ts 1341166373745
> > > set_page_owner include/linux/page_owner.h:32 [inline]
> > > post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
> > > prep_new_page mm/page_alloc.c:1501 [inline]
> > > get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
> > > __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
> > > __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
> > > alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
> > > alloc_slab_page+0x5f/0x120 mm/slub.c:2321
> > > allocate_slab+0x5a/0x2f0 mm/slub.c:2484
> > > new_slab mm/slub.c:2537 [inline]
> > > ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3723
> > > __slab_alloc+0x58/0xa0 mm/slub.c:3813
> > > __slab_alloc_node mm/slub.c:3866 [inline]
> > > slab_alloc_node mm/slub.c:4025 [inline]
> > > kmem_cache_alloc_noprof+0x1c1/0x2a0 mm/slub.c:4044
> > > vm_area_dup+0x27/0x290 kernel/fork.c:486
> > > dup_mmap kernel/fork.c:695 [inline]
> > > dup_mm kernel/fork.c:1672 [inline]
> > > copy_mm+0xc7b/0x1f30 kernel/fork.c:1721
> > > copy_process+0x187a/0x3dc0 kernel/fork.c:2374
> > > kernel_clone+0x226/0x8f0 kernel/fork.c:2781
> > > __do_sys_clone kernel/fork.c:2924 [inline]
> > > __se_sys_clone kernel/fork.c:2908 [inline]
> > > __x64_sys_clone+0x258/0x2a0 kernel/fork.c:2908
> > > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > > page last free pid 26730 tgid 26718 stack trace:
> > > reset_page_owner include/linux/page_owner.h:25 [inline]
> > > free_pages_prepare mm/page_alloc.c:1094 [inline]
> > > free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
> > > __folio_put+0x2c8/0x440 mm/swap.c:128
> > > migrate_folio_move mm/migrate.c:1330 [inline]
> > > migrate_pages_batch+0x2a76/0x3560 mm/migrate.c:1818
> > > migrate_pages_sync mm/migrate.c:1884 [inline]
> > > migrate_pages+0x1f59/0x3460 mm/migrate.c:1993
> > > do_mbind mm/mempolicy.c:1388 [inline]
> > > kernel_mbind mm/mempolicy.c:1531 [inline]
> > > __do_sys_mbind mm/mempolicy.c:1605 [inline]
> > > __se_sys_mbind+0x1490/0x19f0 mm/mempolicy.c:1601
> > > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > > 
> > > Memory state around the buggy address:
> > > ffff88807c17f880: fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 00
> > > ffff88807c17f900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
> > >> ffff88807c17f980: fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb fb
> > >                                                 ^
> > > ffff88807c17fa00: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
> > > ffff88807c17fa80: fc fc fc fc fc 00 00 00 00 00 00 00 00 00 00 00
> > > ==================================================================
> > > 
> > > 
> > > ---
> > > This report is generated by a bot. It may contain errors.
> > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > > 
> > > syzbot will keep track of this issue. See:
> > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > > 
> > > If the report is already addressed, let syzbot know by replying with:
> > > #syz fix: exact-commit-title
> > > 
> > > If you want to overwrite report's subsystems, reply with:
> > > #syz set subsystems: new-subsystem
> > > (See the list of subsystem names on the web dashboard)
> > > 
> > > If the report is a duplicate of another one, reply with:
> > > #syz dup: exact-subject-of-another-report
> > > 
> > > If you want to undo deduplication, reply with:
> > > #syz undup
> > 

[-- Attachment #2: 0001-mm-Change-vmf_anon_prepare-to-__vmf_anon_prepare.patch --]
[-- Type: text/plain, Size: 2743 bytes --]

From 734dde34151c2951b86f16cc554e0eed671d340d Mon Sep 17 00:00:00 2001
From: "Vishal Moola (Oracle)" <vishal.moola@gmail.com>
Date: Tue, 10 Sep 2024 10:39:46 -0700
Subject: [PATCH 1/2] mm: Change vmf_anon_prepare() to __vmf_anon_prepare()

Some callers of vmf_anon_prepare() may not want us to release the
per-VMA lock ourselves. Rename vmf_anon_prepare() to
__vmf_anon_prepare() and let the callers drop the lock when desired.

Also, make vmf_anon_prepare() a wrapper that releases the per-VMA lock
itself for any callers that don't care.

This is in preparation to fix this bug reported by syzbot:
https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/

Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
 mm/internal.h | 11 ++++++++++-
 mm/memory.c   |  8 +++-----
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 44c8dec1f0d7..93083bbeeefa 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -320,7 +320,16 @@ static inline void wake_throttle_isolated(pg_data_t *pgdat)
 		wake_up(wqh);
 }
 
-vm_fault_t vmf_anon_prepare(struct vm_fault *vmf);
+vm_fault_t __vmf_anon_prepare(struct vm_fault *vmf);
+static inline vm_fault_t vmf_anon_prepare(struct vm_fault *vmf)
+{
+	vm_fault_t ret = __vmf_anon_prepare(vmf);
+
+	if (unlikely(ret & VM_FAULT_RETRY))
+		vma_end_read(vmf->vma);
+	return ret;
+}
+
 vm_fault_t do_swap_page(struct vm_fault *vmf);
 void folio_rotate_reclaimable(struct folio *folio);
 bool __folio_end_writeback(struct folio *folio);
diff --git a/mm/memory.c b/mm/memory.c
index 36f655eb66c4..d564737255f8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3262,7 +3262,7 @@ static inline vm_fault_t vmf_can_call_fault(const struct vm_fault *vmf)
 }
 
 /**
- * vmf_anon_prepare - Prepare to handle an anonymous fault.
+ * __vmf_anon_prepare - Prepare to handle an anonymous fault.
  * @vmf: The vm_fault descriptor passed from the fault handler.
  *
  * When preparing to insert an anonymous page into a VMA from a
@@ -3276,7 +3276,7 @@ static inline vm_fault_t vmf_can_call_fault(const struct vm_fault *vmf)
  * Return: 0 if fault handling can proceed.  Any other value should be
  * returned to the caller.
  */
-vm_fault_t vmf_anon_prepare(struct vm_fault *vmf)
+vm_fault_t __vmf_anon_prepare(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	vm_fault_t ret = 0;
@@ -3284,10 +3284,8 @@ vm_fault_t vmf_anon_prepare(struct vm_fault *vmf)
 	if (likely(vma->anon_vma))
 		return 0;
 	if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
-		if (!mmap_read_trylock(vma->vm_mm)) {
-			vma_end_read(vma);
+		if (!mmap_read_trylock(vma->vm_mm))
 			return VM_FAULT_RETRY;
-		}
 	}
 	if (__anon_vma_prepare(vma))
 		ret = VM_FAULT_OOM;
-- 
2.45.0


[-- Attachment #3: 0002-mm-hugetlb.c-Fix-UAF-of-vma-in-hugetlb-fault-pathway.patch --]
[-- Type: text/plain, Size: 2596 bytes --]

From f3a9fd823fa187c57ddd4482d3f089911c912e5c Mon Sep 17 00:00:00 2001
From: "Vishal Moola (Oracle)" <vishal.moola@gmail.com>
Date: Tue, 10 Sep 2024 10:24:24 -0700
Subject: [PATCH 2/2] mm/hugetlb.c: Fix UAF of vma in hugetlb fault pathway

Syzbot reports a UAF in hugetlb_fault(). This happens because
vmf_anon_prepare() could drop the per-VMA lock and allow the current VMA
to be freed before hugetlb_vma_unlock_read() is called.

We can fix this by using a modified version of vmf_anon_prepare() that
doesn't release the VMA lock on failure, and then release it ourselves
after hugetlb_vma_unlock_read().

Reported-by: syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/
Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: <stable@vger.kernel.org>
---
 mm/hugetlb.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5c77defad295..190fa05635f4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5915,7 +5915,7 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio,
 	 * When the original hugepage is shared one, it does not have
 	 * anon_vma prepared.
 	 */
-	ret = vmf_anon_prepare(vmf);
+	ret = __vmf_anon_prepare(vmf);
 	if (unlikely(ret))
 		goto out_release_all;
 
@@ -6114,7 +6114,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 		}
 
 		if (!(vma->vm_flags & VM_MAYSHARE)) {
-			ret = vmf_anon_prepare(vmf);
+			ret = __vmf_anon_prepare(vmf);
 			if (unlikely(ret))
 				goto out;
 		}
@@ -6245,6 +6245,14 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 	folio_unlock(folio);
 out:
 	hugetlb_vma_unlock_read(vma);
+
+	/*
+	 * We must check to release the per-VMA lock. __vmf_anon_prepare() is
+	 * the only way ret can be set to VM_FAULT_RETRY.
+	 */
+	if (unlikely(ret & VM_FAULT_RETRY))
+		vma_end_read(vma);
+
 	mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 	return ret;
 
@@ -6466,6 +6474,14 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	}
 out_mutex:
 	hugetlb_vma_unlock_read(vma);
+
+	/*
+	 * We must check to release the per-VMA lock. __vmf_anon_prepare() in
+	 * hugetlb_wp() is the only way ret can be set to VM_FAULT_RETRY.
+	 */
+	if (unlikely(ret & VM_FAULT_RETRY))
+		vma_end_read(vma);
+
 	mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 	/*
 	 * Generally it's safe to hold refcount during waiting page lock. But
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [syzbot] [mm?] KASAN: slab-use-after-free Read in hugetlb_fault (2)
  2024-09-10 19:27     ` Vishal Moola
@ 2024-09-14  5:50       ` Muchun Song
  2024-09-14 19:41         ` [PATCH 1/2] mm: Change vmf_anon_prepare() to __vmf_anon_prepare() Vishal Moola (Oracle)
  0 siblings, 1 reply; 7+ messages in thread
From: Muchun Song @ 2024-09-14  5:50 UTC (permalink / raw)
  To: Vishal Moola
  Cc: syzbot, Andrew Morton, LKML, Linux Memory Management List,
	syzkaller-bugs



> On Sep 11, 2024, at 03:27, Vishal Moola <vishal.moola@gmail.com> wrote:
> 
> On Mon, Sep 09, 2024 at 02:06:13PM -0700, Vishal Moola wrote:
>> On Mon, Sep 09, 2024 at 05:57:52PM +0800, Muchun Song wrote:
>>> 
>>> 
>>>> On Sep 9, 2024, at 02:23, syzbot <syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> syzbot found the following issue on:
>>>> 
>>>> HEAD commit:    88fac17500f4 Merge tag 'fuse-fixes-6.11-rc7' of git://git...
>>>> git tree:       upstream
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13291d97980000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=660f6eb11f9c7dc5
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2dab93857ee95f2eeb08
>>>> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>>>> 
>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>> 
>>>> Downloadable assets:
>>>> disk image: https://storage.googleapis.com/syzbot-assets/6dfa1c637f53/disk-88fac175.raw.xz
>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/7a322b491698/vmlinux-88fac175.xz
>>>> kernel image: https://storage.googleapis.com/syzbot-assets/edc9184a3a97/bzImage-88fac175.xz
>>>> 
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com
>>>> 
>>>> ==================================================================
>>>> BUG: KASAN: slab-use-after-free in __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]
>>> 
>>> This is accessing vma structure.
>>> 
>>>> BUG: KASAN: slab-use-after-free in hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]
>>>> BUG: KASAN: slab-use-after-free in hugetlb_no_page mm/hugetlb.c:6380 [inline]
>>>> BUG: KASAN: slab-use-after-free in hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
>>>> Read of size 8 at addr ffff88807c17f9d0 by task syz.0.4558/26998
>>>> 
>>>> CPU: 1 UID: 0 PID: 26998 Comm: syz.0.4558 Not tainted 6.11.0-rc6-syzkaller-00026-g88fac17500f4 #0
>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
>>>> Call Trace:
>>>> <TASK>
>>>> __dump_stack lib/dump_stack.c:93 [inline]
>>>> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
>>>> print_address_description mm/kasan/report.c:377 [inline]
>>>> print_report+0x169/0x550 mm/kasan/report.c:488
>>>> kasan_report+0x143/0x180 mm/kasan/report.c:601
>>>> __vma_shareable_lock include/linux/hugetlb.h:1278 [inline]
>>>> hugetlb_vma_unlock_read mm/hugetlb.c:281 [inline]
>>> 
>>> I think vma is freed before this call of hugetlb_vma_unlock_read()
>>> but after hugetlb_vma_lock_read() in hugetlb_fault(). I found a
>>> possible scenario to cause this problem.
>>> 
>>> hugetlb_no_page()
>>> ret = vmf_anon_prepare()
>>> if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
>>> if (!mmap_read_trylock(vma->vm_mm)) {
>>> vma_end_read(vma);
>>> // VMA lock is released, which could be freed before the call of hugetlb_vma_unlock_read().
>>> return VM_FAULT_RETRY;
>>> }
>>> }
>>> if (unlikely(ret))
>>> goto out;
>>> out:
>>> hugetlb_vma_unlock_read(vma); // UAF of VMA
>> 
>> Thanks for catching this, it indeed looks like the problem. I don't
>> think its easy to reproduce since we would have to unmap the vma while
>> a fault is being handled (and failing).
>> 
>> This same issue should be present in hugetlb_wp() as well, so I'm thinking
>> the best fix would be to make another function similar to
>> vmf_anon_prepare() that doesn't release the vma lock. Then wait to drop
>> the lock until hugetlb_vma_unlock_read() is called.
>> 
>> I'll have that fix out tomorrow.
> 
> The 2 attached patches should fix this.

Hi Vishal,

Would you mind sending it as a separated patch instead of an
attachment?

Thanks.

> 
>>> The culprit commit should be
>>> 
>>> 7c43a553792a1 ("hugetlb: allow faults to be handled under the VMA lock").
>>> 
>>> I will take a closer look at the solution tomorrow. And Cc the author of the
>>> above commit, maybe have some comments on this.
>>> 
>>> Muchun,
>>> Thanks.
>>> 
>>>> hugetlb_no_page mm/hugetlb.c:6380 [inline]
>>>> hugetlb_fault+0xfaf/0x3770 mm/hugetlb.c:6485
>>>> handle_mm_fault+0x1901/0x1bc0 mm/memory.c:5830
>>>> do_user_addr_fault arch/x86/mm/fault.c:1338 [inline]
>>>> handle_page_fault arch/x86/mm/fault.c:1481 [inline]
>>>> exc_page_fault+0x459/0x8c0 arch/x86/mm/fault.c:1539
>>>> asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
>>>> RIP: 0033:0x7f2b63744998
>>>> Code: fc 89 37 c3 c5 fa 6f 06 c5 fa 6f 4c 16 f0 c5 fa 7f 07 c5 fa 7f 4c 17 f0 c3 66 0f 1f 84 00 00 00 00 00 48 8b 4c 16 f8 48 8b 36 <48> 89 37 48 89 4c 17 f8 c3 c5 fe 6f 54 16 e0 c5 fe 6f 5c 16 c0 c5
>>>> RSP: 002b:00007f2b63a5fb88 EFLAGS: 00010206
>>>> RAX: 00000000200002c0 RBX: 0000000000000004 RCX: 00676e7277682f76
>>>> RDX: 000000000000000b RSI: 7277682f7665642f RDI: 00000000200002c0
>>>> RBP: 00007f2b63937a80 R08: 00007f2b63600000 R09: 0000000000000001
>>>> R10: 0000000000000001 R11: 0000000000000009 R12: 000000000014aa5e
>>>> R13: 00007f2b63a5fc90 R14: 0000000000000032 R15: fffffffffffffffe
>>>> </TASK>
>>>> 
>>>> Allocated by task 27000:
>>>> kasan_save_stack mm/kasan/common.c:47 [inline]
>>>> kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
>>>> unpoison_slab_object mm/kasan/common.c:312 [inline]
>>>> __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
>>>> kasan_slab_alloc include/linux/kasan.h:201 [inline]
>>>> slab_post_alloc_hook mm/slub.c:3988 [inline]
>>>> slab_alloc_node mm/slub.c:4037 [inline]
>>>> kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4044
>>>> vm_area_alloc+0x24/0x1d0 kernel/fork.c:471
>>>> mmap_region+0xc3d/0x2090 mm/mmap.c:2944
>>>> do_mmap+0x8f9/0x1010 mm/mmap.c:1468
>>>> vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
>>>> ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
>>>> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>>>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>> 
>>>> Freed by task 26255:
>>>> kasan_save_stack mm/kasan/common.c:47 [inline]
>>>> kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
>>>> kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
>>>> poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
>>>> __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
>>>> kasan_slab_free include/linux/kasan.h:184 [inline]
>>>> slab_free_hook mm/slub.c:2252 [inline]
>>>> slab_free mm/slub.c:4473 [inline]
>>>> kmem_cache_free+0x145/0x350 mm/slub.c:4548
>>>> rcu_do_batch kernel/rcu/tree.c:2569 [inline]
>>>> rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2843
>>> 
>>> VMA structure is freed via rcu, so it is really a UAF problem.
>>> 
>>>> handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
>>>> do_softirq+0x11b/0x1e0 kernel/softirq.c:455
>>>> __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
>>>> spin_unlock_bh include/linux/spinlock.h:396 [inline]
>>>> __fib6_clean_all+0x327/0x4b0 net/ipv6/ip6_fib.c:2277
>>>> rt6_sync_down_dev net/ipv6/route.c:4908 [inline]
>>>> rt6_disable_ip+0x164/0x7e0 net/ipv6/route.c:4913
>>>> addrconf_ifdown+0x15d/0x1bd0 net/ipv6/addrconf.c:3856
>>>> addrconf_notify+0x3cb/0x1020
>>>> notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
>>>> call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
>>>> call_netdevice_notifiers net/core/dev.c:2046 [inline]
>>>> dev_close_many+0x33c/0x4c0 net/core/dev.c:1587
>>>> unregister_netdevice_many_notify+0x50b/0x1c40 net/core/dev.c:11327
>>>> unregister_netdevice_many net/core/dev.c:11414 [inline]
>>>> default_device_exit_batch+0xa0f/0xa90 net/core/dev.c:11897
>>>> ops_exit_list net/core/net_namespace.c:178 [inline]
>>>> cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
>>>> process_one_work kernel/workqueue.c:3231 [inline]
>>>> process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
>>>> worker_thread+0x86d/0xd10 kernel/workqueue.c:3389
>>>> kthread+0x2f0/0x390 kernel/kthread.c:389
>>>> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
>>>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>>>> 
>>>> Last potentially related work creation:
>>>> kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
>>>> __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
>>>> __call_rcu_common kernel/rcu/tree.c:3106 [inline]
>>>> call_rcu+0x167/0xa70 kernel/rcu/tree.c:3210
>>>> remove_vma mm/mmap.c:189 [inline]
>>>> remove_mt mm/mmap.c:2415 [inline]
>>>> do_vmi_align_munmap+0x155c/0x18c0 mm/mmap.c:2758
>>>> do_vmi_munmap+0x261/0x2f0 mm/mmap.c:2830
>>>> mmap_region+0x72f/0x2090 mm/mmap.c:2881
>>>> do_mmap+0x8f9/0x1010 mm/mmap.c:1468
>>>> vm_mmap_pgoff+0x1dd/0x3d0 mm/util.c:588
>>>> ksys_mmap_pgoff+0x544/0x720 mm/mmap.c:1514
>>>> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>>>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>> 
>>>> The buggy address belongs to the object at ffff88807c17f9b0
>>>> which belongs to the cache vm_area_struct of size 184
>>>> The buggy address is located 32 bytes inside of
>>>> freed 184-byte region [ffff88807c17f9b0, ffff88807c17fa68)
>>>> 
>>>> The buggy address belongs to the physical page:
>>>> page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7c17f
>>>> memcg:ffff888028997401
>>>> anon flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
>>>> page_type: 0xfdffffff(slab)
>>>> raw: 00fff00000000000 ffff88801bafdb40 ffffea0001f89e00 000000000000000d
>>>> raw: 0000000000000000 0000000000100010 00000001fdffffff ffff888028997401
>>>> page dumped because: kasan: bad access detected
>>>> page_owner tracks the page as allocated
>>>> page last allocated via order 0, migratetype Unmovable, gfp_mask 0x152cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 26741, tgid 26741 (dhcpcd-run-hook), ts 1341391347767, free_ts 1341166373745
>>>> set_page_owner include/linux/page_owner.h:32 [inline]
>>>> post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
>>>> prep_new_page mm/page_alloc.c:1501 [inline]
>>>> get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
>>>> __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
>>>> __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
>>>> alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
>>>> alloc_slab_page+0x5f/0x120 mm/slub.c:2321
>>>> allocate_slab+0x5a/0x2f0 mm/slub.c:2484
>>>> new_slab mm/slub.c:2537 [inline]
>>>> ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3723
>>>> __slab_alloc+0x58/0xa0 mm/slub.c:3813
>>>> __slab_alloc_node mm/slub.c:3866 [inline]
>>>> slab_alloc_node mm/slub.c:4025 [inline]
>>>> kmem_cache_alloc_noprof+0x1c1/0x2a0 mm/slub.c:4044
>>>> vm_area_dup+0x27/0x290 kernel/fork.c:486
>>>> dup_mmap kernel/fork.c:695 [inline]
>>>> dup_mm kernel/fork.c:1672 [inline]
>>>> copy_mm+0xc7b/0x1f30 kernel/fork.c:1721
>>>> copy_process+0x187a/0x3dc0 kernel/fork.c:2374
>>>> kernel_clone+0x226/0x8f0 kernel/fork.c:2781
>>>> __do_sys_clone kernel/fork.c:2924 [inline]
>>>> __se_sys_clone kernel/fork.c:2908 [inline]
>>>> __x64_sys_clone+0x258/0x2a0 kernel/fork.c:2908
>>>> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>>>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>> page last free pid 26730 tgid 26718 stack trace:
>>>> reset_page_owner include/linux/page_owner.h:25 [inline]
>>>> free_pages_prepare mm/page_alloc.c:1094 [inline]
>>>> free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
>>>> __folio_put+0x2c8/0x440 mm/swap.c:128
>>>> migrate_folio_move mm/migrate.c:1330 [inline]
>>>> migrate_pages_batch+0x2a76/0x3560 mm/migrate.c:1818
>>>> migrate_pages_sync mm/migrate.c:1884 [inline]
>>>> migrate_pages+0x1f59/0x3460 mm/migrate.c:1993
>>>> do_mbind mm/mempolicy.c:1388 [inline]
>>>> kernel_mbind mm/mempolicy.c:1531 [inline]
>>>> __do_sys_mbind mm/mempolicy.c:1605 [inline]
>>>> __se_sys_mbind+0x1490/0x19f0 mm/mempolicy.c:1601
>>>> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>>>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>> 
>>>> Memory state around the buggy address:
>>>> ffff88807c17f880: fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 00
>>>> ffff88807c17f900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
>>>>> ffff88807c17f980: fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb fb
>>>>                                                ^
>>>> ffff88807c17fa00: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
>>>> ffff88807c17fa80: fc fc fc fc fc 00 00 00 00 00 00 00 00 00 00 00
>>>> ==================================================================
>>>> 
>>>> 
>>>> ---
>>>> This report is generated by a bot. It may contain errors.
>>>> See https://goo.gl/tpsmEJ for more information about syzbot.
>>>> syzbot engineers can be reached at syzkaller@googlegroups.com.
>>>> 
>>>> syzbot will keep track of this issue. See:
>>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>>>> 
>>>> If the report is already addressed, let syzbot know by replying with:
>>>> #syz fix: exact-commit-title
>>>> 
>>>> If you want to overwrite report's subsystems, reply with:
>>>> #syz set subsystems: new-subsystem
>>>> (See the list of subsystem names on the web dashboard)
>>>> 
>>>> If the report is a duplicate of another one, reply with:
>>>> #syz dup: exact-subject-of-another-report
>>>> 
>>>> If you want to undo deduplication, reply with:
>>>> #syz undup
>>> 
> <0001-mm-Change-vmf_anon_prepare-to-__vmf_anon_prepare.patch><0002-mm-hugetlb.c-Fix-UAF-of-vma-in-hugetlb-fault-pathway.patch>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] mm: Change vmf_anon_prepare() to __vmf_anon_prepare()
  2024-09-14  5:50       ` Muchun Song
@ 2024-09-14 19:41         ` Vishal Moola (Oracle)
  2024-09-14 19:41           ` [PATCH 2/2] mm/hugetlb.c: Fix UAF of vma in hugetlb fault pathway Vishal Moola (Oracle)
  0 siblings, 1 reply; 7+ messages in thread
From: Vishal Moola (Oracle) @ 2024-09-14 19:41 UTC (permalink / raw)
  To: Muchun Song, syzbot, Andrew Morton, LKML,
	Linux Memory Management List, syzkaller-bugs
  Cc: Vishal Moola (Oracle)

Some callers of vmf_anon_prepare() may not want us to release the
per-VMA lock ourselves. Rename vmf_anon_prepare() to
__vmf_anon_prepare() and let the callers drop the lock when desired.

Also, make vmf_anon_prepare() a wrapper that releases the per-VMA lock
itself for any callers that don't care.

This is in preparation to fix this bug reported by syzbot:
https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/

Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
 mm/internal.h | 11 ++++++++++-
 mm/memory.c   |  8 +++-----
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 44c8dec1f0d7..93083bbeeefa 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -320,7 +320,16 @@ static inline void wake_throttle_isolated(pg_data_t *pgdat)
 		wake_up(wqh);
 }
 
-vm_fault_t vmf_anon_prepare(struct vm_fault *vmf);
+vm_fault_t __vmf_anon_prepare(struct vm_fault *vmf);
+static inline vm_fault_t vmf_anon_prepare(struct vm_fault *vmf)
+{
+	vm_fault_t ret = __vmf_anon_prepare(vmf);
+
+	if (unlikely(ret & VM_FAULT_RETRY))
+		vma_end_read(vmf->vma);
+	return ret;
+}
+
 vm_fault_t do_swap_page(struct vm_fault *vmf);
 void folio_rotate_reclaimable(struct folio *folio);
 bool __folio_end_writeback(struct folio *folio);
diff --git a/mm/memory.c b/mm/memory.c
index 36f655eb66c4..d564737255f8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3262,7 +3262,7 @@ static inline vm_fault_t vmf_can_call_fault(const struct vm_fault *vmf)
 }
 
 /**
- * vmf_anon_prepare - Prepare to handle an anonymous fault.
+ * __vmf_anon_prepare - Prepare to handle an anonymous fault.
  * @vmf: The vm_fault descriptor passed from the fault handler.
  *
  * When preparing to insert an anonymous page into a VMA from a
@@ -3276,7 +3276,7 @@ static inline vm_fault_t vmf_can_call_fault(const struct vm_fault *vmf)
  * Return: 0 if fault handling can proceed.  Any other value should be
  * returned to the caller.
  */
-vm_fault_t vmf_anon_prepare(struct vm_fault *vmf)
+vm_fault_t __vmf_anon_prepare(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	vm_fault_t ret = 0;
@@ -3284,10 +3284,8 @@ vm_fault_t vmf_anon_prepare(struct vm_fault *vmf)
 	if (likely(vma->anon_vma))
 		return 0;
 	if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
-		if (!mmap_read_trylock(vma->vm_mm)) {
-			vma_end_read(vma);
+		if (!mmap_read_trylock(vma->vm_mm))
 			return VM_FAULT_RETRY;
-		}
 	}
 	if (__anon_vma_prepare(vma))
 		ret = VM_FAULT_OOM;
-- 
2.45.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] mm/hugetlb.c: Fix UAF of vma in hugetlb fault pathway
  2024-09-14 19:41         ` [PATCH 1/2] mm: Change vmf_anon_prepare() to __vmf_anon_prepare() Vishal Moola (Oracle)
@ 2024-09-14 19:41           ` Vishal Moola (Oracle)
  0 siblings, 0 replies; 7+ messages in thread
From: Vishal Moola (Oracle) @ 2024-09-14 19:41 UTC (permalink / raw)
  To: Muchun Song, syzbot, Andrew Morton, LKML,
	Linux Memory Management List, syzkaller-bugs
  Cc: Vishal Moola (Oracle), stable

Syzbot reports a UAF in hugetlb_fault(). This happens because
vmf_anon_prepare() could drop the per-VMA lock and allow the current VMA
to be freed before hugetlb_vma_unlock_read() is called.

We can fix this by using a modified version of vmf_anon_prepare() that
doesn't release the VMA lock on failure, and then release it ourselves
after hugetlb_vma_unlock_read().

Reported-by: syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/
Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: <stable@vger.kernel.org>
---
 mm/hugetlb.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5c77defad295..190fa05635f4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5915,7 +5915,7 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio,
 	 * When the original hugepage is shared one, it does not have
 	 * anon_vma prepared.
 	 */
-	ret = vmf_anon_prepare(vmf);
+	ret = __vmf_anon_prepare(vmf);
 	if (unlikely(ret))
 		goto out_release_all;
 
@@ -6114,7 +6114,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 		}
 
 		if (!(vma->vm_flags & VM_MAYSHARE)) {
-			ret = vmf_anon_prepare(vmf);
+			ret = __vmf_anon_prepare(vmf);
 			if (unlikely(ret))
 				goto out;
 		}
@@ -6245,6 +6245,14 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 	folio_unlock(folio);
 out:
 	hugetlb_vma_unlock_read(vma);
+
+	/*
+	 * We must check to release the per-VMA lock. __vmf_anon_prepare() is
+	 * the only way ret can be set to VM_FAULT_RETRY.
+	 */
+	if (unlikely(ret & VM_FAULT_RETRY))
+		vma_end_read(vma);
+
 	mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 	return ret;
 
@@ -6466,6 +6474,14 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	}
 out_mutex:
 	hugetlb_vma_unlock_read(vma);
+
+	/*
+	 * We must check to release the per-VMA lock. __vmf_anon_prepare() in
+	 * hugetlb_wp() is the only way ret can be set to VM_FAULT_RETRY.
+	 */
+	if (unlikely(ret & VM_FAULT_RETRY))
+		vma_end_read(vma);
+
 	mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 	/*
 	 * Generally it's safe to hold refcount during waiting page lock. But
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-09-14 19:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-08 18:23 [syzbot] [mm?] KASAN: slab-use-after-free Read in hugetlb_fault (2) syzbot
2024-09-09  9:57 ` Muchun Song
2024-09-09 21:06   ` Vishal Moola
2024-09-10 19:27     ` Vishal Moola
2024-09-14  5:50       ` Muchun Song
2024-09-14 19:41         ` [PATCH 1/2] mm: Change vmf_anon_prepare() to __vmf_anon_prepare() Vishal Moola (Oracle)
2024-09-14 19:41           ` [PATCH 2/2] mm/hugetlb.c: Fix UAF of vma in hugetlb fault pathway Vishal Moola (Oracle)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.