* KASAN: use-after-free Read in get_mem_cgroup_from_mm
@ 2018-11-07  1:52 syzbot
  2018-12-04 15:43 ` syzbot
  2019-03-22  9:36 ` syzbot
  0 siblings, 2 replies; 26+ messages in thread
From: syzbot @ 2018-11-07  1:52 UTC (permalink / raw)
  To: cgroups, hannes, linux-kernel, linux-mm, mhocko, syzkaller-bugs,
	vdavydov.dev
Hello,
syzbot found the following crash on:
HEAD commit:    83650fd58a93 Merge tag 'arm64-upstream' of git://git.kerne..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12ce682b400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=9384ecb1c973baed
dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
Unfortunately, I don't have any reproducer for this crash yet.
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
FAT-fs (loop1): Unrecognized mount option "\a" or missing value
F2FS-fs (loop5): Magic Mismatch, valid(0xf2f52010) - read(0x0)
==================================================================
BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182  
[inline]
BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815  
[inline]
BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880  
mm/memcontrol.c:844
Read of size 8 at addr ffff8801c635d210 by task syz-executor0/14887
CPU: 0 PID: 14887 Comm: syz-executor0 Not tainted 4.19.0+ #318
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x244/0x39d lib/dump_stack.c:113
  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
  __read_once_size include/linux/compiler.h:182 [inline]
  task_css include/linux/cgroup.h:477 [inline]
  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
  mcopy_atomic_pte mm/userfaultfd.c:69 [inline]
  mfill_atomic_pte mm/userfaultfd.c:385 [inline]
  __mcopy_atomic mm/userfaultfd.c:529 [inline]
  mcopy_atomic+0xae9/0x2aa0 mm/userfaultfd.c:579
  userfaultfd_copy fs/userfaultfd.c:1690 [inline]
  userfaultfd_ioctl+0x213d/0x54a0 fs/userfaultfd.c:1836
  vfs_ioctl fs/ioctl.c:46 [inline]
  file_ioctl fs/ioctl.c:509 [inline]
  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
  __do_sys_ioctl fs/ioctl.c:720 [inline]
  __se_sys_ioctl fs/ioctl.c:718 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f6dd22acc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457569
RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
RBP: 000000000072bfa0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f6dd22ad6d4
R13: 00000000004c142b R14: 00000000004d22a8 R15: 00000000ffffffff
Allocated by task 14881:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
  alloc_task_struct_node kernel/fork.c:158 [inline]
  dup_task_struct kernel/fork.c:843 [inline]
  copy_process+0x2026/0x87a0 kernel/fork.c:1751
  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
  __do_sys_clone kernel/fork.c:2323 [inline]
  __se_sys_clone kernel/fork.c:2317 [inline]
  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 14881:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
  __cache_free mm/slab.c:3498 [inline]
  kmem_cache_free+0x83/0x290 mm/slab.c:3760
  free_task_struct kernel/fork.c:163 [inline]
  free_task+0x16e/0x1f0 kernel/fork.c:457
  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
  __do_sys_clone kernel/fork.c:2323 [inline]
  __se_sys_clone kernel/fork.c:2317 [inline]
  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff8801c635c140
  which belongs to the cache task_struct(17:syz0) of size 6080
The buggy address is located 4304 bytes inside of
  6080-byte region [ffff8801c635c140, ffff8801c635d900)
The buggy address belongs to the page:
page:ffffea000718d700 count:1 mapcount:0 mapping:ffff8801ccef9800 index:0x0  
compound_mapcount: 0
flags: 0x2fffc0000010200(slab|head)
raw: 02fffc0000010200 ffffea000573d508 ffffea0006fc6088 ffff8801ccef9800
raw: 0000000000000000 ffff8801c635c140 0000000100000001 ffff880188008ec0
page dumped because: kasan: bad access detected
page->mem_cgroup:ffff880188008ec0
Memory state around the buggy address:
  ffff8801c635d100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8801c635d180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff8801c635d200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                          ^
  ffff8801c635d280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8801c635d300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
FAT-fs (loop2): bogus number of FAT structure
FAT-fs (loop2): Can't find a valid FAT filesystem
F2FS-fs (loop5): Can't find valid F2FS filesystem in 1th superblock
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2018-11-07  1:52 KASAN: use-after-free Read in get_mem_cgroup_from_mm syzbot
@ 2018-12-04 15:43 ` syzbot
  2019-03-03 16:19   ` zhong jiang
  2019-03-22  9:36 ` syzbot
  1 sibling, 1 reply; 26+ messages in thread
From: syzbot @ 2018-12-04 15:43 UTC (permalink / raw)
  To: cgroups, hannes, linux-kernel, linux-mm, mhocko, syzkaller-bugs,
	vdavydov.dev
syzbot has found a reproducer for the following crash on:
HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
cgroup: fork rejected by pids controller in /syz2
==================================================================
BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182  
[inline]
BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815  
[inline]
BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880  
mm/memcontrol.c:844
Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x244/0x39d lib/dump_stack.c:113
  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
  __read_once_size include/linux/compiler.h:182 [inline]
  task_css include/linux/cgroup.h:477 [inline]
  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
  __mcopy_atomic mm/userfaultfd.c:559 [inline]
  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
  vfs_ioctl fs/ioctl.c:46 [inline]
  file_ioctl fs/ioctl.c:509 [inline]
  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
  __do_sys_ioctl fs/ioctl.c:720 [inline]
  __se_sys_ioctl fs/ioctl.c:718 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x44c7e9
Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
Allocated by task 9325:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
  alloc_task_struct_node kernel/fork.c:158 [inline]
  dup_task_struct kernel/fork.c:843 [inline]
  copy_process+0x2026/0x87a0 kernel/fork.c:1751
  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
  __do_sys_clone kernel/fork.c:2323 [inline]
  __se_sys_clone kernel/fork.c:2317 [inline]
  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 9325:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
  __cache_free mm/slab.c:3498 [inline]
  kmem_cache_free+0x83/0x290 mm/slab.c:3760
  free_task_struct kernel/fork.c:163 [inline]
  free_task+0x16e/0x1f0 kernel/fork.c:457
  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
  __do_sys_clone kernel/fork.c:2323 [inline]
  __se_sys_clone kernel/fork.c:2317 [inline]
  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff8881b72ae240
  which belongs to the cache task_struct(81:syz2) of size 6080
The buggy address is located 4304 bytes inside of
  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
The buggy address belongs to the page:
page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0  
compound_mapcount: 0
flags: 0x2fffc0000010200(slab|head)
raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
page dumped because: kasan: bad access detected
page->mem_cgroup:ffff8881d87fe580
Memory state around the buggy address:
  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                          ^
  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2018-12-04 15:43 ` syzbot
@ 2019-03-03 16:19   ` zhong jiang
  2019-03-04  7:40     ` Dmitry Vyukov
  2019-03-04 21:51     ` Matthew Wilcox
  0 siblings, 2 replies; 26+ messages in thread
From: zhong jiang @ 2019-03-03 16:19 UTC (permalink / raw)
  To: syzbot, mhocko, Andrea Arcangeli
  Cc: cgroups, hannes, linux-kernel, linux-mm, syzkaller-bugs,
	vdavydov.dev, David Rientjes, Hugh Dickins, Matthew Wilcox,
	Mel Gorman, Vlastimil Babka
Hi, guys
I also hit the following issue. but it fails to reproduce the issue by the log.
it seems to the case that we access the mm->owner and deference it will result in the UAF.
But it should not be possible that we specify the incomplete process to be the mm->owner.
Any thoughts?
Thanks,
zhong jiang
On 2018/12/4 23:43, syzbot wrote:
> syzbot has found a reproducer for the following crash on:
>
> HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
>
> cgroup: fork rejected by pids controller in /syz2
> ==================================================================
> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline]
> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
>
> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>  __read_once_size include/linux/compiler.h:182 [inline]
>  task_css include/linux/cgroup.h:477 [inline]
>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  file_ioctl fs/ioctl.c:509 [inline]
>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x44c7e9
> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
> RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
> RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
> R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
>
> Allocated by task 9325:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>  set_track mm/kasan/kasan.c:460 [inline]
>  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
>  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
>  alloc_task_struct_node kernel/fork.c:158 [inline]
>  dup_task_struct kernel/fork.c:843 [inline]
>  copy_process+0x2026/0x87a0 kernel/fork.c:1751
>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>  __do_sys_clone kernel/fork.c:2323 [inline]
>  __se_sys_clone kernel/fork.c:2317 [inline]
>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Freed by task 9325:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>  set_track mm/kasan/kasan.c:460 [inline]
>  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
>  __cache_free mm/slab.c:3498 [inline]
>  kmem_cache_free+0x83/0x290 mm/slab.c:3760
>  free_task_struct kernel/fork.c:163 [inline]
>  free_task+0x16e/0x1f0 kernel/fork.c:457
>  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>  __do_sys_clone kernel/fork.c:2323 [inline]
>  __se_sys_clone kernel/fork.c:2317 [inline]
>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> The buggy address belongs to the object at ffff8881b72ae240
>  which belongs to the cache task_struct(81:syz2) of size 6080
> The buggy address is located 4304 bytes inside of
>  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
> The buggy address belongs to the page:
> page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0 compound_mapcount: 0
> flags: 0x2fffc0000010200(slab|head)
> raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
> raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
> page dumped because: kasan: bad access detected
> page->mem_cgroup:ffff8881d87fe580
>
> Memory state around the buggy address:
>  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                          ^
>  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
>
>
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-03 16:19   ` zhong jiang
@ 2019-03-04  7:40     ` Dmitry Vyukov
  2019-03-04 14:00       ` zhong jiang
  2019-03-04 21:51     ` Matthew Wilcox
  1 sibling, 1 reply; 26+ messages in thread
From: Dmitry Vyukov @ 2019-03-04  7:40 UTC (permalink / raw)
  To: zhong jiang
  Cc: syzbot, Michal Hocko, Andrea Arcangeli, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka
On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
>
> Hi, guys
>
> I also hit the following issue. but it fails to reproduce the issue by the log.
>
> it seems to the case that we access the mm->owner and deference it will result in the UAF.
> But it should not be possible that we specify the incomplete process to be the mm->owner.
>
> Any thoughts?
FWIW syzbot was able to reproduce this with this reproducer.
This looks like a very subtle race (threaded reproducer that runs
repeatedly in multiple processes), so most likely we are looking for
something like few instructions inconsistency window.
> Thanks,
> zhong jiang
>
> On 2018/12/4 23:43, syzbot wrote:
> > syzbot has found a reproducer for the following crash on:
> >
> > HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> > dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
> >
> > cgroup: fork rejected by pids controller in /syz2
> > ==================================================================
> > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline]
> > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> > BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> > Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
> >
> > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x244/0x39d lib/dump_stack.c:113
> >  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
> >  kasan_report_error mm/kasan/report.c:354 [inline]
> >  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
> >  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >  __read_once_size include/linux/compiler.h:182 [inline]
> >  task_css include/linux/cgroup.h:477 [inline]
> >  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
> >  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
> >  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
> >  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
> >  __mcopy_atomic mm/userfaultfd.c:559 [inline]
> >  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
> >  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
> >  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
> >  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x44c7e9
> > Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
> > RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
> > RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
> > R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
> >
> > Allocated by task 9325:
> >  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >  set_track mm/kasan/kasan.c:460 [inline]
> >  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
> >  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
> >  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
> >  alloc_task_struct_node kernel/fork.c:158 [inline]
> >  dup_task_struct kernel/fork.c:843 [inline]
> >  copy_process+0x2026/0x87a0 kernel/fork.c:1751
> >  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
> >  __do_sys_clone kernel/fork.c:2323 [inline]
> >  __se_sys_clone kernel/fork.c:2317 [inline]
> >  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > Freed by task 9325:
> >  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >  set_track mm/kasan/kasan.c:460 [inline]
> >  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
> >  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
> >  __cache_free mm/slab.c:3498 [inline]
> >  kmem_cache_free+0x83/0x290 mm/slab.c:3760
> >  free_task_struct kernel/fork.c:163 [inline]
> >  free_task+0x16e/0x1f0 kernel/fork.c:457
> >  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
> >  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
> >  __do_sys_clone kernel/fork.c:2323 [inline]
> >  __se_sys_clone kernel/fork.c:2317 [inline]
> >  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > The buggy address belongs to the object at ffff8881b72ae240
> >  which belongs to the cache task_struct(81:syz2) of size 6080
> > The buggy address is located 4304 bytes inside of
> >  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
> > The buggy address belongs to the page:
> > page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0 compound_mapcount: 0
> > flags: 0x2fffc0000010200(slab|head)
> > raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
> > raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
> > page dumped because: kasan: bad access detected
> > page->mem_cgroup:ffff8881d87fe580
> >
> > Memory state around the buggy address:
> >  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >                          ^
> >  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ==================================================================
> >
> >
> > .
> >
>
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/5C7BFE94.6070500%40huawei.com.
> For more options, visit https://groups.google.com/d/optout.
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-04  7:40     ` Dmitry Vyukov
@ 2019-03-04 14:00       ` zhong jiang
  2019-03-04 14:11         ` Dmitry Vyukov
  0 siblings, 1 reply; 26+ messages in thread
From: zhong jiang @ 2019-03-04 14:00 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Michal Hocko, Andrea Arcangeli, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka
On 2019/3/4 15:40, Dmitry Vyukov wrote:
> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
>> Hi, guys
>>
>> I also hit the following issue. but it fails to reproduce the issue by the log.
>>
>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
>> But it should not be possible that we specify the incomplete process to be the mm->owner.
>>
>> Any thoughts?
> FWIW syzbot was able to reproduce this with this reproducer.
> This looks like a very subtle race (threaded reproducer that runs
> repeatedly in multiple processes), so most likely we are looking for
> something like few instructions inconsistency window.
>
I has a little doubtful about the instrustions inconsistency window.
I guess that you mean some smb barriers should be taken into account.:-)
Because IMO, It should not be the lock case to result in the issue.
Thanks,
zhong jinag
>> Thanks,
>> zhong jiang
>>
>> On 2018/12/4 23:43, syzbot wrote:
>>> syzbot has found a reproducer for the following crash on:
>>>
>>> HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
>>>
>>> cgroup: fork rejected by pids controller in /syz2
>>> ==================================================================
>>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline]
>>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
>>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>> Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
>>>
>>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>>>  kasan_report_error mm/kasan/report.c:354 [inline]
>>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>>>  __read_once_size include/linux/compiler.h:182 [inline]
>>>  task_css include/linux/cgroup.h:477 [inline]
>>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>>  file_ioctl fs/ioctl.c:509 [inline]
>>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>>>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>>>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> RIP: 0033:0x44c7e9
>>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
>>> RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>> RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
>>> RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
>>> RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
>>> R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
>>>
>>> Allocated by task 9325:
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>>  set_track mm/kasan/kasan.c:460 [inline]
>>>  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
>>>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
>>>  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
>>>  alloc_task_struct_node kernel/fork.c:158 [inline]
>>>  dup_task_struct kernel/fork.c:843 [inline]
>>>  copy_process+0x2026/0x87a0 kernel/fork.c:1751
>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>>>  __do_sys_clone kernel/fork.c:2323 [inline]
>>>  __se_sys_clone kernel/fork.c:2317 [inline]
>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>
>>> Freed by task 9325:
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>>  set_track mm/kasan/kasan.c:460 [inline]
>>>  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
>>>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
>>>  __cache_free mm/slab.c:3498 [inline]
>>>  kmem_cache_free+0x83/0x290 mm/slab.c:3760
>>>  free_task_struct kernel/fork.c:163 [inline]
>>>  free_task+0x16e/0x1f0 kernel/fork.c:457
>>>  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>>>  __do_sys_clone kernel/fork.c:2323 [inline]
>>>  __se_sys_clone kernel/fork.c:2317 [inline]
>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>
>>> The buggy address belongs to the object at ffff8881b72ae240
>>>  which belongs to the cache task_struct(81:syz2) of size 6080
>>> The buggy address is located 4304 bytes inside of
>>>  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
>>> The buggy address belongs to the page:
>>> page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0 compound_mapcount: 0
>>> flags: 0x2fffc0000010200(slab|head)
>>> raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
>>> raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
>>> page dumped because: kasan: bad access detected
>>> page->mem_cgroup:ffff8881d87fe580
>>>
>>> Memory state around the buggy address:
>>>  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>                          ^
>>>  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ==================================================================
>>>
>>>
>>> .
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/5C7BFE94.6070500%40huawei.com.
>> For more options, visit https://groups.google.com/d/optout.
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-04 14:00       ` zhong jiang
@ 2019-03-04 14:11         ` Dmitry Vyukov
  2019-03-04 15:32           ` zhong jiang
  0 siblings, 1 reply; 26+ messages in thread
From: Dmitry Vyukov @ 2019-03-04 14:11 UTC (permalink / raw)
  To: zhong jiang
  Cc: syzbot, Michal Hocko, Andrea Arcangeli, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka
On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
>
> On 2019/3/4 15:40, Dmitry Vyukov wrote:
> > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >> Hi, guys
> >>
> >> I also hit the following issue. but it fails to reproduce the issue by the log.
> >>
> >> it seems to the case that we access the mm->owner and deference it will result in the UAF.
> >> But it should not be possible that we specify the incomplete process to be the mm->owner.
> >>
> >> Any thoughts?
> > FWIW syzbot was able to reproduce this with this reproducer.
> > This looks like a very subtle race (threaded reproducer that runs
> > repeatedly in multiple processes), so most likely we are looking for
> > something like few instructions inconsistency window.
> >
>
> I has a little doubtful about the instrustions inconsistency window.
>
> I guess that you mean some smb barriers should be taken into account.:-)
>
> Because IMO, It should not be the lock case to result in the issue.
Since the crash was triggered on x86 _most likley_ this is not a
missed barrier. What I meant is that one thread needs to executed some
code, while another thread is stopped within few instructions.
> Thanks,
> zhong jinag
> >> Thanks,
> >> zhong jiang
> >>
> >> On 2018/12/4 23:43, syzbot wrote:
> >>> syzbot has found a reproducer for the following crash on:
> >>>
> >>> HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
> >>> git tree:       upstream
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
> >>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> >>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> >>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
> >>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
> >>>
> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >>> Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
> >>>
> >>> cgroup: fork rejected by pids controller in /syz2
> >>> ==================================================================
> >>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline]
> >>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> >>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >>> Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
> >>>
> >>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> >>> Call Trace:
> >>>  __dump_stack lib/dump_stack.c:77 [inline]
> >>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
> >>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
> >>>  kasan_report_error mm/kasan/report.c:354 [inline]
> >>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
> >>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >>>  __read_once_size include/linux/compiler.h:182 [inline]
> >>>  task_css include/linux/cgroup.h:477 [inline]
> >>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
> >>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
> >>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
> >>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
> >>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
> >>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
> >>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
> >>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
> >>>  vfs_ioctl fs/ioctl.c:46 [inline]
> >>>  file_ioctl fs/ioctl.c:509 [inline]
> >>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
> >>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
> >>>  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >>>  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>> RIP: 0033:0x44c7e9
> >>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> >>> RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> >>> RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
> >>> RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
> >>> RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
> >>> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
> >>> R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
> >>>
> >>> Allocated by task 9325:
> >>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >>>  set_track mm/kasan/kasan.c:460 [inline]
> >>>  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
> >>>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
> >>>  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
> >>>  alloc_task_struct_node kernel/fork.c:158 [inline]
> >>>  dup_task_struct kernel/fork.c:843 [inline]
> >>>  copy_process+0x2026/0x87a0 kernel/fork.c:1751
> >>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
> >>>  __do_sys_clone kernel/fork.c:2323 [inline]
> >>>  __se_sys_clone kernel/fork.c:2317 [inline]
> >>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
> >>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>>
> >>> Freed by task 9325:
> >>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >>>  set_track mm/kasan/kasan.c:460 [inline]
> >>>  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
> >>>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
> >>>  __cache_free mm/slab.c:3498 [inline]
> >>>  kmem_cache_free+0x83/0x290 mm/slab.c:3760
> >>>  free_task_struct kernel/fork.c:163 [inline]
> >>>  free_task+0x16e/0x1f0 kernel/fork.c:457
> >>>  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
> >>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
> >>>  __do_sys_clone kernel/fork.c:2323 [inline]
> >>>  __se_sys_clone kernel/fork.c:2317 [inline]
> >>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
> >>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>>
> >>> The buggy address belongs to the object at ffff8881b72ae240
> >>>  which belongs to the cache task_struct(81:syz2) of size 6080
> >>> The buggy address is located 4304 bytes inside of
> >>>  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
> >>> The buggy address belongs to the page:
> >>> page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0 compound_mapcount: 0
> >>> flags: 0x2fffc0000010200(slab|head)
> >>> raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
> >>> raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
> >>> page dumped because: kasan: bad access detected
> >>> page->mem_cgroup:ffff8881d87fe580
> >>>
> >>> Memory state around the buggy address:
> >>>  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>                          ^
> >>>  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>> ==================================================================
> >>>
> >>>
> >>> .
> >>>
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> >> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/5C7BFE94.6070500%40huawei.com.
> >> For more options, visit https://groups.google.com/d/optout.
> > .
> >
>
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-04 14:11         ` Dmitry Vyukov
@ 2019-03-04 15:32           ` zhong jiang
  2019-03-05  6:26             ` Dmitry Vyukov
  0 siblings, 1 reply; 26+ messages in thread
From: zhong jiang @ 2019-03-04 15:32 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Michal Hocko, Andrea Arcangeli, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka
On 2019/3/4 22:11, Dmitry Vyukov wrote:
> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>> Hi, guys
>>>>
>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
>>>>
>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
>>>>
>>>> Any thoughts?
>>> FWIW syzbot was able to reproduce this with this reproducer.
>>> This looks like a very subtle race (threaded reproducer that runs
>>> repeatedly in multiple processes), so most likely we are looking for
>>> something like few instructions inconsistency window.
>>>
>> I has a little doubtful about the instrustions inconsistency window.
>>
>> I guess that you mean some smb barriers should be taken into account.:-)
>>
>> Because IMO, It should not be the lock case to result in the issue.
>
> Since the crash was triggered on x86 _most likley_ this is not a
> missed barrier. What I meant is that one thread needs to executed some
> code, while another thread is stopped within few instructions.
>
>
It is weird and I can not find any relationship you had said with the issue.:-(
Because It is the cause that mm->owner has been freed, whereas we still deference it.
From the lastest freed task call trace, It fails to create process.
Am I miss something or I misunderstand your meaning. Please correct me.
Thanks,
zhong jiang
>> Thanks,
>> zhong jinag
>>>> Thanks,
>>>> zhong jiang
>>>>
>>>> On 2018/12/4 23:43, syzbot wrote:
>>>>> syzbot has found a reproducer for the following crash on:
>>>>>
>>>>> HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
>>>>> git tree:       upstream
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
>>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
>>>>>
>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>>> Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
>>>>>
>>>>> cgroup: fork rejected by pids controller in /syz2
>>>>> ==================================================================
>>>>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline]
>>>>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
>>>>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>>>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>>>> Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
>>>>>
>>>>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>>>> Call Trace:
>>>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>>>>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>>>>>  kasan_report_error mm/kasan/report.c:354 [inline]
>>>>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>>>>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>>>>>  __read_once_size include/linux/compiler.h:182 [inline]
>>>>>  task_css include/linux/cgroup.h:477 [inline]
>>>>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>>>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>>>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>>>>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>>>>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>>>>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>>>>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>>>>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>>>>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>>>>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>>>>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>>>>  file_ioctl fs/ioctl.c:509 [inline]
>>>>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>>>>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>>>>>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>>>>>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>>>>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>>>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>> RIP: 0033:0x44c7e9
>>>>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
>>>>> RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>>>> RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
>>>>> RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
>>>>> RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
>>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
>>>>> R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
>>>>>
>>>>> Allocated by task 9325:
>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>>>>  set_track mm/kasan/kasan.c:460 [inline]
>>>>>  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
>>>>>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
>>>>>  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
>>>>>  alloc_task_struct_node kernel/fork.c:158 [inline]
>>>>>  dup_task_struct kernel/fork.c:843 [inline]
>>>>>  copy_process+0x2026/0x87a0 kernel/fork.c:1751
>>>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>>>>>  __do_sys_clone kernel/fork.c:2323 [inline]
>>>>>  __se_sys_clone kernel/fork.c:2317 [inline]
>>>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>>>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>>
>>>>> Freed by task 9325:
>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>>>>  set_track mm/kasan/kasan.c:460 [inline]
>>>>>  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
>>>>>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
>>>>>  __cache_free mm/slab.c:3498 [inline]
>>>>>  kmem_cache_free+0x83/0x290 mm/slab.c:3760
>>>>>  free_task_struct kernel/fork.c:163 [inline]
>>>>>  free_task+0x16e/0x1f0 kernel/fork.c:457
>>>>>  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
>>>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>>>>>  __do_sys_clone kernel/fork.c:2323 [inline]
>>>>>  __se_sys_clone kernel/fork.c:2317 [inline]
>>>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>>>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>>
>>>>> The buggy address belongs to the object at ffff8881b72ae240
>>>>>  which belongs to the cache task_struct(81:syz2) of size 6080
>>>>> The buggy address is located 4304 bytes inside of
>>>>>  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
>>>>> The buggy address belongs to the page:
>>>>> page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0 compound_mapcount: 0
>>>>> flags: 0x2fffc0000010200(slab|head)
>>>>> raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
>>>>> raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
>>>>> page dumped because: kasan: bad access detected
>>>>> page->mem_cgroup:ffff8881d87fe580
>>>>>
>>>>> Memory state around the buggy address:
>>>>>  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>>  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>>> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>>                          ^
>>>>>  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>>  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>> ==================================================================
>>>>>
>>>>>
>>>>> .
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
>>>> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/5C7BFE94.6070500%40huawei.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>> .
>>>
>>
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-03 16:19   ` zhong jiang
  2019-03-04  7:40     ` Dmitry Vyukov
@ 2019-03-04 21:51     ` Matthew Wilcox
  2019-03-05  3:09       ` zhong jiang
  1 sibling, 1 reply; 26+ messages in thread
From: Matthew Wilcox @ 2019-03-04 21:51 UTC (permalink / raw)
  To: zhong jiang
  Cc: syzbot, mhocko, Andrea Arcangeli, cgroups, hannes, linux-kernel,
	linux-mm, syzkaller-bugs, vdavydov.dev, David Rientjes,
	Hugh Dickins, Mel Gorman, Vlastimil Babka
On Mon, Mar 04, 2019 at 12:19:32AM +0800, zhong jiang wrote:
> I also hit the following issue. but it fails to reproduce the issue by the log.
> 
> it seems to the case that we access the mm->owner and deference it will result in the UAF.
> But it should not be possible that we specify the incomplete process to be the mm->owner.
OK, so we've got thread 9325 calling fork() and failing due to the PID
controller saying "no".  9325 calls free_task(), but somehow thread 9332
has a reference to the struct task_struct.  There are two possibilities
here: one is that 9332 really did manage to get a reference to the larval
child of 9325, and the other is that 9332 has a stale reference to some
memory which was reallocated to 9325's child.
Andrea, is there any way for a UFFD thread to get access to the child's
task_struct during the copy_process() call?  If so, I think copy_process()
needs to call mm_update_next_owner().
If there's no way for that to happen, then we have quite a bug-hunt ahead
of us looking for who is missing a call to mm_update_next_owner().
> On 2018/12/4 23:43, syzbot wrote:
> > syzbot has found a reproducer for the following crash on:
> >
> > HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> > dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
> >
> > cgroup: fork rejected by pids controller in /syz2
> > ==================================================================
> > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline]
> > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> > BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> > Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
> >
> > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x244/0x39d lib/dump_stack.c:113
> >  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
> >  kasan_report_error mm/kasan/report.c:354 [inline]
> >  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
> >  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >  __read_once_size include/linux/compiler.h:182 [inline]
> >  task_css include/linux/cgroup.h:477 [inline]
> >  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
> >  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
> >  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
> >  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
> >  __mcopy_atomic mm/userfaultfd.c:559 [inline]
> >  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
> >  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
> >  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
> >  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x44c7e9
> > Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
> > RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
> > RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
> > R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
> >
> > Allocated by task 9325:
> >  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >  set_track mm/kasan/kasan.c:460 [inline]
> >  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
> >  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
> >  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
> >  alloc_task_struct_node kernel/fork.c:158 [inline]
> >  dup_task_struct kernel/fork.c:843 [inline]
> >  copy_process+0x2026/0x87a0 kernel/fork.c:1751
> >  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
> >  __do_sys_clone kernel/fork.c:2323 [inline]
> >  __se_sys_clone kernel/fork.c:2317 [inline]
> >  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > Freed by task 9325:
> >  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >  set_track mm/kasan/kasan.c:460 [inline]
> >  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
> >  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
> >  __cache_free mm/slab.c:3498 [inline]
> >  kmem_cache_free+0x83/0x290 mm/slab.c:3760
> >  free_task_struct kernel/fork.c:163 [inline]
> >  free_task+0x16e/0x1f0 kernel/fork.c:457
> >  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
> >  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
> >  __do_sys_clone kernel/fork.c:2323 [inline]
> >  __se_sys_clone kernel/fork.c:2317 [inline]
> >  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > The buggy address belongs to the object at ffff8881b72ae240
> >  which belongs to the cache task_struct(81:syz2) of size 6080
> > The buggy address is located 4304 bytes inside of
> >  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
> > The buggy address belongs to the page:
> > page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0 compound_mapcount: 0
> > flags: 0x2fffc0000010200(slab|head)
> > raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
> > raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
> > page dumped because: kasan: bad access detected
> > page->mem_cgroup:ffff8881d87fe580
> >
> > Memory state around the buggy address:
> >  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >                          ^
> >  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ==================================================================
> >
> >
> > .
> >
> 
> 
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-04 21:51     ` Matthew Wilcox
@ 2019-03-05  3:09       ` zhong jiang
  0 siblings, 0 replies; 26+ messages in thread
From: zhong jiang @ 2019-03-05  3:09 UTC (permalink / raw)
  To: Matthew Wilcox, Andrea Arcangeli
  Cc: syzbot, mhocko, cgroups, hannes, linux-kernel, linux-mm,
	syzkaller-bugs, vdavydov.dev, David Rientjes, Hugh Dickins,
	Mel Gorman, Vlastimil Babka
On 2019/3/5 5:51, Matthew Wilcox wrote:
> On Mon, Mar 04, 2019 at 12:19:32AM +0800, zhong jiang wrote:
>> I also hit the following issue. but it fails to reproduce the issue by the log.
>>
>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
>> But it should not be possible that we specify the incomplete process to be the mm->owner.
> OK, so we've got thread 9325 calling fork() and failing due to the PID
> controller saying "no".  9325 calls free_task(), but somehow thread 9332
> has a reference to the struct task_struct.  There are two possibilities
> here: one is that 9332 really did manage to get a reference to the larval
> child of 9325, and the other is that 9332 has a stale reference to some
> memory which was reallocated to 9325's child.
Good guess and analysis.   IMO,   9332 can not handle the task_struct directly in the code flow.
But It can get a reference of mm_struct.  Maybe I miss something important.
> Andrea, is there any way for a UFFD thread to get access to the child's
> task_struct during the copy_process() call?  If so, I think copy_process()
> needs to call mm_update_next_owner().
Yep,  Hope andrea  have time to  look at this. 
Thanks,
zhong jiang
> If there's no way for that to happen, then we have quite a bug-hunt ahead
> of us looking for who is missing a call to mm_update_next_owner().
>> On 2018/12/4 23:43, syzbot wrote:
>>> syzbot has found a reproducer for the following crash on:
>>>
>>> HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
>>>
>>> cgroup: fork rejected by pids controller in /syz2
>>> ==================================================================
>>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline]
>>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
>>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>> Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
>>>
>>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>>>  kasan_report_error mm/kasan/report.c:354 [inline]
>>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>>>  __read_once_size include/linux/compiler.h:182 [inline]
>>>  task_css include/linux/cgroup.h:477 [inline]
>>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>>  file_ioctl fs/ioctl.c:509 [inline]
>>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>>>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>>>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> RIP: 0033:0x44c7e9
>>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
>>> RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>> RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
>>> RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
>>> RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
>>> R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
>>>
>>> Allocated by task 9325:
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>>  set_track mm/kasan/kasan.c:460 [inline]
>>>  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
>>>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
>>>  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
>>>  alloc_task_struct_node kernel/fork.c:158 [inline]
>>>  dup_task_struct kernel/fork.c:843 [inline]
>>>  copy_process+0x2026/0x87a0 kernel/fork.c:1751
>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>>>  __do_sys_clone kernel/fork.c:2323 [inline]
>>>  __se_sys_clone kernel/fork.c:2317 [inline]
>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>
>>> Freed by task 9325:
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>>  set_track mm/kasan/kasan.c:460 [inline]
>>>  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
>>>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
>>>  __cache_free mm/slab.c:3498 [inline]
>>>  kmem_cache_free+0x83/0x290 mm/slab.c:3760
>>>  free_task_struct kernel/fork.c:163 [inline]
>>>  free_task+0x16e/0x1f0 kernel/fork.c:457
>>>  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>>>  __do_sys_clone kernel/fork.c:2323 [inline]
>>>  __se_sys_clone kernel/fork.c:2317 [inline]
>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>
>>> The buggy address belongs to the object at ffff8881b72ae240
>>>  which belongs to the cache task_struct(81:syz2) of size 6080
>>> The buggy address is located 4304 bytes inside of
>>>  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
>>> The buggy address belongs to the page:
>>> page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0 compound_mapcount: 0
>>> flags: 0x2fffc0000010200(slab|head)
>>> raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
>>> raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
>>> page dumped because: kasan: bad access detected
>>> page->mem_cgroup:ffff8881d87fe580
>>>
>>> Memory state around the buggy address:
>>>  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>                          ^
>>>  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ==================================================================
>>>
>>>
>>> .
>>>
>>
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-04 15:32           ` zhong jiang
@ 2019-03-05  6:26             ` Dmitry Vyukov
  2019-03-05  6:42               ` zhong jiang
  0 siblings, 1 reply; 26+ messages in thread
From: Dmitry Vyukov @ 2019-03-05  6:26 UTC (permalink / raw)
  To: zhong jiang
  Cc: syzbot, Michal Hocko, Andrea Arcangeli, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka
On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
>
> On 2019/3/4 22:11, Dmitry Vyukov wrote:
> > On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >> On 2019/3/4 15:40, Dmitry Vyukov wrote:
> >>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>> Hi, guys
> >>>>
> >>>> I also hit the following issue. but it fails to reproduce the issue by the log.
> >>>>
> >>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
> >>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
> >>>>
> >>>> Any thoughts?
> >>> FWIW syzbot was able to reproduce this with this reproducer.
> >>> This looks like a very subtle race (threaded reproducer that runs
> >>> repeatedly in multiple processes), so most likely we are looking for
> >>> something like few instructions inconsistency window.
> >>>
> >> I has a little doubtful about the instrustions inconsistency window.
> >>
> >> I guess that you mean some smb barriers should be taken into account.:-)
> >>
> >> Because IMO, It should not be the lock case to result in the issue.
> >
> > Since the crash was triggered on x86 _most likley_ this is not a
> > missed barrier. What I meant is that one thread needs to executed some
> > code, while another thread is stopped within few instructions.
> >
> >
> It is weird and I can not find any relationship you had said with the issue.:-(
>
> Because It is the cause that mm->owner has been freed, whereas we still deference it.
>
> From the lastest freed task call trace, It fails to create process.
>
> Am I miss something or I misunderstand your meaning. Please correct me.
Your analysis looks correct. I am just saying that the root cause of
this use-after-free seems to be a race condition.
> >>>> On 2018/12/4 23:43, syzbot wrote:
> >>>>> syzbot has found a reproducer for the following crash on:
> >>>>>
> >>>>> HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
> >>>>> git tree:       upstream
> >>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
> >>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> >>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> >>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
> >>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
> >>>>>
> >>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >>>>> Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
> >>>>>
> >>>>> cgroup: fork rejected by pids controller in /syz2
> >>>>> ==================================================================
> >>>>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline]
> >>>>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> >>>>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >>>>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >>>>> Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
> >>>>>
> >>>>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> >>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> >>>>> Call Trace:
> >>>>>  __dump_stack lib/dump_stack.c:77 [inline]
> >>>>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
> >>>>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
> >>>>>  kasan_report_error mm/kasan/report.c:354 [inline]
> >>>>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
> >>>>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >>>>>  __read_once_size include/linux/compiler.h:182 [inline]
> >>>>>  task_css include/linux/cgroup.h:477 [inline]
> >>>>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >>>>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >>>>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
> >>>>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
> >>>>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
> >>>>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
> >>>>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
> >>>>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
> >>>>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
> >>>>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
> >>>>>  vfs_ioctl fs/ioctl.c:46 [inline]
> >>>>>  file_ioctl fs/ioctl.c:509 [inline]
> >>>>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
> >>>>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
> >>>>>  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >>>>>  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >>>>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >>>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>>>> RIP: 0033:0x44c7e9
> >>>>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> >>>>> RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> >>>>> RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
> >>>>> RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
> >>>>> RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
> >>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
> >>>>> R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
> >>>>>
> >>>>> Allocated by task 9325:
> >>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >>>>>  set_track mm/kasan/kasan.c:460 [inline]
> >>>>>  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
> >>>>>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
> >>>>>  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
> >>>>>  alloc_task_struct_node kernel/fork.c:158 [inline]
> >>>>>  dup_task_struct kernel/fork.c:843 [inline]
> >>>>>  copy_process+0x2026/0x87a0 kernel/fork.c:1751
> >>>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
> >>>>>  __do_sys_clone kernel/fork.c:2323 [inline]
> >>>>>  __se_sys_clone kernel/fork.c:2317 [inline]
> >>>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
> >>>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>>>>
> >>>>> Freed by task 9325:
> >>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >>>>>  set_track mm/kasan/kasan.c:460 [inline]
> >>>>>  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
> >>>>>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
> >>>>>  __cache_free mm/slab.c:3498 [inline]
> >>>>>  kmem_cache_free+0x83/0x290 mm/slab.c:3760
> >>>>>  free_task_struct kernel/fork.c:163 [inline]
> >>>>>  free_task+0x16e/0x1f0 kernel/fork.c:457
> >>>>>  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
> >>>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
> >>>>>  __do_sys_clone kernel/fork.c:2323 [inline]
> >>>>>  __se_sys_clone kernel/fork.c:2317 [inline]
> >>>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
> >>>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>>>>
> >>>>> The buggy address belongs to the object at ffff8881b72ae240
> >>>>>  which belongs to the cache task_struct(81:syz2) of size 6080
> >>>>> The buggy address is located 4304 bytes inside of
> >>>>>  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
> >>>>> The buggy address belongs to the page:
> >>>>> page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0 compound_mapcount: 0
> >>>>> flags: 0x2fffc0000010200(slab|head)
> >>>>> raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
> >>>>> raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
> >>>>> page dumped because: kasan: bad access detected
> >>>>> page->mem_cgroup:ffff8881d87fe580
> >>>>>
> >>>>> Memory state around the buggy address:
> >>>>>  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>>>  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>>>> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>>>                          ^
> >>>>>  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>>>  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>>> ==================================================================
> >>>>>
> >>>>>
> >>>>> .
> >>>>>
> >>>> --
> >>>> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> >>>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> >>>> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/5C7BFE94.6070500%40huawei.com.
> >>>> For more options, visit https://groups.google.com/d/optout.
> >>> .
> >>>
> >>
> > .
> >
>
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-05  6:26             ` Dmitry Vyukov
@ 2019-03-05  6:42               ` zhong jiang
  2019-03-06  2:05                 ` Andrea Arcangeli
  0 siblings, 1 reply; 26+ messages in thread
From: zhong jiang @ 2019-03-05  6:42 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Michal Hocko, Andrea Arcangeli, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka
On 2019/3/5 14:26, Dmitry Vyukov wrote:
> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>> Hi, guys
>>>>>>
>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
>>>>>>
>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
>>>>>>
>>>>>> Any thoughts?
>>>>> FWIW syzbot was able to reproduce this with this reproducer.
>>>>> This looks like a very subtle race (threaded reproducer that runs
>>>>> repeatedly in multiple processes), so most likely we are looking for
>>>>> something like few instructions inconsistency window.
>>>>>
>>>> I has a little doubtful about the instrustions inconsistency window.
>>>>
>>>> I guess that you mean some smb barriers should be taken into account.:-)
>>>>
>>>> Because IMO, It should not be the lock case to result in the issue.
>>> Since the crash was triggered on x86 _most likley_ this is not a
>>> missed barrier. What I meant is that one thread needs to executed some
>>> code, while another thread is stopped within few instructions.
>>>
>>>
>> It is weird and I can not find any relationship you had said with the issue.:-(
>>
>> Because It is the cause that mm->owner has been freed, whereas we still deference it.
>>
>> From the lastest freed task call trace, It fails to create process.
>>
>> Am I miss something or I misunderstand your meaning. Please correct me.
> Your analysis looks correct. I am just saying that the root cause of
> this use-after-free seems to be a race condition.
>
>
>
Yep, Indeed,  I can not figure out how the race works. I will dig up further.
Thanks,
zhong jiang
>
>>>>>> On 2018/12/4 23:43, syzbot wrote:
>>>>>>> syzbot has found a reproducer for the following crash on:
>>>>>>>
>>>>>>> HEAD commit:    0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
>>>>>>> git tree:       upstream
>>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a3400000
>>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
>>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
>>>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
>>>>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
>>>>>>>
>>>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>>>>> Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
>>>>>>>
>>>>>>> cgroup: fork rejected by pids controller in /syz2
>>>>>>> ==================================================================
>>>>>>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline]
>>>>>>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
>>>>>>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>>>>>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>>>>>> Read of size 8 at addr ffff8881b72af310 by task syz-executor198/9332
>>>>>>>
>>>>>>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
>>>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>>>>>> Call Trace:
>>>>>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>>>>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>>>>>>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>>>>>>>  kasan_report_error mm/kasan/report.c:354 [inline]
>>>>>>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>>>>>>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>>>>>>>  __read_once_size include/linux/compiler.h:182 [inline]
>>>>>>>  task_css include/linux/cgroup.h:477 [inline]
>>>>>>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>>>>>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>>>>>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>>>>>>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>>>>>>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>>>>>>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>>>>>>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>>>>>>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>>>>>>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>>>>>>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>>>>>>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>>>>>>  file_ioctl fs/ioctl.c:509 [inline]
>>>>>>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>>>>>>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>>>>>>>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>>>>>>>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>>>>>>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>>>>>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>>>> RIP: 0033:0x44c7e9
>>>>>>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
>>>>>>> RSP: 002b:00007f906b69fdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>>>>>> RAX: ffffffffffffffda RBX: 00000000006e4a08 RCX: 000000000044c7e9
>>>>>>> RDX: 0000000020000100 RSI: 00000000c028aa03 RDI: 0000000000000004
>>>>>>> RBP: 00000000006e4a00 R08: 0000000000000000 R09: 0000000000000000
>>>>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e4a0c
>>>>>>> R13: 00007ffdfd47813f R14: 00007f906b6a09c0 R15: 000000000000002d
>>>>>>>
>>>>>>> Allocated by task 9325:
>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>>>>>>  set_track mm/kasan/kasan.c:460 [inline]
>>>>>>>  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
>>>>>>>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
>>>>>>>  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
>>>>>>>  alloc_task_struct_node kernel/fork.c:158 [inline]
>>>>>>>  dup_task_struct kernel/fork.c:843 [inline]
>>>>>>>  copy_process+0x2026/0x87a0 kernel/fork.c:1751
>>>>>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>>>>>>>  __do_sys_clone kernel/fork.c:2323 [inline]
>>>>>>>  __se_sys_clone kernel/fork.c:2317 [inline]
>>>>>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>>>>>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>>>>
>>>>>>> Freed by task 9325:
>>>>>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>>>>>>  set_track mm/kasan/kasan.c:460 [inline]
>>>>>>>  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
>>>>>>>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
>>>>>>>  __cache_free mm/slab.c:3498 [inline]
>>>>>>>  kmem_cache_free+0x83/0x290 mm/slab.c:3760
>>>>>>>  free_task_struct kernel/fork.c:163 [inline]
>>>>>>>  free_task+0x16e/0x1f0 kernel/fork.c:457
>>>>>>>  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
>>>>>>>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>>>>>>>  __do_sys_clone kernel/fork.c:2323 [inline]
>>>>>>>  __se_sys_clone kernel/fork.c:2317 [inline]
>>>>>>>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>>>>>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>>>>
>>>>>>> The buggy address belongs to the object at ffff8881b72ae240
>>>>>>>  which belongs to the cache task_struct(81:syz2) of size 6080
>>>>>>> The buggy address is located 4304 bytes inside of
>>>>>>>  6080-byte region [ffff8881b72ae240, ffff8881b72afa00)
>>>>>>> The buggy address belongs to the page:
>>>>>>> page:ffffea0006dcab80 count:1 mapcount:0 mapping:ffff8881d2dce0c0 index:0x0 compound_mapcount: 0
>>>>>>> flags: 0x2fffc0000010200(slab|head)
>>>>>>> raw: 02fffc0000010200 ffffea00074a1f88 ffffea0006ebbb88 ffff8881d2dce0c0
>>>>>>> raw: 0000000000000000 ffff8881b72ae240 0000000100000001 ffff8881d87fe580
>>>>>>> page dumped because: kasan: bad access detected
>>>>>>> page->mem_cgroup:ffff8881d87fe580
>>>>>>>
>>>>>>> Memory state around the buggy address:
>>>>>>>  ffff8881b72af200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>>>>  ffff8881b72af280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>>>>> ffff8881b72af300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>>>>                          ^
>>>>>>>  ffff8881b72af380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>>>>  ffff8881b72af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>>>>> ==================================================================
>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
>>>>>> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/5C7BFE94.6070500%40huawei.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>> .
>>>>>
>>> .
>>>
>>
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-05  6:42               ` zhong jiang
@ 2019-03-06  2:05                 ` Andrea Arcangeli
  2019-03-06  5:53                   ` zhong jiang
  2019-03-08  7:10                   ` zhong jiang
  0 siblings, 2 replies; 26+ messages in thread
From: Andrea Arcangeli @ 2019-03-06  2:05 UTC (permalink / raw)
  To: zhong jiang
  Cc: Dmitry Vyukov, syzbot, Michal Hocko, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka,
	Mike Rapoport, Peter Xu
Hello everyone,
[ CC'ed Mike and Peter ]
On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
> On 2019/3/5 14:26, Dmitry Vyukov wrote:
> > On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >> On 2019/3/4 22:11, Dmitry Vyukov wrote:
> >>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
> >>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>>>> Hi, guys
> >>>>>>
> >>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
> >>>>>>
> >>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
> >>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
> >>>>>>
> >>>>>> Any thoughts?
> >>>>> FWIW syzbot was able to reproduce this with this reproducer.
> >>>>> This looks like a very subtle race (threaded reproducer that runs
> >>>>> repeatedly in multiple processes), so most likely we are looking for
> >>>>> something like few instructions inconsistency window.
> >>>>>
> >>>> I has a little doubtful about the instrustions inconsistency window.
> >>>>
> >>>> I guess that you mean some smb barriers should be taken into account.:-)
> >>>>
> >>>> Because IMO, It should not be the lock case to result in the issue.
> >>> Since the crash was triggered on x86 _most likley_ this is not a
> >>> missed barrier. What I meant is that one thread needs to executed some
> >>> code, while another thread is stopped within few instructions.
> >>>
> >>>
> >> It is weird and I can not find any relationship you had said with the issue.:-(
> >>
> >> Because It is the cause that mm->owner has been freed, whereas we still deference it.
> >>
> >> From the lastest freed task call trace, It fails to create process.
> >>
> >> Am I miss something or I misunderstand your meaning. Please correct me.
> > Your analysis looks correct. I am just saying that the root cause of
> > this use-after-free seems to be a race condition.
> >
> >
> >
> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
Yes it's a race condition.
We were aware about the non-cooperative fork userfaultfd feature
creating userfaultfd file descriptor that gets reported to the parent
uffd, despite they belong to mm created by failed forks.
https://www.spinics.net/lists/linux-mm/msg136357.html
The fork failure in my testcase happened because of signal pending
that interrupted fork after the failed-fork uffd context, was already
pushed to the userfaultfd reader/monitor. CRIU then takes care of
filtering the failed fork cases so we didn't want to make the fork
code more complicated just for userfaultfd.
In reality if MEMCG is enabled at build time, mm->owner maintainance
code now creates a race condition in the above case, with any fork
failure.
I pinged Mike yesterday to ask if my theory could be true for this bug
and one solution he suggested is to do the userfaultfd_dup at a point
where fork cannot fail anymore. That's precisely what we were
wondering to do back then to avoid the failed fork reports to the
non cooperative uffd monitor.
That will solve the false positive deliveries that CRIU manager
currently filters out too. From a theoretical standpoint it's also
quite strange to even allow any uffd ioctl to run on a otherwise long
gone mm created for a process that in the end wasn't even created (the
mm got temporarily fully created, but no child task really ever used
such mm). However that mm is on its way to exit_mmap as soon as the
ioclt returns and this only ever happens during race conditions, so
the way CRIU monitor works there wasn't anything fundamentally
concerning about this detail, despite it's remarkably "strange". Our
priority was to keep the fork code as simple as possible and keep
userfaultfd as non intrusive as possible.
One alternative solution I'm wondering about for this memcg issue is
to free the task struct with RCU also when fork has failed and to add
the mm_update_next_owner before mmput. That will still report failed
forks to the uffd monitor, so it's not the ideal fix, but since it's
probably simpler I'm posting it below. Also I couldn't reproduce the
problem with the testcase here yet.
From 6cbf9d377b705476e5226704422357176f79e32c Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli <aarcange@redhat.com>
Date: Tue, 5 Mar 2019 19:21:37 -0500
Subject: [PATCH 1/1] userfaultfd: use RCU to free the task struct when fork
 fails if MEMCG
MEMCG depends on the task structure not to be freed under
rcu_read_lock() in get_mem_cgroup_from_mm() after it dereferences
mm->owner.
A better fix would be to avoid registering forked vmas in userfaultfd
contexts reported to the monitor, if case fork ends up failing.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 kernel/fork.c | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index eb9953c82104..3bcbb361ffbc 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -953,6 +953,15 @@ static void mm_init_aio(struct mm_struct *mm)
 #endif
 }
 
+static __always_inline void mm_clear_owner(struct mm_struct *mm,
+					   struct task_struct *p)
+{
+#ifdef CONFIG_MEMCG
+	if (mm->owner == p)
+		mm->owner = NULL;
+#endif
+}
+
 static void mm_init_owner(struct mm_struct *mm, struct task_struct *p)
 {
 #ifdef CONFIG_MEMCG
@@ -1345,6 +1354,7 @@ static struct mm_struct *dup_mm(struct task_struct *tsk)
 free_pt:
 	/* don't put binfmt in mmput, we haven't got module yet */
 	mm->binfmt = NULL;
+	mm_init_owner(mm, NULL);
 	mmput(mm);
 
 fail_nomem:
@@ -1676,6 +1686,24 @@ static inline void rcu_copy_process(struct task_struct *p)
 #endif /* #ifdef CONFIG_TASKS_RCU */
 }
 
+#ifdef CONFIG_MEMCG
+static void __delayed_free_task(struct rcu_head *rhp)
+{
+	struct task_struct *tsk = container_of(rhp, struct task_struct, rcu);
+
+	free_task(tsk);
+}
+#endif /* CONFIG_MEMCG */
+
+static __always_inline void delayed_free_task(struct task_struct *tsk)
+{
+#ifdef CONFIG_MEMCG
+	call_rcu(&tsk->rcu, __delayed_free_task);
+#else /* CONFIG_MEMCG */
+	free_task(tsk);
+#endif /* CONFIG_MEMCG */
+}
+
 /*
  * This creates a new process as a copy of the old one,
  * but does not actually start it yet.
@@ -2137,8 +2165,10 @@ static __latent_entropy struct task_struct *copy_process(
 bad_fork_cleanup_namespaces:
 	exit_task_namespaces(p);
 bad_fork_cleanup_mm:
-	if (p->mm)
+	if (p->mm) {
+		mm_clear_owner(p->mm, p);
 		mmput(p->mm);
+	}
 bad_fork_cleanup_signal:
 	if (!(clone_flags & CLONE_THREAD))
 		free_signal_struct(p->signal);
@@ -2169,7 +2199,7 @@ static __latent_entropy struct task_struct *copy_process(
 bad_fork_free:
 	p->state = TASK_DEAD;
 	put_task_stack(p);
-	free_task(p);
+	delayed_free_task(p);
 fork_out:
 	spin_lock_irq(¤t->sighand->siglock);
 	hlist_del_init(&delayed.node);
^ permalink raw reply related	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-06  2:05                 ` Andrea Arcangeli
@ 2019-03-06  5:53                   ` zhong jiang
  2019-03-06  6:26                     ` Mike Rapoport
  2019-03-08  7:10                   ` zhong jiang
  1 sibling, 1 reply; 26+ messages in thread
From: zhong jiang @ 2019-03-06  5:53 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Dmitry Vyukov, syzbot, Michal Hocko, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka,
	Mike Rapoport, Peter Xu
On 2019/3/6 10:05, Andrea Arcangeli wrote:
> Hello everyone,
>
> [ CC'ed Mike and Peter ]
>
> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
>>>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>>>> Hi, guys
>>>>>>>>
>>>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
>>>>>>>>
>>>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
>>>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
>>>>>>>>
>>>>>>>> Any thoughts?
>>>>>>> FWIW syzbot was able to reproduce this with this reproducer.
>>>>>>> This looks like a very subtle race (threaded reproducer that runs
>>>>>>> repeatedly in multiple processes), so most likely we are looking for
>>>>>>> something like few instructions inconsistency window.
>>>>>>>
>>>>>> I has a little doubtful about the instrustions inconsistency window.
>>>>>>
>>>>>> I guess that you mean some smb barriers should be taken into account.:-)
>>>>>>
>>>>>> Because IMO, It should not be the lock case to result in the issue.
>>>>> Since the crash was triggered on x86 _most likley_ this is not a
>>>>> missed barrier. What I meant is that one thread needs to executed some
>>>>> code, while another thread is stopped within few instructions.
>>>>>
>>>>>
>>>> It is weird and I can not find any relationship you had said with the issue.:-(
>>>>
>>>> Because It is the cause that mm->owner has been freed, whereas we still deference it.
>>>>
>>>> From the lastest freed task call trace, It fails to create process.
>>>>
>>>> Am I miss something or I misunderstand your meaning. Please correct me.
>>> Your analysis looks correct. I am just saying that the root cause of
>>> this use-after-free seems to be a race condition.
>>>
>>>
>>>
>> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
> Yes it's a race condition.
>
> We were aware about the non-cooperative fork userfaultfd feature
> creating userfaultfd file descriptor that gets reported to the parent
> uffd, despite they belong to mm created by failed forks.
>
> https://www.spinics.net/lists/linux-mm/msg136357.html
>
> The fork failure in my testcase happened because of signal pending
> that interrupted fork after the failed-fork uffd context, was already
> pushed to the userfaultfd reader/monitor. CRIU then takes care of
> filtering the failed fork cases so we didn't want to make the fork
> code more complicated just for userfaultfd.
>
> In reality if MEMCG is enabled at build time, mm->owner maintainance
> code now creates a race condition in the above case, with any fork
> failure.
>
> I pinged Mike yesterday to ask if my theory could be true for this bug
> and one solution he suggested is to do the userfaultfd_dup at a point
> where fork cannot fail anymore. That's precisely what we were
> wondering to do back then to avoid the failed fork reports to the
> non cooperative uffd monitor.
>
> That will solve the false positive deliveries that CRIU manager
> currently filters out too. From a theoretical standpoint it's also
> quite strange to even allow any uffd ioctl to run on a otherwise long
> gone mm created for a process that in the end wasn't even created (the
> mm got temporarily fully created, but no child task really ever used
> such mm). However that mm is on its way to exit_mmap as soon as the
> ioclt returns and this only ever happens during race conditions, so
> the way CRIU monitor works there wasn't anything fundamentally
> concerning about this detail, despite it's remarkably "strange". Our
> priority was to keep the fork code as simple as possible and keep
> userfaultfd as non intrusive as possible.
Hi, Andrea
I still not clear why uffd ioctl can use the incomplete process as the mm->owner.
and how to produce the race.
From your above explainations,   My underdtanding is that the process handling do_exexve
will have a temporary mm,  which will be used by the UUFD ioctl.
Thanks,
zhong jiang
> One alternative solution I'm wondering about for this memcg issue is
> to free the task struct with RCU also when fork has failed and to add
> the mm_update_next_owner before mmput. That will still report failed
> forks to the uffd monitor, so it's not the ideal fix, but since it's
> probably simpler I'm posting it below. Also I couldn't reproduce the
> problem with the testcase here yet.
>
> >From 6cbf9d377b705476e5226704422357176f79e32c Mon Sep 17 00:00:00 2001
> From: Andrea Arcangeli <aarcange@redhat.com>
> Date: Tue, 5 Mar 2019 19:21:37 -0500
> Subject: [PATCH 1/1] userfaultfd: use RCU to free the task struct when fork
>  fails if MEMCG
>
> MEMCG depends on the task structure not to be freed under
> rcu_read_lock() in get_mem_cgroup_from_mm() after it dereferences
> mm->owner.
>
> A better fix would be to avoid registering forked vmas in userfaultfd
> contexts reported to the monitor, if case fork ends up failing.
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
>  kernel/fork.c | 34 ++++++++++++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index eb9953c82104..3bcbb361ffbc 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -953,6 +953,15 @@ static void mm_init_aio(struct mm_struct *mm)
>  #endif
>  }
>  
> +static __always_inline void mm_clear_owner(struct mm_struct *mm,
> +					   struct task_struct *p)
> +{
> +#ifdef CONFIG_MEMCG
> +	if (mm->owner == p)
> +		mm->owner = NULL;
> +#endif
> +}
> +
>  static void mm_init_owner(struct mm_struct *mm, struct task_struct *p)
>  {
>  #ifdef CONFIG_MEMCG
> @@ -1345,6 +1354,7 @@ static struct mm_struct *dup_mm(struct task_struct *tsk)
>  free_pt:
>  	/* don't put binfmt in mmput, we haven't got module yet */
>  	mm->binfmt = NULL;
> +	mm_init_owner(mm, NULL);
>  	mmput(mm);
>  
>  fail_nomem:
> @@ -1676,6 +1686,24 @@ static inline void rcu_copy_process(struct task_struct *p)
>  #endif /* #ifdef CONFIG_TASKS_RCU */
>  }
>  
> +#ifdef CONFIG_MEMCG
> +static void __delayed_free_task(struct rcu_head *rhp)
> +{
> +	struct task_struct *tsk = container_of(rhp, struct task_struct, rcu);
> +
> +	free_task(tsk);
> +}
> +#endif /* CONFIG_MEMCG */
> +
> +static __always_inline void delayed_free_task(struct task_struct *tsk)
> +{
> +#ifdef CONFIG_MEMCG
> +	call_rcu(&tsk->rcu, __delayed_free_task);
> +#else /* CONFIG_MEMCG */
> +	free_task(tsk);
> +#endif /* CONFIG_MEMCG */
> +}
> +
>  /*
>   * This creates a new process as a copy of the old one,
>   * but does not actually start it yet.
> @@ -2137,8 +2165,10 @@ static __latent_entropy struct task_struct *copy_process(
>  bad_fork_cleanup_namespaces:
>  	exit_task_namespaces(p);
>  bad_fork_cleanup_mm:
> -	if (p->mm)
> +	if (p->mm) {
> +		mm_clear_owner(p->mm, p);
>  		mmput(p->mm);
> +	}
>  bad_fork_cleanup_signal:
>  	if (!(clone_flags & CLONE_THREAD))
>  		free_signal_struct(p->signal);
> @@ -2169,7 +2199,7 @@ static __latent_entropy struct task_struct *copy_process(
>  bad_fork_free:
>  	p->state = TASK_DEAD;
>  	put_task_stack(p);
> -	free_task(p);
> +	delayed_free_task(p);
>  fork_out:
>  	spin_lock_irq(¤t->sighand->siglock);
>  	hlist_del_init(&delayed.node);
>
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-06  5:53                   ` zhong jiang
@ 2019-03-06  6:26                     ` Mike Rapoport
  2019-03-06  7:41                       ` zhong jiang
  0 siblings, 1 reply; 26+ messages in thread
From: Mike Rapoport @ 2019-03-06  6:26 UTC (permalink / raw)
  To: zhong jiang
  Cc: Andrea Arcangeli, Dmitry Vyukov, syzbot, Michal Hocko, cgroups,
	Johannes Weiner, LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov,
	David Rientjes, Hugh Dickins, Matthew Wilcox, Mel Gorman,
	Vlastimil Babka, Mike Rapoport, Peter Xu
Hi,
On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
> On 2019/3/6 10:05, Andrea Arcangeli wrote:
> > Hello everyone,
> >
> > [ CC'ed Mike and Peter ]
> >
> > On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
> >> On 2019/3/5 14:26, Dmitry Vyukov wrote:
> >>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
> >>>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
> >>>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>>>>>> Hi, guys
> >>>>>>>>
> >>>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
> >>>>>>>>
> >>>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
> >>>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
> >>>>>>>>
> >>>>>>>> Any thoughts?
> >>>>>>> FWIW syzbot was able to reproduce this with this reproducer.
> >>>>>>> This looks like a very subtle race (threaded reproducer that runs
> >>>>>>> repeatedly in multiple processes), so most likely we are looking for
> >>>>>>> something like few instructions inconsistency window.
> >>>>>>>
> >>>>>> I has a little doubtful about the instrustions inconsistency window.
> >>>>>>
> >>>>>> I guess that you mean some smb barriers should be taken into account.:-)
> >>>>>>
> >>>>>> Because IMO, It should not be the lock case to result in the issue.
> >>>>> Since the crash was triggered on x86 _most likley_ this is not a
> >>>>> missed barrier. What I meant is that one thread needs to executed some
> >>>>> code, while another thread is stopped within few instructions.
> >>>>>
> >>>>>
> >>>> It is weird and I can not find any relationship you had said with the issue.:-(
> >>>>
> >>>> Because It is the cause that mm->owner has been freed, whereas we still deference it.
> >>>>
> >>>> From the lastest freed task call trace, It fails to create process.
> >>>>
> >>>> Am I miss something or I misunderstand your meaning. Please correct me.
> >>> Your analysis looks correct. I am just saying that the root cause of
> >>> this use-after-free seems to be a race condition.
> >>>
> >>>
> >>>
> >> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
> > Yes it's a race condition.
> >
> > We were aware about the non-cooperative fork userfaultfd feature
> > creating userfaultfd file descriptor that gets reported to the parent
> > uffd, despite they belong to mm created by failed forks.
> >
> > https://www.spinics.net/lists/linux-mm/msg136357.html
> >
> 
> Hi, Andrea
> 
> I still not clear why uffd ioctl can use the incomplete process as the mm->owner.
> and how to produce the race.
There is a C reproducer in  the syzcaller report:
https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
 
> From your above explainations,   My underdtanding is that the process handling do_exexve
> will have a temporary mm,  which will be used by the UUFD ioctl.
The race is between userfaultfd operation and fork() failure:
forking thread                  | userfaultfd monitor thread
--------------------------------+-------------------------------
fork()                          |
  dup_mmap()                    |
    dup_userfaultfd()           |
    dup_userfaultfd_complete()  |
                                |  read(UFFD_EVENT_FORK)
                                |  uffdio_copy()
                                |    mmget_not_zero()
    goto bad_fork_something     |
    ...                         |
bad_fork_free:                  |
      free_task()               |
                                |  mem_cgroup_from_task()
                                |       /* access stale mm->owner */
 
> Thanks,
> zhong jiang
-- 
Sincerely yours,
Mike.
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-06  6:26                     ` Mike Rapoport
@ 2019-03-06  7:41                       ` zhong jiang
  2019-03-06  8:12                         ` Peter Xu
  2019-03-06  8:20                         ` Mike Rapoport
  0 siblings, 2 replies; 26+ messages in thread
From: zhong jiang @ 2019-03-06  7:41 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrea Arcangeli, Dmitry Vyukov, syzbot, Michal Hocko, cgroups,
	Johannes Weiner, LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov,
	David Rientjes, Hugh Dickins, Matthew Wilcox, Mel Gorman,
	Vlastimil Babka, Mike Rapoport, Peter Xu
On 2019/3/6 14:26, Mike Rapoport wrote:
> Hi,
>
> On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
>> On 2019/3/6 10:05, Andrea Arcangeli wrote:
>>> Hello everyone,
>>>
>>> [ CC'ed Mike and Peter ]
>>>
>>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>>>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
>>>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
>>>>>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>>>>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>>>>>> Hi, guys
>>>>>>>>>>
>>>>>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
>>>>>>>>>>
>>>>>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
>>>>>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
>>>>>>>>>>
>>>>>>>>>> Any thoughts?
>>>>>>>>> FWIW syzbot was able to reproduce this with this reproducer.
>>>>>>>>> This looks like a very subtle race (threaded reproducer that runs
>>>>>>>>> repeatedly in multiple processes), so most likely we are looking for
>>>>>>>>> something like few instructions inconsistency window.
>>>>>>>>>
>>>>>>>> I has a little doubtful about the instrustions inconsistency window.
>>>>>>>>
>>>>>>>> I guess that you mean some smb barriers should be taken into account.:-)
>>>>>>>>
>>>>>>>> Because IMO, It should not be the lock case to result in the issue.
>>>>>>> Since the crash was triggered on x86 _most likley_ this is not a
>>>>>>> missed barrier. What I meant is that one thread needs to executed some
>>>>>>> code, while another thread is stopped within few instructions.
>>>>>>>
>>>>>>>
>>>>>> It is weird and I can not find any relationship you had said with the issue.:-(
>>>>>>
>>>>>> Because It is the cause that mm->owner has been freed, whereas we still deference it.
>>>>>>
>>>>>> From the lastest freed task call trace, It fails to create process.
>>>>>>
>>>>>> Am I miss something or I misunderstand your meaning. Please correct me.
>>>>> Your analysis looks correct. I am just saying that the root cause of
>>>>> this use-after-free seems to be a race condition.
>>>>>
>>>>>
>>>>>
>>>> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
>>> Yes it's a race condition.
>>>
>>> We were aware about the non-cooperative fork userfaultfd feature
>>> creating userfaultfd file descriptor that gets reported to the parent
>>> uffd, despite they belong to mm created by failed forks.
>>>
>>> https://www.spinics.net/lists/linux-mm/msg136357.html
>>>
>> Hi, Andrea
>>
>> I still not clear why uffd ioctl can use the incomplete process as the mm->owner.
>> and how to produce the race.
> There is a C reproducer in  the syzcaller report:
>
> https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
>  
>> From your above explainations,   My underdtanding is that the process handling do_exexve
>> will have a temporary mm,  which will be used by the UUFD ioctl.
> The race is between userfaultfd operation and fork() failure:
>
> forking thread                  | userfaultfd monitor thread
> --------------------------------+-------------------------------
> fork()                          |
>   dup_mmap()                    |
>     dup_userfaultfd()           |
>     dup_userfaultfd_complete()  |
>                                 |  read(UFFD_EVENT_FORK)
>                                 |  uffdio_copy()
>                                 |    mmget_not_zero()
>     goto bad_fork_something     |
>     ...                         |
> bad_fork_free:                  |
>       free_task()               |
>                                 |  mem_cgroup_from_task()
>                                 |       /* access stale mm->owner */
>  
Hi, Mike
forking thread fails to create the process ,and then free the allocated task struct.
Other userfaultfd monitor thread should not access the stale mm->owner.
The parent process and child process do not share the mm struct.  Userfaultfd monitor thread's
mm->owner should not point to the freed child task_struct.
and due to the existence of tasklist_lock,  we can not specify the mm->owner to freed task_struct.
I miss something,=-O
Thanks,
zhong jiang
>> Thanks,
>> zhong jiang
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-06  7:41                       ` zhong jiang
@ 2019-03-06  8:12                         ` Peter Xu
  2019-03-06 13:07                           ` zhong jiang
  2019-03-06  8:20                         ` Mike Rapoport
  1 sibling, 1 reply; 26+ messages in thread
From: Peter Xu @ 2019-03-06  8:12 UTC (permalink / raw)
  To: zhong jiang
  Cc: Mike Rapoport, Andrea Arcangeli, Dmitry Vyukov, syzbot,
	Michal Hocko, cgroups, Johannes Weiner, LKML, Linux-MM,
	syzkaller-bugs, Vladimir Davydov, David Rientjes, Hugh Dickins,
	Matthew Wilcox, Mel Gorman, Vlastimil Babka, Mike Rapoport
On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote:
> On 2019/3/6 14:26, Mike Rapoport wrote:
> > Hi,
> >
> > On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
> >> On 2019/3/6 10:05, Andrea Arcangeli wrote:
> >>> Hello everyone,
> >>>
> >>> [ CC'ed Mike and Peter ]
> >>>
> >>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
> >>>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
> >>>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>>>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
> >>>>>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>>>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
> >>>>>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>>>>>>>> Hi, guys
> >>>>>>>>>>
> >>>>>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
> >>>>>>>>>>
> >>>>>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
> >>>>>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
> >>>>>>>>>>
> >>>>>>>>>> Any thoughts?
> >>>>>>>>> FWIW syzbot was able to reproduce this with this reproducer.
> >>>>>>>>> This looks like a very subtle race (threaded reproducer that runs
> >>>>>>>>> repeatedly in multiple processes), so most likely we are looking for
> >>>>>>>>> something like few instructions inconsistency window.
> >>>>>>>>>
> >>>>>>>> I has a little doubtful about the instrustions inconsistency window.
> >>>>>>>>
> >>>>>>>> I guess that you mean some smb barriers should be taken into account.:-)
> >>>>>>>>
> >>>>>>>> Because IMO, It should not be the lock case to result in the issue.
> >>>>>>> Since the crash was triggered on x86 _most likley_ this is not a
> >>>>>>> missed barrier. What I meant is that one thread needs to executed some
> >>>>>>> code, while another thread is stopped within few instructions.
> >>>>>>>
> >>>>>>>
> >>>>>> It is weird and I can not find any relationship you had said with the issue.:-(
> >>>>>>
> >>>>>> Because It is the cause that mm->owner has been freed, whereas we still deference it.
> >>>>>>
> >>>>>> From the lastest freed task call trace, It fails to create process.
> >>>>>>
> >>>>>> Am I miss something or I misunderstand your meaning. Please correct me.
> >>>>> Your analysis looks correct. I am just saying that the root cause of
> >>>>> this use-after-free seems to be a race condition.
> >>>>>
> >>>>>
> >>>>>
> >>>> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
> >>> Yes it's a race condition.
> >>>
> >>> We were aware about the non-cooperative fork userfaultfd feature
> >>> creating userfaultfd file descriptor that gets reported to the parent
> >>> uffd, despite they belong to mm created by failed forks.
> >>>
> >>> https://www.spinics.net/lists/linux-mm/msg136357.html
> >>>
> >> Hi, Andrea
> >>
> >> I still not clear why uffd ioctl can use the incomplete process as the mm->owner.
> >> and how to produce the race.
> > There is a C reproducer in  the syzcaller report:
> >
> > https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
> >  
> >> From your above explainations,   My underdtanding is that the process handling do_exexve
> >> will have a temporary mm,  which will be used by the UUFD ioctl.
> > The race is between userfaultfd operation and fork() failure:
> >
> > forking thread                  | userfaultfd monitor thread
> > --------------------------------+-------------------------------
> > fork()                          |
> >   dup_mmap()                    |
> >     dup_userfaultfd()           |
> >     dup_userfaultfd_complete()  |
> >                                 |  read(UFFD_EVENT_FORK)
> >                                 |  uffdio_copy()
> >                                 |    mmget_not_zero()
> >     goto bad_fork_something     |
> >     ...                         |
> > bad_fork_free:                  |
> >       free_task()               |
> >                                 |  mem_cgroup_from_task()
> >                                 |       /* access stale mm->owner */
> >  
> Hi, Mike
Hi, Zhong,
> 
> forking thread fails to create the process ,and then free the allocated task struct.
> Other userfaultfd monitor thread should not access the stale mm->owner.
> 
> The parent process and child process do not share the mm struct.  Userfaultfd monitor thread's
> mm->owner should not point to the freed child task_struct.
IIUC the problem is that above mm (of the mm->owner) is the child
process's mm rather than the uffd monitor's.  When
dup_userfaultfd_complete() is called there will be a new userfaultfd
context sent to the uffd monitor thread which linked to the chlid
process's mm, and if the monitor thread do UFFDIO_COPY upon the newly
received userfaultfd it'll operate on that new mm too.
> 
> and due to the existence of tasklist_lock,  we can not specify the mm->owner to freed task_struct.
> 
> I miss something,=-O
> 
> Thanks,
> zhong jiang
> >> Thanks,
> >> zhong jiang
> 
> 
Regards,
-- 
Peter Xu
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-06  7:41                       ` zhong jiang
  2019-03-06  8:12                         ` Peter Xu
@ 2019-03-06  8:20                         ` Mike Rapoport
  1 sibling, 0 replies; 26+ messages in thread
From: Mike Rapoport @ 2019-03-06  8:20 UTC (permalink / raw)
  To: zhong jiang
  Cc: Andrea Arcangeli, Dmitry Vyukov, syzbot, Michal Hocko, cgroups,
	Johannes Weiner, LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov,
	David Rientjes, Hugh Dickins, Matthew Wilcox, Mel Gorman,
	Vlastimil Babka, Mike Rapoport, Peter Xu
On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote:
> On 2019/3/6 14:26, Mike Rapoport wrote:
> > Hi,
> >
> > On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
> >> On 2019/3/6 10:05, Andrea Arcangeli wrote:
> >>> Hello everyone,
> >>>
> >>> [ CC'ed Mike and Peter ]
> >>>
> >>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
> >>>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
> >>>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>>>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
> >>>>>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>>>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
> >>>>>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
> >>>>>>>>>> Hi, guys
> >>>>>>>>>>
> >>>>>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
> >>>>>>>>>>
> >>>>>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
> >>>>>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
> >>>>>>>>>>
> >>>>>>>>>> Any thoughts?
> >>>>>>>>> FWIW syzbot was able to reproduce this with this reproducer.
> >>>>>>>>> This looks like a very subtle race (threaded reproducer that runs
> >>>>>>>>> repeatedly in multiple processes), so most likely we are looking for
> >>>>>>>>> something like few instructions inconsistency window.
> >>>>>>>>>
> >>>>>>>> I has a little doubtful about the instrustions inconsistency window.
> >>>>>>>>
> >>>>>>>> I guess that you mean some smb barriers should be taken into account.:-)
> >>>>>>>>
> >>>>>>>> Because IMO, It should not be the lock case to result in the issue.
> >>>>>>> Since the crash was triggered on x86 _most likley_ this is not a
> >>>>>>> missed barrier. What I meant is that one thread needs to executed some
> >>>>>>> code, while another thread is stopped within few instructions.
> >>>>>>>
> >>>>>>>
> >>>>>> It is weird and I can not find any relationship you had said with the issue.:-(
> >>>>>>
> >>>>>> Because It is the cause that mm->owner has been freed, whereas we still deference it.
> >>>>>>
> >>>>>> From the lastest freed task call trace, It fails to create process.
> >>>>>>
> >>>>>> Am I miss something or I misunderstand your meaning. Please correct me.
> >>>>> Your analysis looks correct. I am just saying that the root cause of
> >>>>> this use-after-free seems to be a race condition.
> >>>>>
> >>>>>
> >>>>>
> >>>> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
> >>> Yes it's a race condition.
> >>>
> >>> We were aware about the non-cooperative fork userfaultfd feature
> >>> creating userfaultfd file descriptor that gets reported to the parent
> >>> uffd, despite they belong to mm created by failed forks.
> >>>
> >>> https://www.spinics.net/lists/linux-mm/msg136357.html
> >>>
> >> Hi, Andrea
> >>
> >> I still not clear why uffd ioctl can use the incomplete process as the mm->owner.
> >> and how to produce the race.
> > There is a C reproducer in  the syzcaller report:
> >
> > https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
> >  
> >> From your above explainations,   My underdtanding is that the process handling do_exexve
> >> will have a temporary mm,  which will be used by the UUFD ioctl.
> > The race is between userfaultfd operation and fork() failure:
> >
> > forking thread                  | userfaultfd monitor thread
> > --------------------------------+-------------------------------
> > fork()                          |
> >   dup_mmap()                    |
> >     dup_userfaultfd()           |
> >     dup_userfaultfd_complete()  |
> >                                 |  read(UFFD_EVENT_FORK)
> >                                 |  uffdio_copy()
> >                                 |    mmget_not_zero()
> >     goto bad_fork_something     |
> >     ...                         |
> > bad_fork_free:                  |
> >       free_task()               |
> >                                 |  mem_cgroup_from_task()
> >                                 |       /* access stale mm->owner */
> >  
> Hi, Mike
> 
> forking thread fails to create the process ,and then free the allocated task struct.
> Other userfaultfd monitor thread should not access the stale mm->owner.
> 
> The parent process and child process do not share the mm struct.  Userfaultfd monitor thread's
> mm->owner should not point to the freed child task_struct.
Userfaultfd can monitor remote mm's [1]. In this case, dup_userfaultfd() and
dup_userfaultfd_complete() create uffd context for the new process and
notify userspace uffd monitor about this new context. The uffd monitor then
can perform uffd operations on the new context.
On the right side the mmget_not_zero() will take the reference for the mm of the newly
created process.
[1] https://www.kernel.org/doc/html/latest/admin-guide/mm/userfaultfd.html#non-cooperative-userfaultfd
 
> and due to the existence of tasklist_lock,  we can not specify the mm->owner to freed task_struct.
> 
> I miss something,=-O
> 
> Thanks,
> zhong jiang
> >> Thanks,
> >> zhong jiang
> 
> 
-- 
Sincerely yours,
Mike.
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-06  8:12                         ` Peter Xu
@ 2019-03-06 13:07                           ` zhong jiang
  2019-03-06 18:29                             ` Andrea Arcangeli
  0 siblings, 1 reply; 26+ messages in thread
From: zhong jiang @ 2019-03-06 13:07 UTC (permalink / raw)
  To: Peter Xu, Mike Rapoport, Andrea Arcangeli
  Cc: Dmitry Vyukov, syzbot, Michal Hocko, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka,
	Mike Rapoport
On 2019/3/6 16:12, Peter Xu wrote:
> On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote:
>> On 2019/3/6 14:26, Mike Rapoport wrote:
>>> Hi,
>>>
>>> On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
>>>> On 2019/3/6 10:05, Andrea Arcangeli wrote:
>>>>> Hello everyone,
>>>>>
>>>>> [ CC'ed Mike and Peter ]
>>>>>
>>>>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>>>>>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
>>>>>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>>>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
>>>>>>>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>>>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>>>>>>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>>>>>>>> Hi, guys
>>>>>>>>>>>>
>>>>>>>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
>>>>>>>>>>>>
>>>>>>>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
>>>>>>>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
>>>>>>>>>>>>
>>>>>>>>>>>> Any thoughts?
>>>>>>>>>>> FWIW syzbot was able to reproduce this with this reproducer.
>>>>>>>>>>> This looks like a very subtle race (threaded reproducer that runs
>>>>>>>>>>> repeatedly in multiple processes), so most likely we are looking for
>>>>>>>>>>> something like few instructions inconsistency window.
>>>>>>>>>>>
>>>>>>>>>> I has a little doubtful about the instrustions inconsistency window.
>>>>>>>>>>
>>>>>>>>>> I guess that you mean some smb barriers should be taken into account.:-)
>>>>>>>>>>
>>>>>>>>>> Because IMO, It should not be the lock case to result in the issue.
>>>>>>>>> Since the crash was triggered on x86 _most likley_ this is not a
>>>>>>>>> missed barrier. What I meant is that one thread needs to executed some
>>>>>>>>> code, while another thread is stopped within few instructions.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> It is weird and I can not find any relationship you had said with the issue.:-(
>>>>>>>>
>>>>>>>> Because It is the cause that mm->owner has been freed, whereas we still deference it.
>>>>>>>>
>>>>>>>> From the lastest freed task call trace, It fails to create process.
>>>>>>>>
>>>>>>>> Am I miss something or I misunderstand your meaning. Please correct me.
>>>>>>> Your analysis looks correct. I am just saying that the root cause of
>>>>>>> this use-after-free seems to be a race condition.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
>>>>> Yes it's a race condition.
>>>>>
>>>>> We were aware about the non-cooperative fork userfaultfd feature
>>>>> creating userfaultfd file descriptor that gets reported to the parent
>>>>> uffd, despite they belong to mm created by failed forks.
>>>>>
>>>>> https://www.spinics.net/lists/linux-mm/msg136357.html
>>>>>
>>>> Hi, Andrea
>>>>
>>>> I still not clear why uffd ioctl can use the incomplete process as the mm->owner.
>>>> and how to produce the race.
>>> There is a C reproducer in  the syzcaller report:
>>>
>>> https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
>>>  
>>>> From your above explainations,   My underdtanding is that the process handling do_exexve
>>>> will have a temporary mm,  which will be used by the UUFD ioctl.
>>> The race is between userfaultfd operation and fork() failure:
>>>
>>> forking thread                  | userfaultfd monitor thread
>>> --------------------------------+-------------------------------
>>> fork()                          |
>>>   dup_mmap()                    |
>>>     dup_userfaultfd()           |
>>>     dup_userfaultfd_complete()  |
>>>                                 |  read(UFFD_EVENT_FORK)
>>>                                 |  uffdio_copy()
>>>                                 |    mmget_not_zero()
>>>     goto bad_fork_something     |
>>>     ...                         |
>>> bad_fork_free:                  |
>>>       free_task()               |
>>>                                 |  mem_cgroup_from_task()
>>>                                 |       /* access stale mm->owner */
>>>  
>> Hi, Mike
> Hi, Zhong,
>
>> forking thread fails to create the process ,and then free the allocated task struct.
>> Other userfaultfd monitor thread should not access the stale mm->owner.
>>
>> The parent process and child process do not share the mm struct.  Userfaultfd monitor thread's
>> mm->owner should not point to the freed child task_struct.
> IIUC the problem is that above mm (of the mm->owner) is the child
> process's mm rather than the uffd monitor's.  When
> dup_userfaultfd_complete() is called there will be a new userfaultfd
> context sent to the uffd monitor thread which linked to the chlid
> process's mm, and if the monitor thread do UFFDIO_COPY upon the newly
> received userfaultfd it'll operate on that new mm too.
Thank Mike and Peter for further explanation. I get it.
Yes, The race indeed will result in the issue.
but as for the patch Andrea has posted. I still has a little worry.
The patch use call_rcu to delay free the task_struct, but It is possible to free the task_struct
ahead of get_mem_cgroup_from_mm. is it right?
Thanks,
zhong jiang
>> and due to the existence of tasklist_lock,  we can not specify the mm->owner to freed task_struct.
>>
>> I miss something,=-O
>>
>> Thanks,
>> zhong jiang
>>>> Thanks,
>>>> zhong jiang
>>
> Regards,
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-06 13:07                           ` zhong jiang
@ 2019-03-06 18:29                             ` Andrea Arcangeli
  2019-03-07  7:58                               ` zhong jiang
  0 siblings, 1 reply; 26+ messages in thread
From: Andrea Arcangeli @ 2019-03-06 18:29 UTC (permalink / raw)
  To: zhong jiang
  Cc: Peter Xu, Mike Rapoport, Dmitry Vyukov, syzbot, Michal Hocko,
	cgroups, Johannes Weiner, LKML, Linux-MM, syzkaller-bugs,
	Vladimir Davydov, David Rientjes, Hugh Dickins, Matthew Wilcox,
	Mel Gorman, Vlastimil Babka, Mike Rapoport
Hello Zhong,
On Wed, Mar 06, 2019 at 09:07:00PM +0800, zhong jiang wrote:
> The patch use call_rcu to delay free the task_struct, but It is possible to free the task_struct
> ahead of get_mem_cgroup_from_mm. is it right?
Yes it is possible to free before get_mem_cgroup_from_mm, but if it's
freed before get_mem_cgroup_from_mm rcu_read_lock,
rcu_dereference(mm->owner) will return NULL in such case and there
will be no problem.
The simple fix also clears the mm->owner of the failed-fork-mm before
doing the call_rcu. The call_rcu delays the freeing after no other CPU
runs in between rcu_read_lock/unlock anymore. That guarantees that
those critical section will see mm->owner == NULL if the freeing of
the task strut already happened.
The solution Mike suggested for this and that we were wondering as
ideal in the past for the signal issue too, is to move the uffd
delivery at a point where fork is guaranteed to succeed. We should
probably try that too to see how it looks like and if it can be done
in a not intrusive way, but the simple fix that uses RCU should work
too.
Rolling back in case of errors inside fork itself isn't easily doable:
the moment we push the uffd ctx to the other side of the uffd pipe
there's no coming back as that information can reach the userland of
the uffd monitor/reader thread immediately after. The rolling back is
really the other thread failing at mmget_not_zero eventually. It's the
userland that has to rollback in such case when it gets a -ESRCH
retval.
Note that this fork feature is only ever needed in the non-cooperative
case, these things never need to happen when userfaultfd is used by an
app (or a lib) that is aware that it is using userfaultfd.
Thanks,
Andrea
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-06 18:29                             ` Andrea Arcangeli
@ 2019-03-07  7:58                               ` zhong jiang
  0 siblings, 0 replies; 26+ messages in thread
From: zhong jiang @ 2019-03-07  7:58 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Peter Xu, Mike Rapoport, Dmitry Vyukov, syzbot, Michal Hocko,
	cgroups, Johannes Weiner, LKML, Linux-MM, syzkaller-bugs,
	Vladimir Davydov, David Rientjes, Hugh Dickins, Matthew Wilcox,
	Mel Gorman, Vlastimil Babka, Mike Rapoport
On 2019/3/7 2:29, Andrea Arcangeli wrote:
> Hello Zhong,
>
> On Wed, Mar 06, 2019 at 09:07:00PM +0800, zhong jiang wrote:
>> The patch use call_rcu to delay free the task_struct, but It is possible to free the task_struct
>> ahead of get_mem_cgroup_from_mm. is it right?
> Yes it is possible to free before get_mem_cgroup_from_mm, but if it's
> freed before get_mem_cgroup_from_mm rcu_read_lock,
> rcu_dereference(mm->owner) will return NULL in such case and there
> will be no problem.
Yes
> The simple fix also clears the mm->owner of the failed-fork-mm before
> doing the call_rcu. The call_rcu delays the freeing after no other CPU
> runs in between rcu_read_lock/unlock anymore. That guarantees that
> those critical section will see mm->owner == NULL if the freeing of
> the task strut already happened.
We has set the mm->owner to NULL when child process fails to fork ahead of freeing
the task struct.
Have those critical section  chance to see the mm->owner, which is not NULL.
I has tested the patch.  Not Oops and panic appear  so far.
Thanks,
zhong jiang
> The solution Mike suggested for this and that we were wondering as
> ideal in the past for the signal issue too, is to move the uffd
> delivery at a point where fork is guaranteed to succeed. We should
> probably try that too to see how it looks like and if it can be done
> in a not intrusive way, but the simple fix that uses RCU should work
> too.
>
> Rolling back in case of errors inside fork itself isn't easily doable:
> the moment we push the uffd ctx to the other side of the uffd pipe
> there's no coming back as that information can reach the userland of
> the uffd monitor/reader thread immediately after. The rolling back is
> really the other thread failing at mmget_not_zero eventually. It's the
> userland that has to rollback in such case when it gets a -ESRCH
> retval.
>
> Note that this fork feature is only ever needed in the non-cooperative
> case, these things never need to happen when userfaultfd is used by an
> app (or a lib) that is aware that it is using userfaultfd.
>
> Thanks,
> Andrea
>
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-06  2:05                 ` Andrea Arcangeli
  2019-03-06  5:53                   ` zhong jiang
@ 2019-03-08  7:10                   ` zhong jiang
  2019-03-15 21:39                     ` Andrea Arcangeli
  1 sibling, 1 reply; 26+ messages in thread
From: zhong jiang @ 2019-03-08  7:10 UTC (permalink / raw)
  To: Andrea Arcangeli, Mike Rapoport, Peter Xu, Andrew Morton
  Cc: Dmitry Vyukov, syzbot, Michal Hocko, cgroups, Johannes Weiner,
	LKML, Linux-MM, syzkaller-bugs, Vladimir Davydov, David Rientjes,
	Hugh Dickins, Matthew Wilcox, Mel Gorman, Vlastimil Babka
On 2019/3/6 10:05, Andrea Arcangeli wrote:
> Hello everyone,
>
> [ CC'ed Mike and Peter ]
>
> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
>>>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@huawei.com> wrote:
>>>>>>>> Hi, guys
>>>>>>>>
>>>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
>>>>>>>>
>>>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
>>>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
>>>>>>>>
>>>>>>>> Any thoughts?
>>>>>>> FWIW syzbot was able to reproduce this with this reproducer.
>>>>>>> This looks like a very subtle race (threaded reproducer that runs
>>>>>>> repeatedly in multiple processes), so most likely we are looking for
>>>>>>> something like few instructions inconsistency window.
>>>>>>>
>>>>>> I has a little doubtful about the instrustions inconsistency window.
>>>>>>
>>>>>> I guess that you mean some smb barriers should be taken into account.:-)
>>>>>>
>>>>>> Because IMO, It should not be the lock case to result in the issue.
>>>>> Since the crash was triggered on x86 _most likley_ this is not a
>>>>> missed barrier. What I meant is that one thread needs to executed some
>>>>> code, while another thread is stopped within few instructions.
>>>>>
>>>>>
>>>> It is weird and I can not find any relationship you had said with the issue.:-(
>>>>
>>>> Because It is the cause that mm->owner has been freed, whereas we still deference it.
>>>>
>>>> From the lastest freed task call trace, It fails to create process.
>>>>
>>>> Am I miss something or I misunderstand your meaning. Please correct me.
>>> Your analysis looks correct. I am just saying that the root cause of
>>> this use-after-free seems to be a race condition.
>>>
>>>
>>>
>> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
> Yes it's a race condition.
>
> We were aware about the non-cooperative fork userfaultfd feature
> creating userfaultfd file descriptor that gets reported to the parent
> uffd, despite they belong to mm created by failed forks.
>
> https://www.spinics.net/lists/linux-mm/msg136357.html
>
> The fork failure in my testcase happened because of signal pending
> that interrupted fork after the failed-fork uffd context, was already
> pushed to the userfaultfd reader/monitor. CRIU then takes care of
> filtering the failed fork cases so we didn't want to make the fork
> code more complicated just for userfaultfd.
>
> In reality if MEMCG is enabled at build time, mm->owner maintainance
> code now creates a race condition in the above case, with any fork
> failure.
>
> I pinged Mike yesterday to ask if my theory could be true for this bug
> and one solution he suggested is to do the userfaultfd_dup at a point
> where fork cannot fail anymore. That's precisely what we were
> wondering to do back then to avoid the failed fork reports to the
> non cooperative uffd monitor.
>
> That will solve the false positive deliveries that CRIU manager
> currently filters out too. From a theoretical standpoint it's also
> quite strange to even allow any uffd ioctl to run on a otherwise long
> gone mm created for a process that in the end wasn't even created (the
> mm got temporarily fully created, but no child task really ever used
> such mm). However that mm is on its way to exit_mmap as soon as the
> ioclt returns and this only ever happens during race conditions, so
> the way CRIU monitor works there wasn't anything fundamentally
> concerning about this detail, despite it's remarkably "strange". Our
> priority was to keep the fork code as simple as possible and keep
> userfaultfd as non intrusive as possible.
>
> One alternative solution I'm wondering about for this memcg issue is
> to free the task struct with RCU also when fork has failed and to add
> the mm_update_next_owner before mmput. That will still report failed
> forks to the uffd monitor, so it's not the ideal fix, but since it's
> probably simpler I'm posting it below. Also I couldn't reproduce the
> problem with the testcase here yet.
>
> >From 6cbf9d377b705476e5226704422357176f79e32c Mon Sep 17 00:00:00 2001
> From: Andrea Arcangeli <aarcange@redhat.com>
> Date: Tue, 5 Mar 2019 19:21:37 -0500
> Subject: [PATCH 1/1] userfaultfd: use RCU to free the task struct when fork
>  fails if MEMCG
>
> MEMCG depends on the task structure not to be freed under
> rcu_read_lock() in get_mem_cgroup_from_mm() after it dereferences
> mm->owner.
>
> A better fix would be to avoid registering forked vmas in userfaultfd
> contexts reported to the monitor, if case fork ends up failing.
Hi,  Andrea
I can reproduce the issue in arm64 qemu machine.  The issue will leave after applying the
patch.
Tested-by: zhong jiang <zhongjiang@huawei.com>
Meanwhile,  I just has a little doubt whether it is necessary to use RCU to free the task struct or not.
I think that mm->owner alway be NULL after failing to create to process. Because we call mm_clear_owner.
Thanks,
zhong jiang
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
>  kernel/fork.c | 34 ++++++++++++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index eb9953c82104..3bcbb361ffbc 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -953,6 +953,15 @@ static void mm_init_aio(struct mm_struct *mm)
>  #endif
>  }
>  
> +static __always_inline void mm_clear_owner(struct mm_struct *mm,
> +					   struct task_struct *p)
> +{
> +#ifdef CONFIG_MEMCG
> +	if (mm->owner == p)
> +		mm->owner = NULL;
> +#endif
> +}
> +
>  static void mm_init_owner(struct mm_struct *mm, struct task_struct *p)
>  {
>  #ifdef CONFIG_MEMCG
> @@ -1345,6 +1354,7 @@ static struct mm_struct *dup_mm(struct task_struct *tsk)
>  free_pt:
>  	/* don't put binfmt in mmput, we haven't got module yet */
>  	mm->binfmt = NULL;
> +	mm_init_owner(mm, NULL);
>  	mmput(mm);
>  
>  fail_nomem:
> @@ -1676,6 +1686,24 @@ static inline void rcu_copy_process(struct task_struct *p)
>  #endif /* #ifdef CONFIG_TASKS_RCU */
>  }
>  
> +#ifdef CONFIG_MEMCG
> +static void __delayed_free_task(struct rcu_head *rhp)
> +{
> +	struct task_struct *tsk = container_of(rhp, struct task_struct, rcu);
> +
> +	free_task(tsk);
> +}
> +#endif /* CONFIG_MEMCG */
> +
> +static __always_inline void delayed_free_task(struct task_struct *tsk)
> +{
> +#ifdef CONFIG_MEMCG
> +	call_rcu(&tsk->rcu, __delayed_free_task);
> +#else /* CONFIG_MEMCG */
> +	free_task(tsk);
> +#endif /* CONFIG_MEMCG */
> +}
> +
>  /*
>   * This creates a new process as a copy of the old one,
>   * but does not actually start it yet.
> @@ -2137,8 +2165,10 @@ static __latent_entropy struct task_struct *copy_process(
>  bad_fork_cleanup_namespaces:
>  	exit_task_namespaces(p);
>  bad_fork_cleanup_mm:
> -	if (p->mm)
> +	if (p->mm) {
> +		mm_clear_owner(p->mm, p);
>  		mmput(p->mm);
> +	}
>  bad_fork_cleanup_signal:
>  	if (!(clone_flags & CLONE_THREAD))
>  		free_signal_struct(p->signal);
> @@ -2169,7 +2199,7 @@ static __latent_entropy struct task_struct *copy_process(
>  bad_fork_free:
>  	p->state = TASK_DEAD;
>  	put_task_stack(p);
> -	free_task(p);
> +	delayed_free_task(p);
>  fork_out:
>  	spin_lock_irq(¤t->sighand->siglock);
>  	hlist_del_init(&delayed.node);
>
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-08  7:10                   ` zhong jiang
@ 2019-03-15 21:39                     ` Andrea Arcangeli
  2019-03-16  9:38                       ` zhong jiang
  0 siblings, 1 reply; 26+ messages in thread
From: Andrea Arcangeli @ 2019-03-15 21:39 UTC (permalink / raw)
  To: zhong jiang
  Cc: Mike Rapoport, Peter Xu, Andrew Morton, Dmitry Vyukov, syzbot,
	Michal Hocko, cgroups, Johannes Weiner, LKML, Linux-MM,
	syzkaller-bugs, Vladimir Davydov, David Rientjes, Hugh Dickins,
	Matthew Wilcox, Mel Gorman, Vlastimil Babka
On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote:
> I can reproduce the issue in arm64 qemu machine.  The issue will leave after applying the
> patch.
> 
> Tested-by: zhong jiang <zhongjiang@huawei.com>
Thanks a lot for the quick testing!
> Meanwhile,  I just has a little doubt whether it is necessary to use RCU to free the task struct or not.
> I think that mm->owner alway be NULL after failing to create to process. Because we call mm_clear_owner.
I wish it was enough, but the problem is that the other CPU may be in
the middle of get_mem_cgroup_from_mm() while this runs, and it would
dereference mm->owner while it is been freed without the call_rcu
affter we clear mm->owner. What prevents this race is the
rcu_read_lock() in get_mem_cgroup_from_mm() and the corresponding
call_rcu to free the task struct in the fork failure path (again only
if CONFIG_MEMCG=y is defined). Considering you can reproduce this tiny
race on arm64 qemu (perhaps tcg JIT timing variantions helps?), you
might also in theory be able to still reproduce the race condition if
you remove the call_rcu from delayed_free_task and you replace it with
free_task.
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-15 21:39                     ` Andrea Arcangeli
@ 2019-03-16  9:38                       ` zhong jiang
  2019-03-16 19:42                         ` Andrea Arcangeli
  0 siblings, 1 reply; 26+ messages in thread
From: zhong jiang @ 2019-03-16  9:38 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Mike Rapoport, Peter Xu, Andrew Morton, Dmitry Vyukov, syzbot,
	Michal Hocko, cgroups, Johannes Weiner, LKML, Linux-MM,
	syzkaller-bugs, Vladimir Davydov, David Rientjes, Hugh Dickins,
	Matthew Wilcox, Mel Gorman, Vlastimil Babka
On 2019/3/16 5:39, Andrea Arcangeli wrote:
> On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote:
>> I can reproduce the issue in arm64 qemu machine.  The issue will leave after applying the
>> patch.
>>
>> Tested-by: zhong jiang <zhongjiang@huawei.com>
> Thanks a lot for the quick testing!
>
>> Meanwhile,  I just has a little doubt whether it is necessary to use RCU to free the task struct or not.
>> I think that mm->owner alway be NULL after failing to create to process. Because we call mm_clear_owner.
> I wish it was enough, but the problem is that the other CPU may be in
> the middle of get_mem_cgroup_from_mm() while this runs, and it would
> dereference mm->owner while it is been freed without the call_rcu
> affter we clear mm->owner. What prevents this race is the
As you had said, It would dereference mm->owner after we clear mm->owner.
But after we clear mm->owner,  mm->owner should be NULL.  Is it right?
And mem_cgroup_from_task will check the parameter. 
you mean that it is possible after checking the parameter to  clear the owner .
and the NULL pointer will trigger. :-(
Thanks,
zhong jiang
> rcu_read_lock() in get_mem_cgroup_from_mm() and the corresponding
> call_rcu to free the task struct in the fork failure path (again only
> if CONFIG_MEMCG=y is defined). Considering you can reproduce this tiny
> race on arm64 qemu (perhaps tcg JIT timing variantions helps?), you
> might also in theory be able to still reproduce the race condition if
> you remove the call_rcu from delayed_free_task and you replace it with
> free_task.
>
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-16  9:38                       ` zhong jiang
@ 2019-03-16 19:42                         ` Andrea Arcangeli
  2019-03-18  6:23                           ` zhong jiang
  0 siblings, 1 reply; 26+ messages in thread
From: Andrea Arcangeli @ 2019-03-16 19:42 UTC (permalink / raw)
  To: zhong jiang
  Cc: Mike Rapoport, Peter Xu, Andrew Morton, Dmitry Vyukov, syzbot,
	Michal Hocko, cgroups, Johannes Weiner, LKML, Linux-MM,
	syzkaller-bugs, Vladimir Davydov, David Rientjes, Hugh Dickins,
	Matthew Wilcox, Mel Gorman, Vlastimil Babka
On Sat, Mar 16, 2019 at 05:38:54PM +0800, zhong jiang wrote:
> On 2019/3/16 5:39, Andrea Arcangeli wrote:
> > On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote:
> >> I can reproduce the issue in arm64 qemu machine.  The issue will leave after applying the
> >> patch.
> >>
> >> Tested-by: zhong jiang <zhongjiang@huawei.com>
> > Thanks a lot for the quick testing!
> >
> >> Meanwhile,  I just has a little doubt whether it is necessary to use RCU to free the task struct or not.
> >> I think that mm->owner alway be NULL after failing to create to process. Because we call mm_clear_owner.
> > I wish it was enough, but the problem is that the other CPU may be in
> > the middle of get_mem_cgroup_from_mm() while this runs, and it would
> > dereference mm->owner while it is been freed without the call_rcu
> > affter we clear mm->owner. What prevents this race is the
> As you had said, It would dereference mm->owner after we clear mm->owner.
> 
> But after we clear mm->owner,  mm->owner should be NULL.  Is it right?
> 
> And mem_cgroup_from_task will check the parameter. 
> you mean that it is possible after checking the parameter to  clear the owner .
> and the NULL pointer will trigger. :-(
Dereference mm->owner didn't mean reading the value of the mm->owner
pointer, it really means to dereference the value of the pointer. It's
like below:
get_mem_cgroup_from_mm()		failing fork()
----					---
task = mm->owner
					mm->owner = NULL;
					free(mm->owner)
*task /* use after free */
We didn't set mm->owner to NULL before, so the window for the race was
larger, but setting mm->owner to NULL only hides the problem and it
can still happen (albeit with a smaller window).
If get_mem_cgroup_from_mm() can see at any time mm->owner not NULL,
then the free of the task struct must be delayed until after
rcu_read_unlock has returned in get_mem_cgroup_from_mm(). This is
the standard RCU model, the freeing must be delayed until after the
next quiescent point.
BTW, both mm_update_next_owner() and mm_clear_owner() should have used
WRITE_ONCE when they write to mm->owner, I can update that too but
it's just to not to make assumptions that gcc does the right thing
(and we still rely on gcc to do the right thing in other places) so
that is just an orthogonal cleanup.
Thanks,
Andrea
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2019-03-16 19:42                         ` Andrea Arcangeli
@ 2019-03-18  6:23                           ` zhong jiang
  0 siblings, 0 replies; 26+ messages in thread
From: zhong jiang @ 2019-03-18  6:23 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Mike Rapoport, Peter Xu, Andrew Morton, Dmitry Vyukov, syzbot,
	Michal Hocko, cgroups, Johannes Weiner, LKML, Linux-MM,
	syzkaller-bugs, Vladimir Davydov, David Rientjes, Hugh Dickins,
	Matthew Wilcox, Mel Gorman, Vlastimil Babka
On 2019/3/17 3:42, Andrea Arcangeli wrote:
> On Sat, Mar 16, 2019 at 05:38:54PM +0800, zhong jiang wrote:
>> On 2019/3/16 5:39, Andrea Arcangeli wrote:
>>> On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote:
>>>> I can reproduce the issue in arm64 qemu machine.  The issue will leave after applying the
>>>> patch.
>>>>
>>>> Tested-by: zhong jiang <zhongjiang@huawei.com>
>>> Thanks a lot for the quick testing!
>>>
>>>> Meanwhile,  I just has a little doubt whether it is necessary to use RCU to free the task struct or not.
>>>> I think that mm->owner alway be NULL after failing to create to process. Because we call mm_clear_owner.
>>> I wish it was enough, but the problem is that the other CPU may be in
>>> the middle of get_mem_cgroup_from_mm() while this runs, and it would
>>> dereference mm->owner while it is been freed without the call_rcu
>>> affter we clear mm->owner. What prevents this race is the
>> As you had said, It would dereference mm->owner after we clear mm->owner.
>>
>> But after we clear mm->owner,  mm->owner should be NULL.  Is it right?
>>
>> And mem_cgroup_from_task will check the parameter. 
>> you mean that it is possible after checking the parameter to  clear the owner .
>> and the NULL pointer will trigger. :-(
> Dereference mm->owner didn't mean reading the value of the mm->owner
> pointer, it really means to dereference the value of the pointer. It's
> like below:
>
> get_mem_cgroup_from_mm()		failing fork()
> ----					---
> task = mm->owner
> 					mm->owner = NULL;
> 					free(mm->owner)
> *task /* use after free */
>
> We didn't set mm->owner to NULL before, so the window for the race was
> larger, but setting mm->owner to NULL only hides the problem and it
> can still happen (albeit with a smaller window).
>
> If get_mem_cgroup_from_mm() can see at any time mm->owner not NULL,
> then the free of the task struct must be delayed until after
> rcu_read_unlock has returned in get_mem_cgroup_from_mm(). This is
> the standard RCU model, the freeing must be delayed until after the
> next quiescent point.
Thank you for your explaination patiently.  The patch should go to upstream too.  I think you
should send a formal patch to the mainline.  Maybe other people suffer from
the issue.  :-)
Thanks,
zhong jiang
> BTW, both mm_update_next_owner() and mm_clear_owner() should have used
> WRITE_ONCE when they write to mm->owner, I can update that too but
> it's just to not to make assumptions that gcc does the right thing
> (and we still rely on gcc to do the right thing in other places) so
> that is just an orthogonal cleanup.
>
> Thanks,
> Andrea
>
> .
>
^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
  2018-11-07  1:52 KASAN: use-after-free Read in get_mem_cgroup_from_mm syzbot
  2018-12-04 15:43 ` syzbot
@ 2019-03-22  9:36 ` syzbot
  1 sibling, 0 replies; 26+ messages in thread
From: syzbot @ 2019-03-22  9:36 UTC (permalink / raw)
  To: aarcange, akpm, cgroups, dvyukov, hannes, hughd, linux-kernel,
	linux-mm, mgorman, mhocko, peterx, rientjes, rppt, rppt,
	syzkaller-bugs, vbabka, vdavydov.dev, willy, zhongjiang
Bisection is inconclusive: the first bad commit could be any of:
2c43838c sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default
bf29cb23 sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION
d94d1053 sched/isolation: Document boot parameters dependency on  
CONFIG_CPU_ISOLATION=y
4c470317 Merge branch 'sched-urgent-for-linus' of  
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1592b037200000
start commit:   0072a0c1
git tree:       upstream
dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12835e25400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
^ permalink raw reply	[flat|nested] 26+ messages in thread
end of thread, other threads:[~2019-03-22  9:36 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-07  1:52 KASAN: use-after-free Read in get_mem_cgroup_from_mm syzbot
2018-12-04 15:43 ` syzbot
2019-03-03 16:19   ` zhong jiang
2019-03-04  7:40     ` Dmitry Vyukov
2019-03-04 14:00       ` zhong jiang
2019-03-04 14:11         ` Dmitry Vyukov
2019-03-04 15:32           ` zhong jiang
2019-03-05  6:26             ` Dmitry Vyukov
2019-03-05  6:42               ` zhong jiang
2019-03-06  2:05                 ` Andrea Arcangeli
2019-03-06  5:53                   ` zhong jiang
2019-03-06  6:26                     ` Mike Rapoport
2019-03-06  7:41                       ` zhong jiang
2019-03-06  8:12                         ` Peter Xu
2019-03-06 13:07                           ` zhong jiang
2019-03-06 18:29                             ` Andrea Arcangeli
2019-03-07  7:58                               ` zhong jiang
2019-03-06  8:20                         ` Mike Rapoport
2019-03-08  7:10                   ` zhong jiang
2019-03-15 21:39                     ` Andrea Arcangeli
2019-03-16  9:38                       ` zhong jiang
2019-03-16 19:42                         ` Andrea Arcangeli
2019-03-18  6:23                           ` zhong jiang
2019-03-04 21:51     ` Matthew Wilcox
2019-03-05  3:09       ` zhong jiang
2019-03-22  9:36 ` syzbot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).