* KCSAN: data-race in task_dump_owner / task_dump_owner @ 2019-10-17 12:36 syzbot 2019-10-17 12:56 ` Marco Elver 0 siblings, 1 reply; 6+ messages in thread From: syzbot @ 2019-10-17 12:36 UTC (permalink / raw) To: adobriyan, akpm, casey, christian, elver, keescook, kent.overstreet, khlebnikov, linux-fsdevel, linux-kernel, mhocko, shakeelb, syzkaller-bugs, tglx Hello, syzbot found the following crash on: HEAD commit: d724f94f x86, kcsan: Enable KCSAN for x86 git tree: https://github.com/google/ktsan.git kcsan console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000 kernel config: https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80 dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891 compiler: gcc (GCC) 9.0.0 20181231 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com ================================================================== BUG: KCSAN: data-race in task_dump_owner / task_dump_owner write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0: task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 d_revalidate fs/namei.c:765 [inline] d_revalidate fs/namei.c:762 [inline] lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 walk_component+0x6d/0xe80 fs/namei.c:1804 link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139 link_path_walk fs/namei.c:2070 [inline] path_openat+0x14f/0x3530 fs/namei.c:3532 do_filp_open+0x11e/0x1b0 fs/namei.c:3563 do_sys_open+0x3b3/0x4f0 fs/open.c:1089 __do_sys_open fs/open.c:1107 [inline] __se_sys_open fs/open.c:1102 [inline] __x64_sys_open+0x55/0x70 fs/open.c:1102 do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1: task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 d_revalidate fs/namei.c:765 [inline] d_revalidate fs/namei.c:762 [inline] lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 walk_component+0x6d/0xe80 fs/namei.c:1804 lookup_last fs/namei.c:2271 [inline] path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316 filename_lookup+0x145/0x2d0 fs/namei.c:2346 user_path_at_empty+0x4c/0x70 fs/namei.c:2606 user_path_at include/linux/namei.h:60 [inline] vfs_statx+0xd9/0x190 fs/stat.c:187 vfs_stat include/linux/fs.h:3188 [inline] __do_sys_newstat+0x51/0xb0 fs/stat.c:341 __se_sys_newstat fs/stat.c:337 [inline] __x64_sys_newstat+0x3a/0x50 fs/stat.c:337 do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 ================================================================== --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KCSAN: data-race in task_dump_owner / task_dump_owner 2019-10-17 12:36 KCSAN: data-race in task_dump_owner / task_dump_owner syzbot @ 2019-10-17 12:56 ` Marco Elver 2019-10-17 18:17 ` Alexey Dobriyan 0 siblings, 1 reply; 6+ messages in thread From: Marco Elver @ 2019-10-17 12:56 UTC (permalink / raw) To: syzbot Cc: adobriyan, Andrew Morton, casey, christian, Kees Cook, kent.overstreet, khlebnikov, linux-fsdevel, LKML, mhocko, shakeelb, syzkaller-bugs, Thomas Gleixner Hi, On Thu, 17 Oct 2019 at 14:36, syzbot <syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote: > > Hello, > > syzbot found the following crash on: > > HEAD commit: d724f94f x86, kcsan: Enable KCSAN for x86 > git tree: https://github.com/google/ktsan.git kcsan > console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000 > kernel config: https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80 > dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891 > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > Unfortunately, I don't have any reproducer for this crash yet. > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com > > ================================================================== > BUG: KCSAN: data-race in task_dump_owner / task_dump_owner > > write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0: > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > d_revalidate fs/namei.c:765 [inline] > d_revalidate fs/namei.c:762 [inline] > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > walk_component+0x6d/0xe80 fs/namei.c:1804 > link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139 > link_path_walk fs/namei.c:2070 [inline] > path_openat+0x14f/0x3530 fs/namei.c:3532 > do_filp_open+0x11e/0x1b0 fs/namei.c:3563 > do_sys_open+0x3b3/0x4f0 fs/open.c:1089 > __do_sys_open fs/open.c:1107 [inline] > __se_sys_open fs/open.c:1102 [inline] > __x64_sys_open+0x55/0x70 fs/open.c:1102 > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1: > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > d_revalidate fs/namei.c:765 [inline] > d_revalidate fs/namei.c:762 [inline] > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > walk_component+0x6d/0xe80 fs/namei.c:1804 > lookup_last fs/namei.c:2271 [inline] > path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316 > filename_lookup+0x145/0x2d0 fs/namei.c:2346 > user_path_at_empty+0x4c/0x70 fs/namei.c:2606 > user_path_at include/linux/namei.h:60 [inline] > vfs_statx+0xd9/0x190 fs/stat.c:187 > vfs_stat include/linux/fs.h:3188 [inline] > __do_sys_newstat+0x51/0xb0 fs/stat.c:341 > __se_sys_newstat fs/stat.c:337 [inline] > __x64_sys_newstat+0x3a/0x50 fs/stat.c:337 > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > ================================================================== My understanding is, that for every access to /proc/<pid>, d_revalidate is called, and /proc-fs implementation simply says that pid_revalidate always revalidates by rewriting uid/gid because "owning task may have performed a setuid(), etc." presumably so every access to a /proc/<pid> entry always has the right uid/gid (in effect updating /proc/<pid> lazily via d_revalidate). Is it possible that one of the tasks above could be preempted after doing its writes to *ruid/*rgid, another thread writing some other values (after setuid / seteuid), and then the preempted thread seeing the other values? Assertion here should never fail: === TASK 1 === | seteuid(1000); | seteuid(0); | stat("/proc/<pid-of-task-1>", &fstat); | assert(fstat.st_uid == 0); === TASK 2 === | stat("/proc/<pid-of-task-1>", ...); Best Wishes, -- Marco ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KCSAN: data-race in task_dump_owner / task_dump_owner 2019-10-17 12:56 ` Marco Elver @ 2019-10-17 18:17 ` Alexey Dobriyan 2019-10-17 18:32 ` Marco Elver 0 siblings, 1 reply; 6+ messages in thread From: Alexey Dobriyan @ 2019-10-17 18:17 UTC (permalink / raw) To: Marco Elver Cc: syzbot, Andrew Morton, casey, christian, Kees Cook, kent.overstreet, khlebnikov, linux-fsdevel, LKML, mhocko, shakeelb, syzkaller-bugs, Thomas Gleixner On Thu, Oct 17, 2019 at 02:56:47PM +0200, Marco Elver wrote: > Hi, > > On Thu, 17 Oct 2019 at 14:36, syzbot > <syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote: > > > > Hello, > > > > syzbot found the following crash on: > > > > HEAD commit: d724f94f x86, kcsan: Enable KCSAN for x86 > > git tree: https://github.com/google/ktsan.git kcsan > > console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80 > > dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891 > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com > > > > ================================================================== > > BUG: KCSAN: data-race in task_dump_owner / task_dump_owner > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0: > > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > > d_revalidate fs/namei.c:765 [inline] > > d_revalidate fs/namei.c:762 [inline] > > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > > walk_component+0x6d/0xe80 fs/namei.c:1804 > > link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139 > > link_path_walk fs/namei.c:2070 [inline] > > path_openat+0x14f/0x3530 fs/namei.c:3532 > > do_filp_open+0x11e/0x1b0 fs/namei.c:3563 > > do_sys_open+0x3b3/0x4f0 fs/open.c:1089 > > __do_sys_open fs/open.c:1107 [inline] > > __se_sys_open fs/open.c:1102 [inline] > > __x64_sys_open+0x55/0x70 fs/open.c:1102 > > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1: > > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > > d_revalidate fs/namei.c:765 [inline] > > d_revalidate fs/namei.c:762 [inline] > > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > > walk_component+0x6d/0xe80 fs/namei.c:1804 > > lookup_last fs/namei.c:2271 [inline] > > path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316 > > filename_lookup+0x145/0x2d0 fs/namei.c:2346 > > user_path_at_empty+0x4c/0x70 fs/namei.c:2606 > > user_path_at include/linux/namei.h:60 [inline] > > vfs_statx+0xd9/0x190 fs/stat.c:187 > > vfs_stat include/linux/fs.h:3188 [inline] > > __do_sys_newstat+0x51/0xb0 fs/stat.c:341 > > __se_sys_newstat fs/stat.c:337 [inline] > > __x64_sys_newstat+0x3a/0x50 fs/stat.c:337 > > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > Reported by Kernel Concurrency Sanitizer on: > > CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > ================================================================== > > My understanding is, that for every access to /proc/<pid>, > d_revalidate is called, and /proc-fs implementation simply says that > pid_revalidate always revalidates by rewriting uid/gid because "owning > task may have performed a setuid(), etc." presumably so every access > to a /proc/<pid> entry always has the right uid/gid (in effect > updating /proc/<pid> lazily via d_revalidate). > > Is it possible that one of the tasks above could be preempted after > doing its writes to *ruid/*rgid, another thread writing some other > values (after setuid / seteuid), and then the preempted thread seeing > the other values? Assertion here should never fail: > === TASK 1 === > | seteuid(1000); > | seteuid(0); > | stat("/proc/<pid-of-task-1>", &fstat); > | assert(fstat.st_uid == 0); > === TASK 2 === > | stat("/proc/<pid-of-task-1>", ...); Is it the same as pid_revalidate() snapshots (uid,gid) correctly but writeback is done in any order? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KCSAN: data-race in task_dump_owner / task_dump_owner 2019-10-17 18:17 ` Alexey Dobriyan @ 2019-10-17 18:32 ` Marco Elver 2019-10-23 17:06 ` Dmitry Vyukov 0 siblings, 1 reply; 6+ messages in thread From: Marco Elver @ 2019-10-17 18:32 UTC (permalink / raw) To: Alexey Dobriyan Cc: syzbot, Andrew Morton, casey, christian, Kees Cook, kent.overstreet, khlebnikov, linux-fsdevel, LKML, mhocko, Shakeel Butt, syzkaller-bugs, Thomas Gleixner On Thu, 17 Oct 2019 at 20:17, Alexey Dobriyan <adobriyan@gmail.com> wrote: > > On Thu, Oct 17, 2019 at 02:56:47PM +0200, Marco Elver wrote: > > Hi, > > > > On Thu, 17 Oct 2019 at 14:36, syzbot > > <syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote: > > > > > > Hello, > > > > > > syzbot found the following crash on: > > > > > > HEAD commit: d724f94f x86, kcsan: Enable KCSAN for x86 > > > git tree: https://github.com/google/ktsan.git kcsan > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891 > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com > > > > > > ================================================================== > > > BUG: KCSAN: data-race in task_dump_owner / task_dump_owner > > > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0: > > > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > > > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > > > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > > > d_revalidate fs/namei.c:765 [inline] > > > d_revalidate fs/namei.c:762 [inline] > > > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > > > walk_component+0x6d/0xe80 fs/namei.c:1804 > > > link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139 > > > link_path_walk fs/namei.c:2070 [inline] > > > path_openat+0x14f/0x3530 fs/namei.c:3532 > > > do_filp_open+0x11e/0x1b0 fs/namei.c:3563 > > > do_sys_open+0x3b3/0x4f0 fs/open.c:1089 > > > __do_sys_open fs/open.c:1107 [inline] > > > __se_sys_open fs/open.c:1102 [inline] > > > __x64_sys_open+0x55/0x70 fs/open.c:1102 > > > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1: > > > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > > > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > > > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > > > d_revalidate fs/namei.c:765 [inline] > > > d_revalidate fs/namei.c:762 [inline] > > > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > > > walk_component+0x6d/0xe80 fs/namei.c:1804 > > > lookup_last fs/namei.c:2271 [inline] > > > path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316 > > > filename_lookup+0x145/0x2d0 fs/namei.c:2346 > > > user_path_at_empty+0x4c/0x70 fs/namei.c:2606 > > > user_path_at include/linux/namei.h:60 [inline] > > > vfs_statx+0xd9/0x190 fs/stat.c:187 > > > vfs_stat include/linux/fs.h:3188 [inline] > > > __do_sys_newstat+0x51/0xb0 fs/stat.c:341 > > > __se_sys_newstat fs/stat.c:337 [inline] > > > __x64_sys_newstat+0x3a/0x50 fs/stat.c:337 > > > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0 > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > > Google 01/01/2011 > > > ================================================================== > > > > My understanding is, that for every access to /proc/<pid>, > > d_revalidate is called, and /proc-fs implementation simply says that > > pid_revalidate always revalidates by rewriting uid/gid because "owning > > task may have performed a setuid(), etc." presumably so every access > > to a /proc/<pid> entry always has the right uid/gid (in effect > > updating /proc/<pid> lazily via d_revalidate). > > > > Is it possible that one of the tasks above could be preempted after > > doing its writes to *ruid/*rgid, another thread writing some other > > values (after setuid / seteuid), and then the preempted thread seeing > > the other values? Assertion here should never fail: > > === TASK 1 === > > | seteuid(1000); > > | seteuid(0); > > | stat("/proc/<pid-of-task-1>", &fstat); > > | assert(fstat.st_uid == 0); > > === TASK 2 === > > | stat("/proc/<pid-of-task-1>", ...); > > Is it the same as > pid_revalidate() snapshots (uid,gid) correctly > but writeback is done in any order? Yes, I think so. Snapshot is done in RCU reader critical section, but the writes can race with another thread. Is there logic that ensures this doesn't lead to the observable outcome above? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KCSAN: data-race in task_dump_owner / task_dump_owner 2019-10-17 18:32 ` Marco Elver @ 2019-10-23 17:06 ` Dmitry Vyukov 2019-10-24 8:35 ` Dmitry Vyukov 0 siblings, 1 reply; 6+ messages in thread From: Dmitry Vyukov @ 2019-10-23 17:06 UTC (permalink / raw) To: Marco Elver Cc: Alexey Dobriyan, syzbot, Andrew Morton, Casey Schaufler, Christian Brauner, Kees Cook, Kent Overstreet, Konstantin Khlebnikov, linux-fsdevel, LKML, Michal Hocko, Shakeel Butt, syzkaller-bugs, Thomas Gleixner, Eric Dumazet On Thu, Oct 17, 2019 at 8:33 PM 'Marco Elver' via syzkaller-bugs <syzkaller-bugs@googlegroups.com> wrote: > > On Thu, 17 Oct 2019 at 20:17, Alexey Dobriyan <adobriyan@gmail.com> wrote: > > > > On Thu, Oct 17, 2019 at 02:56:47PM +0200, Marco Elver wrote: > > > Hi, > > > > > > On Thu, 17 Oct 2019 at 14:36, syzbot > > > <syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote: > > > > > > > > Hello, > > > > > > > > syzbot found the following crash on: > > > > > > > > HEAD commit: d724f94f x86, kcsan: Enable KCSAN for x86 > > > > git tree: https://github.com/google/ktsan.git kcsan > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80 > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891 > > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > > > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com > > > > > > > > ================================================================== > > > > BUG: KCSAN: data-race in task_dump_owner / task_dump_owner > > > > > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0: > > > > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > > > > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > > > > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > > > > d_revalidate fs/namei.c:765 [inline] > > > > d_revalidate fs/namei.c:762 [inline] > > > > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > > > > walk_component+0x6d/0xe80 fs/namei.c:1804 > > > > link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139 > > > > link_path_walk fs/namei.c:2070 [inline] > > > > path_openat+0x14f/0x3530 fs/namei.c:3532 > > > > do_filp_open+0x11e/0x1b0 fs/namei.c:3563 > > > > do_sys_open+0x3b3/0x4f0 fs/open.c:1089 > > > > __do_sys_open fs/open.c:1107 [inline] > > > > __se_sys_open fs/open.c:1102 [inline] > > > > __x64_sys_open+0x55/0x70 fs/open.c:1102 > > > > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1: > > > > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > > > > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > > > > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > > > > d_revalidate fs/namei.c:765 [inline] > > > > d_revalidate fs/namei.c:762 [inline] > > > > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > > > > walk_component+0x6d/0xe80 fs/namei.c:1804 > > > > lookup_last fs/namei.c:2271 [inline] > > > > path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316 > > > > filename_lookup+0x145/0x2d0 fs/namei.c:2346 > > > > user_path_at_empty+0x4c/0x70 fs/namei.c:2606 > > > > user_path_at include/linux/namei.h:60 [inline] > > > > vfs_statx+0xd9/0x190 fs/stat.c:187 > > > > vfs_stat include/linux/fs.h:3188 [inline] > > > > __do_sys_newstat+0x51/0xb0 fs/stat.c:341 > > > > __se_sys_newstat fs/stat.c:337 [inline] > > > > __x64_sys_newstat+0x3a/0x50 fs/stat.c:337 > > > > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > > CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0 > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > > > Google 01/01/2011 > > > > ================================================================== > > > > > > My understanding is, that for every access to /proc/<pid>, > > > d_revalidate is called, and /proc-fs implementation simply says that > > > pid_revalidate always revalidates by rewriting uid/gid because "owning > > > task may have performed a setuid(), etc." presumably so every access > > > to a /proc/<pid> entry always has the right uid/gid (in effect > > > updating /proc/<pid> lazily via d_revalidate). > > > > > > Is it possible that one of the tasks above could be preempted after > > > doing its writes to *ruid/*rgid, another thread writing some other > > > values (after setuid / seteuid), and then the preempted thread seeing > > > the other values? Assertion here should never fail: > > > === TASK 1 === > > > | seteuid(1000); > > > | seteuid(0); > > > | stat("/proc/<pid-of-task-1>", &fstat); > > > | assert(fstat.st_uid == 0); > > > === TASK 2 === > > > | stat("/proc/<pid-of-task-1>", ...); > > > > Is it the same as > > pid_revalidate() snapshots (uid,gid) correctly > > but writeback is done in any order? > > Yes, I think so. Snapshot is done in RCU reader critical section, but > the writes can race with another thread. Is there logic that ensures > this doesn't lead to the observable outcome above? I found the case where this leads to an observable bug. common_perm_cond() in security/apparmor/lsm.c reads the inode uid and uses it for the security check: static int common_perm_cond(const char *op, const struct path *path, u32 mask) { struct path_cond cond = { d_backing_inode(path->dentry)->i_uid, d_backing_inode(path->dentry)->i_mode }; Now consider the following test program: #define _GNU_SOURCE #include <stdio.h> #include <sys/stat.h> #include <sys/types.h> #include <unistd.h> #include <stdlib.h> #include <pthread.h> void *thr(void *arg) { for (;;) { struct stat file_stat; stat((char*)arg, &file_stat); } return 0; } int main(int argc, char *argv[]) { char proc[32]; sprintf(proc, "/proc/%d", getpid()); printf("%s\n", proc); pthread_t th; pthread_create(&th, 0, thr, proc); for (;;) { seteuid(1000); usleep(1); seteuid(0); struct stat file_stat; stat(proc, &file_stat); } return 0; } Whenever the main thread does stat, it must observe inode.uid == 0 in common_perm_cond(). But since task_dump_owner() does writeback out of order, it can lead to non-linearizable executions and main thread observing inode.uid == 1000. This in turn can lead to both false negatives and false positives from AppArmour (false denying access and falsely permitting access). I don't know how to setup actual AppArmour profile to do this, but I see this guide mentions "owner @{PROC}/[0-9]*" in a policy, so I assume it's possible: https://gitlab.com/apparmor/apparmor/wikis/Profiling_by_hand Instead, I added the following check to common_perm_cond() (it's dirty, but you get the idea): @@ -218,6 +218,15 @@ static int common_perm_cond(const char *op, const struct path *path, u32 mask) d_backing_inode(path->dentry)->i_mode }; + if (op == OP_GETATTR && mask == AA_MAY_GETATTR && cond.uid.val != 0) { + char buf1[64], buf2[64]; + char *str = d_path(path, buf1, sizeof(buf1)); + sprintf(buf2, "/proc/%d", current->pid); + if (!strcmp(str, buf2)) + pr_err("common_perm_cond: path=%s pid=%d uid=%d\n", + str, current->pid, cond.uid.val); + } Now when I run the program, I see how it fires every few seconds: # ./a.out /proc/1548 [ 123.233107] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 [ 126.142869] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 [ 127.048353] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 [ 128.181873] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 [ 128.557104] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 [ 144.690774] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 Which means AppArmour acts based on the wrong UID. Obviously can lead to falsely denying access, but also falsely permitting access. Consider the following scenario. A process sets owner UID on a file so that a child process won't be able to access it, after that it starts the child process. common_perm_cond() in the child process should observe the new owner UID. However, if there a random other process simply doing stat() or something similar on the file, now the common_perm_cond() in the child can suddenly observe the old UID, which will be permitted by AppArmour. Boom! I've tried to apply "proc: fix inode uid/gid writeback race": https://lore.kernel.org/lkml/20191020173010.GA14744@avx2/ but it does _not_ help because it does not really resolve the non-atomic snapshot and writeback of UID. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KCSAN: data-race in task_dump_owner / task_dump_owner 2019-10-23 17:06 ` Dmitry Vyukov @ 2019-10-24 8:35 ` Dmitry Vyukov 0 siblings, 0 replies; 6+ messages in thread From: Dmitry Vyukov @ 2019-10-24 8:35 UTC (permalink / raw) To: Marco Elver Cc: Alexey Dobriyan, syzbot, Andrew Morton, Casey Schaufler, Christian Brauner, Kees Cook, Kent Overstreet, Konstantin Khlebnikov, linux-fsdevel, LKML, Michal Hocko, Shakeel Butt, syzkaller-bugs, Thomas Gleixner, Eric Dumazet On Wed, Oct 23, 2019 at 7:06 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Thu, Oct 17, 2019 at 8:33 PM 'Marco Elver' via syzkaller-bugs > <syzkaller-bugs@googlegroups.com> wrote: > > > > On Thu, 17 Oct 2019 at 20:17, Alexey Dobriyan <adobriyan@gmail.com> wrote: > > > > > > On Thu, Oct 17, 2019 at 02:56:47PM +0200, Marco Elver wrote: > > > > Hi, > > > > > > > > On Thu, 17 Oct 2019 at 14:36, syzbot > > > > <syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote: > > > > > > > > > > Hello, > > > > > > > > > > syzbot found the following crash on: > > > > > > > > > > HEAD commit: d724f94f x86, kcsan: Enable KCSAN for x86 > > > > > git tree: https://github.com/google/ktsan.git kcsan > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000 > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80 > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891 > > > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > > > > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > > > > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > > > > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com > > > > > > > > > > ================================================================== > > > > > BUG: KCSAN: data-race in task_dump_owner / task_dump_owner > > > > > > > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0: > > > > > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > > > > > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > > > > > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > > > > > d_revalidate fs/namei.c:765 [inline] > > > > > d_revalidate fs/namei.c:762 [inline] > > > > > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > > > > > walk_component+0x6d/0xe80 fs/namei.c:1804 > > > > > link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139 > > > > > link_path_walk fs/namei.c:2070 [inline] > > > > > path_openat+0x14f/0x3530 fs/namei.c:3532 > > > > > do_filp_open+0x11e/0x1b0 fs/namei.c:3563 > > > > > do_sys_open+0x3b3/0x4f0 fs/open.c:1089 > > > > > __do_sys_open fs/open.c:1107 [inline] > > > > > __se_sys_open fs/open.c:1102 [inline] > > > > > __x64_sys_open+0x55/0x70 fs/open.c:1102 > > > > > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > > > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1: > > > > > task_dump_owner+0xd8/0x260 fs/proc/base.c:1742 > > > > > pid_update_inode+0x3c/0x70 fs/proc/base.c:1818 > > > > > pid_revalidate+0x91/0xd0 fs/proc/base.c:1841 > > > > > d_revalidate fs/namei.c:765 [inline] > > > > > d_revalidate fs/namei.c:762 [inline] > > > > > lookup_fast+0x7cb/0x7e0 fs/namei.c:1613 > > > > > walk_component+0x6d/0xe80 fs/namei.c:1804 > > > > > lookup_last fs/namei.c:2271 [inline] > > > > > path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316 > > > > > filename_lookup+0x145/0x2d0 fs/namei.c:2346 > > > > > user_path_at_empty+0x4c/0x70 fs/namei.c:2606 > > > > > user_path_at include/linux/namei.h:60 [inline] > > > > > vfs_statx+0xd9/0x190 fs/stat.c:187 > > > > > vfs_stat include/linux/fs.h:3188 [inline] > > > > > __do_sys_newstat+0x51/0xb0 fs/stat.c:341 > > > > > __se_sys_newstat fs/stat.c:337 [inline] > > > > > __x64_sys_newstat+0x3a/0x50 fs/stat.c:337 > > > > > do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 > > > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > > > CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0 > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > > > > Google 01/01/2011 > > > > > ================================================================== > > > > > > > > My understanding is, that for every access to /proc/<pid>, > > > > d_revalidate is called, and /proc-fs implementation simply says that > > > > pid_revalidate always revalidates by rewriting uid/gid because "owning > > > > task may have performed a setuid(), etc." presumably so every access > > > > to a /proc/<pid> entry always has the right uid/gid (in effect > > > > updating /proc/<pid> lazily via d_revalidate). > > > > > > > > Is it possible that one of the tasks above could be preempted after > > > > doing its writes to *ruid/*rgid, another thread writing some other > > > > values (after setuid / seteuid), and then the preempted thread seeing > > > > the other values? Assertion here should never fail: > > > > === TASK 1 === > > > > | seteuid(1000); > > > > | seteuid(0); > > > > | stat("/proc/<pid-of-task-1>", &fstat); > > > > | assert(fstat.st_uid == 0); > > > > === TASK 2 === > > > > | stat("/proc/<pid-of-task-1>", ...); > > > > > > Is it the same as > > > pid_revalidate() snapshots (uid,gid) correctly > > > but writeback is done in any order? > > > > Yes, I think so. Snapshot is done in RCU reader critical section, but > > the writes can race with another thread. Is there logic that ensures > > this doesn't lead to the observable outcome above? > > > I found the case where this leads to an observable bug. > common_perm_cond() in security/apparmor/lsm.c reads the inode uid and > uses it for the security check: > > static int common_perm_cond(const char *op, const struct path *path, u32 mask) > { > struct path_cond cond = { d_backing_inode(path->dentry)->i_uid, > > d_backing_inode(path->dentry)->i_mode > }; > > Now consider the following test program: > > #define _GNU_SOURCE > #include <stdio.h> > #include <sys/stat.h> > #include <sys/types.h> > #include <unistd.h> > #include <stdlib.h> > #include <pthread.h> > > void *thr(void *arg) > { > for (;;) { > struct stat file_stat; > stat((char*)arg, &file_stat); > } > return 0; > } > > int main(int argc, char *argv[]) > { > char proc[32]; > sprintf(proc, "/proc/%d", getpid()); > printf("%s\n", proc); > pthread_t th; > pthread_create(&th, 0, thr, proc); > for (;;) { > seteuid(1000); > usleep(1); > seteuid(0); > struct stat file_stat; > stat(proc, &file_stat); > } > return 0; > } > > Whenever the main thread does stat, it must observe inode.uid == 0 in > common_perm_cond(). > > But since task_dump_owner() does writeback out of order, it can lead > to non-linearizable executions and main thread observing inode.uid == > 1000. > This in turn can lead to both false negatives and false positives from > AppArmour (false denying access and falsely permitting access). > > I don't know how to setup actual AppArmour profile to do this, but I > see this guide mentions "owner @{PROC}/[0-9]*" in a policy, so I > assume it's possible: > https://gitlab.com/apparmor/apparmor/wikis/Profiling_by_hand > > Instead, I added the following check to common_perm_cond() (it's > dirty, but you get the idea): > > @@ -218,6 +218,15 @@ static int common_perm_cond(const char *op, const > struct path *path, u32 mask) > d_backing_inode(path->dentry)->i_mode > }; > + if (op == OP_GETATTR && mask == AA_MAY_GETATTR && cond.uid.val != 0) { > + char buf1[64], buf2[64]; > + char *str = d_path(path, buf1, sizeof(buf1)); > + sprintf(buf2, "/proc/%d", current->pid); > + if (!strcmp(str, buf2)) > + pr_err("common_perm_cond: path=%s pid=%d uid=%d\n", > + str, current->pid, cond.uid.val); > + } > > Now when I run the program, I see how it fires every few seconds: > > # ./a.out > /proc/1548 > [ 123.233107] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 > [ 126.142869] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 > [ 127.048353] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 > [ 128.181873] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 > [ 128.557104] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 > [ 144.690774] common_perm_cond: path=/proc/1548 pid=1548 uid=1000 > > Which means AppArmour acts based on the wrong UID. Obviously can lead > to falsely denying access, but also falsely permitting access. > Consider the following scenario. > A process sets owner UID on a file so that a child process won't be > able to access it, after that it starts the child process. > common_perm_cond() in the child process should observe the new owner > UID. However, if there a random other process simply doing stat() or > something similar on the file, now the common_perm_cond() in the child > can suddenly observe the old UID, which will be permitted by > AppArmour. Boom! > > I've tried to apply "proc: fix inode uid/gid writeback race": > https://lore.kernel.org/lkml/20191020173010.GA14744@avx2/ > but it does _not_ help because it does not really resolve the > non-atomic snapshot and writeback of UID. FTR here is the corresponding race report: https://groups.google.com/d/msg/syzkaller-bugs/Cs07ly_Nmtg/avW672LrAQAJ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-10-24 8:35 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-10-17 12:36 KCSAN: data-race in task_dump_owner / task_dump_owner syzbot 2019-10-17 12:56 ` Marco Elver 2019-10-17 18:17 ` Alexey Dobriyan 2019-10-17 18:32 ` Marco Elver 2019-10-23 17:06 ` Dmitry Vyukov 2019-10-24 8:35 ` Dmitry Vyukov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox