KCSAN: data-race in task_dump_owner / task_dump

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* KCSAN: data-race in task_dump_owner / task_dump_owner
@ 2019-10-17 12:36 syzbot
  2019-10-17 12:56 ` Marco Elver
  0 siblings, 1 reply; 6+ messages in thread
From: syzbot @ 2019-10-17 12:36 UTC (permalink / raw)
  To: adobriyan, akpm, casey, christian, elver, keescook,
	kent.overstreet, khlebnikov, linux-fsdevel, linux-kernel, mhocko,
	shakeelb, syzkaller-bugs, tglx

Hello,

syzbot found the following crash on:

HEAD commit:    d724f94f x86, kcsan: Enable KCSAN for x86
git tree:       https://github.com/google/ktsan.git kcsan
console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in task_dump_owner / task_dump_owner

write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0:
  task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
  pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
  pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
  d_revalidate fs/namei.c:765 [inline]
  d_revalidate fs/namei.c:762 [inline]
  lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
  walk_component+0x6d/0xe80 fs/namei.c:1804
  link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139
  link_path_walk fs/namei.c:2070 [inline]
  path_openat+0x14f/0x3530 fs/namei.c:3532
  do_filp_open+0x11e/0x1b0 fs/namei.c:3563
  do_sys_open+0x3b3/0x4f0 fs/open.c:1089
  __do_sys_open fs/open.c:1107 [inline]
  __se_sys_open fs/open.c:1102 [inline]
  __x64_sys_open+0x55/0x70 fs/open.c:1102
  do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1:
  task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
  pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
  pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
  d_revalidate fs/namei.c:765 [inline]
  d_revalidate fs/namei.c:762 [inline]
  lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
  walk_component+0x6d/0xe80 fs/namei.c:1804
  lookup_last fs/namei.c:2271 [inline]
  path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316
  filename_lookup+0x145/0x2d0 fs/namei.c:2346
  user_path_at_empty+0x4c/0x70 fs/namei.c:2606
  user_path_at include/linux/namei.h:60 [inline]
  vfs_statx+0xd9/0x190 fs/stat.c:187
  vfs_stat include/linux/fs.h:3188 [inline]
  __do_sys_newstat+0x51/0xb0 fs/stat.c:341
  __se_sys_newstat fs/stat.c:337 [inline]
  __x64_sys_newstat+0x3a/0x50 fs/stat.c:337
  do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KCSAN: data-race in task_dump_owner / task_dump_owner
  2019-10-17 12:36 KCSAN: data-race in task_dump_owner / task_dump_owner syzbot
@ 2019-10-17 12:56 ` Marco Elver
  2019-10-17 18:17   ` Alexey Dobriyan
  0 siblings, 1 reply; 6+ messages in thread
From: Marco Elver @ 2019-10-17 12:56 UTC (permalink / raw)
  To: syzbot
  Cc: adobriyan, Andrew Morton, casey, christian, Kees Cook,
	kent.overstreet, khlebnikov, linux-fsdevel, LKML, mhocko,
	shakeelb, syzkaller-bugs, Thomas Gleixner

Hi,

On Thu, 17 Oct 2019 at 14:36, syzbot
<syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    d724f94f x86, kcsan: Enable KCSAN for x86
> git tree:       https://github.com/google/ktsan.git kcsan
> console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KCSAN: data-race in task_dump_owner / task_dump_owner
>
> write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0:
>   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
>   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
>   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
>   d_revalidate fs/namei.c:765 [inline]
>   d_revalidate fs/namei.c:762 [inline]
>   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
>   walk_component+0x6d/0xe80 fs/namei.c:1804
>   link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139
>   link_path_walk fs/namei.c:2070 [inline]
>   path_openat+0x14f/0x3530 fs/namei.c:3532
>   do_filp_open+0x11e/0x1b0 fs/namei.c:3563
>   do_sys_open+0x3b3/0x4f0 fs/open.c:1089
>   __do_sys_open fs/open.c:1107 [inline]
>   __se_sys_open fs/open.c:1102 [inline]
>   __x64_sys_open+0x55/0x70 fs/open.c:1102
>   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1:
>   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
>   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
>   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
>   d_revalidate fs/namei.c:765 [inline]
>   d_revalidate fs/namei.c:762 [inline]
>   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
>   walk_component+0x6d/0xe80 fs/namei.c:1804
>   lookup_last fs/namei.c:2271 [inline]
>   path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316
>   filename_lookup+0x145/0x2d0 fs/namei.c:2346
>   user_path_at_empty+0x4c/0x70 fs/namei.c:2606
>   user_path_at include/linux/namei.h:60 [inline]
>   vfs_statx+0xd9/0x190 fs/stat.c:187
>   vfs_stat include/linux/fs.h:3188 [inline]
>   __do_sys_newstat+0x51/0xb0 fs/stat.c:341
>   __se_sys_newstat fs/stat.c:337 [inline]
>   __x64_sys_newstat+0x3a/0x50 fs/stat.c:337
>   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> ==================================================================

My understanding is, that for every access to /proc/<pid>,
d_revalidate is called, and /proc-fs implementation simply says that
pid_revalidate always revalidates by rewriting uid/gid because "owning
task may have performed a setuid(), etc." presumably so every access
to a /proc/<pid> entry always has the right uid/gid (in effect
updating /proc/<pid> lazily via d_revalidate).

Is it possible that one of the tasks above could be preempted after
doing its writes to *ruid/*rgid, another thread writing some other
values (after setuid / seteuid), and then the preempted thread seeing
the other values? Assertion here should never fail:
=== TASK 1 ===
| seteuid(1000);
| seteuid(0);
| stat("/proc/<pid-of-task-1>", &fstat);
| assert(fstat.st_uid == 0);
=== TASK 2 ===
| stat("/proc/<pid-of-task-1>", ...);


Best Wishes,
-- Marco

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KCSAN: data-race in task_dump_owner / task_dump_owner
  2019-10-17 12:56 ` Marco Elver
@ 2019-10-17 18:17   ` Alexey Dobriyan
  2019-10-17 18:32     ` Marco Elver
  0 siblings, 1 reply; 6+ messages in thread
From: Alexey Dobriyan @ 2019-10-17 18:17 UTC (permalink / raw)
  To: Marco Elver
  Cc: syzbot, Andrew Morton, casey, christian, Kees Cook,
	kent.overstreet, khlebnikov, linux-fsdevel, LKML, mhocko,
	shakeelb, syzkaller-bugs, Thomas Gleixner

On Thu, Oct 17, 2019 at 02:56:47PM +0200, Marco Elver wrote:
> Hi,
> 
> On Thu, 17 Oct 2019 at 14:36, syzbot
> <syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    d724f94f x86, kcsan: Enable KCSAN for x86
> > git tree:       https://github.com/google/ktsan.git kcsan
> > console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com
> >
> > ==================================================================
> > BUG: KCSAN: data-race in task_dump_owner / task_dump_owner
> >
> > write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0:
> >   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
> >   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
> >   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
> >   d_revalidate fs/namei.c:765 [inline]
> >   d_revalidate fs/namei.c:762 [inline]
> >   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
> >   walk_component+0x6d/0xe80 fs/namei.c:1804
> >   link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139
> >   link_path_walk fs/namei.c:2070 [inline]
> >   path_openat+0x14f/0x3530 fs/namei.c:3532
> >   do_filp_open+0x11e/0x1b0 fs/namei.c:3563
> >   do_sys_open+0x3b3/0x4f0 fs/open.c:1089
> >   __do_sys_open fs/open.c:1107 [inline]
> >   __se_sys_open fs/open.c:1102 [inline]
> >   __x64_sys_open+0x55/0x70 fs/open.c:1102
> >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1:
> >   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
> >   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
> >   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
> >   d_revalidate fs/namei.c:765 [inline]
> >   d_revalidate fs/namei.c:762 [inline]
> >   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
> >   walk_component+0x6d/0xe80 fs/namei.c:1804
> >   lookup_last fs/namei.c:2271 [inline]
> >   path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316
> >   filename_lookup+0x145/0x2d0 fs/namei.c:2346
> >   user_path_at_empty+0x4c/0x70 fs/namei.c:2606
> >   user_path_at include/linux/namei.h:60 [inline]
> >   vfs_statx+0xd9/0x190 fs/stat.c:187
> >   vfs_stat include/linux/fs.h:3188 [inline]
> >   __do_sys_newstat+0x51/0xb0 fs/stat.c:341
> >   __se_sys_newstat fs/stat.c:337 [inline]
> >   __x64_sys_newstat+0x3a/0x50 fs/stat.c:337
> >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > ==================================================================
> 
> My understanding is, that for every access to /proc/<pid>,
> d_revalidate is called, and /proc-fs implementation simply says that
> pid_revalidate always revalidates by rewriting uid/gid because "owning
> task may have performed a setuid(), etc." presumably so every access
> to a /proc/<pid> entry always has the right uid/gid (in effect
> updating /proc/<pid> lazily via d_revalidate).
> 
> Is it possible that one of the tasks above could be preempted after
> doing its writes to *ruid/*rgid, another thread writing some other
> values (after setuid / seteuid), and then the preempted thread seeing
> the other values? Assertion here should never fail:
> === TASK 1 ===
> | seteuid(1000);
> | seteuid(0);
> | stat("/proc/<pid-of-task-1>", &fstat);
> | assert(fstat.st_uid == 0);
> === TASK 2 ===
> | stat("/proc/<pid-of-task-1>", ...);

Is it the same as
pid_revalidate() snapshots (uid,gid) correctly
but writeback is done in any order?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KCSAN: data-race in task_dump_owner / task_dump_owner
  2019-10-17 18:17   ` Alexey Dobriyan
@ 2019-10-17 18:32     ` Marco Elver
  2019-10-23 17:06       ` Dmitry Vyukov
  0 siblings, 1 reply; 6+ messages in thread
From: Marco Elver @ 2019-10-17 18:32 UTC (permalink / raw)
  To: Alexey Dobriyan
  Cc: syzbot, Andrew Morton, casey, christian, Kees Cook,
	kent.overstreet, khlebnikov, linux-fsdevel, LKML, mhocko,
	Shakeel Butt, syzkaller-bugs, Thomas Gleixner

On Thu, 17 Oct 2019 at 20:17, Alexey Dobriyan <adobriyan@gmail.com> wrote:
>
> On Thu, Oct 17, 2019 at 02:56:47PM +0200, Marco Elver wrote:
> > Hi,
> >
> > On Thu, 17 Oct 2019 at 14:36, syzbot
> > <syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    d724f94f x86, kcsan: Enable KCSAN for x86
> > > git tree:       https://github.com/google/ktsan.git kcsan
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com
> > >
> > > ==================================================================
> > > BUG: KCSAN: data-race in task_dump_owner / task_dump_owner
> > >
> > > write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0:
> > >   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
> > >   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
> > >   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
> > >   d_revalidate fs/namei.c:765 [inline]
> > >   d_revalidate fs/namei.c:762 [inline]
> > >   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
> > >   walk_component+0x6d/0xe80 fs/namei.c:1804
> > >   link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139
> > >   link_path_walk fs/namei.c:2070 [inline]
> > >   path_openat+0x14f/0x3530 fs/namei.c:3532
> > >   do_filp_open+0x11e/0x1b0 fs/namei.c:3563
> > >   do_sys_open+0x3b3/0x4f0 fs/open.c:1089
> > >   __do_sys_open fs/open.c:1107 [inline]
> > >   __se_sys_open fs/open.c:1102 [inline]
> > >   __x64_sys_open+0x55/0x70 fs/open.c:1102
> > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >
> > > write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1:
> > >   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
> > >   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
> > >   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
> > >   d_revalidate fs/namei.c:765 [inline]
> > >   d_revalidate fs/namei.c:762 [inline]
> > >   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
> > >   walk_component+0x6d/0xe80 fs/namei.c:1804
> > >   lookup_last fs/namei.c:2271 [inline]
> > >   path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316
> > >   filename_lookup+0x145/0x2d0 fs/namei.c:2346
> > >   user_path_at_empty+0x4c/0x70 fs/namei.c:2606
> > >   user_path_at include/linux/namei.h:60 [inline]
> > >   vfs_statx+0xd9/0x190 fs/stat.c:187
> > >   vfs_stat include/linux/fs.h:3188 [inline]
> > >   __do_sys_newstat+0x51/0xb0 fs/stat.c:341
> > >   __se_sys_newstat fs/stat.c:337 [inline]
> > >   __x64_sys_newstat+0x3a/0x50 fs/stat.c:337
> > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >
> > > Reported by Kernel Concurrency Sanitizer on:
> > > CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > ==================================================================
> >
> > My understanding is, that for every access to /proc/<pid>,
> > d_revalidate is called, and /proc-fs implementation simply says that
> > pid_revalidate always revalidates by rewriting uid/gid because "owning
> > task may have performed a setuid(), etc." presumably so every access
> > to a /proc/<pid> entry always has the right uid/gid (in effect
> > updating /proc/<pid> lazily via d_revalidate).
> >
> > Is it possible that one of the tasks above could be preempted after
> > doing its writes to *ruid/*rgid, another thread writing some other
> > values (after setuid / seteuid), and then the preempted thread seeing
> > the other values? Assertion here should never fail:
> > === TASK 1 ===
> > | seteuid(1000);
> > | seteuid(0);
> > | stat("/proc/<pid-of-task-1>", &fstat);
> > | assert(fstat.st_uid == 0);
> > === TASK 2 ===
> > | stat("/proc/<pid-of-task-1>", ...);
>
> Is it the same as
> pid_revalidate() snapshots (uid,gid) correctly
> but writeback is done in any order?

Yes, I think so. Snapshot is done in RCU reader critical section, but
the writes can race with another thread. Is there logic that ensures
this doesn't lead to the observable outcome above?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KCSAN: data-race in task_dump_owner / task_dump_owner
  2019-10-17 18:32     ` Marco Elver
@ 2019-10-23 17:06       ` Dmitry Vyukov
  2019-10-24  8:35         ` Dmitry Vyukov
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Vyukov @ 2019-10-23 17:06 UTC (permalink / raw)
  To: Marco Elver
  Cc: Alexey Dobriyan, syzbot, Andrew Morton, Casey Schaufler,
	Christian Brauner, Kees Cook, Kent Overstreet,
	Konstantin Khlebnikov, linux-fsdevel, LKML, Michal Hocko,
	Shakeel Butt, syzkaller-bugs, Thomas Gleixner, Eric Dumazet

On Thu, Oct 17, 2019 at 8:33 PM 'Marco Elver' via syzkaller-bugs
<syzkaller-bugs@googlegroups.com> wrote:
>
> On Thu, 17 Oct 2019 at 20:17, Alexey Dobriyan <adobriyan@gmail.com> wrote:
> >
> > On Thu, Oct 17, 2019 at 02:56:47PM +0200, Marco Elver wrote:
> > > Hi,
> > >
> > > On Thu, 17 Oct 2019 at 14:36, syzbot
> > > <syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    d724f94f x86, kcsan: Enable KCSAN for x86
> > > > git tree:       https://github.com/google/ktsan.git kcsan
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891
> > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > >
> > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com
> > > >
> > > > ==================================================================
> > > > BUG: KCSAN: data-race in task_dump_owner / task_dump_owner
> > > >
> > > > write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0:
> > > >   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
> > > >   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
> > > >   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
> > > >   d_revalidate fs/namei.c:765 [inline]
> > > >   d_revalidate fs/namei.c:762 [inline]
> > > >   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
> > > >   walk_component+0x6d/0xe80 fs/namei.c:1804
> > > >   link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139
> > > >   link_path_walk fs/namei.c:2070 [inline]
> > > >   path_openat+0x14f/0x3530 fs/namei.c:3532
> > > >   do_filp_open+0x11e/0x1b0 fs/namei.c:3563
> > > >   do_sys_open+0x3b3/0x4f0 fs/open.c:1089
> > > >   __do_sys_open fs/open.c:1107 [inline]
> > > >   __se_sys_open fs/open.c:1102 [inline]
> > > >   __x64_sys_open+0x55/0x70 fs/open.c:1102
> > > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > >
> > > > write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1:
> > > >   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
> > > >   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
> > > >   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
> > > >   d_revalidate fs/namei.c:765 [inline]
> > > >   d_revalidate fs/namei.c:762 [inline]
> > > >   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
> > > >   walk_component+0x6d/0xe80 fs/namei.c:1804
> > > >   lookup_last fs/namei.c:2271 [inline]
> > > >   path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316
> > > >   filename_lookup+0x145/0x2d0 fs/namei.c:2346
> > > >   user_path_at_empty+0x4c/0x70 fs/namei.c:2606
> > > >   user_path_at include/linux/namei.h:60 [inline]
> > > >   vfs_statx+0xd9/0x190 fs/stat.c:187
> > > >   vfs_stat include/linux/fs.h:3188 [inline]
> > > >   __do_sys_newstat+0x51/0xb0 fs/stat.c:341
> > > >   __se_sys_newstat fs/stat.c:337 [inline]
> > > >   __x64_sys_newstat+0x3a/0x50 fs/stat.c:337
> > > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > >
> > > > Reported by Kernel Concurrency Sanitizer on:
> > > > CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > > Google 01/01/2011
> > > > ==================================================================
> > >
> > > My understanding is, that for every access to /proc/<pid>,
> > > d_revalidate is called, and /proc-fs implementation simply says that
> > > pid_revalidate always revalidates by rewriting uid/gid because "owning
> > > task may have performed a setuid(), etc." presumably so every access
> > > to a /proc/<pid> entry always has the right uid/gid (in effect
> > > updating /proc/<pid> lazily via d_revalidate).
> > >
> > > Is it possible that one of the tasks above could be preempted after
> > > doing its writes to *ruid/*rgid, another thread writing some other
> > > values (after setuid / seteuid), and then the preempted thread seeing
> > > the other values? Assertion here should never fail:
> > > === TASK 1 ===
> > > | seteuid(1000);
> > > | seteuid(0);
> > > | stat("/proc/<pid-of-task-1>", &fstat);
> > > | assert(fstat.st_uid == 0);
> > > === TASK 2 ===
> > > | stat("/proc/<pid-of-task-1>", ...);
> >
> > Is it the same as
> > pid_revalidate() snapshots (uid,gid) correctly
> > but writeback is done in any order?
>
> Yes, I think so. Snapshot is done in RCU reader critical section, but
> the writes can race with another thread. Is there logic that ensures
> this doesn't lead to the observable outcome above?


I found the case where this leads to an observable bug.
common_perm_cond() in security/apparmor/lsm.c reads the inode uid and
uses it for the security check:

static int common_perm_cond(const char *op, const struct path *path, u32 mask)
{
      struct path_cond cond = { d_backing_inode(path->dentry)->i_uid,

d_backing_inode(path->dentry)->i_mode
      };

Now consider the following test program:

#define _GNU_SOURCE
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>

void *thr(void *arg)
{
        for (;;) {
                struct stat file_stat;
                stat((char*)arg, &file_stat);
        }
        return 0;
}

int main(int argc, char *argv[])
{
        char proc[32];
        sprintf(proc, "/proc/%d", getpid());
        printf("%s\n", proc);
        pthread_t th;
        pthread_create(&th, 0, thr, proc);
        for (;;) {
                seteuid(1000);
                usleep(1);
                seteuid(0);
                struct stat file_stat;
                stat(proc, &file_stat);
        }
        return 0;
}

Whenever the main thread does stat, it must observe inode.uid == 0 in
common_perm_cond().

But since task_dump_owner() does writeback out of order, it can lead
to non-linearizable executions and main thread observing inode.uid ==
1000.
This in turn can lead to both false negatives and false positives from
AppArmour (false denying access and falsely permitting access).

I don't know how to setup actual AppArmour profile to do this, but I
see this guide mentions "owner @{PROC}/[0-9]*" in a policy, so I
assume it's possible:
https://gitlab.com/apparmor/apparmor/wikis/Profiling_by_hand

Instead, I added the following check to common_perm_cond() (it's
dirty, but you get the idea):

@@ -218,6 +218,15 @@ static int common_perm_cond(const char *op, const
struct path *path, u32 mask)
                                  d_backing_inode(path->dentry)->i_mode
        };
+       if (op == OP_GETATTR && mask == AA_MAY_GETATTR && cond.uid.val != 0) {
+               char buf1[64], buf2[64];
+               char *str = d_path(path, buf1, sizeof(buf1));
+               sprintf(buf2, "/proc/%d", current->pid);
+               if (!strcmp(str, buf2))
+                       pr_err("common_perm_cond: path=%s pid=%d uid=%d\n",
+                               str, current->pid, cond.uid.val);
+       }

Now when I run the program, I see how it fires every few seconds:

# ./a.out
/proc/1548
[  123.233107] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
[  126.142869] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
[  127.048353] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
[  128.181873] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
[  128.557104] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
[  144.690774] common_perm_cond: path=/proc/1548 pid=1548 uid=1000

Which means AppArmour acts based on the wrong UID. Obviously can lead
to falsely denying access, but also falsely permitting access.
Consider the following scenario.
A process sets owner UID on a file so that a child process won't be
able to access it, after that it starts the child process.
common_perm_cond() in the child process should observe the new owner
UID. However, if there a random other process simply doing stat() or
something similar on the file, now the common_perm_cond() in the child
can suddenly observe the old UID, which will be permitted by
AppArmour. Boom!

I've tried to apply "proc: fix inode uid/gid writeback race":
https://lore.kernel.org/lkml/20191020173010.GA14744@avx2/
but it does _not_ help because it does not really resolve the
non-atomic snapshot and writeback of UID.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KCSAN: data-race in task_dump_owner / task_dump_owner
  2019-10-23 17:06       ` Dmitry Vyukov
@ 2019-10-24  8:35         ` Dmitry Vyukov
  0 siblings, 0 replies; 6+ messages in thread
From: Dmitry Vyukov @ 2019-10-24  8:35 UTC (permalink / raw)
  To: Marco Elver
  Cc: Alexey Dobriyan, syzbot, Andrew Morton, Casey Schaufler,
	Christian Brauner, Kees Cook, Kent Overstreet,
	Konstantin Khlebnikov, linux-fsdevel, LKML, Michal Hocko,
	Shakeel Butt, syzkaller-bugs, Thomas Gleixner, Eric Dumazet

On Wed, Oct 23, 2019 at 7:06 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Thu, Oct 17, 2019 at 8:33 PM 'Marco Elver' via syzkaller-bugs
> <syzkaller-bugs@googlegroups.com> wrote:
> >
> > On Thu, 17 Oct 2019 at 20:17, Alexey Dobriyan <adobriyan@gmail.com> wrote:
> > >
> > > On Thu, Oct 17, 2019 at 02:56:47PM +0200, Marco Elver wrote:
> > > > Hi,
> > > >
> > > > On Thu, 17 Oct 2019 at 14:36, syzbot
> > > > <syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit:    d724f94f x86, kcsan: Enable KCSAN for x86
> > > > > git tree:       https://github.com/google/ktsan.git kcsan
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17884db3600000
> > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=e392f8008a294fdf8891
> > > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > > >
> > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com
> > > > >
> > > > > ==================================================================
> > > > > BUG: KCSAN: data-race in task_dump_owner / task_dump_owner
> > > > >
> > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7804 on cpu 0:
> > > > >   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
> > > > >   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
> > > > >   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
> > > > >   d_revalidate fs/namei.c:765 [inline]
> > > > >   d_revalidate fs/namei.c:762 [inline]
> > > > >   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
> > > > >   walk_component+0x6d/0xe80 fs/namei.c:1804
> > > > >   link_path_walk.part.0+0x5d3/0xa90 fs/namei.c:2139
> > > > >   link_path_walk fs/namei.c:2070 [inline]
> > > > >   path_openat+0x14f/0x3530 fs/namei.c:3532
> > > > >   do_filp_open+0x11e/0x1b0 fs/namei.c:3563
> > > > >   do_sys_open+0x3b3/0x4f0 fs/open.c:1089
> > > > >   __do_sys_open fs/open.c:1107 [inline]
> > > > >   __se_sys_open fs/open.c:1102 [inline]
> > > > >   __x64_sys_open+0x55/0x70 fs/open.c:1102
> > > > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > >
> > > > > write to 0xffff8881255bb7fc of 4 bytes by task 7813 on cpu 1:
> > > > >   task_dump_owner+0xd8/0x260 fs/proc/base.c:1742
> > > > >   pid_update_inode+0x3c/0x70 fs/proc/base.c:1818
> > > > >   pid_revalidate+0x91/0xd0 fs/proc/base.c:1841
> > > > >   d_revalidate fs/namei.c:765 [inline]
> > > > >   d_revalidate fs/namei.c:762 [inline]
> > > > >   lookup_fast+0x7cb/0x7e0 fs/namei.c:1613
> > > > >   walk_component+0x6d/0xe80 fs/namei.c:1804
> > > > >   lookup_last fs/namei.c:2271 [inline]
> > > > >   path_lookupat.isra.0+0x13a/0x5a0 fs/namei.c:2316
> > > > >   filename_lookup+0x145/0x2d0 fs/namei.c:2346
> > > > >   user_path_at_empty+0x4c/0x70 fs/namei.c:2606
> > > > >   user_path_at include/linux/namei.h:60 [inline]
> > > > >   vfs_statx+0xd9/0x190 fs/stat.c:187
> > > > >   vfs_stat include/linux/fs.h:3188 [inline]
> > > > >   __do_sys_newstat+0x51/0xb0 fs/stat.c:341
> > > > >   __se_sys_newstat fs/stat.c:337 [inline]
> > > > >   __x64_sys_newstat+0x3a/0x50 fs/stat.c:337
> > > > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > >
> > > > > Reported by Kernel Concurrency Sanitizer on:
> > > > > CPU: 1 PID: 7813 Comm: ps Not tainted 5.3.0+ #0
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > > > Google 01/01/2011
> > > > > ==================================================================
> > > >
> > > > My understanding is, that for every access to /proc/<pid>,
> > > > d_revalidate is called, and /proc-fs implementation simply says that
> > > > pid_revalidate always revalidates by rewriting uid/gid because "owning
> > > > task may have performed a setuid(), etc." presumably so every access
> > > > to a /proc/<pid> entry always has the right uid/gid (in effect
> > > > updating /proc/<pid> lazily via d_revalidate).
> > > >
> > > > Is it possible that one of the tasks above could be preempted after
> > > > doing its writes to *ruid/*rgid, another thread writing some other
> > > > values (after setuid / seteuid), and then the preempted thread seeing
> > > > the other values? Assertion here should never fail:
> > > > === TASK 1 ===
> > > > | seteuid(1000);
> > > > | seteuid(0);
> > > > | stat("/proc/<pid-of-task-1>", &fstat);
> > > > | assert(fstat.st_uid == 0);
> > > > === TASK 2 ===
> > > > | stat("/proc/<pid-of-task-1>", ...);
> > >
> > > Is it the same as
> > > pid_revalidate() snapshots (uid,gid) correctly
> > > but writeback is done in any order?
> >
> > Yes, I think so. Snapshot is done in RCU reader critical section, but
> > the writes can race with another thread. Is there logic that ensures
> > this doesn't lead to the observable outcome above?
>
>
> I found the case where this leads to an observable bug.
> common_perm_cond() in security/apparmor/lsm.c reads the inode uid and
> uses it for the security check:
>
> static int common_perm_cond(const char *op, const struct path *path, u32 mask)
> {
>       struct path_cond cond = { d_backing_inode(path->dentry)->i_uid,
>
> d_backing_inode(path->dentry)->i_mode
>       };
>
> Now consider the following test program:
>
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <sys/stat.h>
> #include <sys/types.h>
> #include <unistd.h>
> #include <stdlib.h>
> #include <pthread.h>
>
> void *thr(void *arg)
> {
>         for (;;) {
>                 struct stat file_stat;
>                 stat((char*)arg, &file_stat);
>         }
>         return 0;
> }
>
> int main(int argc, char *argv[])
> {
>         char proc[32];
>         sprintf(proc, "/proc/%d", getpid());
>         printf("%s\n", proc);
>         pthread_t th;
>         pthread_create(&th, 0, thr, proc);
>         for (;;) {
>                 seteuid(1000);
>                 usleep(1);
>                 seteuid(0);
>                 struct stat file_stat;
>                 stat(proc, &file_stat);
>         }
>         return 0;
> }
>
> Whenever the main thread does stat, it must observe inode.uid == 0 in
> common_perm_cond().
>
> But since task_dump_owner() does writeback out of order, it can lead
> to non-linearizable executions and main thread observing inode.uid ==
> 1000.
> This in turn can lead to both false negatives and false positives from
> AppArmour (false denying access and falsely permitting access).
>
> I don't know how to setup actual AppArmour profile to do this, but I
> see this guide mentions "owner @{PROC}/[0-9]*" in a policy, so I
> assume it's possible:
> https://gitlab.com/apparmor/apparmor/wikis/Profiling_by_hand
>
> Instead, I added the following check to common_perm_cond() (it's
> dirty, but you get the idea):
>
> @@ -218,6 +218,15 @@ static int common_perm_cond(const char *op, const
> struct path *path, u32 mask)
>                                   d_backing_inode(path->dentry)->i_mode
>         };
> +       if (op == OP_GETATTR && mask == AA_MAY_GETATTR && cond.uid.val != 0) {
> +               char buf1[64], buf2[64];
> +               char *str = d_path(path, buf1, sizeof(buf1));
> +               sprintf(buf2, "/proc/%d", current->pid);
> +               if (!strcmp(str, buf2))
> +                       pr_err("common_perm_cond: path=%s pid=%d uid=%d\n",
> +                               str, current->pid, cond.uid.val);
> +       }
>
> Now when I run the program, I see how it fires every few seconds:
>
> # ./a.out
> /proc/1548
> [  123.233107] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
> [  126.142869] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
> [  127.048353] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
> [  128.181873] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
> [  128.557104] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
> [  144.690774] common_perm_cond: path=/proc/1548 pid=1548 uid=1000
>
> Which means AppArmour acts based on the wrong UID. Obviously can lead
> to falsely denying access, but also falsely permitting access.
> Consider the following scenario.
> A process sets owner UID on a file so that a child process won't be
> able to access it, after that it starts the child process.
> common_perm_cond() in the child process should observe the new owner
> UID. However, if there a random other process simply doing stat() or
> something similar on the file, now the common_perm_cond() in the child
> can suddenly observe the old UID, which will be permitted by
> AppArmour. Boom!
>
> I've tried to apply "proc: fix inode uid/gid writeback race":
> https://lore.kernel.org/lkml/20191020173010.GA14744@avx2/
> but it does _not_ help because it does not really resolve the
> non-atomic snapshot and writeback of UID.

FTR here is the corresponding race report:
https://groups.google.com/d/msg/syzkaller-bugs/Cs07ly_Nmtg/avW672LrAQAJ

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-24  8:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-10-17 12:36 KCSAN: data-race in task_dump_owner / task_dump_owner syzbot
2019-10-17 12:56 ` Marco Elver
2019-10-17 18:17   ` Alexey Dobriyan
2019-10-17 18:32     ` Marco Elver
2019-10-23 17:06       ` Dmitry Vyukov
2019-10-24  8:35         ` Dmitry Vyukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox