linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check() usage (kernel/exit.c:1387)
@ 2010-08-16 16:57 Vegard Nossum
  2010-08-16 17:01 ` Vegard Nossum
  0 siblings, 1 reply; 6+ messages in thread
From: Vegard Nossum @ 2010-08-16 16:57 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Peter Zijlstra, David Howells, Paul E. McKenney, LKML

Hi,

I was testing your VFS scaling tree*, but I get the following in my dmesg:

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
kernel/exit.c:1387 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
2 locks held by bash/2878:
 #0:  (tasklist_lock){.+.+..}, at: [<ffffffff8103a275>] do_wait+0xb3/0x200
 #1:  (&(&sighand->siglock)->rlock){-.....}, at: [<ffffffff810399e9>]
wait_consider_task+0x2ca/0xaa3

stack backtrace:
Pid: 2878, comm: bash Not tainted 2.6.35-00079-g4067b85 #7
Call Trace:
 [<ffffffff8105e9f2>] lockdep_rcu_dereference+0x9b/0xa6
 [<ffffffff8103a036>] wait_consider_task+0x917/0xaa3
 [<ffffffff8102cf36>] ? add_preempt_count+0xc2/0xc9
 [<ffffffff8103a29d>] do_wait+0xdb/0x200
 [<ffffffff8103a45b>] sys_wait4+0x99/0xd3
 [<ffffffff810387e5>] ? child_wait_callback+0x0/0x5d
 [<ffffffff8109be66>] ? might_fault+0x96/0xbc
 [<ffffffff810737a6>] compat_sys_wait4+0x8a/0xc1
 [<ffffffff8109be83>] ? might_fault+0xb3/0xbc
 [<ffffffff8109be38>] ? might_fault+0x68/0xbc
 [<ffffffff815d02c3>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff8102862c>] sys32_waitpid+0xb/0xd
 [<ffffffff81027bb4>] sysenter_dispatch+0x7/0x30
 [<ffffffff815d0284>] ? trace_hardirqs_on_thunk+0x3a/0x3f

I don't think it's related to the VFS patches, however, so I added
some other (maybe relevant) CCs too.

At the time I was simply invoking gcc from bash.


Vegard

* git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin.git
branch vfs-scale

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check() usage (kernel/exit.c:1387)
  2010-08-16 16:57 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check() usage (kernel/exit.c:1387) Vegard Nossum
@ 2010-08-16 17:01 ` Vegard Nossum
  2010-08-16 17:03   ` Vegard Nossum
  0 siblings, 1 reply; 6+ messages in thread
From: Vegard Nossum @ 2010-08-16 17:01 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Peter Zijlstra, David Howells, Paul E. McKenney, LKML

Nick Piggin might update his git config because the commits in the
vfs-scale tree appear to bear an invalid e-mail address. Please reply
to this e-mail instead. Thanks.

Vegard


---------- Forwarded message ----------
From: Vegard Nossum <vegard.nossum@gmail.com>
Date: 16 August 2010 18:57
Subject: 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check()
usage (kernel/exit.c:1387)
To: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>, David Howells
<dhowells@redhat.com>, "Paul E. McKenney"
<paulmck@linux.vnet.ibm.com>, LKML <linux-kernel@vger.kernel.org>


Hi,

I was testing your VFS scaling tree*, but I get the following in my dmesg:

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
kernel/exit.c:1387 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
2 locks held by bash/2878:
 #0:  (tasklist_lock){.+.+..}, at: [<ffffffff8103a275>] do_wait+0xb3/0x200
 #1:  (&(&sighand->siglock)->rlock){-.....}, at: [<ffffffff810399e9>]
wait_consider_task+0x2ca/0xaa3

stack backtrace:
Pid: 2878, comm: bash Not tainted 2.6.35-00079-g4067b85 #7
Call Trace:
 [<ffffffff8105e9f2>] lockdep_rcu_dereference+0x9b/0xa6
 [<ffffffff8103a036>] wait_consider_task+0x917/0xaa3
 [<ffffffff8102cf36>] ? add_preempt_count+0xc2/0xc9
 [<ffffffff8103a29d>] do_wait+0xdb/0x200
 [<ffffffff8103a45b>] sys_wait4+0x99/0xd3
 [<ffffffff810387e5>] ? child_wait_callback+0x0/0x5d
 [<ffffffff8109be66>] ? might_fault+0x96/0xbc
 [<ffffffff810737a6>] compat_sys_wait4+0x8a/0xc1
 [<ffffffff8109be83>] ? might_fault+0xb3/0xbc
 [<ffffffff8109be38>] ? might_fault+0x68/0xbc
 [<ffffffff815d02c3>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff8102862c>] sys32_waitpid+0xb/0xd
 [<ffffffff81027bb4>] sysenter_dispatch+0x7/0x30
 [<ffffffff815d0284>] ? trace_hardirqs_on_thunk+0x3a/0x3f

I don't think it's related to the VFS patches, however, so I added
some other (maybe relevant) CCs too.

At the time I was simply invoking gcc from bash.


Vegard

* git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin.git
branch vfs-scale

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check() usage (kernel/exit.c:1387)
  2010-08-16 17:01 ` Vegard Nossum
@ 2010-08-16 17:03   ` Vegard Nossum
  2010-08-16 20:01     ` David Howells
  2010-08-16 20:21     ` Paul E. McKenney
  0 siblings, 2 replies; 6+ messages in thread
From: Vegard Nossum @ 2010-08-16 17:03 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Peter Zijlstra, David Howells, Paul E. McKenney, LKML

On the third try it will hopefully work.


Vegard

---------- Forwarded message ----------
From: Vegard Nossum <vegard.nossum@gmail.com>
Date: 16 August 2010 19:01
Subject: Re: 2.6.35+vfs-scale: INFO: suspicious
rcu_dereference_check() usage (kernel/exit.c:1387)
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>, David Howells
<dhowells@redhat.com>, "Paul E. McKenney"
<paulmck@linux.vnet.ibm.com>, LKML <linux-kernel@vger.kernel.org>


Nick Piggin might update his git config because the commits in the
vfs-scale tree appear to bear an invalid e-mail address. Please reply
to this e-mail instead. Thanks.

Vegard


---------- Forwarded message ----------
From: Vegard Nossum <vegard.nossum@gmail.com>
Date: 16 August 2010 18:57
Subject: 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check()
usage (kernel/exit.c:1387)
To: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>, David Howells
<dhowells@redhat.com>, "Paul E. McKenney"
<paulmck@linux.vnet.ibm.com>, LKML <linux-kernel@vger.kernel.org>


Hi,

I was testing your VFS scaling tree*, but I get the following in my dmesg:

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
kernel/exit.c:1387 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
2 locks held by bash/2878:
 #0:  (tasklist_lock){.+.+..}, at: [<ffffffff8103a275>] do_wait+0xb3/0x200
 #1:  (&(&sighand->siglock)->rlock){-.....}, at: [<ffffffff810399e9>]
wait_consider_task+0x2ca/0xaa3

stack backtrace:
Pid: 2878, comm: bash Not tainted 2.6.35-00079-g4067b85 #7
Call Trace:
 [<ffffffff8105e9f2>] lockdep_rcu_dereference+0x9b/0xa6
 [<ffffffff8103a036>] wait_consider_task+0x917/0xaa3
 [<ffffffff8102cf36>] ? add_preempt_count+0xc2/0xc9
 [<ffffffff8103a29d>] do_wait+0xdb/0x200
 [<ffffffff8103a45b>] sys_wait4+0x99/0xd3
 [<ffffffff810387e5>] ? child_wait_callback+0x0/0x5d
 [<ffffffff8109be66>] ? might_fault+0x96/0xbc
 [<ffffffff810737a6>] compat_sys_wait4+0x8a/0xc1
 [<ffffffff8109be83>] ? might_fault+0xb3/0xbc
 [<ffffffff8109be38>] ? might_fault+0x68/0xbc
 [<ffffffff815d02c3>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff8102862c>] sys32_waitpid+0xb/0xd
 [<ffffffff81027bb4>] sysenter_dispatch+0x7/0x30
 [<ffffffff815d0284>] ? trace_hardirqs_on_thunk+0x3a/0x3f

I don't think it's related to the VFS patches, however, so I added
some other (maybe relevant) CCs too.

At the time I was simply invoking gcc from bash.


Vegard

* git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin.git
branch vfs-scale

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check() usage (kernel/exit.c:1387)
  2010-08-16 17:03   ` Vegard Nossum
@ 2010-08-16 20:01     ` David Howells
  2010-08-17 18:09       ` Oleg Nesterov
  2010-08-16 20:21     ` Paul E. McKenney
  1 sibling, 1 reply; 6+ messages in thread
From: David Howells @ 2010-08-16 20:01 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: dhowells, Oleg Nesterov, Nick Piggin, Peter Zijlstra,
	Paul E. McKenney, LKML


Vegard Nossum <vegard.nossum@gmail.com> wrote:

>  #0:  (tasklist_lock){.+.+..}, at: [<ffffffff8103a275>] do_wait+0xb3/0x200
>  #1:  (&(&sighand->siglock)->rlock){-.....}, at: [<ffffffff810399e9>]
> wait_consider_task+0x2ca/0xaa3
> 
> stack backtrace:
> Pid: 2878, comm: bash Not tainted 2.6.35-00079-g4067b85 #7
> Call Trace:
>  [<ffffffff8105e9f2>] lockdep_rcu_dereference+0x9b/0xa6
>  [<ffffffff8103a036>] wait_consider_task+0x917/0xaa3

The attached patch should fix this.

David
---
From: Daniel J Blueman <daniel.blueman@gmail.com>
Subject: [PATCH] Fix unprotected access to task credentials in waitid()

Using a program like the following:

	#include <stdlib.h>
	#include <unistd.h>
	#include <sys/types.h>
	#include <sys/wait.h>

	int main() {
		id_t id;
		siginfo_t infop;
		pid_t res;

		id = fork();
		if (id == 0) { sleep(1); exit(0); }
		kill(id, SIGSTOP);
		alarm(1);
		waitid(P_PID, id, &infop, WCONTINUED);
		return 0;
	}

to call waitid() on a stopped process results in access to the child task's
credentials without the RCU read lock being held - which may be replaced in the
meantime - eliciting the following warning:

	===================================================
	[ INFO: suspicious rcu_dereference_check() usage. ]
	---------------------------------------------------
	kernel/exit.c:1460 invoked rcu_dereference_check() without protection!

	other info that might help us debug this:

	rcu_scheduler_active = 1, debug_locks = 1
	2 locks held by waitid02/22252:
	 #0:  (tasklist_lock){.?.?..}, at: [<ffffffff81061ce5>] do_wait+0xc5/0x310
	 #1:  (&(&sighand->siglock)->rlock){-.-...}, at: [<ffffffff810611da>]
	wait_consider_task+0x19a/0xbe0

	stack backtrace:
	Pid: 22252, comm: waitid02 Not tainted 2.6.35-323cd+ #3
	Call Trace:
	 [<ffffffff81095da4>] lockdep_rcu_dereference+0xa4/0xc0
	 [<ffffffff81061b31>] wait_consider_task+0xaf1/0xbe0
	 [<ffffffff81061d15>] do_wait+0xf5/0x310
	 [<ffffffff810620b6>] sys_waitid+0x86/0x1f0
	 [<ffffffff8105fce0>] ? child_wait_callback+0x0/0x70
	 [<ffffffff81003282>] system_call_fastpath+0x16/0x1b

This is fixed by holding the RCU read lock in wait_task_continued() to ensure
that the task's current credentials aren't destroyed between us reading the
cred pointer and us reading the UID from those credentials.

Furthermore, protect wait_task_stopped() in the same way.

We don't need to keep holding the RCU read lock once we've read the UID from
the credentials as holding the RCU read lock doesn't stop the target task from
changing its creds under us - so the credentials may be outdated immediately
after we've read the pointer, lock or no lock.

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 kernel/exit.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)


diff --git a/kernel/exit.c b/kernel/exit.c
index 671ed56..0312022 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1386,8 +1386,7 @@ static int wait_task_stopped(struct wait_opts *wo,
 	if (!unlikely(wo->wo_flags & WNOWAIT))
 		*p_code = 0;
 
-	/* don't need the RCU readlock here as we're holding a spinlock */
-	uid = __task_cred(p)->uid;
+	uid = task_uid(p);
 unlock_sig:
 	spin_unlock_irq(&p->sighand->siglock);
 	if (!exit_code)
@@ -1460,7 +1459,7 @@ static int wait_task_continued(struct wait_opts *wo, struct task_struct *p)
 	}
 	if (!unlikely(wo->wo_flags & WNOWAIT))
 		p->signal->flags &= ~SIGNAL_STOP_CONTINUED;
-	uid = __task_cred(p)->uid;
+	uid = task_uid(p);
 	spin_unlock_irq(&p->sighand->siglock);
 
 	pid = task_pid_vnr(p);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check() usage (kernel/exit.c:1387)
  2010-08-16 17:03   ` Vegard Nossum
  2010-08-16 20:01     ` David Howells
@ 2010-08-16 20:21     ` Paul E. McKenney
  1 sibling, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2010-08-16 20:21 UTC (permalink / raw)
  To: Vegard Nossum; +Cc: Nick Piggin, Peter Zijlstra, David Howells, LKML

On Mon, Aug 16, 2010 at 07:03:40PM +0200, Vegard Nossum wrote:
> On the third try it will hopefully work.
> 
> 
> Vegard
> 
> ---------- Forwarded message ----------
> From: Vegard Nossum <vegard.nossum@gmail.com>
> Date: 16 August 2010 19:01
> Subject: Re: 2.6.35+vfs-scale: INFO: suspicious
> rcu_dereference_check() usage (kernel/exit.c:1387)
> To: Nick Piggin <nickpiggin@yahoo.com.au>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>, David Howells
> <dhowells@redhat.com>, "Paul E. McKenney"
> <paulmck@linux.vnet.ibm.com>, LKML <linux-kernel@vger.kernel.org>
> 
> 
> Nick Piggin might update his git config because the commits in the
> vfs-scale tree appear to bear an invalid e-mail address. Please reply
> to this e-mail instead. Thanks.
> 
> Vegard
> 
> 
> ---------- Forwarded message ----------
> From: Vegard Nossum <vegard.nossum@gmail.com>
> Date: 16 August 2010 18:57
> Subject: 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check()
> usage (kernel/exit.c:1387)
> To: Nick Piggin <npiggin@suse.de>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>, David Howells
> <dhowells@redhat.com>, "Paul E. McKenney"
> <paulmck@linux.vnet.ibm.com>, LKML <linux-kernel@vger.kernel.org>
> 
> 
> Hi,
> 
> I was testing your VFS scaling tree*, but I get the following in my dmesg:
> 
> ===================================================
> [ INFO: suspicious rcu_dereference_check() usage. ]
> ---------------------------------------------------
> kernel/exit.c:1387 invoked rcu_dereference_check() without protection!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 1, debug_locks = 1
> 2 locks held by bash/2878:
>  #0:  (tasklist_lock){.+.+..}, at: [<ffffffff8103a275>] do_wait+0xb3/0x200
>  #1:  (&(&sighand->siglock)->rlock){-.....}, at: [<ffffffff810399e9>]
> wait_consider_task+0x2ca/0xaa3

Thank you for locating this!

OK, so we aren't holding rcu_read_lock(), nor is this task dead, so
something needs some help.

We get from wait_consider_task() as follows: wait_consider_task()
calls security_task_wait(), which calls either selinux_task_wait() or
smack_task_wait().  Let's start with selinux_task_wait(), which calls
task_has_perm(), which does a couple of __task_cred(), but does so
under rcu_read_lock() protection, so presumably we didn't come in
that way.

Dave, am I missing a code path somewhere here?

							Thanx, Paul

> stack backtrace:
> Pid: 2878, comm: bash Not tainted 2.6.35-00079-g4067b85 #7
> Call Trace:
>  [<ffffffff8105e9f2>] lockdep_rcu_dereference+0x9b/0xa6
>  [<ffffffff8103a036>] wait_consider_task+0x917/0xaa3
>  [<ffffffff8102cf36>] ? add_preempt_count+0xc2/0xc9
>  [<ffffffff8103a29d>] do_wait+0xdb/0x200
>  [<ffffffff8103a45b>] sys_wait4+0x99/0xd3
>  [<ffffffff810387e5>] ? child_wait_callback+0x0/0x5d
>  [<ffffffff8109be66>] ? might_fault+0x96/0xbc
>  [<ffffffff810737a6>] compat_sys_wait4+0x8a/0xc1
>  [<ffffffff8109be83>] ? might_fault+0xb3/0xbc
>  [<ffffffff8109be38>] ? might_fault+0x68/0xbc
>  [<ffffffff815d02c3>] ? trace_hardirqs_off_thunk+0x3a/0x3c
>  [<ffffffff8102862c>] sys32_waitpid+0xb/0xd
>  [<ffffffff81027bb4>] sysenter_dispatch+0x7/0x30
>  [<ffffffff815d0284>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> 
> I don't think it's related to the VFS patches, however, so I added
> some other (maybe relevant) CCs too.
> 
> At the time I was simply invoking gcc from bash.
> 
> 
> Vegard
> 
> * git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin.git
> branch vfs-scale

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check() usage (kernel/exit.c:1387)
  2010-08-16 20:01     ` David Howells
@ 2010-08-17 18:09       ` Oleg Nesterov
  0 siblings, 0 replies; 6+ messages in thread
From: Oleg Nesterov @ 2010-08-17 18:09 UTC (permalink / raw)
  To: David Howells
  Cc: Vegard Nossum, Nick Piggin, Peter Zijlstra, Paul E. McKenney,
	LKML

On 08/16, David Howells wrote:
>
> Vegard Nossum <vegard.nossum@gmail.com> wrote:
>
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1386,8 +1386,7 @@ static int wait_task_stopped(struct wait_opts *wo,
>  	if (!unlikely(wo->wo_flags & WNOWAIT))
>  		*p_code = 0;
>
> -	/* don't need the RCU readlock here as we're holding a spinlock */
> -	uid = __task_cred(p)->uid;
> +	uid = task_uid(p);
>  unlock_sig:
>  	spin_unlock_irq(&p->sighand->siglock);
>  	if (!exit_code)
> @@ -1460,7 +1459,7 @@ static int wait_task_continued(struct wait_opts *wo, struct task_struct *p)
>  	}
>  	if (!unlikely(wo->wo_flags & WNOWAIT))
>  		p->signal->flags &= ~SIGNAL_STOP_CONTINUED;
> -	uid = __task_cred(p)->uid;
> +	uid = task_uid(p);
>  	spin_unlock_irq(&p->sighand->siglock);
>
>  	pid = task_pid_vnr(p);

Acked-by: Oleg Nesterov <oleg@redhat.com>


Vegard, perhaps you can take care of wait_task_zombie() too?
Of course, if you wish.

Like wait_task_stopped(), it doesn't really need RCU lock. But if
we change it to use task_uid() to shut up rcu_dereference_check(),
then we can kill task_is_dead().

Oleg.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-08-17 18:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-16 16:57 2.6.35+vfs-scale: INFO: suspicious rcu_dereference_check() usage (kernel/exit.c:1387) Vegard Nossum
2010-08-16 17:01 ` Vegard Nossum
2010-08-16 17:03   ` Vegard Nossum
2010-08-16 20:01     ` David Howells
2010-08-17 18:09       ` Oleg Nesterov
2010-08-16 20:21     ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).