From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751685Ab0JLEUe (ORCPT ); Tue, 12 Oct 2010 00:20:34 -0400 Received: from e2.ny.us.ibm.com ([32.97.182.142]:52670 "EHLO e2.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751126Ab0JLEUd (ORCPT ); Tue, 12 Oct 2010 00:20:33 -0400 Date: Mon, 11 Oct 2010 21:20:30 -0700 From: "Paul E. McKenney" To: Greg Thelen Cc: linux-kernel@vger.kernel.org Subject: Re: INFO: suspicious rcu_dereference_check() usage - kernel/sched.c:618 invoked rcu_dereference_check() without protection! Message-ID: <20101012042030.GB2496@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 11, 2010 at 06:19:55PM -0700, Greg Thelen wrote: > I reliably see a rcu_dereference_check() failure on with v2.6.36-rc7 in > a 512MiB VM. I would be happy to test out proposed patches to this > issue. Hello, Greg, Commit 6506cf6ce68 in my -rcu tree should address this. git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/next Please see below for a patch against tip/core/rcu that gathers up the four commits. > [ 0.036082] lockdep: fixing up alternatives. > [ 0.037184] > [ 0.037185] =================================================== > [ 0.037999] [ INFO: suspicious rcu_dereference_check() usage. ] > [ 0.037999] --------------------------------------------------- > [ 0.037999] kernel/sched.c:618 invoked rcu_dereference_check() without protection! > [ 0.037999] > [ 0.037999] other info that might help us debug this: > [ 0.037999] > [ 0.037999] > [ 0.037999] rcu_scheduler_active = 1, debug_locks = 0 > [ 0.037999] 3 locks held by kworker/0:0/4: > [ 0.037999] #0: (events){+.+.+.}, at: [] process_one_work+0x195/0x422 > [ 0.037999] #1: ((&c_idle.work)){+.+.+.}, at: [] process_one_work+0x195/0x422 > [ 0.037999] #2: (&rq->lock){-.-...}, at: [] init_idle+0x2b/0x114 > [ 0.037999] > [ 0.037999] stack backtrace: > [ 0.037999] Pid: 4, comm: kworker/0:0 Not tainted 2.6.36-rc7 #1 > [ 0.037999] Call Trace: > [ 0.037999] [] lockdep_rcu_dereference+0xaa/0xb2 > [ 0.037999] [] task_group+0x7b/0x8b > [ 0.037999] [] set_task_rq+0x15/0x40 > [ 0.037999] [] init_idle+0xd1/0x114 > [ 0.037999] [] fork_idle+0xb8/0xc9 > [ 0.037999] [] ? check_preempt_wakeup+0xf0/0x177 > [ 0.037999] [] do_fork_idle+0x17/0x28 > [ 0.037999] [] process_one_work+0x265/0x422 > [ 0.037999] [] ? process_one_work+0x195/0x422 > [ 0.037999] [] ? wake_up_process+0x10/0x12 > [ 0.037999] [] ? manage_workers+0x106/0x191 > [ 0.037999] [] worker_thread+0x136/0x24c > [ 0.037999] [] ? worker_thread+0x0/0x24c > [ 0.037999] [] kthread+0x7d/0x85 > [ 0.037999] [] kernel_thread_helper+0x4/0x10 > [ 0.037999] [] ? restore_args+0x0/0x30 > [ 0.037999] [] ? kthread+0x0/0x85 > [ 0.037999] [] ? kernel_thread_helper+0x0/0x10 > > Below is the .config, which was generated from: > $ make defconfig > $ make menuconfig > - enable CONFIG_SPINLOCK_SLEEP > - enable CONFIG_PREEMPT > - enable CONFIG_PROVE_LOCKING > - enable CONFIG_PROVE_RCU Please let me know how it goes! Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/rcutree.c b/kernel/rcutree.c index e750735..ccdc04c 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -545,9 +545,9 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) if (rcu_cpu_stall_suppress) return; - delta = jiffies - rsp->jiffies_stall; + delta = jiffies - ACCESS_ONCE(rsp->jiffies_stall); rnp = rdp->mynode; - if ((rnp->qsmask & rdp->grpmask) && delta >= 0) { + if ((ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && delta >= 0) { /* We haven't checked in, so go dump stack. */ print_cpu_stall(rsp); diff --git a/kernel/sched.c b/kernel/sched.c index dc85ceb..ae8f75a 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -5337,7 +5337,19 @@ void __cpuinit init_idle(struct task_struct *idle, int cpu) idle->se.exec_start = sched_clock(); cpumask_copy(&idle->cpus_allowed, cpumask_of(cpu)); + /* + * We're having a chicken and egg problem, even though we are + * holding rq->lock, the cpu isn't yet set to this cpu so the + * lockdep check in task_group() will fail. + * + * Similar case to sched_fork(). / Alternatively we could + * use task_rq_lock() here and obtain the other rq->lock. + * + * Silence PROVE_RCU + */ + rcu_read_lock(); __set_task_cpu(idle, cpu); + rcu_read_unlock(); rq->curr = rq->idle = idle; #if defined(CONFIG_SMP) && defined(__ARCH_WANT_UNLOCKED_CTXSW) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index db3f674..5f996d3 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -3751,8 +3751,11 @@ static void task_fork_fair(struct task_struct *p) update_rq_clock(rq); - if (unlikely(task_cpu(p) != this_cpu)) + if (unlikely(task_cpu(p) != this_cpu)) { + rcu_read_lock(); __set_task_cpu(p, this_cpu); + rcu_read_unlock(); + } update_curr(cfs_rq); diff --git a/net/core/sock.c b/net/core/sock.c index ef30e9d..7d99e13 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1078,8 +1078,11 @@ static void sk_prot_free(struct proto *prot, struct sock *sk) #ifdef CONFIG_CGROUPS void sock_update_classid(struct sock *sk) { - u32 classid = task_cls_classid(current); + u32 classid; + rcu_read_lock(); /* doing current task, which cannot vanish. */ + classid = task_cls_classid(current); + rcu_read_unlock(); if (classid && classid != sk->sk_classid) sk->sk_classid = classid; }