From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755996Ab2ITPls (ORCPT ); Thu, 20 Sep 2012 11:41:48 -0400 Received: from e1.ny.us.ibm.com ([32.97.182.141]:52332 "EHLO e1.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932070Ab2ITPlq (ORCPT ); Thu, 20 Sep 2012 11:41:46 -0400 Date: Thu, 20 Sep 2012 08:23:41 -0700 From: "Paul E. McKenney" To: Sasha Levin Cc: Michael Wang , Dave Jones , "linux-kernel@vger.kernel.org" Subject: Re: RCU idle CPU detection is broken in linux-next Message-ID: <20120920152341.GE2449@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <5050CCE0.4090403@gmail.com> <20120919153934.GB2455@linux.vnet.ibm.com> <5059F458.3000407@gmail.com> <20120919170648.GF2455@linux.vnet.ibm.com> <505AC6C8.9060706@linux.vnet.ibm.com> <505AC979.7000008@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <505AC979.7000008@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) x-cbid: 12092015-6078-0000-0000-00000FCACA0C Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 20, 2012 at 09:44:57AM +0200, Sasha Levin wrote: > On 09/20/2012 09:33 AM, Michael Wang wrote: > > On 09/20/2012 01:06 AM, Paul E. McKenney wrote: > >> On Wed, Sep 19, 2012 at 06:35:36PM +0200, Sasha Levin wrote: > >>> On 09/19/2012 05:39 PM, Paul E. McKenney wrote: > >>>> On Wed, Sep 12, 2012 at 07:56:48PM +0200, Sasha Levin wrote: > >>>>>> Hi Paul, > >>>>>> > >>>>>> While fuzzing using trinity inside a KVM tools guest, I've managed to trigger > >>>>>> "RCU used illegally from idle CPU!" warnings several times. > >>>>>> > >>>>>> There are a bunch of traces which seem to pop exactly at the same time and from > >>>>>> different places around the kernel. Here are several of them: > >>>> Hello, Sasha, > >>>> > >>>> OK, interesting. Could you please try reproducing with the diagnostic > >>>> patch shown below? > >>> > >>> Sure - here are the results (btw, it reproduces very easily): > >>> > >>> [ 13.525119] ================================================ > >>> [ 13.527165] [ BUG: lock held when returning to user space! ] > >>> [ 13.528752] 3.6.0-rc6-next-20120918-sasha-00002-g190c311-dirty #362 Tainted: GW > >>> [ 13.531314] ------------------------------------------------ > >>> [ 13.532918] init/1 is leaving the kernel with locks still held! > >>> [ 13.534574] 1 lock held by init/1: > >>> [ 13.535533] #0: (rcu_idle){.+.+..}, at: [] > >>> rcu_eqs_enter_common+0x1a0/0x9a0 > >>> > >>> I'm basically seeing lots of the above, so I can't even get to the point where I > >>> get the previous lockdep warnings. > >> > >> OK, that diagnostic patch was unhelpful. Back to the drawing board... > > > > May be we could first make sure the cpu_idle() behave properly? > > > > Since according to the log, rcu think cpu is idle while current pid > > is not 0, that could happen if things broken in cpu_idle() which > > is very dependent on platform. > > > > So check it when idle thread was switched out may could be the first > > step? some thing like below. > > > > Regards, > > Michael Wang > > > > diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c > > index b6baf37..f8c7354 100644 > > --- a/kernel/sched/idle_task.c > > +++ b/kernel/sched/idle_task.c > > @@ -43,6 +43,7 @@ dequeue_task_idle(struct rq *rq, struct task_struct *p, int flags) > > > > static void put_prev_task_idle(struct rq *rq, struct task_struct *prev) > > { > > + WARN_ON(rcu_is_cpu_idle()); > > } > > > > static void task_tick_idle(struct rq *rq, struct task_struct *curr, int queued) > > Looks like you're on to something, with the small patch above applied: > > [ 23.514223] ------------[ cut here ]------------ > [ 23.515496] WARNING: at kernel/sched/idle_task.c:46 > put_prev_task_idle+0x1e/0x30() > [ 23.517498] Pid: 0, comm: swapper/0 Tainted: G W > 3.6.0-rc6-next-20120919-sasha-00001-gb54aafe-dirty #366 > [ 23.520393] Call Trace: > [ 23.521882] [] ? put_prev_task_idle+0x1e/0x30 > [ 23.524220] [] warn_slowpath_common+0x86/0xb0 > [ 23.524220] [] warn_slowpath_null+0x15/0x20 > [ 23.524220] [] put_prev_task_idle+0x1e/0x30 > [ 23.524220] [] __schedule+0x25e/0x8f0 > [ 23.524220] [] ? tick_nohz_idle_exit+0x18d/0x1c0 > [ 23.524220] [] schedule+0x55/0x60 > [ 23.524220] [] cpu_idle+0x90/0x160 > [ 23.524220] [] rest_init+0x130/0x144 > [ 23.524220] [] ? csum_partial_copy_generic+0x16c/0x16c > [ 23.524220] [] start_kernel+0x38d/0x39a > [ 23.524220] [] ? repair_env_string+0x5e/0x5e > [ 23.524220] [] x86_64_start_reservations+0x101/0x105 > [ 23.524220] [] x86_64_start_kernel+0x148/0x157 > [ 23.524220] ---[ end trace 2c3061ab727afec2 ]--- It looks like someone is exiting the idle loop without telling RCU about it. Architectures are supposed to invoke rcu_idle_exit() before they leave the idle loop. This was in fact my guess yesterday, which is why I tried to enlist lockdep's help, forgetting that lockdep complains about holding a lock when entering the idle loop. A couple of possible things to try: 1. Inspect the idle loop to see if it can invoke schedule() without invoking rcu_idle_exit(). This might happen indirectly, for example, by calling mutex_lock(). 2. Bisect to see what caused the warning to appear -- perhaps when someone put a mutex_lock() or some such into the idle loop without protecting it with rcu_idle_exit() or RCU_NONIDLE(). Seem reasonable? Thanx, Paul