From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul E. McKenney" Subject: Re: [PATCH v2 6/8] kvm tools: Add rwlock wrapper Date: Fri, 3 Jun 2011 14:20:28 -0700 Message-ID: <20110603212028.GS2333@linux.vnet.ibm.com> References: <20110530114949.GD22324@elte.hu> <4DE387DA.3020307@redhat.com> <20110530123602.GK22324@elte.hu> <1307086079.13088.8.camel@lappy> <20110603073427.GF15375@elte.hu> <1307087659.13088.10.camel@lappy> <20110603193157.GO2333@linux.vnet.ibm.com> <1307130980.2811.4.camel@lappy> <20110603202211.GP2333@linux.vnet.ibm.com> <1307135039.2811.6.camel@lappy> Reply-To: paulmck@linux.vnet.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Ingo Molnar , Yinghai Lu , Avi Kivity , Takuya Yoshikawa , Pekka Enberg , kvm@vger.kernel.org, asias.hejun@gmail.com, gorcunov@gmail.com, prasadjoshi124@gmail.com, takuya.yoshikawa@gmail.com To: Sasha Levin Return-path: Received: from e7.ny.us.ibm.com ([32.97.182.137]:44632 "EHLO e7.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752478Ab1FCVUf (ORCPT ); Fri, 3 Jun 2011 17:20:35 -0400 Received: from d01relay07.pok.ibm.com (d01relay07.pok.ibm.com [9.56.227.147]) by e7.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p53Kuj7B029254 for ; Fri, 3 Jun 2011 16:56:45 -0400 Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay07.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p53LKUt31429626 for ; Fri, 3 Jun 2011 17:20:30 -0400 Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p53LKTmM001370 for ; Fri, 3 Jun 2011 17:20:30 -0400 Content-Disposition: inline In-Reply-To: <1307135039.2811.6.camel@lappy> Sender: kvm-owner@vger.kernel.org List-ID: On Sat, Jun 04, 2011 at 12:03:59AM +0300, Sasha Levin wrote: > On Fri, 2011-06-03 at 13:22 -0700, Paul E. McKenney wrote: > > On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote: > > > On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote: > > > > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote: > > > > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote: > > > > > > * Sasha Levin wrote: > > > > > > > > > > > > > > with no apparent progress being made. > > > > > > > > > > > > > > Since it's something that worked in 2.6.37, I've looked into it to > > > > > > > find what might have caused this issue. > > > > > > > > > > > > > > I've bisected guest kernels and found that the problem starts with: > > > > > > > > > > > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit > > > > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114 > > > > > > > Author: Paul E. McKenney > > > > > > > Date: Wed Jan 12 14:10:23 2011 -0800 > > > > > > > > > > > > > > rcu: move TREE_RCU from softirq to kthread > > > > > > > > > > > > > > Ingo, could you confirm that the problem goes away for you when you > > > > > > > use an earlier commit? > > > > > > > > > > > > testing will have to wait, but there's a recent upstream fix: > > > > > > > > > > > > d72bce0e67e8: rcu: Cure load woes > > > > > > > > > > > > That *might* perhaps address this problem too. > > > > > > > > > > > I've re-tested with Linus's current git, the problem is still there. > > > > > > > > > > > If not then this appears to be some sort of RCU related livelock with > > > > > > brutally overcommitted vcpus. On native this would show up too, in a > > > > > > less drastic form, as a spurious bootup delay. > > > > > > > > > > I don't think it was overcommited by *that* much. With that commit it > > > > > usually hangs at 20-40 vcpus, while without it I can go up to 255. > > > > > > > > Here is a diagnostic patch, untested. It assumes that your system > > > > has only a few CPUs (maybe 8-16) and that timers are still running. > > > > It dumps out some RCU state if grace periods extend for more than > > > > a few seconds. > > > > > > > > To activate it, call rcu_diag_timer_start() from process context. > > > > To stop it, call rcu_diag_timer_stop(), also from process context. > > > > > > Since the hang happens in guest kernel very early during boot, I can't > > > call rcu_diag_timer_start(). What would be a good place to put the > > > _start() code instead? > > > > Assuming that the failure occurs in the host OS rather than in the guest > > OS, I suggest placing rcu_diag_timer_start() in the host code that starts > > up the guest. > > > > On the other hand, if the failure is occuring in the guest OS, then > > I suggest placing the call to rcu_diag_timer_start() just after timer > > initialization -- that is, assuming that interrupts are enabled at the > > time of the failure. If interrupts are not yet enabled at the time of > > the failure, color me clueless. > > It happens in the guest OS, the bisection was done on a guest kernel. > > I've placed the rcu debug _start() right at the end of init_timers() > which happens before the hang, but I'm not seeing any output from the > debug function. Sounds like the hang is happening before timers start? Are scheduling clock interrupts happening in the guest, as in scheduler_tick()? If so, I could just as easily put the debug there. Thanx, Paul