From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758093Ab1GKQZI (ORCPT ); Mon, 11 Jul 2011 12:25:08 -0400 Received: from rcsinet15.oracle.com ([148.87.113.117]:59932 "EHLO rcsinet15.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757945Ab1GKQZG convert rfc822-to-8bit (ORCPT ); Mon, 11 Jul 2011 12:25:06 -0400 Date: Mon, 11 Jul 2011 12:24:51 -0400 From: Konrad Rzeszutek Wilk To: "Paul E. McKenney" , xen-devel@lists.xensource.com Cc: julie Sullivan , linux-kernel@vger.kernel.org Subject: Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 Message-ID: <20110711162450.GA22913@dumpdata.com> References: <20110709100900.GB6014@linux.vnet.ibm.com> <20110710032510.GG6014@linux.vnet.ibm.com> <20110710171626.GK6014@linux.vnet.ibm.com> <20110710173530.GA16954@linux.vnet.ibm.com> <20110710214639.GP6014@linux.vnet.ibm.com> <20110710231449.GQ6014@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20110710231449.GQ6014@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Content-Transfer-Encoding: 8BIT X-Source-IP: rtcsinet21.oracle.com [66.248.204.29] X-CT-RefId: str=0001.0A090203.4E1B23DD.012B:SCFSTAT5015188,ss=1,re=-4.000,fgs=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jul 10, 2011 at 04:14:49PM -0700, Paul E. McKenney wrote: > On Sun, Jul 10, 2011 at 10:50:48PM +0100, julie Sullivan wrote: > > > Very cool!  Thank you very much for the testing -- .. snip.. > And here is what I am proposing sending upstream. I have your Tested-by, Hey Paul, I am hitting a similar bug. Starting udev Kernel Device Manager... Starting Configure read-only root support... [ 79.942067] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0} (detected by 2, t=60002 jiffies) [ 79.942089] sending NMI to all CPUs: when running a 3.0-rc6 under Xen as 32-bit guest (I don't see this issue when running a 64-bit guest) and when I've more than two CPUs under the guest. I've tried the patch below against 3.0-rc6 and it did not fix the issue. I've also tried to use 3.0-rc3 as somewhere in thread one of the reporters mentioned that it worked for me - but that did not help me. The config is a Fedora Core based. The stack traces of the four CPUs look as follow: CPU0: Call Trace: [] hypercall_page+0x3a7 <-- [] xen_safe_halt+0x12 [] default_idle+0x5a [] cpu_idle+0x8e [] rest_init+0x5d [] start_kernel+0x34d [] unknown_bootoption [] i386_start_kernel+0xa9 [] xen_start_kernel+0x55d [] sys_rt_sigreturn+0xb CPU1 and CPU2: Call Trace: [] hypercall_page+0x3a7 <-- [] xen_safe_halt+0x12 [] default_idle+0x5a [] cpu_idle+0x8e [] cpu_bringup_and_idle+0xd CPU3: Call Trace: [] task_waking_fair+0x11 <-- [] try_to_wake_up+0xb2 [] default_wake_function+0x10 [] __wake_up_common+0x3b [] complete+0x3e [] wakeme_after_rcu+0x10 [] __rcu_process_callbacks+0x172 [] rcu_process_callbacks+0x20 [] __do_softirq+0xa2 [] __do_softirq [] do_softirq+0x5a The full config is http://darnok.org/xen/config-rcu-stall The full bootup log is http://darnok.org/xen/log-rcu-stall Any thoughts of what I ought to try? I don't know if there is some missing functionality in the RCU patches to work under Xen.... Any older version of Linux kernel you would like me to try?