From mboxrd@z Thu Jan 1 00:00:00 1970 From: Don Zickus Subject: Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() Date: Fri, 23 Mar 2012 09:26:48 -0400 Message-ID: <20120323132648.GA18218@redhat.com> References: <4F34EC35.7010109@linux.vnet.ibm.com> <1328900283.25989.45.camel@laptop> <1328900633.25989.47.camel@laptop> <20120210200250.GG5650@redhat.com> <1328905121.25989.52.camel@laptop> <20120210203117.GI5650@redhat.com> <1328906163.25989.59.camel@laptop> <20120210210423.GJ5650@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Peter Zijlstra , "Srivatsa S. Bhat" , Josh Boyer , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Avi Kivity , kvm , linux-kernel , x86 , Suresh B Siddha , Sergey Senozhatsky To: Sasha Levin Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On Fri, Mar 23, 2012 at 12:47:38PM +0200, Sasha Levin wrote: > I'm just wondering about the status of the patches to fix this issue, > this is still happening on linux-next. I got distracted with other stuff. I have been running code that does = the following in the shutdown path: foreach_online_cpu cpu_down but I get occasional hangs on reboot that I haven't gotten around to debugging. I assumed this is the approach Peter was suggesting though = I don't think he was sure if it was going to be reliable. Cheers, Don >=20 > On Fri, Feb 10, 2012 at 11:04 PM, Don Zickus wro= te: > > On Fri, Feb 10, 2012 at 09:36:03PM +0100, Peter Zijlstra wrote: > >> On Fri, 2012-02-10 at 15:31 -0500, Don Zickus wrote: > >> > So my second patch which I will eventually post will just skip t= he WARN_ON > >> > if the system is going down. =A0Not sure if that is the proper w= ay to address > >> > this problem or change all of the stop_this_cpu code to use a di= fferent > >> > bitmask than the cpu_online bitmask (but then you run the risk o= f a stuck > >> > IPI I guess if the cpu is halted without notifying anyone). > >> > >> Yeah, the async hard kill of all cpus is bound to make problems.. = what > >> I'm wondering is, why is this in the normal shutdown path and not > >> specific to a hard panic? > > > > I didn't write the original code, I just changed it from REBOOT_IRQ= to > > NMI and left all the stop_this_cpu stuff alone. > > > >> > >> Trying to make this work is just not going to be pretty, and in th= e > >> panic case we really don't care much. > > > > Sure. > > > > Cheers, > > Don