From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from out01.mta.xmission.com ([166.70.13.231]) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1RyEc7-0003yL-N5 for kexec@lists.infradead.org; Fri, 17 Feb 2012 03:36:05 +0000 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [tip:x86/debug] x86/kdump: No need to disable ioapic/ lapic in crash path References: <20120216172735.GX9751@redhat.com> <20120216215603.GH9751@redhat.com> Date: Thu, 16 Feb 2012 19:38:21 -0800 In-Reply-To: <20120216215603.GH9751@redhat.com> (Don Zickus's message of "Thu, 16 Feb 2012 16:56:03 -0500") Message-ID: MIME-Version: 1.0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: kexec-bounces@lists.infradead.org Errors-To: kexec-bounces+dwmw2=infradead.org@lists.infradead.org To: Don Zickus Cc: linux-tip-commits@vger.kernel.org, Yinghai Lu , mingo@elte.hu, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, mingo@redhat.com, hpa@zytor.com, akpm@linux-foundation.org, torvalds@linux-foundation.org, tglx@linutronix.de, vgoyal@redhat.com RG9uIFppY2t1cyA8ZHppY2t1c0ByZWRoYXQuY29tPiB3cml0ZXM6Cgo+IE9uIFRodSwgRmViIDE2 LCAyMDEyIGF0IDAxOjUzOjI5UE0gLTA4MDAsIFlpbmdoYWkgTHUgd3JvdGU6Cj4+IE9uIFRodSwg RmViIDE2LCAyMDEyIGF0IDk6MjcgQU0sIERvbiBaaWNrdXMgPGR6aWNrdXNAcmVkaGF0LmNvbT4g d3JvdGU6Cj4+IAo+PiA+IFNvIEkgdGhpbmsgSSBmaWd1cmVkIGl0IG91dC4gwqBJIHdlbnQgdGhy b3VnaCBhbmQgY29tbWVudGVkIG91dCBjb2RlIGluCj4+ID4gZGlzYWJsZV9sb2NhbF9BUElDIHVu dGlsIEkgbmFycm93ZWQgaXQgZG93biB0byB0aGUgcGllY2Ugb2YgY29kZSB0aGF0Cj4+ID4gbmVl ZHMgdG8gYmUgZGlzYWJsZWQgZm9yIGl0IHRvIHdvcmsuCj4+ID4KPj4gPiBTdXJwcmlzZSwgc3Vy cHJpc2UuLi4gaXRzIExWVFBDIG9yIHBlcmYhIDotKSDCoEFjdHVhbGx5IGl0IGlzIHRoZQo+PiA+ IG5taV93YXRjaGRvZyB3aGljaCB1c2VzIHBlcmYuIMKgTXkgdGhlb3J5IGlzIE5NSXMgYXJlIG5v dCBkaXNhYmxlZCBhbmQgb25lCj4+ID4gaXMgZ2VuZXJhdGVkIGJ5IHRoZSBsb2NhbCBhcGljIGR1 cmluZyBkZWNvbXByZXNzaW9uIChqdXN0IGJhZCB0aW1pbmcpIGFuZAo+PiA+ICpzcGxhdCouCj4+ ID4KPj4gPiBZaW5naGFpLCB5b3UgY2FuIHByb2JhYmx5IHByb3ZlIHRoaXMgYnkKPj4gPgo+PiA+ IGVjaG8gMCA+IC9wcm9jL3N5cy9rZXJuZWwvbm1pX3dhdGNoZG9nCj4+ID4KPj4gPiB0aGVuIGRv IHlvdXIga2R1bXAgY3Jhc2ggdGVzdC4KPj4gCj4+IHllcy4gIHRoYXQgd2lsbCBtYWtlIGtkdW1w IGNyYXNoIHdvcmtpbmcuCj4KPiBDb29sLiAgVGhhbmtzLgo+Cj4gRXJpYywKPgo+IEp1c3QgbGV0 IG1lIGtub3cgaG93IHlvdSB3YW50IHRvIGhhbmRsZSBkaXNhYmxpbmcgTk1JcyBpbiB0aGUga2V4 ZWMgaW4KPiBwYW5pYyBzaHV0ZG93biBjYXNlLgoKSW50ZXJlc3RpbmcuICBBcHBhcmVudGx5IHdl IGhhdmUgYmVlbiBhdm9pZGluZyB0aGlzIHByb2JsZW0gYnkgYWNjaWRlbnQuCgpUaGFua3MgZm9y IGh1bnRpbmcgdGhpcyBkb3duLgoKVGhlIG9wdGlvbnMgSSBjYW4gc2VlIGFyZToKLSBFbnN1cmUg d2UgY2FuIGhhbmRsZSBhbmQgaWdub3JlIGV4Y2VwdGlvbnMgbGlrZSB0aGlzLgotIEFsd2F5cyBz aHV0b2ZmIHRoZSBsYXBpYyBhbmQgaW9hcGljIGVudHJpZXMgdGhhdCBjYW4gZ2VuZXJhdGUgdGhp cy4KClRoZSBnb29kIG5ld3MgaXMgdGhhdCBib3RoIHNvbHV0aW9ucyBzaG91bGQgYmUgbG9jayBm cmVlLgoKVGhlIGN1cnJlbnQga2VybmVsIGJvb3QgY29kZSByZWxpZXMgb24gdGhlIGFzc3VtcHRp b24gdGhhdCBhbGwKaW50ZXJydXB0cyBjYW4gYmUgZGlzYWJsZWQuICBJbiB0aGlzIGNhc2Ugd2l0 aCBubWkncyB0aGF0IGlzIGNsZWFybHkgbm90CnRoZSBjYXNlLgoKVGhlIG1vc3Qgcm9idXN0IHNv bHV0aW9uIGFuZCB3aGF0IHdlIHdhbnQgdG8gZG8gbG9uZyB0ZXJtIGlzIHRvCmluc3RhbGwgYW4g aWR0IHRoYXQgd2lsbCBzaW1wbHkgaWdub3JlIGFsbCBpbnRlcnJ1cHRzIHVudGlsIHRoZQppZHQg aXMgcmVwbGFjZWQuICBTaW5jZSByZWFsbHkgYWxsIHdlIG5lZWQgdG8gZGVhbCB3aXRoIGlzIHRo ZSBOTUkKdmVjdG9yLCB3aGljaCBpcyB2ZWN0b3IgIzIsIHdlIGNhbiBoYXZlIGEgdmVyeSBzbWFs bCBpbnRlcnJ1cHQKZGVzY3JpcHRvciB0YWJsZS4KClVuZm9ydHVuYXRlbHkgd2UgZ28gdGhyb3Vn aCBzb21lIGNwdSBtb2RlIHN3aXRjaGVzIGluIC9zYmluL2tleGVjLAphbGxvd2luZyB1cyB0byBl bnRlciB0aGUga2VybmVscyAzMmJpdCBlbnRyeSBwb2ludCBiZWZvcmUgd2UKcnVuIHRoZSBkZWNv bXByZXNzZXIsIHNvIGF0IGZpcnN0IGdsYW5jZSBib3RoIC9zYmluL2tleGVjIGFuZCB0aGUKa2Vy bmVsIG5lZWQgdG8gYmUgZml4ZWQgaW4gYSBjb29yZGluYXRlZCBmYXNoaW9uLgoKVGhlcmUgYXJl IHR3byB3YXMgSSBjYW4gc2VlIG9mIHJlbW92aW5nIHRoZSBuZWVkIGZvciBhbiBleGFjdGx5CmNv b3JkaW5hdGVkIHJlbGVhc2UuCi0gRG9jdW1lbnQgdGhhdCBhbiBvbGQgL3NiaW4va2V4ZWMgdXNl cnNwYWNlIHJlcXVpcmVzIHlvdSBub3QgdG8KICB1c2UgdGhlIG5taSB3YXRjaGRvZyB3aXRoIG1v ZGVybiBrZXJuZWxzLgotIEZvciBhIHNob3J0IHdoaWxlIHNpbXBseSByZXRhaW4gY29kZSB0aGF0 IHN0b21wcyB0aGUgbm1pIHdhdGNoZG9nLgogIChCdXQgc3RpbGwgbGVhdmVzIHVzIG9wZW4gdG8g b3RoZXIga2luZHMgb2Ygbm1pJ3MpLgoKR3JyLiAgTG9va2luZyBhIGxpdHRsZSBtb3JlIGNsb3Nl bHksIGFsbCB0aHJvdWdob3V0IHRoZSBsaW51eCBrZXJuZWwncwpib290IHRoZXJlIGlzIHRoZSBh c3N1bXB0aW9uIHRoYXQgYW55IGludGVycnVwdCBkdXJpbmcgYm9vdCBpcyBhIGZhaWx1cmUKb2Yg c29tZSBraW5kLCBhbmQgZXhjZXB0IGZvciBhbiBlcnJhbnQgbm1pIHdhdGNoZG9nIHRoYXQgaXMg YSB0cnVlCmFzc3VtcHRpb24uCgpEb24gSSBndWVzcyBJIHJlYWxseSBoYXZlIHRvIHJlY29tbWVu ZCBkaXNhYmxpbmcgdGhlIG5taSB3YXRjaGRvZyBpbiB0aGUKa2V4ZWMgb24gcGFuaWMgcGF0aCBp ZiB3ZSBjYW4gZG8gc28gYXQgYWxsIHJlYXNvbmFibHkuIAoKSSBsaWtlIHRoZSBpZGVhIG9mIGln bm9yaW5nIG5taXMgZHVyaW5nIGJvb3QgYnV0IHRoYXQgc2VlbXMgdG8gYmUgYQpzbGlnaHRseSBs YXJnZXIgcHJvamVjdCBhbmQgd2l0aCBsaXR0bGUgcHJhY3RpY2FsIGltcHJvdmVtZW50IGluIGtl eGVjCm9uIHBhbmljIHF1YWxpdHkuICBPdGhlciB0aGFuIGdldHRpbmcgd2hhdCBzaG91bGQgYmUg b25lIG9yIHR3bwppL28gd3JpdGVzIG91dCBvZiB0aGUga2V4ZWMgb24gcGFuaWMgcGF0aC4KCkVy aWMKCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwprZXhl YyBtYWlsaW5nIGxpc3QKa2V4ZWNAbGlzdHMuaW5mcmFkZWFkLm9yZwpodHRwOi8vbGlzdHMuaW5m cmFkZWFkLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2tleGVjCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753919Ab2BQDfc (ORCPT ); Thu, 16 Feb 2012 22:35:32 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:56715 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751438Ab2BQDfb convert rfc822-to-8bit (ORCPT ); Thu, 16 Feb 2012 22:35:31 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Don Zickus Cc: Yinghai Lu , linux-kernel@vger.kernel.org, mingo@redhat.com, hpa@zytor.com, torvalds@linux-foundation.org, kexec@lists.infradead.org, vgoyal@redhat.com, akpm@linux-foundation.org, tglx@linutronix.de, mingo@elte.hu, linux-tip-commits@vger.kernel.org Subject: Re: [tip:x86/debug] x86/kdump: No need to disable ioapic/ lapic in crash path References: <20120216172735.GX9751@redhat.com> <20120216215603.GH9751@redhat.com> Date: Thu, 16 Feb 2012 19:38:21 -0800 In-Reply-To: <20120216215603.GH9751@redhat.com> (Don Zickus's message of "Thu, 16 Feb 2012 16:56:03 -0500") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=;;;mid=;;;hst=in01.mta.xmission.com;;;ip=98.207.153.68;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+KPfx86BPEsB2HsW0Yid6BWoyeVH4Om2A= X-SA-Exim-Connect-IP: 98.207.153.68 X-SA-Exim-Mail-From: ebiederm@xmission.com X-SA-Exim-Scanned: No (on in01.mta.xmission.com); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Don Zickus writes: > On Thu, Feb 16, 2012 at 01:53:29PM -0800, Yinghai Lu wrote: >> On Thu, Feb 16, 2012 at 9:27 AM, Don Zickus wrote: >> >> > So I think I figured it out.  I went through and commented out code in >> > disable_local_APIC until I narrowed it down to the piece of code that >> > needs to be disabled for it to work. >> > >> > Surprise, surprise... its LVTPC or perf! :-)  Actually it is the >> > nmi_watchdog which uses perf.  My theory is NMIs are not disabled and one >> > is generated by the local apic during decompression (just bad timing) and >> > *splat*. >> > >> > Yinghai, you can probably prove this by >> > >> > echo 0 > /proc/sys/kernel/nmi_watchdog >> > >> > then do your kdump crash test. >> >> yes. that will make kdump crash working. > > Cool. Thanks. > > Eric, > > Just let me know how you want to handle disabling NMIs in the kexec in > panic shutdown case. Interesting. Apparently we have been avoiding this problem by accident. Thanks for hunting this down. The options I can see are: - Ensure we can handle and ignore exceptions like this. - Always shutoff the lapic and ioapic entries that can generate this. The good news is that both solutions should be lock free. The current kernel boot code relies on the assumption that all interrupts can be disabled. In this case with nmi's that is clearly not the case. The most robust solution and what we want to do long term is to install an idt that will simply ignore all interrupts until the idt is replaced. Since really all we need to deal with is the NMI vector, which is vector #2, we can have a very small interrupt descriptor table. Unfortunately we go through some cpu mode switches in /sbin/kexec, allowing us to enter the kernels 32bit entry point before we run the decompresser, so at first glance both /sbin/kexec and the kernel need to be fixed in a coordinated fashion. There are two was I can see of removing the need for an exactly coordinated release. - Document that an old /sbin/kexec userspace requires you not to use the nmi watchdog with modern kernels. - For a short while simply retain code that stomps the nmi watchdog. (But still leaves us open to other kinds of nmi's). Grr. Looking a little more closely, all throughout the linux kernel's boot there is the assumption that any interrupt during boot is a failure of some kind, and except for an errant nmi watchdog that is a true assumption. Don I guess I really have to recommend disabling the nmi watchdog in the kexec on panic path if we can do so at all reasonably. I like the idea of ignoring nmis during boot but that seems to be a slightly larger project and with little practical improvement in kexec on panic quality. Other than getting what should be one or two i/o writes out of the kexec on panic path. Eric