From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from e4.ny.us.ibm.com ([32.97.182.144]) by canuck.infradead.org with esmtps (Exim 4.63 #1 (Red Hat Linux)) id 1IIitI-0001rn-Ju for kexec@lists.infradead.org; Wed, 08 Aug 2007 06:35:52 -0400 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e4.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id l78AZk9p021178 for ; Wed, 8 Aug 2007 06:35:46 -0400 Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v8.4) with ESMTP id l78AZk6k425238 for ; Wed, 8 Aug 2007 06:35:46 -0400 Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l78AZkWs000820 for ; Wed, 8 Aug 2007 06:35:46 -0400 Date: Wed, 8 Aug 2007 16:06:03 +0530 From: Vivek Goyal Subject: Re: PATCH/RFC: [kdump] fix APIC shutdown sequence Message-ID: <20070808103603.GC13808@in.ibm.com> References: <46B73955.2080007@fujitsu-siemens.com> <20070807142928.GA18839@in.ibm.com> <46B8AECA.7050908@fujitsu-siemens.com> Mime-Version: 1.0 Content-Disposition: inline In-Reply-To: <46B8AECA.7050908@fujitsu-siemens.com> Reply-To: vgoyal@in.ibm.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: kexec-bounces@lists.infradead.org Errors-To: kexec-bounces+dwmw2=infradead.org+dwmw2=infradead.org@lists.infradead.org To: Martin Wilck Cc: Haren Myneni , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "Eric W. Biederman" T24gVHVlLCBBdWcgMDcsIDIwMDcgYXQgMDc6NDE6MzBQTSArMDIwMCwgTWFydGluIFdpbGNrIHdy b3RlOgpbLi5dCj4gCj4gU3VjaCBhIHNpdHVhdGlvbiBoYXMgbmV2ZXIgYmVlbiBvYnNlcnZlZCBp biB0aGUgImdvb2QiIGNhc2UuCj4gU28sIHdlIGRvIGhhdmUgc29tZSBldmlkZW5jZSwgbm90IGp1 c3QgYmFyZSBzcGVjdWxhdGlvbi4KPiAKPiA+PiAyLiBUaGUgY3Jhc2hpbmcgQ1BVIGl0c2VsZiBk aXNhYmxlcyBpdHMgbG9jYWwgQVBJQwo+ID4+ICAgIGJlZm9yZSB0aGUgSU8tQVBJQywgbGVhdmlu ZyBhIHNob3J0IHRpbWUgd2luZG93Cj4gPj4gICAgd2hlcmUgdGhlIElPQVBJQyBjYW4gcmVjZWl2 ZSBJUlFzLCBidXQgbm90Cj4gPj4gICAgZGVsaXZlciB0aGVtLgo+ID4+Cj4gPiAKPiA+IEkgZG91 YnV0IHRoYXQgaXQgd291bGQgYmUgdGhlIGlzc3VlLiBMb29raW5nIGF0IGludGVsIElPQVBJQyAo ODIwOTNBQSkKPiA+IGRvY3VtZW50YXRpb24sIGl0IHNheXMgdGhhdCBJUlIgYml0IG9mIElPQVBJ QyB3aWxsIGJlIHNldCBvbmx5IGlmCj4gPiBkZXN0aW5hdGlvbiBDUFUgaGFzIGFjY2VwdGVkIHRo ZSBpbnRlcnJ1cHQuIFNvIGlmIHdlIGhhdmUgZGlzYWJsZWQKPiA+IHRoZSBMQVBJQywgaXQgd2ls bCBub3QgYWNjZXB0IHRoZSBpbnRlcnJ1cHQgYW5kIElSUiBiaXQgb2YgSU9BUElDCj4gPiBzaG91 bGQgbm90IGJlIHNldC4KPiAKPiBDYW4geW91IGV4cGxhaW4gaG93LCBvbiB0aGUgZnJvbnQgc2lk ZSBidXMsIHRoZSBJTy1BUElDIGtub3dzIHdoZXRoZXIKPiBhIENQVSBoYXMgYWNjZXB0ZWQgdGhl IElOVCBtZXNzYWdlPyBUaGVyZSBpcyBubyByZXNwb25zZQo+IHRvIHRoZSBJTlQgbWVzc2FnZSBv biB0aGUgYnVzLCBleGNlcHQgZm9yIHRoZSBFT0kgd2hpY2ggY29tZXMgbXVjaCBsYXRlci4KPiBJ J20gbm90IHNheWluZyB0aGF0IHlvdSdyZSB3cm9uZywgSSBqdXN0IHJlYWxseSBkb24ndCB1bmRl cnN0YW5kIHRoaXMKPiBwb2ludC4KPiAKCkkgZG9uJ3Qga25vdyB3aGF0IGlzIGV4YWN0bHkgaGFy ZHdhcmUgcHJvdG9jb2wuIEkgYW0ganVzdCBnb2luZyBieSAKaW50ZWwgZG9jdW1lbnRhdGlvbi4g CgpSZW1vdGUgSVJS4oCUUk8uIFRoaXMgYml0IGlzIHVzZWQgZm9yIGxldmVsIHRyaWdnZXJlZCBp bnRlcnJ1cHRzLiBJdHMgbWVhbmluZwppcyB1bmRlZmluZWQgZm9yIGVkZ2UgdHJpZ2dlcmVkIGlu dGVycnVwdHMuIEZvciBsZXZlbCB0cmlnZ2VyZWQgaW50ZXJydXB0cywKdGhpcyBiaXQgaXMgc2V0 IHRvIDEgd2hlbiBsb2NhbCBBUElDKHMpIGFjY2VwdCB0aGUgbGV2ZWwgaW50ZXJydXB0IHNlbnQg YnkKdGhlIElPQVBJQy4gVGhlIFJlbW90ZSBJUlIgYml0IGlzIHNldCB0byAwIHdoZW4gYW4gRU9J IG1lc3NhZ2Ugd2l0aCBhIG1hdGNoaW5nCmludGVycnVwdCB2ZWN0b3IgaXMgcmVjZWl2ZWQgZnJv bSBhIGxvY2FsIEFQSUMuCgpBY2NvcmRpbmcgdG8gdGhpcywgSVJSIGJpdCBpcyBzZXQgdG8gb25l IG9ubHkgd2hlbiBMb2NhbCBBUElDIGFjY2VwdHMgdGhlCmxldmVsIHRyaWdnZXIgaW50ZXJydXB0 LiBUaGlzIGltcGxpZXMgdGhlcmUgc2hvdWxkIGJlIGEgd2F5IG9mIElPQVBJQyAKa25vd2luZyB0 aGF0IGRlc3RpbmF0aW9uIGxvY2FsIGFwaWMgaGFzIGFjY2VwdGVkIHRoZSBpbnRlcnJ1cHQuCgpB bnl3YXksIEkgdGhpbmsgYWZ0ZXIgZGlzYWJsaW5nIHRoZSBMQVBJQywgaXQgc2hvdWxkIG5vdCBh Y2NlcHQgbW9yZQppbnRlcnJ1cHQgZnJvbSBJT0FQSUMuIEl0IHdpbGwgb25seSBkZWxpdmVyIHBl bmRpbmcgaW50ZXJydXB0cyB0byBDUFUKaWYgQ1BVIGhhcyBub3QgbWFza2VkIGl0LiBTbyBpbiB5 b3VyIGNhc2UgaXQgbG9va3MgbGlrZSB3ZSBoYXZlIGdvdApwZW5kaW5nIGludGVycnVwdHMgYXQg TEFQSUMgd2hpY2ggbmV2ZXIgcmVjZWl2ZWQgRU9JIGluIGZpcnN0IGtlcm5lbC4KU2Vjb25kIGtl cm5lbCB3aWxsIGZpZWxkIHRoZXNlIGludGVycnVwdHMgYW5kIHdpbGwgbWFyayBhcyBzcHVyaW91 cwppbnRlcnJ1cHQgYW5kIGlzc3VlIEVPSS4gQnV0IGJ5IHRoYXQgdGltZSBJT0FQSUMgIGRvZXMg bm90IGhhdmUgdmVjdG9yCmluZm8gc28gd2lsbCBub3QgY2xlYXIgSVJSIGJpdC4KICAgCj4gSW4g dGhlIGxvZ2ljYWwgYW5hbHl6ZXIsIHdlIGNhbid0IHNlZSB3aGVuIGV4YWN0bHkgdGhlIGxvY2Fs IEFQSUNzIGFyZQo+IGRpc2FibGVkLiBCdXQgd2Ugc2VlIHRoYXQgSVJRcyBhcnJpdmluZyBhZnRl ciB0aGUgSU8gQVBJQyBwaW4gaXMKPiBtYXNrZWQgbmV2ZXIgZG8gYW55IGhhcm0sIHdoaWxlIElS UXMgYXJyaXZpbmcgImR1cmluZyB0aGUgc2h1dGRvd24KPiBzZXF1ZW5jZSIgKHdlIGNhbiBzZWUg ZS5nLiB0aGUgMm5kIENQVSB0YWtpbmcgdGhlIGJ1cyBhZnRlciB0aGUgTk1JCj4gSVBJKSBjYXVz ZSB0aGUgZXJyb3Igc2l0dWF0aW9uLgo+IAo+ID4+IDMuIEFuIElSUSBpcyByZWNlaXZlZCBhbmQg ZGVsaXZlcmVkIHRvIGEgbG9jYWwgQVBJQywgYnV0Cj4gPj4gICAgbm8gQ1BVIGV2ZXIgZXhlY3V0 ZXMgdGhlIElSUSBoYW5kbGVyIGFuZCB0aGVyZWZvcmUgbm8KPiA+PiAgICBFT0kgaXMgc2VudC4K PiA+Pgo+ID4gCj4gPiBXZSBkbyBpc3N1ZSBFT0kgZm9yIGFsbCB0aGUgcGVuZGluZyBpbnRlcnJ1 cHRzIGluIHNlY29uZAo+ID4ga2VybmVsLiBMb29rIGF0IHNldHVwX2xvY2FsX0FQSUMoKS4gT25j ZSB0aGUgc2Vjb25kIGlzIGJvb3RpbmcsIGl0Cj4gPiBjaGVja3MgaWYgdGhlcmUgYXJlIGFueSBw ZW5kaW5nIGludGVycnVwdHMgKElTUiBiaXQgaXMgc2V0KS4gSWYgeWVzLAo+ID4gaXQgZ29lcyBh aGVhZCBhbmQgaXNzdWVzIGFuIGV4dHJhIEVPSS4gVGhpcyBzaG91bGQgYWxzbyBjbGVhciB0aGUK PiA+IElSUiByZWdpc3RlciBvZiBJT0FQSUMuCj4gCj4gSW4gYW4gZWFybGllciBwYXRjaCwgSSB0 cmllZCB0byBhZGQgdGhhdCBzYW1lIGNvZGUgaW4KPiBtYWNoaW5lX2NyYXNoX3NodXRkb3duKCkg YW5kIGNyYXNoX25taV9jYWxsYmFjaygpLCAgaW4gb3JkZXIKPiB0byBzZW5kIEVPSXMgZm9yIHBl bmRpbmcgSVJRcyBvbiBhbGwgQ1BVcy4gVW5mb3J0dW5hdGVseSwKPiB0aGF0IGhhZCBubyBlZmZl Y3QuCj4gCgpTbyB5b3UgdHJpZWQgaXNzdWluZyBFT0kgdG8gYWxsIHRoZSBwZW5kaW5nIGludGVy cnVwdHMgaW4gZmlyc3QKa2VybmVsLiBEaWQgeW91IGNoZWNrIHRoYXQgYXQgdGhlIHRpbWUgb2Yg aXNzdWluZyBFT0ksIGNvcnJlc3BvbmRpbmcKdmVjdG9yIGJpdHMgd2VyZSBzZXQgaW4gSVNSL0lS UiBhdCBMQVBJQyBhbmQgSVJSIHdhcyBzZXQgYXQgSU9BUElDPwpJZiB5ZXMsIHRoZW4gb25seSBz b21lIGhhcmR3YXJlIGd1eSBjYW4gdGVsbCB1cyB0aGF0IHdoeSBpc3N1aW5nIGFuIEVPSQpkaWQg bm90IGNsZWFyIHRoZSBJUlIgYml0IGF0IElPQVBJQy4KCkkgaGF2ZSBuZXZlciB1c2VkIGEgdHJh Y2UgYW5hbHl6ZXIuIENhbiB5b3UgcmVhbGx5IHBhcnNlIHRoZSBFT0kgbWVzc2FnZQphbmQgc2Vl IHdoYXQgYXJlIHRoZSBjb250ZW50cz8gSSBtZWFuIGluIHRlcm1zIG9mIG1ha2luZyBzdXJlIHJp Z2h0CnZlY3RvciBpbmZvIGlzIHRoZXJlIHNvIHRoYXQgSU9BUElDIGNhbiByZXNldCBJUlI/CgoK PiA+IGRpc2FibGVfSU9fQVBJQygpIGNvZGUgZG9lcyBub3QgY2xlYXIgdGhlIHZlY3RvciBpbmZv cm1hdGlvbgo+ID4gaW4gcm91dGluZyB0YWJsZS4gSXQganVzdCBtYXNrcyB0aGUgaW50ZXJydXB0 LiBTbyBldmVuIGlmCj4gPiBhbiBFT0kgaXMgaXNzdWVkIGxhdGVyIGluIHNlY29uZCBrZXJuZWws IGl0IHNob3VsZCBjbGVhciB0aGUKPiA+IElSUiBiaXQgYXQgSU9BUElDLgo+IAo+IEhtbS4uLiBp b2FwaWNfbWFza19lbnRyeSgpIHdyaXRlcwo+ICJ1bmlvbiBlbnRyeV91bmlvbiBldSA9IHsgLmVu dHJ5Lm1hc2sgPSAxIH0iIHRvIHRoZSAgTFZUIHJlZ2lzdGVyLgo+IFRoYXQgY2xlYXJzIGFsbCBi aXRzIGV4Y2VwdCB0aGUgbWFzayBiaXQsIHNvIHRoYXQgdGhlIHZlY3RvciBpbmZvcm1hdGlvbgo+ IGlzIGxvc3QuIFBsZWFzZSBjb3JyZWN0IG1lIGlmIEknbSBtaXN0YWtlbi4KPiAKClNvcnJ5LCB5 b3UgYXJlIHJpZ2h0LiBJIHJlYWQgdGhlIGNvZGUgdG9vIGZhc3QuIElPQVBJQyBlbnRyaWVzIHdp bGwgYmUKY2xlYXJlZCBhbmQgcHJlY2lzZWx5IHRoYXQgc2VlbXMgdG8gYmUgdGhlIHJlYXNvbiB3 aHkgaXNzdWluZyBhbiBFT0kKaW4gc2Vjb25kIGtlcm5lbCB3aWxsIG5vdCBoZWxwIHRoaXMgc2l0 dWF0aW9uIGluIGl0cyBjdXJyZW50IGZvcm0uCgo+ID4+IGMpIFRoZXJlIGFyZSBpbmRpY2F0aW9u cyB0aGF0IGJlc2lkZXMgdGhlIEVPSSwgaXQncyBhbHNvCj4gPj4gbmVjZXNzYXJ5IHRoYXQgdGhl IFBDSSBJUlEgcGluIGlzIGRlYXNzZXJ0ZWQgYXQgbGVhc3QgZm9yCj4gPj4gYSBzaG9ydCB0aW1l Lgo+IAo+ID4gSSBkb3VidCB0aGlzLiBUaGVyZSBhcmUgc2l0dWF0aW9ucyB3aGVuIHRoZXJlIGlz IG5vIGRldmljZQo+ID4gZHJpdmVyIGZvciB0aGUgZGV2aWNlIGFuZCBkZXZpY2UgcHVzaGVzIHRo ZSBpbnRlcnJ1cHQgKGZyZXF1ZW50bHkKPiA+IG9ic2VydmVkIGluIHRoZSBjYXNlIG9mIGtkdW1w KS4gS2VybmVsIHN0aWxsIGtlZXBzIG9uIHJlY2VpdmluZwo+ID4gdGhlIGludGVycnVwdCB3aXRo b3V0IGRyaXZlciB0ZWxsaW5nIGRldmljZSB0byBsb3dlciB0aGUgaW50ZXJydXB0Cj4gPiBsaW5l Lgo+IAo+IFNvIGZhciBJIGhhdmVuJ3QgY29tZSB1cCB3aXRoIGEgcGF0Y2ggdGhhdCBqdXN0IHNl bmRzIEVPSSB3aXRob3V0Cj4gYWN0dWFsbHkgY2FsbGluZyBhbnkgSFcgSVJRIGhhbmRsZXIuIFRo YXQgd291bGQgY2xhcmlmeSB0aGlzIHF1ZXN0aW9uLgo+IEl0J3Mgb24gbXkgdG9kbyBsaXN0Lgo+ IAoKSXNuJ3QgdGhlICJub2JvZHkgY2FyZWQgZm9yIGlycSB4IiBzaXR1YXRpb24gc2ltaWxhci4g U29tZSBkZXZpY2UgaGFzCmFzc2VydGVkIGEgbGV2ZWwgdHJpZ2dlcmVkIGludGVycnVwdCBhbmQg dGhlcmUgaXMgbm8gcmVzcGVjdGl2ZSBkZXZpY2UKZHJpdmVyLiBTbyBrZXJuZWwgc2VlcyBhIGZs dXJyeSBvZiBpbnRlcnJ1cHRzLiAoaXNzdWVzIGFuZCBFT0kgYnV0CmltbWVkaWF0ZWx5IHNlZXMg bmV4dCBpbnRlcnJ1cHQgYXMgZGV2aWNlIGhhcyBub3QgZGUtYXNzZXJ0ZWQgdGhlIGludGVycnVw dApsaW5lKT8KClsuLl0KPiA+IC0gQ2FuIHlvdSBwbGVhc2UgcHJpbnQgbG9jYWwgYXBpYyAocHJp bnRfbG9jYWxfQVBJQykgYW5kCj4gPiAgIGlvYXBpYyByZWdpc3RlcnMgKHByaW50X0lPX0FQSUMp IGFuZCB2ZXJpZnkgYWJvdmUgdGhlb3J5Pwo+IAo+IFdlIGFsd2F5cyBzZWUgdGhlIElPLUFQSUMg SVJSICBiaXQgaW4gdGhlIGVycm9yIHNpdHVhdGlvbiwgYmVmb3JlIGFuZAo+IGFmdGVyIHRoZSBz dGFydCBvZiB0aGUga2R1bXAga2VybmVsLgo+IAo+ICpCZWZvcmUqIHRoZSBrZHVtcCBrZXJuZWwg c3RhcnRzIChtb3JlIHByZWNpc2VseTogYmVmb3JlIHRoZSBjYWxsCj4gdG8gZGlzYWJsZV9JT19B UElDKCkpLCB0aGUgSU8tQVBJQyAiZGVsaXZlcnkgc3RhdHVzIiBiaXQgaXMgYWxzbyBzZXQuCj4g Cj4gSSBjaGVja2VkIGxvY2FsIEFQSUMgSVNSIGFuZCBJUlIgYml0cyBpbiBhbiBlYXJsaWVyIHZl cnNpb24gb2YgbXkgcGF0Y2gKPiAoc2VlIGFib3ZlKS4gVGhleSB3ZXJlIHNvbWV0aW1lcyBzZXQs IGFuZCBzb21ldGltZXMgbm90ICh1bmxpa2UgdGhlIElPLUFQSUMKPiBJUlIvRGVsaXZlcnkgU3Rh dHVzIHdoaWNoIGJlaGF2ZSBhbHdheXMgdGhlIHNhbWUpLgo+IAoKVGhpcyB3aWxsIGJlIGludGVy ZXN0aW5nLiBTbyB5b3UgYXJlIHNheWluZyB0aGF0IHRoZXJlIGFyZSBjYXNlcyB3aGVyZQpJUlIg Yml0IGlzIHNldCBhdCBJT0FQSUMgYnV0IGNvcnJlc3BvbmRpbmcgSVJSL0lTUiBiaXQgaXMgbm90 IHNldCBhdApMQVBJQz8gSWYgdGhhdCBpcyB0aGUgY2FzZSB0aGVuIGV2ZW4gZW5hYmxpbmcgdGhl IGludGVycnVwdHMgd2lsbCBub3QKaGVscD8gQmVjYXVzZSBJUlIvSVNSIGJpdCBpcyBub3Qgc2V0 LCBDUFUgd2lsbCBuZXZlciByZWNlaXZlIGFuIGludGVycnVwdAphbmQgaXQgd2lsbCBuZXZlciBp c3N1ZSBhbiBFT0kgZm9yIHRoYXQgdmVjdG9yPyBJIGFtIG5vdCBzdXJlIGluIHN1Y2ggY2FzZXMK aG93IHdpbGwgeW91ciBwYXRjaCBzb2x2ZSB0aGUgaXNzdWUuCgpJIHRoaW5rIHdlIHNob3VsZCBu b3QgYmUgZW5hYmxpbmcgdGhlIGludGVycnVwdCBhZnRlciB0aGUgY3Jhc2guIElmIG5lZWRlZCwK d2Ugc2hvdWxkIGp1c3QgaXNzdWUgdGhlIEVPSSBhbmQgaXQgc2hvdWxkIHdvcmsuIE90aGVyd2lz ZSB3ZSBuZWVkIHRvIGdldAppbiB0b3VjaCB3aXRoIGhhcmR3YXJlIGZvbGtzIHRvIHRlbGwgdXMg d2hhdCBpcyBnb2luZyBvbi4KClRoYW5rcwpWaXZlawoKX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX18Ka2V4ZWMgbWFpbGluZyBsaXN0CmtleGVjQGxpc3RzLmlu ZnJhZGVhZC5vcmcKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0aW5mby9r ZXhlYwo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757296AbXHHKf6 (ORCPT ); Wed, 8 Aug 2007 06:35:58 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751287AbXHHKft (ORCPT ); Wed, 8 Aug 2007 06:35:49 -0400 Received: from e5.ny.us.ibm.com ([32.97.182.145]:39937 "EHLO e5.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751032AbXHHKfs (ORCPT ); Wed, 8 Aug 2007 06:35:48 -0400 Date: Wed, 8 Aug 2007 16:06:03 +0530 From: Vivek Goyal To: Martin Wilck Cc: Haren Myneni , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "Eric W. Biederman" Subject: Re: PATCH/RFC: [kdump] fix APIC shutdown sequence Message-ID: <20070808103603.GC13808@in.ibm.com> Reply-To: vgoyal@in.ibm.com References: <46B73955.2080007@fujitsu-siemens.com> <20070807142928.GA18839@in.ibm.com> <46B8AECA.7050908@fujitsu-siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <46B8AECA.7050908@fujitsu-siemens.com> User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 07, 2007 at 07:41:30PM +0200, Martin Wilck wrote: [..] > > Such a situation has never been observed in the "good" case. > So, we do have some evidence, not just bare speculation. > > >> 2. The crashing CPU itself disables its local APIC > >> before the IO-APIC, leaving a short time window > >> where the IOAPIC can receive IRQs, but not > >> deliver them. > >> > > > > I doubut that it would be the issue. Looking at intel IOAPIC (82093AA) > > documentation, it says that IRR bit of IOAPIC will be set only if > > destination CPU has accepted the interrupt. So if we have disabled > > the LAPIC, it will not accept the interrupt and IRR bit of IOAPIC > > should not be set. > > Can you explain how, on the front side bus, the IO-APIC knows whether > a CPU has accepted the INT message? There is no response > to the INT message on the bus, except for the EOI which comes much later. > I'm not saying that you're wrong, I just really don't understand this > point. > I don't know what is exactly hardware protocol. I am just going by intel documentation. Remote IRR—RO. This bit is used for level triggered interrupts. Its meaning is undefined for edge triggered interrupts. For level triggered interrupts, this bit is set to 1 when local APIC(s) accept the level interrupt sent by the IOAPIC. The Remote IRR bit is set to 0 when an EOI message with a matching interrupt vector is received from a local APIC. According to this, IRR bit is set to one only when Local APIC accepts the level trigger interrupt. This implies there should be a way of IOAPIC knowing that destination local apic has accepted the interrupt. Anyway, I think after disabling the LAPIC, it should not accept more interrupt from IOAPIC. It will only deliver pending interrupts to CPU if CPU has not masked it. So in your case it looks like we have got pending interrupts at LAPIC which never received EOI in first kernel. Second kernel will field these interrupts and will mark as spurious interrupt and issue EOI. But by that time IOAPIC does not have vector info so will not clear IRR bit. > In the logical analyzer, we can't see when exactly the local APICs are > disabled. But we see that IRQs arriving after the IO APIC pin is > masked never do any harm, while IRQs arriving "during the shutdown > sequence" (we can see e.g. the 2nd CPU taking the bus after the NMI > IPI) cause the error situation. > > >> 3. An IRQ is received and delivered to a local APIC, but > >> no CPU ever executes the IRQ handler and therefore no > >> EOI is sent. > >> > > > > We do issue EOI for all the pending interrupts in second > > kernel. Look at setup_local_APIC(). Once the second is booting, it > > checks if there are any pending interrupts (ISR bit is set). If yes, > > it goes ahead and issues an extra EOI. This should also clear the > > IRR register of IOAPIC. > > In an earlier patch, I tried to add that same code in > machine_crash_shutdown() and crash_nmi_callback(), in order > to send EOIs for pending IRQs on all CPUs. Unfortunately, > that had no effect. > So you tried issuing EOI to all the pending interrupts in first kernel. Did you check that at the time of issuing EOI, corresponding vector bits were set in ISR/IRR at LAPIC and IRR was set at IOAPIC? If yes, then only some hardware guy can tell us that why issuing an EOI did not clear the IRR bit at IOAPIC. I have never used a trace analyzer. Can you really parse the EOI message and see what are the contents? I mean in terms of making sure right vector info is there so that IOAPIC can reset IRR? > > disable_IO_APIC() code does not clear the vector information > > in routing table. It just masks the interrupt. So even if > > an EOI is issued later in second kernel, it should clear the > > IRR bit at IOAPIC. > > Hmm... ioapic_mask_entry() writes > "union entry_union eu = { .entry.mask = 1 }" to the LVT register. > That clears all bits except the mask bit, so that the vector information > is lost. Please correct me if I'm mistaken. > Sorry, you are right. I read the code too fast. IOAPIC entries will be cleared and precisely that seems to be the reason why issuing an EOI in second kernel will not help this situation in its current form. > >> c) There are indications that besides the EOI, it's also > >> necessary that the PCI IRQ pin is deasserted at least for > >> a short time. > > > I doubt this. There are situations when there is no device > > driver for the device and device pushes the interrupt (frequently > > observed in the case of kdump). Kernel still keeps on receiving > > the interrupt without driver telling device to lower the interrupt > > line. > > So far I haven't come up with a patch that just sends EOI without > actually calling any HW IRQ handler. That would clarify this question. > It's on my todo list. > Isn't the "nobody cared for irq x" situation similar. Some device has asserted a level triggered interrupt and there is no respective device driver. So kernel sees a flurry of interrupts. (issues and EOI but immediately sees next interrupt as device has not de-asserted the interrupt line)? [..] > > - Can you please print local apic (print_local_APIC) and > > ioapic registers (print_IO_APIC) and verify above theory? > > We always see the IO-APIC IRR bit in the error situation, before and > after the start of the kdump kernel. > > *Before* the kdump kernel starts (more precisely: before the call > to disable_IO_APIC()), the IO-APIC "delivery status" bit is also set. > > I checked local APIC ISR and IRR bits in an earlier version of my patch > (see above). They were sometimes set, and sometimes not (unlike the IO-APIC > IRR/Delivery Status which behave always the same). > This will be interesting. So you are saying that there are cases where IRR bit is set at IOAPIC but corresponding IRR/ISR bit is not set at LAPIC? If that is the case then even enabling the interrupts will not help? Because IRR/ISR bit is not set, CPU will never receive an interrupt and it will never issue an EOI for that vector? I am not sure in such cases how will your patch solve the issue. I think we should not be enabling the interrupt after the crash. If needed, we should just issue the EOI and it should work. Otherwise we need to get in touch with hardware folks to tell us what is going on. Thanks Vivek