From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Verma, Vishal L" Subject: Re: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic' Date: Fri, 21 Apr 2017 21:39:45 +0000 Message-ID: <1492810703.2738.27.camel@intel.com> References: <20170412202238.5d327vmwjqvbzzop@pd.tnic> <1492028744.2738.14.camel@intel.com> <20170412205229.GA13659@intel.com> <20170412211931.GA15771@intel.com> <20170412214749.jyt7cmyhovivtb2m@pd.tnic> <20170412221639.5klmqk4mjbvy6btx@pd.tnic> <20170412222619.GA17839@intel.com> <20170412222925.r3izasv3yuyjy62e@pd.tnic> <20170413113159.rc32ebiswn64nzrr@pd.tnic> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20170413113159.rc32ebiswn64nzrr-fF5Pk5pvG8Y@public.gmane.org> Content-Language: en-US Content-ID: <827E35C5767ABA4DB933C03C84D49F84-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: "Luck, Tony" , "bp-l3A5Bk7waGM@public.gmane.org" Cc: "linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org" , "x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org" , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org" List-Id: linux-nvdimm@lists.01.org T24gVGh1LCAyMDE3LTA0LTEzIGF0IDEzOjMxICswMjAwLCBCb3Jpc2xhdiBQZXRrb3Ygd3JvdGU6 DQo+IE9uIFRodSwgQXByIDEzLCAyMDE3IGF0IDEyOjI5OjI1QU0gKzAyMDAsIEJvcmlzbGF2IFBl dGtvdiB3cm90ZToNCj4gPiBPbiBXZWQsIEFwciAxMiwgMjAxNyBhdCAwMzoyNjoxOVBNIC0wNzAw LCBMdWNrLCBUb255IHdyb3RlOg0KPiA+ID4gV2UgY2FuIGZ1dHogd2l0aCB0aGF0IGFuZCBoYXZl IHRoZW0gc3BlY2lmeSB3aGljaCBjaGFpbiAob3IgYm90aCkNCj4gPiA+IHRoYXQgdGhleSB3YW50 IHRvIGJlIGFkZGVkIHRvLg0KPiA+IA0KPiA+IFdlbGwsIEkgZGlkbid0IHdhbnQgdGhlIGF0b21p YyBjaGFpbiB0byBiZSBhIG5vdGlmaWVyIGJlY2F1c2Ugd2UgY2FuDQo+ID4ga2VlcCBpdCBzaW1w bGUgYW5kIG5vbi1ibG9ja2luZy4gT25seSB0aGUgcHJvY2VzcyBjb250ZXh0IG9uZSB3aWxsDQo+ ID4gYmUuDQo+ID4gDQo+ID4gU28gdGhlIHF1ZXN0aW9uIGlzLCBkbyB3ZSBldmVuIGhhdmUgYSB1 c2UgY2FzZSBmb3Igb3V0c2lkZSBjb25zdW1lcnMNCj4gPiBoYW5naW5nIG9uIHRoZSBhdG9taWMg Y2hhaW4/IEJlY2F1c2UgaWYgbm90LCB3ZSdyZSBnb29kIHRvIGdvLg0KPiANCj4gT2ssIG5ldyBk YXksIG5ldyBwYXRjaC4NCj4gDQo+IEJlbG93IGlzIHdoYXQgd2UgY291bGQgZG86IHdlIGRvbid0 IGNhbGwgdGhlIG5vdGlmaWVyIGF0IGFsbCBvbiB0aGUNCj4gYXRvbWljIHBhdGggYnV0IG9ubHkg cHJpbnQgdGhlIE1DRXMuIFdlIGRvIGxvZyB0aGVtIGFuZCBpZiB0aGUgbWFjaGluZQ0KPiBzdXJ2 aXZlcywgd2UgcHJvY2VzcyB0aGVtIGFjY29yZGluZ2x5LiBUaGlzIGlzIG9ubHkgYSBmaXggZm9y IHVwc3RyZWFtDQo+IHNvIHRoYXQgdGhlIGN1cnJlbnQgaXNzdWUgYXQgaGFuZCBpcyBhZGRyZXNz ZWQuDQo+IA0KPiBGb3IgbGF0ZXIsIHdlJ2QgbmVlZCB0byBzcGxpdCB0aGUgcGF0aHMgaW46DQo+ IA0KPiBjcml0aWNhbF9wcmludF9tY2UoKQ0KPiANCj4gb3Igc29tZXN1Y2ggd2hpY2ggaW1tZWRp YXRlbHkgZHVtcHMgdGhlIE1DRSB0byBkbWVzZywgYW5kDQo+IA0KPiBtY2VfbG9nKCkNCj4gDQo+ IHdoaWNoIGRvZXMgdGhlIHNsb3cgcGF0aCBvZiBsb2dnaW5nIE1DRXMgYW5kIGNhbGxpbmcgdGhl IGJsb2NraW5nDQo+IG5vdGlmaWVyLg0KPiANCj4gTm93LCBJJ2Qgd2FudCB0byBoYXZlIGRlY29k aW5nIG9mIHRoZSBNQ0Ugb24gdGhlIGNyaXRpY2FsIHBhdGggdG9vIHNvDQo+IEkgaGF2ZSB0byB0 aGluayBhYm91dCBob3cgdG8gZG8gdGhhdCBuaWNlbHkuIE1heWJlIG1vdmUgdGhlIGRlY29kaW5n DQo+IGJpdHMgd2hpY2ggYXJlIHRoZSBzYW1lIGJldHdlZW4gSW50ZWwgYW5kIEFNRCBpbiBtY2Uu YyBhbmQgaGF2ZSBzb21lDQo+IHZlbmRvci1zcGVjaWZpYywgZmFzdCBjYWxscy4gV2UnbGwgc2Vl LiBCdHcsIHRoaXMgaXMgc29tZXRoaW5nIEluZ28NCj4gaGFzDQo+IGJlZW4gbWVudGlvbmluZyBm b3IgYSB3aGlsZS4NCj4gDQo+IEFueXdheSwgaGVyZSdzIGp1c3QgdGhlIHVyZ2VudCBmaXggZm9y IG5vdy4NCj4gDQo+IFRoYW5rcy4NCj4gDQo+IC0tLQ0KPiBGcm9tOiBWaXNoYWwgVmVybWEgPHZp c2hhbC5sLnZlcm1hQGludGVsLmNvbT4NCj4gRGF0ZTogVHVlLCAxMSBBcHIgMjAxNyAxNjo0NDo1 NyAtMDYwMA0KPiBTdWJqZWN0OiBbUEFUQ0hdIHg4Ni9tY2U6IE1ha2UgdGhlIE1DRSBub3RpZmll ciBhIGJsb2NraW5nIG9uZQ0KPiANCj4gVGhlIE5GSVQgTUNFIGhhbmRsZXIgY2FsbGJhY2sgKGZv ciBoYW5kbGluZyBtZWRpYSBlcnJvcnMgb24gTlZESU1NcykNCj4gdGFrZXMgYSBtdXRleCB0byBh ZGQgdGhlIGxvY2F0aW9uIG9mIGEgbWVtb3J5IGVycm9yIHRvIGEgbGlzdC4gQnV0DQo+IHNpbmNl DQo+IHRoZSBub3RpZmllciBjYWxsIGNoYWluIGZvciBtYWNoaW5lIGNoZWNrcyAoeDg2X21jZV9k ZWNvZGVyX2NoYWluKSBpcw0KPiBhdG9taWMsIHdlIGdldCBhIGxvY2tkZXAgc3BsYXQgbGlrZToN Cj4gDQo+IMKgIEJVRzogc2xlZXBpbmcgZnVuY3Rpb24gY2FsbGVkIGZyb20gaW52YWxpZCBjb250 ZXh0IGF0DQo+IGtlcm5lbC9sb2NraW5nL211dGV4LmM6NjIwDQo+IMKgIGluX2F0b21pYygpOiAx LCBpcnFzX2Rpc2FibGVkKCk6IDAsIHBpZDogNCwgbmFtZToga3dvcmtlci8wOjANCj4gwqAgWy4u XQ0KPiDCoCBDYWxsIFRyYWNlOg0KPiDCoMKgwqBkdW1wX3N0YWNrDQo+IMKgwqDCoF9fX21pZ2h0 X3NsZWVwDQo+IMKgwqDCoF9fbWlnaHRfc2xlZXANCj4gwqDCoMKgbXV0ZXhfbG9ja19uZXN0ZWQN Cj4gwqDCoMKgPyBfX2xvY2tfYWNxdWlyZQ0KPiDCoMKgwqBuZml0X2hhbmRsZV9tY2UNCj4gwqDC oMKgbm90aWZpZXJfY2FsbF9jaGFpbg0KPiDCoMKgwqBhdG9taWNfbm90aWZpZXJfY2FsbF9jaGFp bg0KPiDCoMKgwqA/IGF0b21pY19ub3RpZmllcl9jYWxsX2NoYWluDQo+IMKgwqDCoG1jZV9nZW5f cG9vbF9wcm9jZXNzDQo+IA0KPiBDb252ZXJ0IHRoZSBub3RpZmllciB0byBhIGJsb2NraW5nIG9u ZSB3aGljaCBnZXRzIHRvIHJ1biBvbmx5IGluDQo+IHByb2Nlc3MNCj4gY29udGV4dC4NCj4gDQo+ IEJvcmlzOiByZW1vdmUgdGhlIG5vdGlmaWVyIGNhbGwgaW4gYXRvbWljIGNvbnRleHQgaW4gcHJp bnRfbWNlKCkuIEZvcg0KPiBub3csIGxldCdzIHByaW50IHRoZSBNQ0Ugb24gdGhlIGF0b21pYyBw YXRoIHNvIHRoYXQgd2UgY2FuIG1ha2Ugc3VyZQ0KPiBpdA0KPiBnb2VzIG91dC4gV2Ugc3RpbGwg bG9nIGl0IGZvciBwcm9jZXNzIGNvbnRleHQgbGF0ZXIuDQo+IA0KPiBSZXBvcnRlZC1ieTogUm9z cyBad2lzbGVyIDxyb3NzLnp3aXNsZXJAbGludXguaW50ZWwuY29tPg0KPiBTaWduZWQtb2ZmLWJ5 OiBWaXNoYWwgVmVybWEgPHZpc2hhbC5sLnZlcm1hQGludGVsLmNvbT4NCj4gQ2M6IFRvbnkgTHVj ayA8dG9ueS5sdWNrQGludGVsLmNvbT4NCj4gQ2M6IERhbiBXaWxsaWFtcyA8ZGFuLmoud2lsbGlh bXNAaW50ZWwuY29tPg0KPiBDYzogbGludXgtZWRhYyA8bGludXgtZWRhY0B2Z2VyLmtlcm5lbC5v cmc+DQo+IENjOiB4ODYtbWwgPHg4NkBrZXJuZWwub3JnPg0KPiBDYzogPHN0YWJsZUB2Z2VyLmtl cm5lbC5vcmc+DQo+IExpbms6IGh0dHA6Ly9sa21sLmtlcm5lbC5vcmcvci8yMDE3MDQxMTIyNDQ1 Ny4yNDc3Ny0xLXZpc2hhbC5sLnZlcm1hQGkNCj4gbnRlbC5jb20NCj4gRml4ZXM6IDY4MzlhNmQ5 NmY0ZSAoIm5maXQ6IGRvIGFuIEFSUyBzY3J1YiBvbiBoaXR0aW5nIGEgbGF0ZW50IG1lZGlhDQo+ IGVycm9yIikNCj4gU2lnbmVkLW9mZi1ieTogQm9yaXNsYXYgUGV0a292IDxicEBzdXNlLmRlPg0K PiAtLS0NCj4gwqBhcmNoL3g4Ni9rZXJuZWwvY3B1L21jaGVjay9tY2UtZ2VucG9vbC5jwqDCoHzC oMKgMiArLQ0KPiDCoGFyY2gveDg2L2tlcm5lbC9jcHUvbWNoZWNrL21jZS1pbnRlcm5hbC5oIHzC oMKgMiArLQ0KPiDCoGFyY2gveDg2L2tlcm5lbC9jcHUvbWNoZWNrL21jZS5jwqDCoMKgwqDCoMKg wqDCoMKgwqB8IDE4ICsrKystLS0tLS0tLS0tLS0tLQ0KPiDCoDMgZmlsZXMgY2hhbmdlZCwgNiBp bnNlcnRpb25zKCspLCAxNiBkZWxldGlvbnMoLSkNCj4gDQoNCkkgbm90aWNlZCB0aGlzIHBhdGNo IHdhcyBwaWNrZWQgdXAgaW4gdGlwLCBpbiByYXMvdXJnZW50LCBidXQgZGlkbid0IHNlZQ0KYSBw dWxsIHJlcXVlc3QgZm9yIDQuMTEgLSB3YXMgdGhpcyB0aGUgaW50ZW50aW9uPyBPciB3aWxsIGl0 IGp1c3QgYmUNCmFkZGVkIGZvciA0LjEyPw0KDQoJLVZpc2hhbApfX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fXwpMaW51eC1udmRpbW0gbWFpbGluZyBsaXN0Ckxp bnV4LW52ZGltbUBsaXN0cy4wMS5vcmcKaHR0cHM6Ly9saXN0cy4wMS5vcmcvbWFpbG1hbi9saXN0 aW5mby9saW51eC1udmRpbW0K From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1163054AbdDUVjw (ORCPT ); Fri, 21 Apr 2017 17:39:52 -0400 Received: from mga04.intel.com ([192.55.52.120]:27039 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1041408AbdDUVjr (ORCPT ); Fri, 21 Apr 2017 17:39:47 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,231,1488873600"; d="scan'208";a="92780976" From: "Verma, Vishal L" To: "Luck, Tony" , "bp@suse.de" CC: "tglx@linutronix.de" , "Williams, Dan J" , "linux-kernel@vger.kernel.org" , "ross.zwisler@linux.intel.com" , "x86@kernel.org" , "linux-nvdimm@ml01.01.org" Subject: Re: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic' Thread-Topic: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic' Thread-Index: AQHSsxV3s/VS0/ZISkGoKwFEXJsIM6HB6Z0AgABPc4CAAGssAIAAAN0AgAAHewCAAADNAIAABL+AgAACAQCAAAfngIAACA+AgAACs4CAAADegIAA2qWAgA08FIA= Date: Fri, 21 Apr 2017 21:39:45 +0000 Message-ID: <1492810703.2738.27.camel@intel.com> References: <20170412202238.5d327vmwjqvbzzop@pd.tnic> <1492028744.2738.14.camel@intel.com> <20170412205229.GA13659@intel.com> <20170412211931.GA15771@intel.com> <20170412214749.jyt7cmyhovivtb2m@pd.tnic> <20170412221639.5klmqk4mjbvy6btx@pd.tnic> <20170412222619.GA17839@intel.com> <20170412222925.r3izasv3yuyjy62e@pd.tnic> <20170413113159.rc32ebiswn64nzrr@pd.tnic> In-Reply-To: <20170413113159.rc32ebiswn64nzrr@pd.tnic> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.232.112.53] Content-Type: text/plain; charset="utf-8" Content-ID: <827E35C5767ABA4DB933C03C84D49F84@intel.com> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id v3LLeDpB002926 On Thu, 2017-04-13 at 13:31 +0200, Borislav Petkov wrote: > On Thu, Apr 13, 2017 at 12:29:25AM +0200, Borislav Petkov wrote: > > On Wed, Apr 12, 2017 at 03:26:19PM -0700, Luck, Tony wrote: > > > We can futz with that and have them specify which chain (or both) > > > that they want to be added to. > > > > Well, I didn't want the atomic chain to be a notifier because we can > > keep it simple and non-blocking. Only the process context one will > > be. > > > > So the question is, do we even have a use case for outside consumers > > hanging on the atomic chain? Because if not, we're good to go. > > Ok, new day, new patch. > > Below is what we could do: we don't call the notifier at all on the > atomic path but only print the MCEs. We do log them and if the machine > survives, we process them accordingly. This is only a fix for upstream > so that the current issue at hand is addressed. > > For later, we'd need to split the paths in: > > critical_print_mce() > > or somesuch which immediately dumps the MCE to dmesg, and > > mce_log() > > which does the slow path of logging MCEs and calling the blocking > notifier. > > Now, I'd want to have decoding of the MCE on the critical path too so > I have to think about how to do that nicely. Maybe move the decoding > bits which are the same between Intel and AMD in mce.c and have some > vendor-specific, fast calls. We'll see. Btw, this is something Ingo > has > been mentioning for a while. > > Anyway, here's just the urgent fix for now. > > Thanks. > > --- > From: Vishal Verma > Date: Tue, 11 Apr 2017 16:44:57 -0600 > Subject: [PATCH] x86/mce: Make the MCE notifier a blocking one > > The NFIT MCE handler callback (for handling media errors on NVDIMMs) > takes a mutex to add the location of a memory error to a list. But > since > the notifier call chain for machine checks (x86_mce_decoder_chain) is > atomic, we get a lockdep splat like: > >   BUG: sleeping function called from invalid context at > kernel/locking/mutex.c:620 >   in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0 >   [..] >   Call Trace: >    dump_stack >    ___might_sleep >    __might_sleep >    mutex_lock_nested >    ? __lock_acquire >    nfit_handle_mce >    notifier_call_chain >    atomic_notifier_call_chain >    ? atomic_notifier_call_chain >    mce_gen_pool_process > > Convert the notifier to a blocking one which gets to run only in > process > context. > > Boris: remove the notifier call in atomic context in print_mce(). For > now, let's print the MCE on the atomic path so that we can make sure > it > goes out. We still log it for process context later. > > Reported-by: Ross Zwisler > Signed-off-by: Vishal Verma > Cc: Tony Luck > Cc: Dan Williams > Cc: linux-edac > Cc: x86-ml > Cc: > Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@i > ntel.com > Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media > error") > Signed-off-by: Borislav Petkov > --- >  arch/x86/kernel/cpu/mcheck/mce-genpool.c  |  2 +- >  arch/x86/kernel/cpu/mcheck/mce-internal.h |  2 +- >  arch/x86/kernel/cpu/mcheck/mce.c          | 18 ++++-------------- >  3 files changed, 6 insertions(+), 16 deletions(-) > I noticed this patch was picked up in tip, in ras/urgent, but didn't see a pull request for 4.11 - was this the intention? Or will it just be added for 4.12? -Vishal