From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Subject: [v3,1/2] nfit, mce: only handle uncorrectable machine checks From: Borislav Petkov Message-Id: <20190220194759.GH3447@zn.tnic> Date: Wed, 20 Feb 2019 20:47:59 +0100 To: Dan Williams Cc: Jeff Moyer , Vishal Verma , linux-nvdimm , Tony Luck , linux-edac@vger.kernel.org List-ID: T24gV2VkLCBGZWIgMjAsIDIwMTkgYXQgMTE6NDA6MTBBTSAtMDgwMCwgRGFuIFdpbGxpYW1zIHdy b3RlOgo+IFRoZXJlIGlzIGEgZGlmZmVyZW5jZS4gTlZESU1NcyBoYXZlIGxvY2FsIHRyYWNraW5n IG9mIGRpc2NvdmVyZWQKPiBwb2lzb24sIG1ldGhvZHMgdG8gc2NhbiBmb3IgbGF0ZW50IHBvaXNv biwgYW5kIG1ldGhvZHMgdG8gY2xlYXIuIEEgQ0VDCj4gY29ubmVjdGlvbiwgaWl1Yywgd291bGQg c2VlbSBhbiBhd2t3YXJkIGZpdC4gQXdrd2FyZCBiZWNhdXNlIHdoYXQgQ0VDCj4gZW5hYmxlcyBp cyBtZWFudCB0byBiZSBpbXBsZW1lbnRlZCBuYXRpdmVseSBpbiB0aGUgaGFyZHdhcmUsIGFuZCBD RUMKPiBzZWVtcyB0byBoYXZlIG5vIGNvbmNlcHQgb2YgdGhlIGZhY3QgdGhhdCBlcnJvcnMgY2Fu IGJlIHJlcGFpcmVkLgoKQ0VDIGlzIGEgbGVha3kgYnVja2V0IG9mIHNvcnRzIHdoaWNoIGRvZXMg Y2FsbCBtZW1vcnlfZmFpbHVyZV9xdWV1ZSgpIGluCnRoZSBlbmQuIFNvIHdlIHBvaXNvbiBvbmx5 IHRob3NlIGVycm9ycyB3aGljaCByZXBvcnQgdGhlIHNhbWUgYWRkcmVzcwpvdmVyIGFuZCBvdmVy IGFnYWluLgoKQ29ycmVjdGFibGUgZXJyb3JzIGFyZSBieSBkZWZpbml0aW9uIGFscmVhZHkgcmVw YWlyZWQsIGkuZS4sIGNvcnJlY3RlZApzbyB0aGVyZSdzIG5vIG5lZWQgdG8gZG8gYW55dGhpbmcu CgpUaGUgd2F5IHN0dWZmIGlzIHBsdW1iZWQgbm93IGlzLCBhbGwgY29ycmVjdGFibGUgZXJyb3Jz IGdvIHRvIHRoZSBDRUMgc28KTkZJVCBkb2Vzbid0IHNlZSB0aGVtLCBpZiBDRUMgaXMgZW5hYmxl ZC4KCkJ1dCB0aGUgcGF0Y2ggSmVmZiBxdW90ZWQgYWxyZWFkeSBjaGFuZ2VkIE5GSVQgdG8gaWdu b3JlIGNvcnJlY3RhYmxlCmVycm9ycyBzbyBJIGd1ZXNzIHdlIGRvbid0IGhhdmUgdG8gZG8gYW55 dGhpbmcuIEFuZCB0aGlzIGlzIHN0aWxsIG5lZWRlZApmb3IgdGhlIGNhc2Ugd2hlcmUgQ0VDIGlz IG5vdCBlbmFibGVkLgoKSSdkIHNheS4K From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.skyhub.de (mail.skyhub.de [IPv6:2a01:4f8:190:11c2::b:1457]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 0656E211CC3BC for ; Wed, 20 Feb 2019 11:48:04 -0800 (PST) Date: Wed, 20 Feb 2019 20:47:59 +0100 From: Borislav Petkov Subject: Re: [PATCH v3 1/2] nfit, mce: only handle uncorrectable machine checks Message-ID: <20190220194759.GH3447@zn.tnic> References: <20181026003729.8420-1-vishal.l.verma@intel.com> <20190220191852.GF3447@zn.tnic> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Dan Williams Cc: Tony Luck , linux-edac@vger.kernel.org, linux-nvdimm List-ID: On Wed, Feb 20, 2019 at 11:40:10AM -0800, Dan Williams wrote: > There is a difference. NVDIMMs have local tracking of discovered > poison, methods to scan for latent poison, and methods to clear. A CEC > connection, iiuc, would seem an awkward fit. Awkward because what CEC > enables is meant to be implemented natively in the hardware, and CEC > seems to have no concept of the fact that errors can be repaired. CEC is a leaky bucket of sorts which does call memory_failure_queue() in the end. So we poison only those errors which report the same address over and over again. Correctable errors are by definition already repaired, i.e., corrected so there's no need to do anything. The way stuff is plumbed now is, all correctable errors go to the CEC so NFIT doesn't see them, if CEC is enabled. But the patch Jeff quoted already changed NFIT to ignore correctable errors so I guess we don't have to do anything. And this is still needed for the case where CEC is not enabled. I'd say. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm