From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751529AbdGZSSN (ORCPT ); Wed, 26 Jul 2017 14:18:13 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:59649 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750867AbdGZSSL (ORCPT ); Wed, 26 Jul 2017 14:18:11 -0400 Date: Wed, 26 Jul 2017 15:17:55 -0300 From: Mauro Carvalho Chehab To: "Luck, Tony" Cc: Borislav Petkov , linux-edac , Toshimitsu Kani , "Rafael J. Wysocki" , LKML Subject: Re: [PATCH 3/3] EDAC, ghes: Make it a proper module Message-ID: <20170726151755.571e5979@vento.lan> In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F61316D09@ORSMSX114.amr.corp.intel.com> References: <20170726084827.11447-1-bp@alien8.de> <20170726084827.11447-4-bp@alien8.de> <20170726072404.3cb283c5@vento.lan> <20170726103708.GA28875@nazgul.tnic> <20170726075118.22f2ca85@vento.lan> <3908561D78D1C84285E8C5FCA982C28F61316D09@ORSMSX114.amr.corp.intel.com> X-Mailer: Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Wed, 26 Jul 2017 17:27:12 +0000 "Luck, Tony" escreveu: > > > > Hmm... I'm not seeing any implementation that would allow setting > > > > between firmware first, hardware first or "auto", as we've discussed. > > > > > > This is all coming up. As the 0/3 message said, these 3 patches are the > > > bare minimum of reorganizing stuff only and should serve as a base. > > > > I'll then wait for such patch before acking this series. > > I didn't think that a BIOS that set "firmware first" gave the OS any choice about this. > > What exactly is this option going to do? Fiddle with ACPI OSC?? Currently, my HP server that I use to build the Kernel is FF: [ 3.783803] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC. I didn't try to disable FF on its BIOS. Not sure if it is even possible. Still, EDAC is working there using sb_edac. As I pointed before, one of the MC channels is not being detected, but I don't use it on this machine. Except for that, EDAC seems to be working fine there: $ ras-mc-ctl --layout +-----------------------------------------------------------------------+ | mc0 | mc1 | | channel0 | channel1 | channel2 | channel0 | channel1 | channel2 | -------+-----------------------------------------------------------------------+ slot2: | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | slot1: | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | slot0: | 16384 MB | 0 MB | 16384 MB | 16384 MB | 0 MB | 16384 MB | -------+---------------------------------------------------------------------------+ # ras-mc-ctl --guess-labels memory stick 'PROC 1 DIMM 1' is located at 'Not Specified' memory stick 'PROC 1 DIMM 2' is located at 'Not Specified' memory stick 'PROC 1 DIMM 3' is located at 'Not Specified' memory stick 'PROC 1 DIMM 4' is located at 'Not Specified' memory stick 'PROC 1 DIMM 5' is located at 'Not Specified' memory stick 'PROC 1 DIMM 6' is located at 'Not Specified' memory stick 'PROC 1 DIMM 7' is located at 'Not Specified' memory stick 'PROC 1 DIMM 8' is located at 'Not Specified' memory stick 'PROC 1 DIMM 9' is located at 'Not Specified' memory stick 'PROC 1 DIMM 10' is located at 'Not Specified' memory stick 'PROC 1 DIMM 11' is located at 'Not Specified' memory stick 'PROC 1 DIMM 12' is located at 'Not Specified' memory stick 'PROC 2 DIMM 1' is located at 'Not Specified' memory stick 'PROC 2 DIMM 2' is located at 'Not Specified' memory stick 'PROC 2 DIMM 3' is located at 'Not Specified' memory stick 'PROC 2 DIMM 4' is located at 'Not Specified' memory stick 'PROC 2 DIMM 5' is located at 'Not Specified' memory stick 'PROC 2 DIMM 6' is located at 'Not Specified' memory stick 'PROC 2 DIMM 7' is located at 'Not Specified' memory stick 'PROC 2 DIMM 8' is located at 'Not Specified' memory stick 'PROC 2 DIMM 9' is located at 'Not Specified' memory stick 'PROC 2 DIMM 10' is located at 'Not Specified' memory stick 'PROC 2 DIMM 11' is located at 'Not Specified' memory stick 'PROC 2 DIMM 12' is located at 'Not Specified' I didn't try to inject an error, as I'm not sure if EINJ feature is enabled on this BIOS. Probably not. At least on this machine, I very much prefer to use sb_edac driver. As I explained earlier in the previous thread, I just don't if the BIOS would be doing the right thing for CE, as I don't know its internal algorithm. Also, as I'm maintaining the EDAC userspace tools (rasdaemon), I would really love to get a few CE error reports there from time to time, as it could be used to check if rasdaemon is doing do the right thing to them. So, I very much prefer to not have any threshold at all there at BIOS. Thanks, Mauro