public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Kani, Toshimitsu" <toshi.kani@hpe.com>
To: "mchehab@s-opensource.com" <mchehab@s-opensource.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mchehab@kernel.org" <mchehab@kernel.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"srinivas.pandruvada@linux.intel.com" 
	<srinivas.pandruvada@linux.intel.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Subject: Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac
Date: Wed, 19 Jul 2017 16:40:25 +0000	[thread overview]
Message-ID: <1500481869.2042.29.camel@hpe.com> (raw)
In-Reply-To: <20170718181545.32bd9181@vento.lan>

On Tue, 2017-07-18 at 18:15 -0300, Mauro Carvalho Chehab wrote:
> Em Tue, 18 Jul 2017 19:58:54 +0000
 :
> We had a similar discussion several years ago when I wrote this
> driver. On that time, I talked with Red Hat, HP, Dell, Intel people
> and with some customers with large clusters.
> 
> The way it is, ghes_edac is a poor man's driver. What it hopefully
> provide is a detection that an error happened, without really telling
> the user what component should be replaced.

"poor man's driver" is a bit misleading, but yes, firmware-first
platforms have RAS features built-into the platforms, and they do not
need intelligence in EDAC drivers, which may conflict with the
platform's RAS features.  I cannot speak for other vendors, but HPE
platforms log errors and provide FRU info.  ghes_edac allows to report
errors to OS management tools like rasdaemon in addition to platform-
specific managements.

> Ok, on machines with their own error reporting mechanism (like
> HP servers), a sys admin can look on some proprietary software
> (or bios), in order to identify what happened.
> 
> Yet, BIOS doesn't provide any glue about what's the memory
> architecture, as it maps memory as if it was a single DIMM memory:
> 
> (from ghes_edac_register)
> 
> 	layers[0].type = EDAC_MC_LAYER_ALL_MEM;
> 	layers[0].size = num_dimm;
> 	layers[0].is_virt_csrow = true;
> 
> So, even on systems where the BIOS actually knows how the memory
> cards are wired, it will mask the memory controller data.
> 
> Now, the EDAC driver can also be used to identify what
> channels are used. That helps the sys admin to know if the
> memories are connected in a way that it will be using multiple
> channels, or not, helping to setup the machine to obtain
> the maximum possible performance.
> 
> So, for example, on my Intel-based HP server, I can check
> such info with:
> 
> $ ras-mc-ctl --mainboard
> ras-mc-ctl: mainboard: HP model ProLiant ML350 Gen9
> $ ras-mc-ctl --layout
>        +-------------------------------------------------------------
> ----------+
>        |                mc0                |                mc1      
>           |
>        | channel0  | channel1  | channel2  | channel0  | channel1  |
> channel2  |
> -------+-------------------------------------------------------------
> ----------+
> slot2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0
> MB  |     0 MB  |
> slot1: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0
> MB  |     0 MB  |
> slot0: |  16384 MB  |     0 MB  |  16384 MB  |  16384 MB  |     0
> MB  |  16384 MB  |
> -------+-------------------------------------------------------------
> --------------+
> 
> So, I know that both CPUs will be connected to my memories, and,
> on both, it is using 2 channels.
> 
> If I was using the ghes driver, that information would be hidden.
> 
> So, due to all problems with ghes, it is enabled only if there are no
> better solution, e. g. on systems where there's no way to talk
> directly to the hardware (like on E7 Xeon machines, where the memory
> controller is actually on a separate chip that are controlled only by
> the BIOS).

Thanks for the info!  That's very helpful.  I will check to see if
ghes_edac provides enough info that we need.
-Toshi

  parent reply	other threads:[~2017-07-19 16:40 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-17 21:59 [PATCH 0/3] enable ghes_edac on selected platforms Toshi Kani
2017-07-17 21:59 ` [PATCH 1/3] ACPI / blacklist: add acpi_match_oemlist() interface Toshi Kani
2017-07-18  5:34   ` Borislav Petkov
2017-07-18 15:48     ` Kani, Toshimitsu
2017-07-18 16:43       ` Borislav Petkov
2017-07-18 17:24         ` Kani, Toshimitsu
2017-07-18 17:42           ` Borislav Petkov
2017-07-18 18:49             ` Kani, Toshimitsu
2017-07-18 19:32               ` Borislav Petkov
2017-07-18 20:17                 ` Kani, Toshimitsu
2017-07-17 21:59 ` [PATCH 2/3] intel_pstate: convert to use acpi_match_oemlist() Toshi Kani
2017-07-17 21:59 ` [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac Toshi Kani
2017-07-18  6:00   ` Borislav Petkov
2017-07-18  8:08     ` Borislav Petkov
2017-07-18 21:20       ` Kani, Toshimitsu
2017-07-19  5:52         ` Borislav Petkov
2017-07-19 16:10           ` Kani, Toshimitsu
2017-07-19 16:22             ` Borislav Petkov
2017-07-19 16:56               ` Kani, Toshimitsu
2017-07-20  4:16                 ` Borislav Petkov
2017-07-20 14:42                   ` Kani, Toshimitsu
2017-07-20 15:04                     ` Borislav Petkov
2017-07-20 16:55                       ` Luck, Tony
2017-07-20 17:05                         ` Borislav Petkov
2017-07-20 17:10                           ` Luck, Tony
2017-07-20 18:16                           ` Mauro Carvalho Chehab
2017-07-19 18:55               ` Aristeu Rozanski
2017-07-19 20:13                 ` Kani, Toshimitsu
2017-07-20  4:19                 ` Borislav Petkov
2017-07-18 19:58     ` Kani, Toshimitsu
2017-07-18 21:15       ` Mauro Carvalho Chehab
2017-07-19  5:58         ` Borislav Petkov
2017-07-19 15:14           ` Luck, Tony
2017-07-19 15:57             ` Borislav Petkov
2017-07-19 18:06               ` Luck, Tony
2017-07-19 16:40         ` Kani, Toshimitsu [this message]
2017-07-20  4:33           ` Borislav Petkov
2017-07-20 19:50             ` Kani, Toshimitsu
2017-07-20 20:15               ` Mauro Carvalho Chehab
2017-07-20 21:07                 ` Kani, Toshimitsu
2017-07-21 13:34               ` Borislav Petkov
2017-07-21 13:40                 ` Mauro Carvalho Chehab
2017-07-21 13:47                   ` Borislav Petkov
2017-07-21 15:08                     ` Kani, Toshimitsu
2017-07-21 15:13                       ` Borislav Petkov
2017-07-21 15:34                         ` Kani, Toshimitsu
2017-07-21 15:44                           ` Mauro Carvalho Chehab
2017-07-21 16:40                             ` Kani, Toshimitsu
2017-07-21 17:01                               ` Mauro Carvalho Chehab
2017-07-21 17:21                                 ` Kani, Toshimitsu
2017-07-21 17:23                                 ` Borislav Petkov
2017-07-21 18:38                                   ` Kani, Toshimitsu
2017-07-22  6:28                                     ` Borislav Petkov
2017-07-24 14:49                                       ` Kani, Toshimitsu
2017-07-24 15:04                                         ` Borislav Petkov
2017-07-24 15:25                                           ` Kani, Toshimitsu
2017-07-24 15:37                                             ` Borislav Petkov
2017-07-24 15:56                                               ` Kani, Toshimitsu
2017-07-24 16:37                                                 ` Borislav Petkov
2017-07-24 17:44                                                   ` Kani, Toshimitsu
2017-07-24 17:50                                                     ` Boris Petkov
2017-07-24 17:54                                                       ` Kani, Toshimitsu
2017-07-24 18:18                                                         ` Borislav Petkov
2017-07-24 17:56                                                 ` Mauro Carvalho Chehab
2017-07-24 18:12                                                   ` Kani, Toshimitsu
2017-07-24 16:04                                               ` Mauro Carvalho Chehab
2017-07-24 16:44                                                 ` Borislav Petkov
2017-07-24 18:10                                                   ` Mauro Carvalho Chehab
2017-07-24 18:30                                                     ` Borislav Petkov
2017-07-25 23:00                                                       ` Kani, Toshimitsu
2017-07-21 15:53                           ` Borislav Petkov
2017-07-21 16:32                             ` Kani, Toshimitsu
2017-07-19  5:55       ` Borislav Petkov
2017-07-18 22:13     ` Luck, Tony
2017-07-19  6:01       ` Borislav Petkov
2017-07-18 14:39   ` Jeffrey Hugo
2017-07-18 15:36     ` Kani, Toshimitsu
2017-07-18 16:24       ` Jeffrey Hugo
2017-07-18 16:42         ` Kani, Toshimitsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1500481869.2042.29.camel@hpe.com \
    --to=toshi.kani@hpe.com \
    --cc=bp@alien8.de \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=mchehab@s-opensource.com \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox