All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab@redhat.com>
To: Borislav Petkov <bp@amd64.org>
Cc: "Luck, Tony" <tony.luck@intel.com>,
	Linux Edac Mailing List <linux-edac@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Aristeu Rozanski <arozansk@redhat.com>,
	Doug Thompson <norsk5@yahoo.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH] RAS: Add a tracepoint for reporting memory controller events
Date: Thu, 31 May 2012 13:14:26 -0300	[thread overview]
Message-ID: <4FC798E2.4000402@redhat.com> (raw)
In-Reply-To: <20120531151408.GJ14515@aftab.osrc.amd.com>

Em 31-05-2012 12:14, Borislav Petkov escreveu:
> On Thu, May 31, 2012 at 12:01:19PM -0300, Mauro Carvalho Chehab wrote:
>> Grain is an error property, associated with the error address.
>> It is as simple as that. It is not a "change grain frequently" type
>> of thing: each address have its associated grain.
> 
> ... which almost never changes:
> 
> 5 amd76x_edac.c     amd76x_init_csrows          214 dimm->grain = dimm->nr_pages << PAGE_SHIFT;
> 6 cpc925_edac.c     cpc925_init_csrows          367 dimm->grain = 32;
> 7 cpc925_edac.c     cpc925_init_csrows          371 dimm->grain = 64;
> 8 e752x_edac.c      e752x_init_csrows          1119 dimm->grain = 1 << 12;
> 9 e7xxx_edac.c      e7xxx_init_csrows           399 dimm->grain = 1 << 12;
> k i3000_edac.c      i3000_probe1                416 dimm->grain = I3000_DEAP_GRAIN;
> l i3200_edac.c      i3200_probe1                395 dimm->grain = nr_pages << PAGE_SHIFT;
> m i5000_edac.c      i5000_init_csrows          1286 dimm->grain = 8;
> n i5100_edac.c      i5100_init_csrows           852 dimm->grain = 32;
> o i5400_edac.c      i5400_init_dimms           1212 dimm->grain = 8;
> p i7300_edac.c      decode_mtr                  662 dimm->grain = 8;
> q i7core_edac.c     get_dimm_config             637 dimm->grain = 8;
> r i82443bxgx_edac.c i82443bxgx_init_csrows      225 dimm->grain = 1 << 12;
> s i82860_edac.c     i82860_init_csrows          180 dimm->grain = 1 << 12;
> t i82875p_edac.c    i82875p_init_csrows         388 dimm->grain = 1 << 12;
> v i82975x_edac.c    i82975x_init_csrows         430 dimm->grain = 1 << 7;
> w mpc85xx_edac.c    mpc85xx_init_csrows         956 dimm->grain = 8;
> x mv64x60_edac.c    mv64x60_init_csrows         677 dimm->grain = 8;
> y pasemi_edac.c     pasemi_edac_init_csrows     183 dimm->grain = PASEMI_EDAC_ERROR_GRAIN;
> z ppc4xx_edac.c     ppc4xx_edac_init_csrows     983 dimm->grain = 1;
> A r82600_edac.c     r82600_init_csrows          259 dimm->grain = 1 << 14;
> B sb_edac.c         get_dimm_config             597 dimm->grain = 32;
> C tile_edac.c       tile_edac_init_csrows       117 dimm->grain = TILE_EDAC_ERROR_GRAIN;
> D x38_edac.c        x38_probe1                  394 dimm->grain = nr_pages << PAGE_SHIFT;

The grains among the drivers are different; userspace needs to know, so an
API is needed.

> 
> From all possible EDAC grain assignments above, only 3 are not static.

+ sb_edac
+ i7core_edac

On both, the grain should be given via MCE regs (it is on my TODO list).

> 
>> Ok, on _old_ hardware, this used to be constant, but on modern ones,
>> this is associated with the error type, as Tony already explained.
> 
> You mean "different" hardware.

I mean _old_ hardware, e. g. non-MCA hardware. On MCA, the MISCV flag 
(at least on Intel) changes the address granularity.

>> Don't create a crappy API, just because you want to save 32 bits.
>> Btw, a "string" grain will spare much more than just 32 bits.
> 
> Don't create a bloated API just to fit your purpose because you're
> staring at the world through your glasses.

It is not a bloated API. The error grain should be reported to userspace,
as:
	- Not all drivers have the same address granularity, as you've shown
	  above;
	- No other userspace API provides it;
	- The granularity is a property of the per-error address;
	- There are well-known cases where the address grain changes are
	  dynamically filled by the error registers (MCA arch on Intel).

So, the memory error tracepoint is the proper place to store it, as it is
the place where the address and the other memory error information is
reported to userspace.

Also, converting the grain to a string, as you proposed would require at 
least 26 bytes to store "grain: 0xdeadbeef:deadbeef", while putting it as
a u64 will consume only 8 bytes.

Regards,
Mauro.

  reply	other threads:[~2012-05-31 16:14 UTC|newest]

Thread overview: 124+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-18 16:31 [PATCH EDAC v26 00/66] EDAC patches for v3.5 Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 01/66] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 02/66] edac: move dimm properties to struct dimm_info Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 03/66] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 04/66] edac: move nr_pages to dimm struct Mauro Carvalho Chehab
2012-05-18 16:31   ` Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 05/66] edac: rewrite edac_align_ptr() Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 06/66] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 07/66] edac: Change internal representation to work with layers Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 08/66] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 09/66] amd76x_edac: " Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 10/66] cell_edac: " Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 11/66] cpc925_edac: " Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 12/66] e752x_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 13/66] e7xxx_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 14/66] i3000_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 15/66] i3200_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 16/66] i5000_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 17/66] i5100_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 18/66] i5400_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 19/66] i7300_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 20/66] i7core_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 21/66] i82443bxgx_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 22/66] i82860_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 23/66] i82875p_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 24/66] i82975x_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 25/66] mpc85xx_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 26/66] mv64x60_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 27/66] pasemi_edac: " Mauro Carvalho Chehab
2012-05-18 16:32   ` Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 28/66] ppc4xx_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 29/66] r82600_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 30/66] sb_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 31/66] tile_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 32/66] x38_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 33/66] edac: Remove the legacy EDAC ABI Mauro Carvalho Chehab
2012-05-18 17:51   ` Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 34/66] edac: Initialize the dimm label with the known information Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 35/66] edac: Cleanup the logs for i7core and sb edac drivers Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 36/66] i5400_edac: improve debug messages to better represent the filled memory Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 37/66] RAS: Add a tracepoint for reporting memory controller events Mauro Carvalho Chehab
2012-05-24 10:14   ` [PATCH] " Mauro Carvalho Chehab
2012-05-24 10:56     ` Borislav Petkov
2012-05-24 16:13       ` Mauro Carvalho Chehab
2012-05-24 16:17         ` Mauro Carvalho Chehab
2012-05-24 16:45         ` Borislav Petkov
2012-05-24 18:00           ` Mauro Carvalho Chehab
2012-05-29 11:58             ` Borislav Petkov
2012-05-29 14:02               ` Mauro Carvalho Chehab
2012-05-29 14:52                 ` Borislav Petkov
2012-05-29 15:23                   ` Mauro Carvalho Chehab
2012-05-30 23:24                     ` Luck, Tony
2012-05-31 10:00                       ` Borislav Petkov
2012-05-31 10:33                         ` Mauro Carvalho Chehab
2012-05-31 12:17                           ` Borislav Petkov
2012-05-31 13:56                             ` Mauro Carvalho Chehab
2012-05-31 14:22                               ` Borislav Petkov
2012-05-31 14:44                                 ` Mauro Carvalho Chehab
2012-05-31 14:54                                   ` Borislav Petkov
2012-05-31 15:01                                     ` Mauro Carvalho Chehab
2012-05-31 15:14                                       ` Borislav Petkov
2012-05-31 16:14                                         ` Mauro Carvalho Chehab [this message]
2012-05-31 17:13                                           ` Borislav Petkov
2012-05-31 18:04                                             ` Mauro Carvalho Chehab
2012-05-31 18:33                                               ` Aristeu Rozanski
2012-05-31 19:37                                               ` Borislav Petkov
2012-05-31 19:32                                             ` Steven Rostedt
2012-05-31 19:42                                               ` Borislav Petkov
2012-05-31 20:11                                                 ` Steven Rostedt
2012-05-31 20:18                                                   ` Borislav Petkov
2012-05-31 20:52                                                     ` Luck, Tony
2012-06-01  9:10                                                       ` Borislav Petkov
2012-06-01  9:40                                                         ` Chen Gong
2012-06-01 12:15                                                         ` Mauro Carvalho Chehab
2012-06-01 15:42                                                         ` Luck, Tony
2012-06-01 16:00                                                           ` Borislav Petkov
2012-06-01 18:21                                                             ` Luck, Tony
2012-06-01 23:00                                                               ` Borislav Petkov
2012-06-01 23:19                                                                 ` Luck, Tony
2012-06-01 23:28                                                                   ` Borislav Petkov
2012-05-31 16:51                         ` Luck, Tony
2012-05-31 17:20                           ` Borislav Petkov
2012-05-31 18:14                             ` Luck, Tony
2012-05-31 19:26                               ` Borislav Petkov
2012-05-31 18:24                             ` Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 38/66] i5000_edac: Fix the logic that retrieves memory information Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 39/66] e752x_edac: provide more info about how DIMMS/ranks are mapped Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 40/66] edac: Rename the parent dev to pdev Mauro Carvalho Chehab
2012-05-18 16:32   ` Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 41/66] edac: use Documentation-nano format for some data structs Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 42/66] edac: rewrite the sysfs code to use struct device Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 43/66] mpc85xx_edac: convert sysfs logic " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 44/66] amd64_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 45/66] i7core_edac: convert it " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 46/66] edac: Get rid of the old kobj's from the edac mc code Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 47/66] edac: add a new per-dimm API and make the old per-virtual-rank API obsolete Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 48/66] edac: add a sysfs node to report the maximum location for the system Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 49/66] edac: Add debufs nodes to allow doing fake error inject Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 50/66] edac: Move grain/dtype/edac_type calculus to be out of channel loop Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 51/66] i82975x_edac: Test nr_pages earlier to save a few CPU cycles Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 52/66] i5100_edac: Fix a warning when compiled with 32 bits Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 53/66] i7300_edac: Get rid of some wrongly-solved rebase conflict Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 54/66] edac: Only expose csrows/channels on legacy API if they're populated Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 55/66] edac: change the mem allocation scheme to make Documentation/kobject.txt happy Mauro Carvalho Chehab
2012-05-18 16:32   ` Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 56/66] i7core_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 57/66] edac: move documentation ABI to ABI/testing/sysfs-devices-edac Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 58/66] Edac: Add ABI Documentation for the new device nodes Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 59/66] i5000: Fix the fatal error handling Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 60/66] i7core: fix ranks information at the per-channel struct Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 61/66] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 62/66] edac: Use more normal debugging macro style Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 63/66] edac: Convert debugfX to edac_dbg(X, Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 64/66] edac_mc: Cleanup per-dimm_info debug messages Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 65/66] edac: Increase version to 3.0.0 Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 66/66] edac_mc: check for allocation failure in edac_mc_alloc() Mauro Carvalho Chehab
2012-05-18 16:46 ` [PATCH EDAC v26 00/66] EDAC patches for v3.5 Borislav Petkov
2012-05-18 17:43   ` Mauro Carvalho Chehab
2012-05-18 17:53     ` Borislav Petkov
2012-05-28 15:46       ` Mauro Carvalho Chehab
2012-05-28 20:36         ` Borislav Petkov
2012-05-28 23:13           ` Mauro Carvalho Chehab
2012-05-29  2:40             ` Chen Gong
2012-05-29 11:45               ` Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FC798E2.4000402@redhat.com \
    --to=mchehab@redhat.com \
    --cc=arozansk@redhat.com \
    --cc=bp@amd64.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=norsk5@yahoo.com \
    --cc=rostedt@goodmis.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.