From: Borislav Petkov <bp@amd64.org>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>,
Borislav Petkov <bp@amd64.org>,
Linux Edac Mailing List <linux-edac@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Aristeu Rozanski <arozansk@redhat.com>,
Doug Thompson <norsk5@yahoo.com>,
Steven Rostedt <rostedt@goodmis.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH] RAS: Add a tracepoint for reporting memory controller events
Date: Thu, 31 May 2012 12:00:05 +0200 [thread overview]
Message-ID: <20120531100005.GC14074@aftab.osrc.amd.com> (raw)
In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F192F6672@ORSMSX104.amr.corp.intel.com>
On Wed, May 30, 2012 at 11:24:41PM +0000, Luck, Tony wrote:
> > u32 grain; /* granularity of reported error in bytes */
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> >> dimm->grain = nr_pages << PAGE_SHIFT;
>
> I'm not at all sure what we'll see digging into the chipset registers
> like EDAC does - but we do have different granularity when reporting
> via machine check banks. That's why we have this code:
>
> /*
> * Mask the reported address by the reported granularity.
> */
> if (mce_ser && (m->status & MCI_STATUS_MISCV)) {
> u8 shift = MCI_MISC_ADDR_LSB(m->misc);
> m->addr >>= shift;
> m->addr <<= shift;
That's 64 bytes max, IIRC.
> in mce_read_aux(). In practice right now I think that many errors will
> report with cache line granularity,
Yep.
> while a few (IIRC patrol scrub) will report with page (4K)
> granularity. Linux doesn't really care - they all have to get rounded
> up to page size because we can't take away just one cache line from a
> process.
I'd like to see that :-)
> > @Tony: Can you ensure us that, on Intel memory controllers, the address
> > mask remains constant at module's lifetime, or are there any events that
> > may change it (memory hot-plug, mirror mode changes, interleaving
> > reconfiguration, ...)?
>
> I could see different controllers (or even different channels) having
> different setup if you have a system with different size/speed/#ranks
> DIMMs ... most systems today allow almost arbitrary mix & match, and the
> BIOS will decide which interleave modes are possible based on what it
> finds in the slots. Mirroring imposes more constraints, so you will
> see less crazy options. Hot plug for Linux reduces to just the hot add
> case (as we still don't have a good way to remove DIMM sized chunks of
> memory) ... so I don't see any clever reconfiguration possibilities
> there (when you add memory, all the existing memory had better stay
> where it is, preserving contents).
You're funny :-)
> Perhaps the only option where things might change radically is socket
> migration ... where the constraint is only that the target of the
> migration have >= memory of the source. So you might move from some
> weird configuration with mixed DIMM sizes and thus no interleave, to a
> homogeneous socket with matched DIMMs and full interleave. But from an
> EDAC level, this is a new controller on a new socket ... not a changed
> configuration on an existing socket.
Right, from the frequency of such events happening, it still sounds to
me like the perfect place for the grain value is in sysfs.
Thanks.
--
Regards/Gruss,
Boris.
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
next prev parent reply other threads:[~2012-05-31 9:59 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-18 16:31 [PATCH EDAC v26 00/66] EDAC patches for v3.5 Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 01/66] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 03/66] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 04/66] edac: move nr_pages to dimm struct Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 05/66] edac: rewrite edac_align_ptr() Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 06/66] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 08/66] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 09/66] amd76x_edac: " Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 10/66] cell_edac: " Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 11/66] cpc925_edac: " Mauro Carvalho Chehab
2012-05-18 16:31 ` [PATCH EDAC v26 12/66] e752x_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 13/66] e7xxx_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 14/66] i3000_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 15/66] i3200_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 16/66] i5000_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 17/66] i5100_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 18/66] i5400_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 19/66] i7300_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 20/66] i7core_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 21/66] i82443bxgx_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 22/66] i82860_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 23/66] i82875p_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 24/66] i82975x_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 25/66] mpc85xx_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 26/66] mv64x60_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 27/66] pasemi_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 28/66] ppc4xx_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 29/66] r82600_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 30/66] sb_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 31/66] tile_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 32/66] x38_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 33/66] edac: Remove the legacy EDAC ABI Mauro Carvalho Chehab
2012-05-18 17:51 ` Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 34/66] edac: Initialize the dimm label with the known information Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 35/66] edac: Cleanup the logs for i7core and sb edac drivers Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 36/66] i5400_edac: improve debug messages to better represent the filled memory Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 37/66] RAS: Add a tracepoint for reporting memory controller events Mauro Carvalho Chehab
2012-05-24 10:14 ` [PATCH] " Mauro Carvalho Chehab
2012-05-24 10:56 ` Borislav Petkov
2012-05-24 16:13 ` Mauro Carvalho Chehab
2012-05-24 16:17 ` Mauro Carvalho Chehab
2012-05-24 16:45 ` Borislav Petkov
2012-05-24 18:00 ` Mauro Carvalho Chehab
2012-05-29 11:58 ` Borislav Petkov
2012-05-29 14:02 ` Mauro Carvalho Chehab
2012-05-29 14:52 ` Borislav Petkov
2012-05-29 15:23 ` Mauro Carvalho Chehab
2012-05-30 23:24 ` Luck, Tony
2012-05-31 10:00 ` Borislav Petkov [this message]
2012-05-31 10:33 ` Mauro Carvalho Chehab
2012-05-31 12:17 ` Borislav Petkov
2012-05-31 13:56 ` Mauro Carvalho Chehab
2012-05-31 14:22 ` Borislav Petkov
2012-05-31 14:44 ` Mauro Carvalho Chehab
2012-05-31 14:54 ` Borislav Petkov
2012-05-31 15:01 ` Mauro Carvalho Chehab
2012-05-31 15:14 ` Borislav Petkov
2012-05-31 16:14 ` Mauro Carvalho Chehab
2012-05-31 17:13 ` Borislav Petkov
2012-05-31 18:04 ` Mauro Carvalho Chehab
2012-05-31 18:33 ` Aristeu Rozanski
2012-05-31 19:37 ` Borislav Petkov
2012-05-31 19:32 ` Steven Rostedt
2012-05-31 19:42 ` Borislav Petkov
2012-05-31 20:11 ` Steven Rostedt
2012-05-31 20:18 ` Borislav Petkov
2012-05-31 20:52 ` Luck, Tony
2012-06-01 9:10 ` Borislav Petkov
2012-06-01 9:40 ` Chen Gong
2012-06-01 12:15 ` Mauro Carvalho Chehab
2012-06-01 15:42 ` Luck, Tony
2012-06-01 16:00 ` Borislav Petkov
2012-06-01 18:21 ` Luck, Tony
2012-06-01 23:00 ` Borislav Petkov
2012-06-01 23:19 ` Luck, Tony
2012-06-01 23:28 ` Borislav Petkov
2012-05-31 16:51 ` Luck, Tony
2012-05-31 17:20 ` Borislav Petkov
2012-05-31 18:14 ` Luck, Tony
2012-05-31 19:26 ` Borislav Petkov
2012-05-31 18:24 ` Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 38/66] i5000_edac: Fix the logic that retrieves memory information Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 39/66] e752x_edac: provide more info about how DIMMS/ranks are mapped Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 40/66] edac: Rename the parent dev to pdev Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 41/66] edac: use Documentation-nano format for some data structs Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 42/66] edac: rewrite the sysfs code to use struct device Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 43/66] mpc85xx_edac: convert sysfs logic " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 44/66] amd64_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 45/66] i7core_edac: convert it " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 46/66] edac: Get rid of the old kobj's from the edac mc code Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 47/66] edac: add a new per-dimm API and make the old per-virtual-rank API obsolete Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 48/66] edac: add a sysfs node to report the maximum location for the system Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 49/66] edac: Add debufs nodes to allow doing fake error inject Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 50/66] edac: Move grain/dtype/edac_type calculus to be out of channel loop Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 51/66] i82975x_edac: Test nr_pages earlier to save a few CPU cycles Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 52/66] i5100_edac: Fix a warning when compiled with 32 bits Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 53/66] i7300_edac: Get rid of some wrongly-solved rebase conflict Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 54/66] edac: Only expose csrows/channels on legacy API if they're populated Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 55/66] edac: change the mem allocation scheme to make Documentation/kobject.txt happy Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 56/66] i7core_edac: " Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 57/66] edac: move documentation ABI to ABI/testing/sysfs-devices-edac Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 58/66] Edac: Add ABI Documentation for the new device nodes Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 59/66] i5000: Fix the fatal error handling Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 60/66] i7core: fix ranks information at the per-channel struct Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 61/66] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 62/66] edac: Use more normal debugging macro style Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 63/66] edac: Convert debugfX to edac_dbg(X, Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 64/66] edac_mc: Cleanup per-dimm_info debug messages Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 65/66] edac: Increase version to 3.0.0 Mauro Carvalho Chehab
2012-05-18 16:32 ` [PATCH EDAC v26 66/66] edac_mc: check for allocation failure in edac_mc_alloc() Mauro Carvalho Chehab
2012-05-18 16:46 ` [PATCH EDAC v26 00/66] EDAC patches for v3.5 Borislav Petkov
2012-05-18 17:43 ` Mauro Carvalho Chehab
2012-05-18 17:53 ` Borislav Petkov
2012-05-28 15:46 ` Mauro Carvalho Chehab
2012-05-28 20:36 ` Borislav Petkov
2012-05-28 23:13 ` Mauro Carvalho Chehab
2012-05-29 2:40 ` Chen Gong
2012-05-29 11:45 ` Mauro Carvalho Chehab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120531100005.GC14074@aftab.osrc.amd.com \
--to=bp@amd64.org \
--cc=arozansk@redhat.com \
--cc=fweisbec@gmail.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@redhat.com \
--cc=mingo@redhat.com \
--cc=norsk5@yahoo.com \
--cc=rostedt@goodmis.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).