linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab@redhat.com>
To: Borislav Petkov <bp@amd64.org>
Cc: Linux Edac Mailing List <linux-edac@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Doug Thompson <norsk5@yahoo.com>
Subject: Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
Date: Tue, 24 Apr 2012 10:11:50 -0300	[thread overview]
Message-ID: <4F96A696.40308@redhat.com> (raw)
In-Reply-To: <20120424125538.GC11559@aftab.osrc.amd.com>

Em 24-04-2012 09:55, Borislav Petkov escreveu:
> On Tue, Apr 24, 2012 at 08:46:53AM -0300, Mauro Carvalho Chehab wrote:
>> Em 24-04-2012 07:40, Borislav Petkov escreveu:
>>> On Mon, Apr 23, 2012 at 06:30:54PM +0000, Mauro Carvalho Chehab wrote:
>>>>>> +};
>>>>>> +
>>>>>> +/**
>>>>>> + * struct edac_mc_layer - describes the memory controller hierarchy
>>>>>> + * @layer:		layer type
>>>>>> + * @size:maximum size of the layer
>>>>>> + * @is_csrow:		This layer is part of the "csrow" when old API
>>>>>> + *			compatibility mode is enabled. Otherwise, it is
>>>>>> + *			a channel
>>>>>> + */
>>>>>> +struct edac_mc_layer {
>>>>>> +	enum edac_mc_layer_type	type;
>>>>>> +	unsigned		size;
>>>>>> +	bool			is_csrow;
>>>>>> +};
>>>>>
>>>>> Huh, why do you need is_csrow? Can't do
>>>>>
>>>>> 	type = EDAC_MC_LAYER_CHIP_SELECT;
>>>>>
>>>>> ?
>>>>
>>>> No, that's different. For a csrow-based memory controller, is_csrow is equal to
>>>> type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
>>>> is used to mark with layers will be used for the "fake csrow" exported by the
>>>> EDAC core by the legacy API.
>>>
>>> I don't understand this, do you mean: "this will be used to mark which
>>> layer will be used to fake a csrow"...?
>>
>> I've already explained this dozens of times: on x86, except for amd64_edac and
>> the drivers for legacy hardware (+7 years old), the information filled at struct 
>> csrow_info is FAKE. That's basically one of the main reasons for this patchset.
>>
>> There's no csrow signals accessed by the memory controller on FB-DIMM/RAMBUS, and on DDR3
>> Intel memory controllers, it is possible to fill memories on different channels with
>> different sizes. For example, this is how the 4 DIMM banks are filled on an HP Z400
>> with a Intel W3505 CPU:
>>
>> $ ./edac-ctl --layout
>>        +-----------------------------------+
>>        |                mc0                |
>>        | channel0  | channel1  | channel2  |
>> -------+-----------------------------------+
>> slot2: |     0 MB  |     0 MB  |     0 MB  |
>> slot1: |  1024 MB  |     0 MB  |     0 MB  |
>> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
>> -------+-----------------------------------+
>>
>> Those are the logs that dump the Memory Controller registers: 
>>
>> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
>> [  115.818950] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [  115.818955] EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
>> [  115.818985] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [  115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
>> [  115.819016] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>
>> The Nehalem memory controllers allow up to 3 DIMMs per channel, and has 3 channels (so,
>> a total of 9 DIMMs). Most motherboards, however, expose either 4 or 8 DIMMs per CPU,
>> so it isn't possible to have all channels and dimms filled on them.
>>
>> On this motherboard, DIMM1 to DIMM3 are mapped to the the first dimm# at channels 0 to 2, and
>> DIMM4 goes to the second dimm# at channel 0.
>>
>> See? On slot 1, only channel 0 is filled.
> 
> Ok, wait a second, wait a second.
> 
> It's good that you brought up an example, that will probably help
> clarify things better.
> 
> So, how many physical DIMMs are we talking in the example above? 4, and
> all of them single-ranked? They must be because it says "rank: 1" above.
> 
> How would the table look if you had dual-ranked or quad-ranked DIMMs on
> the motherboard?

It won't change. The only changes will be at the debug logs. It would print
something like:

EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 4 ranks, UDIMMs
EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 2, row: 0x4000, col: 0x400
EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 2, row: 0x4000, col: 0x400

> I understand channel{0,1,2} so what is slot now, is that the physical
> DIMM slot on the motherboard?

physical slots:
	DIMM1 - at MCU channel 0, dimm slot#0
	DIMM2 - at MCU channel 1, dimm slot#0
	DIMM3 - at MCU channel 2, dimm slot#0
	DIMM4 - at MCU channel 0, dimm slot#1

This motherboard has only 4 slots.

The i7core_edac driver is not able to discover how many physical DIMM slots
are there at the motherboard.

> If so, why are there 9 slots (3x3) when you say that most motherboards
> support 4 or 8 DIMMs per socket? Are the "slot{0,1,2}" things the
> view from the memory controller or what you physically have on the
> motherboard?

slot{0,1,2} channel{0,1,2} are the addresses given by the memory controller.
Not all motherboards add 9 DIMM physical slots though. Only high-end
motherboards provide 9 slots per MCU.

We have one Nehalem motherboard with 18 DIMM slots, and 2 CPUs. On that
machine, it is possible to use the maximum supported range of DIMMs.

> 
>> Even if this memory controller would be rank-based[1], the channel
>> information can't be mapped using the legacy EDAC API, as, on the old
>> API, all channels need to be filled with memories with the same size.
>> So, this driver uses both the slot layer and the channel layer as the
>> fake csrow.
> 
> So what is the slot layer, is it something you've come up with or is it
> a real DIMM slot on the motherboard?

It is the slot# inside each channel.

>> [1] As you can see from the logs and from the source code, the MC
>> registers aren't per rank, they are per DIMM. The number of ranks
>> is just one attribute of the register that describes a DIMM. The
>> MCA Error registers, however, don't map the rank when reporting an
>> errors, nor the error counters are per rank. So, while it is possible
>> to enumerate information per rank, the error detection is always per
>> DIMM.
> 
> Ok.
> 
> [..]
> 


  reply	other threads:[~2012-04-24 13:12 UTC|newest]

Thread overview: 161+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 01/13] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
2012-03-30 10:50   ` Borislav Petkov
2012-03-30 13:26     ` Mauro Carvalho Chehab
2012-03-30 15:38       ` Borislav Petkov
2012-04-16  8:41     ` Mauro Carvalho Chehab
2012-04-16 11:02       ` Borislav Petkov
2012-04-16 11:44         ` Mauro Carvalho Chehab
2012-04-16 13:21           ` Borislav Petkov
2012-03-29 16:45 ` [PATCH 02/13] edac: move dimm properties to struct memset_info Mauro Carvalho Chehab
2012-03-30 13:10   ` Borislav Petkov
2012-03-30 13:22     ` Mauro Carvalho Chehab
2012-03-30 17:03   ` Borislav Petkov
2012-04-16  8:56     ` Mauro Carvalho Chehab
2012-04-16 13:31       ` Borislav Petkov
2012-03-29 16:45 ` [PATCH 03/13] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
2012-04-02 12:33   ` Borislav Petkov
2012-03-29 16:45 ` [PATCH 04/13] edac: move nr_pages to dimm struct Mauro Carvalho Chehab
2012-04-02 13:18   ` Borislav Petkov
2012-03-29 16:45 ` [PATCH 05/13] edac: Fix core support for MC's that see DIMMS instead of ranks Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 06/13] edac: Initialize the dimm label with the known information Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 07/13] edac: Cleanup the logs for i7core and sb edac drivers Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 08/13] i5400_edac: improve debug messages to better represent the filled memory Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 09/13] events/hw_event: Create a Hardware Events Report Mecanism (HERM) Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 10/13] i5000_edac: Fix the logic that retrieves memory information Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 11/13] e752x_edac: provide more info about how DIMMS/ranks are mapped Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 12/13] edac: Rename the parent dev to pdev Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 13/13] edac: use Documentation-nano format for some data structs Mauro Carvalho Chehab
2012-03-29 20:46 ` [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Aristeu Rozanski Filho
2012-04-02 13:59 ` Borislav Petkov
2012-04-16 12:58   ` Mauro Carvalho Chehab
2012-04-16 14:06     ` Borislav Petkov
2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
2012-04-16 20:12   ` [EDAC PATCH v13 1/7] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
2012-04-26 14:26     ` Borislav Petkov
2012-04-16 20:12   ` [EDAC PATCH v13 3/7] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
2012-04-16 20:12   ` [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct Mauro Carvalho Chehab
2012-04-17 18:48     ` Borislav Petkov
2012-04-17 19:28       ` Mauro Carvalho Chehab
2012-04-17 21:40         ` Borislav Petkov
2012-04-18 12:58           ` Mauro Carvalho Chehab
2012-04-18 17:53           ` [PATCH] " Mauro Carvalho Chehab
2012-04-16 20:12   ` [EDAC PATCH v13 5/7] edac: rewrite edac_align_ptr() Mauro Carvalho Chehab
2012-04-18 14:06     ` Borislav Petkov
2012-04-18 15:25       ` Borislav Petkov
2012-04-18 18:15       ` Mauro Carvalho Chehab
2012-04-18 18:19       ` [PATCH] " Mauro Carvalho Chehab
2012-04-23 14:05         ` Borislav Petkov
2012-04-23 15:19           ` Mauro Carvalho Chehab
2012-04-23 15:26             ` Mauro Carvalho Chehab
2012-04-16 20:12   ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Mauro Carvalho Chehab
2012-04-23 17:49     ` Borislav Petkov
2012-04-23 18:30       ` Mauro Carvalho Chehab
2012-04-23 18:56         ` Mauro Carvalho Chehab
2012-04-23 19:19           ` [PATCH] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
2012-04-23 20:07             ` Mauro Carvalho Chehab
2012-04-24 10:46               ` Borislav Petkov
2012-04-24 10:40         ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
2012-04-24 11:46           ` Mauro Carvalho Chehab
2012-04-24 12:42             ` Mauro Carvalho Chehab
2012-04-24 12:49               ` [PATCH] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
2012-04-24 13:09                 ` Borislav Petkov
2012-04-24 13:22                   ` Mauro Carvalho Chehab
2012-04-24 13:38                     ` Borislav Petkov
2012-04-24 16:39                       ` Mauro Carvalho Chehab
2012-04-24 16:49                         ` Borislav Petkov
2012-04-24 17:38                           ` Mauro Carvalho Chehab
     [not found]                             ` <1335291342-14922-1-git-send-email-mchehab@redhat.com>
2012-04-24 18:15                               ` [PATCH EDACv16 2/2] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
2012-04-27 10:42                                 ` Mauro Carvalho Chehab
2012-04-27 13:33                               ` [PATCH EDACv16 1/2] edac: Change internal representation to work with layers Borislav Petkov
2012-04-27 14:11                                 ` Joe Perches
2012-04-27 15:12                                   ` Borislav Petkov
2012-04-27 16:07                                   ` Mauro Carvalho Chehab
2012-04-28  8:52                                     ` Borislav Petkov
2012-04-28 20:38                                       ` Joe Perches
2012-04-29 14:25                                       ` Mauro Carvalho Chehab
2012-04-29 15:11                                         ` Mauro Carvalho Chehab
2012-04-29 16:03                                           ` Joe Perches
2012-04-29 17:18                                             ` Mauro Carvalho Chehab
2012-04-29 16:20                                           ` Mauro Carvalho Chehab
2012-04-29 16:43                                             ` Joe Perches
2012-04-29 17:39                                               ` Mauro Carvalho Chehab
2012-04-30  7:47                                                 ` Borislav Petkov
2012-04-30 11:09                                                   ` Mauro Carvalho Chehab
2012-04-30 11:15                                                     ` Borislav Petkov
2012-04-30 11:46                                                       ` Mauro Carvalho Chehab
2012-04-27 15:36                                 ` Mauro Carvalho Chehab
2012-04-28  9:05                                   ` Borislav Petkov
2012-04-29 13:49                                     ` Mauro Carvalho Chehab
2012-04-30  8:15                                       ` Borislav Petkov
2012-04-30 10:58                                         ` Mauro Carvalho Chehab
2012-04-30 11:11                                           ` Borislav Petkov
2012-04-30 11:45                                             ` Mauro Carvalho Chehab
2012-04-30 12:38                                               ` Borislav Petkov
2012-04-30 13:00                                                 ` Mauro Carvalho Chehab
2012-04-30 13:53                                                   ` Mauro Carvalho Chehab
2012-04-30 15:02                                                     ` [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages Mauro Carvalho Chehab
2012-04-30 15:10                                                       ` Mauro Carvalho Chehab
2012-04-30 15:20                                                         ` Borislav Petkov
2012-04-30 15:33                                                           ` Mauro Carvalho Chehab
2012-04-30 16:16                                                       ` Joe Perches
2012-04-30 16:47                                                         ` Mauro Carvalho Chehab
2012-04-30 16:44                                                       ` [PATCHv3] " Mauro Carvalho Chehab
2012-04-30 11:37                                         ` [PATCH EDACv16 1/2] edac: Change internal representation to work with layers Mauro Carvalho Chehab
2012-04-27 17:52                                 ` Mauro Carvalho Chehab
2012-04-28  9:16                                   ` Borislav Petkov
2012-04-28 17:07                                     ` Joe Perches
2012-04-29 14:02                                       ` Mauro Carvalho Chehab
2012-04-29 14:16                                     ` Mauro Carvalho Chehab
2012-04-30  7:59                                       ` Borislav Petkov
2012-04-30 11:23                                         ` Mauro Carvalho Chehab
2012-04-30 12:51                                           ` Borislav Petkov
2012-04-24 12:55             ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
2012-04-24 13:11               ` Mauro Carvalho Chehab [this message]
2012-04-24 13:32                 ` Borislav Petkov
2012-04-24 14:24                   ` Mauro Carvalho Chehab
2012-04-24 16:27                     ` Borislav Petkov
2012-04-24 17:24                       ` Mauro Carvalho Chehab
2012-04-25 17:19                         ` Borislav Petkov
2012-04-25 17:47                           ` Mauro Carvalho Chehab
2012-04-25 18:32                             ` Luck, Tony
2012-04-25 18:44                               ` Mauro Carvalho Chehab
2012-04-26 14:11                             ` Borislav Petkov
2012-04-26 14:25                               ` Mauro Carvalho Chehab
2012-04-26 14:59                                 ` Mauro Carvalho Chehab
2012-04-25 17:55                           ` Luck, Tony
2012-04-24 17:31                       ` Luck, Tony
2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
2012-05-07 14:31       ` Borislav Petkov
2012-05-07 16:12         ` Mauro Carvalho Chehab
2012-05-07 16:17           ` Borislav Petkov
2012-05-07 16:59             ` Mauro Carvalho Chehab
2012-05-07 19:49               ` Borislav Petkov
2012-05-07 16:24           ` Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 02/26] amd76x_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 03/26] cell_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 04/26] cpc925_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 05/26] e752x_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 06/26] e7xxx_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 07/26] i3000_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 08/26] i3200_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 09/26] i5000_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 10/26] i5100_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 11/26] i5400_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 12/26] i7300_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 13/26] i7core_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 14/26] i82443bxgx_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 15/26] i82860_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 16/26] i82875p_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 17/26] i82975x_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 18/26] mpc85xx_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 19/26] mv64x60_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 20/26] pasemi_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 21/26] ppc4xx_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 22/26] r82600_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 23/26] sb_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 24/26] tile_edac: " Mauro Carvalho Chehab
2012-04-26 19:47       ` Chris Metcalf
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 25/26] x38_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 26/26] edac: Remove the legacy EDAC ABI Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F96A696.40308@redhat.com \
    --to=mchehab@redhat.com \
    --cc=bp@amd64.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=norsk5@yahoo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).