linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab@redhat.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Borislav Petkov <petkovbb@googlemail.com>,
	mingo@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org,
	andi@firstfloor.org, tglx@linutronix.de,
	Andreas Herrmann <andreas.herrmann3@amd.com>,
	Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
	linux-tip-commits@vger.kernel.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Fr??d??ric Weisbecker <fweisbec@gmail.com>,
	Aristeu Rozanski <aris@redhat.com>,
	Doug Thompson <norsk5@yahoo.com>,
	Huang Ying <ying.huang@intel.com>,
	Arjan van de Ven <arjan@infradead.org>
Subject: Re: [tip:x86/mce] x86, mce: Rename cpu_specific_poll to mce_cpu_specific_poll
Date: Wed, 27 Jan 2010 10:34:47 -0200	[thread overview]
Message-ID: <4B6032E7.1090900@redhat.com> (raw)
In-Reply-To: <20100123090003.GA20056@elte.hu>

Ingo Molnar wrote:
> * Borislav Petkov <petkovbb@googlemail.com> wrote:
> 
>> (Adding some more interested parties to Cc:)
>>
>> On Sat, Jan 23, 2010 at 06:17:17AM +0100, Ingo Molnar wrote:
>>> * tip-bot for H. Peter Anvin <hpa@zytor.com> wrote:
>>>
>>>> Commit-ID:  f91c4d2649531cc36e10c6bc0f92d0f99116b209
>>>> Gitweb:     http://git.kernel.org/tip/f91c4d2649531cc36e10c6bc0f92d0f99116b209
>>>> Author:     H. Peter Anvin <hpa@zytor.com>
>>>> AuthorDate: Thu, 21 Jan 2010 18:31:54 -0800
>>>> Committer:  H. Peter Anvin <hpa@zytor.com>
>>>> CommitDate: Thu, 21 Jan 2010 18:31:54 -0800
>>>>
>>>> x86, mce: Rename cpu_specific_poll to mce_cpu_specific_poll
>>>>
>>>> cpu_specific_poll is a global variable, and it should have a global
>>>> namespace name.  Since it is MCE-specific (it takes a struct mce *),
>>>> rename it mce_cpu_specific_poll.
>>>>
>>>> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
>>>> Cc: Andi Kleen <andi@firstfloor.org>
>>>> LKML-Reference: <20100121221711.GA8242@basil.fritz.box>
>>> FYI, this commit triggered a -tip test failure:
>>>
>>>  arch/x86/kernel/cpu/mcheck/mce-xeon75xx.c: In function 'xeon75xx_mce_init':
>>>  arch/x86/kernel/cpu/mcheck/mce-xeon75xx.c:340: error: implicit declaration of function 'pci_match_id'
>>>
>>> I'm excluding it from tip:master.
>>>
>>> But the bigger problem with this commit is structure of it - or the lack 
>>> thereof.
>>>
>>> It just blindly goes into the direction the MCE code has been going for some 
>>> time, minimally enabling the hardware, ignoring both the new EDAC design and 
>>> the new performance monitoring related design i outlined some time ago.
>> I completely agree - from what I see this is adding vendor- or rather 
>> vendor-and-machine-specific hooks to read out (1) the position of the the 
>> memory translation table from PCI config space (0x8c), (2) then to read out 
>> the offset from the first MCA status register in order to (3) rdmsr the 
>> status information.
>>
>> In AMD's case, we need similar hooks too, in order to evaluate correctable 
>> MCEs for different RAS reasons like for example L3 cache or data arrays 
>> errors for disabling L3 indices. I was looking into adding hooks into 
>> machine_check_poll() and cpu_specific_poll() interface could work.
>>
>> Furthermore, lets leave mcheck be mcheck and do error decoding in EDAC 
>> modules. For example, there was a core i7 EDAC module submission from Mauro 
>> and the Xeon75xx-specific decoding bits could be added to it or even as a 
>> new machine-specific module instead of mcelog.
>>
>> With the evergrowing complexity of memory controller design I don't think 
>> that the userspace mcelog approach will scale - you need the whole decoding 
>> in the kernel where the module knows the exact memory controllers setup and 
>> which DRAM addresses belong to which nodes and whether you do memory 
>> hoisting and whether you interleave, if yes, how and on what granilarity you 
>> interleave and on and on...
> 
> Yep. Could you give a few pointers to Andi where exactly you'd like to see the 
> Intel Xeon functionality added to the EDAC code? There is some Intel 
> functionality there already, but the current upstream code does not look very 
> uptodate. I've looked at e752x_edac.c. (there's some Corei7 work pending, 
> right?) In any case there's a lot of fixing to be done to the Intel code 
> there.

The i7core_edac driver covers the Nehalem chipsets. It is at:
	http://git.kernel.org/?p=linux/kernel/git/mchehab/i7core.git

The error decoding is done by i7core_mce_output_error(). If Xeon 75xx needs to do
something different, it is just a matter of calling a function there, to retrieve
the values from the proper registers.

In order to avoid duplication at the code, the driver is receiving the error events
from the mcelog driver, and having two drivers fighting to access the same error
registers at the same time, this driver relies on mce driver to send the events.

I expect to allocate some time to put the driver into a better shape for
upstream submission soon.

> Memory controller functionality integrated into the kernel proper is a good 
> idea partly for similar reasons an on-die memory controller is a good idea in 
> the hardware space.

I agree. The EDAC module handles memory errors for several memory controllers,
including some on non-x86 architectures. Letting this decode to happen outside
the kernel, just for a very few set of memory controllers, seems a bad idea.

Cheers,
Mauro.

  parent reply	other threads:[~2010-01-27 12:35 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-21 22:17 [PATCH] x86: mce: Xeon75xx specific interface to get corrected memory error information Andi Kleen
2010-01-22 10:51 ` [tip:x86/mce] x86, " tip-bot for Andi Kleen
2010-01-22 10:51 ` [tip:x86/mce] x86, mce: Rename cpu_specific_poll to mce_cpu_specific_poll tip-bot for H. Peter Anvin
2010-01-23  5:17   ` Ingo Molnar
2010-01-23  7:58     ` Borislav Petkov
2010-01-23  9:00       ` Ingo Molnar
2010-01-24 10:08         ` Borislav Petkov
2010-01-25 13:19           ` Andi Kleen
2010-01-26  6:33             ` Borislav Petkov
2010-01-26  9:06               ` Hidetoshi Seto
2010-01-26 16:09                 ` Andi Kleen
2010-01-26 15:36               ` Andi Kleen
2010-02-16 21:02           ` Ingo Molnar
2010-02-22  8:28             ` Borislav Petkov
2010-02-22  9:47               ` Ingo Molnar
2010-02-22 11:59                 ` Mauro Carvalho Chehab
2010-02-24 17:42                   ` Mauro Carvalho Chehab
2010-02-24 20:28                     ` Andi Kleen
2010-01-27 12:34         ` Mauro Carvalho Chehab [this message]
2010-01-27 14:39           ` Andi Kleen
2010-01-27 15:04             ` Mauro Carvalho Chehab
2010-01-27 16:36               ` Andi Kleen
2010-01-23 11:33     ` Andi Kleen
2010-02-05 23:31       ` [tip:x86/mce] x86, mce: Make xeon75xx memory driver dependent on PCI tip-bot for Andi Kleen
2010-02-16 20:47         ` Ingo Molnar
2010-02-16 22:29           ` Andi Kleen
2010-02-19 10:50             ` Thomas Gleixner
2010-02-19 12:17               ` Andi Kleen
2010-02-19 12:45                 ` Borislav Petkov
2010-02-19 13:21                   ` Andi Kleen
2010-02-19 15:17                     ` Mauro Carvalho Chehab
2010-02-19 15:37                       ` Andi Kleen
2010-02-20  0:14                         ` Mauro Carvalho Chehab
2010-02-20  9:01                           ` Andi Kleen
2010-02-19 15:46                 ` Thomas Gleixner
2010-02-22  7:38             ` Hidetoshi Seto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B6032E7.1090900@redhat.com \
    --to=mchehab@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=andi@firstfloor.org \
    --cc=andreas.herrmann3@amd.com \
    --cc=aris@redhat.com \
    --cc=arjan@infradead.org \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=norsk5@yahoo.com \
    --cc=petkovbb@googlemail.com \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).