All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@amd64.org>
To: Tony Luck <tony.luck@intel.com>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	"Huang, Ying" <ying.huang@intel.com>,
	Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Subject: Re: [PATCH 6/6] x86, mce: Recognise machine check bank signature for data path error
Date: Wed, 14 Dec 2011 16:47:38 +0100	[thread overview]
Message-ID: <20111214154738.GF23589@aftab> (raw)
In-Reply-To: <9fc2d80b67a31c7b3c25bdabb20c361443113c04.1323803130.git.tony.luck@intel.com>

On Thu, Dec 08, 2011 at 02:49:09PM -0800, Tony Luck wrote:
> Action required data path signature is defined in table 15-19 of SDM:
> 
> +-----------------------------------------------------------------------------+
> | SRAR Error | Valid | OVER | UC | EN | MISCV | ADDRV | PCC | S | AR | MCACOD |
> | Data Load  |     1 |    0 |  1 |  1 |     1 |     1 |   0 | 1 |  1 |  0x134 |
> +-----------------------------------------------------------------------------+
> 
> Recognise this, and pass MCE_AR_SEVERITY code back to do_machine_check()
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/kernel/cpu/mcheck/mce-severity.c |   14 +++++++++++++-
>  1 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> index 7395d5f..c4d8b24 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> @@ -54,6 +54,7 @@ static struct severity {
>  #define  MASK(x, y)	.mask = x, .result = y
>  #define MCI_UC_S (MCI_STATUS_UC|MCI_STATUS_S)
>  #define MCI_UC_SAR (MCI_STATUS_UC|MCI_STATUS_S|MCI_STATUS_AR)
> +#define	MCI_ADDR (MCI_STATUS_ADDRV|MCI_STATUS_MISCV)
>  #define MCACOD 0xffff
>  
>  	MCESEV(
> @@ -102,11 +103,22 @@ static struct severity {
>  		SER, BITCLR(MCI_STATUS_S)
>  		),
>  
> -	/* AR add known MCACODs here */
>  	MCESEV(
>  		PANIC, "Action required with lost events",
>  		SER, BITSET(MCI_STATUS_OVER|MCI_UC_SAR)
>  		),
> +
> +	/* known AR MCACODs: */
> +	MCESEV(
> +		KEEP, "HT thread notices Action required: data load error",
> +		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|0x0134),
> +		MCGMASK(MCG_STATUS_EIPV, 0)

Oh this is the core "observed" the error case, ok.

This is marked as MCE_KEEP_SEVERITY, which means that we're panicking
in case we lose the AR error on the affected CPU. Which should be
conservative enough...

ACK.

> +		),
> +	MCESEV(
> +		AR, "Action required: data load error",
> +		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|0x0134),
> +		USER
> +		),
>  	MCESEV(
>  		PANIC, "Action required: unknown MCACOD",
>  		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR, MCI_UC_SAR)
> -- 
> 1.7.3.1
> 
> 

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

  reply	other threads:[~2011-12-14 15:47 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-13 19:05 [PATCH 0/6] x86, mce: machine check recovery for applications Tony Luck
2011-12-08 22:49 ` [PATCH 6/6] x86, mce: Recognise machine check bank signature for data path error Tony Luck
2011-12-14 15:47   ` Borislav Petkov [this message]
2011-12-12 21:06 ` [PATCH 4/6] x86, mce: Add mechanism to safely save information in MCE handler Tony Luck
2011-12-14  7:52   ` Ingo Molnar
2011-12-12 21:47 ` [PATCH 5/6] x86, mce: handle "action required" errors Tony Luck
2011-12-14  9:28   ` Chen Gong
2011-12-14 21:30     ` Tony Luck
2011-12-15  2:56       ` Chen Gong
2011-12-14 16:04   ` Borislav Petkov
2011-12-14 19:05     ` Luck, Tony
2011-12-13 17:24 ` [PATCH 1/6] HWPOISON: clean up memory_failure() vs. __memory_failure() Tony Luck
2011-12-14  7:47   ` Ingo Molnar
2011-12-14 16:07     ` Borislav Petkov
2011-12-14 16:55       ` Ingo Molnar
2011-12-14 17:21         ` Luck, Tony
2011-12-15  6:44           ` Ingo Molnar
2011-12-15 18:05             ` Tony Luck
2011-12-15 18:09               ` Ingo Molnar
2011-12-13 17:27 ` [PATCH 2/6] HWPOISON: Add code to handle "action required" errors Tony Luck
2011-12-13 17:48 ` [PATCH 3/6] x86, mce: create helper function to save addr/misc when needed Tony Luck
2011-12-16  0:13   ` Hidetoshi Seto
  -- strict thread matches above, loose matches on Subject: below --
2011-12-15 19:59 [PATCH 0/6] x86, mce: machine check recovery for applications [updated] Tony Luck
2011-12-08 22:49 ` [PATCH 6/6] x86, mce: Recognise machine check bank signature for data path error Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111214154738.GF23589@aftab \
    --to=bp@amd64.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=tony.luck@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.