public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] mce-inject: extend mce-inject for support threshold interrupt event injection on ADM platform
@ 2014-10-31  1:24 Chen Yucong
  2014-10-31  1:24 ` [PATCH 1/2] x86, mce: apply MCE MSR wrappers to AMD platform for testing threshold interrupt handler Chen Yucong
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Chen Yucong @ 2014-10-31  1:24 UTC (permalink / raw)
  To: bp; +Cc: tony.luck, ak, gong.chen, linux-edac, linux-kernel



The work based on Boris's ras-for-3.19 branch.
https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git -b ras-for-3.19

Until now, `mce-inject' is unable to inject threshold interrupt event 
on AMD platform. That's because both Threshold Interrupt and POLL have a
separate event handler. amd_threshold_interrupt() is used for handling
Threshold Interrupt event. And machine_check_poll() has been used for
polling other events, such as `deferred' error. The main items of this
work include:

  * apply MCE MSR wrappers to AMD-specific threshold interrupt handler 
    for supporting mce-inject
  * introduces a new flag MCJ_INTERRUPT that is used to separate 
    CMCI/Threshold Interrupt and POLL in mce-inject.

Note that Linux machine check injector tool - mce-inject should be also
updated for accommodating the above changes in kernel-space.
  * [PATCH] separate CMCI/Threshold Interrupt and POLL in mce-inject
  * https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] x86, mce: apply MCE MSR wrappers to AMD platform for testing threshold interrupt handler
  2014-10-31  1:24 [PATCH 0/2] mce-inject: extend mce-inject for support threshold interrupt event injection on ADM platform Chen Yucong
@ 2014-10-31  1:24 ` Chen Yucong
  2014-10-31  1:24 ` [PATCH 2/2] x86, mce, amd: extend mce-inject for support threshold interrupt event injection on AMD platform Chen Yucong
  2014-10-31  1:24 ` [PATCH] separate CMCI/Threshold Interrupt and POLL in mce-inject Chen Yucong
  2 siblings, 0 replies; 8+ messages in thread
From: Chen Yucong @ 2014-10-31  1:24 UTC (permalink / raw)
  To: bp; +Cc: tony.luck, ak, gong.chen, linux-edac, linux-kernel, Chen Yucong

Until now, the `mce-inject' mechanism does not support error injection
for threshold interrupt event in AMD platform.

This patch aims to apply MCE MSR wrappers to AMD-specific threshold
interrupt handler for supporting mce-inject.

Signed-off-by: Chen Yucong <slaoub@gmail.com>
---
 arch/x86/include/asm/mce.h           |    4 ++++
 arch/x86/kernel/cpu/mcheck/mce.c     |   25 +++++++++++++++++++++++--
 arch/x86/kernel/cpu/mcheck/mce_amd.c |    6 +++---
 3 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 276392f..3a430ad 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -185,6 +185,10 @@ enum mcp_flags {
 };
 void machine_check_poll(enum mcp_flags flags, mce_banks_t *b);
 
+u64 mce_rdmsrl(u32 msr);
+void mce_wrmsrl(u32 msr, u64 v);
+int mce_rdmsr_safe(u32 msr, u32 *low, u32 *high);
+
 int mce_notify_irq(void);
 void mce_notify_process(void);
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 61a9668ce..b8fe5ae 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -391,7 +391,7 @@ static int msr_to_offset(u32 msr)
 }
 
 /* MSR access wrappers used for error injection */
-static u64 mce_rdmsrl(u32 msr)
+u64 mce_rdmsrl(u32 msr)
 {
 	u64 v;
 
@@ -416,7 +416,7 @@ static u64 mce_rdmsrl(u32 msr)
 	return v;
 }
 
-static void mce_wrmsrl(u32 msr, u64 v)
+void mce_wrmsrl(u32 msr, u64 v)
 {
 	if (__this_cpu_read(injectm.finished)) {
 		int offset = msr_to_offset(msr);
@@ -428,6 +428,27 @@ static void mce_wrmsrl(u32 msr, u64 v)
 	wrmsrl(msr, v);
 }
 
+int mce_rdmsr_safe(u32 msr, u32 *low, u32 *high) 
+{
+	int err = -1;
+	u64 val;
+
+	if (__this_cpu_read(injectm.finished)) {
+		int offset = msr_to_offset(msr);
+
+		if (offset < 0)
+			val = 0;
+		val = *(u64 *)((char *)&__get_cpu_var(injectm) + offset);
+		err = 0;
+	} else
+		err = rdmsrl_safe(msr, &val);
+
+	(*low) = (u32)val;
+	(*high) = (u32)(val >> 32);
+
+	return err;
+}
+
 /*
  * Collect all global (w.r.t. this processor) status about this machine
  * check into our "mce" struct so that we can use it later to assess
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 6606523..926e8a3 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -292,7 +292,7 @@ static void amd_threshold_interrupt(void)
 				++address;
 			}
 
-			if (rdmsr_safe(address, &low, &high))
+			if (mce_rdmsr_safe(address, &low, &high))
 				break;
 
 			if (!(high & MASK_VALID_HI)) {
@@ -318,12 +318,12 @@ static void amd_threshold_interrupt(void)
 
 log:
 	mce_setup(&m);
-	rdmsrl(MSR_IA32_MCx_STATUS(bank), m.status);
+	m.status = mce_rdmsrl(MSR_IA32_MCx_STATUS(bank));
 	m.misc = ((u64)high << 32) | low;
 	m.bank = bank;
 	mce_log(&m);
 
-	wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
+	mce_wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
 }
 
 /*
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] x86, mce, amd: extend mce-inject for support threshold interrupt event injection on AMD platform
  2014-10-31  1:24 [PATCH 0/2] mce-inject: extend mce-inject for support threshold interrupt event injection on ADM platform Chen Yucong
  2014-10-31  1:24 ` [PATCH 1/2] x86, mce: apply MCE MSR wrappers to AMD platform for testing threshold interrupt handler Chen Yucong
@ 2014-10-31  1:24 ` Chen Yucong
       [not found]   ` <CAOjmkp9Aec9Ec-93YvT5S_mMaxrOoZSYCDbjyWaxGV_dac6qog@mail.gmail.com>
  2014-10-31  1:24 ` [PATCH] separate CMCI/Threshold Interrupt and POLL in mce-inject Chen Yucong
  2 siblings, 1 reply; 8+ messages in thread
From: Chen Yucong @ 2014-10-31  1:24 UTC (permalink / raw)
  To: bp; +Cc: tony.luck, ak, gong.chen, linux-edac, linux-kernel, Chen Yucong

There are three ways that have been used to report machine check event.
And they are MCE, CMCI/Threshold Interrupt, and POLL. On the Intel
platform, CMCI/Threshold Interrupt and POLL share the same event handler
- machine_check_poll(). However, on the AMD platform, they have a
separate event handler. amd_threshold_interrupt() is used for handling
Threshold Interrupt event. And machine_check_poll() has been used for
polling other events.

This patch introduces a new flag MCJ_INTERRUPT that will be used to
separate CMCI/Threshold Interrupt and POLL handler in mce-inject.

Signed-off-by: Chen Yucong <slaoub@gmail.com>
---
 arch/x86/include/asm/mce.h              |    5 +++--
 arch/x86/kernel/cpu/mcheck/mce-inject.c |   16 ++++++++++++++++
 arch/x86/kernel/cpu/mcheck/threshold.c  |    1 +
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 3a430ad..cf25839 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -70,8 +70,9 @@
 #define MCJ_CTX_PROCESS		0x1  /* inject context: process */
 #define MCJ_CTX_IRQ		0x2  /* inject context: IRQ */
 #define MCJ_NMI_BROADCAST	0x4  /* do NMI broadcasting */
-#define MCJ_EXCEPTION		0x8  /* raise as exception */
-#define MCJ_IRQ_BROADCAST	0x10 /* do IRQ broadcasting */
+#define MCJ_IRQ_BROADCAST	0x8  /* do IRQ broadcasting */
+#define MCJ_EXCEPTION		0x10  /* raise as exception */
+#define MCJ_INTERRUPT		0x20  /* raise as interruption */
 
 #define MCE_OVERFLOW 0		/* bit 0 in flags means overflow */
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce-inject.c b/arch/x86/kernel/cpu/mcheck/mce-inject.c
index 4cfba43..8428746 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-inject.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-inject.c
@@ -59,6 +59,16 @@ static void raise_poll(struct mce *m)
 	m->finished = 0;
 }
 
+static void raise_interrupt(struct mce *m)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	mce_threshold_vector();
+	local_irq_restore(flags);
+	m->finished = 0;
+}
+
 static void raise_exception(struct mce *m, struct pt_regs *pregs)
 {
 	struct pt_regs regs;
@@ -89,6 +99,8 @@ static int mce_raise_notify(unsigned int cmd, struct pt_regs *regs)
 	cpumask_clear_cpu(cpu, mce_inject_cpumask);
 	if (m->inject_flags & MCJ_EXCEPTION)
 		raise_exception(m, regs);
+	else if (m->inject_flags & MCJ_INTERRUPT)
+		raise_interrupt(m);
 	else if (m->status)
 		raise_poll(m);
 	return NMI_HANDLED;
@@ -132,6 +144,10 @@ static int raise_local(void)
 			ret = -EINVAL;
 		}
 		printk(KERN_INFO "MCE exception done on CPU %d\n", cpu);
+	} else if (m->inject_flags & MCJ_INTERRUPT) {
+		printk(KERN_INFO "Raising threshold interrupt on CPU %d\n", cpu);
+		raise_interrupt(m);
+		printk(KERN_INFO "Threshold interrupt done on CPU %d\n", cpu);
 	} else if (m->status) {
 		printk(KERN_INFO "Starting machine check poll CPU %d\n", cpu);
 		raise_poll(m);
diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c
index 7245980..e324bf9 100644
--- a/arch/x86/kernel/cpu/mcheck/threshold.c
+++ b/arch/x86/kernel/cpu/mcheck/threshold.c
@@ -17,6 +17,7 @@ static void default_threshold_interrupt(void)
 }
 
 void (*mce_threshold_vector)(void) = default_threshold_interrupt;
+EXPORT_SYMBOL_GPL(mce_threshold_vector);
 
 static inline void __smp_threshold_interrupt(void)
 {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] separate CMCI/Threshold Interrupt and POLL in mce-inject
  2014-10-31  1:24 [PATCH 0/2] mce-inject: extend mce-inject for support threshold interrupt event injection on ADM platform Chen Yucong
  2014-10-31  1:24 ` [PATCH 1/2] x86, mce: apply MCE MSR wrappers to AMD platform for testing threshold interrupt handler Chen Yucong
  2014-10-31  1:24 ` [PATCH 2/2] x86, mce, amd: extend mce-inject for support threshold interrupt event injection on AMD platform Chen Yucong
@ 2014-10-31  1:24 ` Chen Yucong
  2 siblings, 0 replies; 8+ messages in thread
From: Chen Yucong @ 2014-10-31  1:24 UTC (permalink / raw)
  To: bp; +Cc: tony.luck, ak, gong.chen, linux-edac, linux-kernel, Chen Yucong

This patch introduces a new flag MCJ_INTERRUPT that will be used
to separate CMCI/Threshold Interrupt and POLL in mce-inject.

Signed-off-by: Chen Yucong <slaoub@gmail.com>
---
 mce.h   |    5 +++--
 mce.lex |    1 +
 mce.y   |    6 +++++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/mce.h b/mce.h
index c0668ad..d0bd39a 100644
--- a/mce.h
+++ b/mce.h
@@ -38,8 +38,9 @@
 #define MCJ_CTX_PROCESS		1    /* inject context: process */
 #define MCJ_CTX_IRQ		2    /* inject context: IRQ */
 #define MCJ_NMI_BROADCAST	4    /* do NMI broadcasting */
-#define MCJ_EXCEPTION		8    /* raise as exception */
-#define MCJ_IRQ_BRAODCAST	0x10 /* do IRQ broadcasting */
+#define MCJ_IRQ_BRAODCAST	8    /* do IRQ broadcasting */
+#define MCJ_EXCEPTION		0x10 /* raise as exception */
+#define MCJ_INTERRUPT		0x20 /* raise as interrupt */
 
 #define MCJ_CTX_SET(flags, ctx)				\
 	do {						\
diff --git a/mce.lex b/mce.lex
index ce8a9ae..ce4ea69 100644
--- a/mce.lex
+++ b/mce.lex
@@ -83,6 +83,7 @@ static struct key {
 	KEY(IN_IRQ),
 	KEY(IN_PROC),
 	KEY(POLL),
+	KEY(INTERRUPT),
 	KEY(EXCP),
 	KEYVAL(CORRECTED, MCI_STATUS_VAL|MCI_STATUS_EN), 	// checkme
 	KEYVAL(UNCORRECTED, MCI_STATUS_VAL|MCI_STATUS_UC|MCI_STATUS_EN),
diff --git a/mce.y b/mce.y
index a9421ee..84095a1 100644
--- a/mce.y
+++ b/mce.y
@@ -43,7 +43,7 @@ static void init(void);
 %token STATUS RIP TSC ADDR MISC CPU BANK MCGSTATUS HOLD
 %token NOBROADCAST IRQBROADCAST NMIBROADCAST 
 %token IN_IRQ IN_PROC PROCESSOR TIME SOCKETID APICID MCGCAP
-%token POLL EXCP
+%token POLL INTERRUPT EXCP
 %token CORRECTED UNCORRECTED FATAL MCE
 %token NUMBER
 %token SYMBOL
@@ -94,7 +94,11 @@ mce_term:   STATUS status_list  { m.status = $2; }
      | IN_IRQ		   { MCJ_CTX_SET(m.inject_flags, MCJ_CTX_IRQ); }
      | IN_PROC		   { MCJ_CTX_SET(m.inject_flags, MCJ_CTX_PROCESS); }
      | POLL		   { mce_flags |= MCE_RAISE_MODE;
+			     m.inject_flags &= ~MCJ_INTERRUPT;
 			     m.inject_flags &= ~MCJ_EXCEPTION; }
+     | INTERRUPT	   { mce_flags |= MCE_RAISE_MODE;
+			     m.inject_flags &= ~MCJ_EXCEPTION;
+			     m.inject_flags |= MCJ_INTERRUPT; }
      | EXCP		   { mce_flags |= MCE_RAISE_MODE;
 			     m.inject_flags |= MCJ_EXCEPTION; }
      ;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] x86, mce, amd: extend mce-inject for support threshold interrupt event injection on AMD platform
       [not found]   ` <CAOjmkp9Aec9Ec-93YvT5S_mMaxrOoZSYCDbjyWaxGV_dac6qog@mail.gmail.com>
@ 2014-11-03 17:51     ` Aravind Gopalakrishnan
  2014-11-03 18:00       ` Borislav Petkov
  2014-11-04  1:39       ` Chen Yucong
  0 siblings, 2 replies; 8+ messages in thread
From: Aravind Gopalakrishnan @ 2014-11-03 17:51 UTC (permalink / raw)
  To: Borislav Petkov, Chen Yucong; +Cc: tony.luck, ak, gong.chen, linux-edac, LKML

On 11/3/2014 11:05 AM, Aravind Gopalakrishnan wrote:
>
> There are three ways that have been used to report machine check event.
> And they are MCE, CMCI/Threshold Interrupt, and POLL. On the Intel
> platform, CMCI/Threshold Interrupt and POLL share the same event handler
> - machine_check_poll(). However, on the AMD platform, they have a
> separate event handler. amd_threshold_interrupt() is used for handling
> Threshold Interrupt event. And machine_check_poll() has been used for
> polling other events.
>
> This patch introduces a new flag MCJ_INTERRUPT that will be used to
> separate CMCI/Threshold Interrupt and POLL handler in mce-inject.
>
> Signed-off-by: Chen Yucong <slaoub@gmail.com <mailto:slaoub@gmail.com>>
> ---
>  arch/x86/include/asm/mce.h              |    5 +++--
>  arch/x86/kernel/cpu/mcheck/mce-inject.c |   16 ++++++++++++++++
>  arch/x86/kernel/cpu/mcheck/threshold.c  |    1 +
>  3 files changed, 20 insertions(+), 2 deletions(-)
>


We currently test decoding logic on AMD by performing mce injections 
using edac/mce_amd_inj.c,
So instead of modifying mce-inject just for testing 
amd_threshold_interrupt(),
Why not put it under mce_amd_inj? (It's AMD specific code anyway)


> diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
> index 3a430ad..cf25839 100644
> --- a/arch/x86/include/asm/mce.h
> +++ b/arch/x86/include/asm/mce.h
> @@ -70,8 +70,9 @@
>  #define MCJ_CTX_PROCESS                0x1  /* inject context: process */
>  #define MCJ_CTX_IRQ            0x2  /* inject context: IRQ */
>  #define MCJ_NMI_BROADCAST      0x4  /* do NMI broadcasting */
> -#define MCJ_EXCEPTION          0x8  /* raise as exception */
> -#define MCJ_IRQ_BROADCAST      0x10 /* do IRQ broadcasting */
> +#define MCJ_IRQ_BROADCAST      0x8  /* do IRQ broadcasting */
> +#define MCJ_EXCEPTION          0x10  /* raise as exception */
> +#define MCJ_INTERRUPT          0x20  /* raise as interruption */
>
>  #define MCE_OVERFLOW 0         /* bit 0 in flags means overflow */
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce-inject.c 
> b/arch/x86/kernel/cpu/mcheck/mce-inject.c
> index 4cfba43..8428746 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce-inject.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce-inject.c
> @@ -59,6 +59,16 @@ static void raise_poll(struct mce *m)
>         m->finished = 0;
>  }
>
> +static void raise_interrupt(struct mce *m)
> +{
> +       unsigned long flags;
> +
> +       local_irq_save(flags);
> +       mce_threshold_vector();
> +       local_irq_restore(flags);
> +       m->finished = 0;
> +}
> +
>  static void raise_exception(struct mce *m, struct pt_regs *pregs)
>  {
>         struct pt_regs regs;
> @@ -89,6 +99,8 @@ static int mce_raise_notify(unsigned int cmd, struct 
> pt_regs *regs)
>         cpumask_clear_cpu(cpu, mce_inject_cpumask);
>         if (m->inject_flags & MCJ_EXCEPTION)
>                 raise_exception(m, regs);
> +       else if (m->inject_flags & MCJ_INTERRUPT)
> +               raise_interrupt(m);
>         else if (m->status)
>                 raise_poll(m);
>         return NMI_HANDLED;
> @@ -132,6 +144,10 @@ static int raise_local(void)
>                         ret = -EINVAL;
>                 }
>                 printk(KERN_INFO "MCE exception done on CPU %d\n", cpu);
> +       } else if (m->inject_flags & MCJ_INTERRUPT) {
> +               printk(KERN_INFO "Raising threshold interrupt on CPU 
> %d\n", cpu);
> +               raise_interrupt(m);
> +               printk(KERN_INFO "Threshold interrupt done on CPU 
> %d\n", cpu);
>         } else if (m->status) {
>                 printk(KERN_INFO "Starting machine check poll CPU 
> %d\n", cpu);
>                 raise_poll(m);
> diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c 
> b/arch/x86/kernel/cpu/mcheck/threshold.c
> index 7245980..e324bf9 100644
> --- a/arch/x86/kernel/cpu/mcheck/threshold.c
> +++ b/arch/x86/kernel/cpu/mcheck/threshold.c
> @@ -17,6 +17,7 @@ static void default_threshold_interrupt(void)
>  }
>
>  void (*mce_threshold_vector)(void) = default_threshold_interrupt;
> +EXPORT_SYMBOL_GPL(mce_threshold_vector);
>
>  static inline void __smp_threshold_interrupt(void)
>  {
> --
> 1.7.10.4
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] x86, mce, amd: extend mce-inject for support threshold interrupt event injection on AMD platform
  2014-11-03 17:51     ` Aravind Gopalakrishnan
@ 2014-11-03 18:00       ` Borislav Petkov
  2014-11-04  2:02         ` Chen Yucong
  2014-11-04  1:39       ` Chen Yucong
  1 sibling, 1 reply; 8+ messages in thread
From: Borislav Petkov @ 2014-11-03 18:00 UTC (permalink / raw)
  To: Aravind Gopalakrishnan
  Cc: Chen Yucong, tony.luck, ak, gong.chen, linux-edac, LKML

On Mon, Nov 03, 2014 at 11:51:47AM -0600, Aravind Gopalakrishnan wrote:
> On 11/3/2014 11:05 AM, Aravind Gopalakrishnan wrote:
> >
> >There are three ways that have been used to report machine check event.
> >And they are MCE, CMCI/Threshold Interrupt, and POLL. On the Intel
> >platform, CMCI/Threshold Interrupt and POLL share the same event handler
> >- machine_check_poll(). However, on the AMD platform, they have a
> >separate event handler. amd_threshold_interrupt() is used for handling
> >Threshold Interrupt event. And machine_check_poll() has been used for
> >polling other events.
> >
> >This patch introduces a new flag MCJ_INTERRUPT that will be used to
> >separate CMCI/Threshold Interrupt and POLL handler in mce-inject.
> >
> >Signed-off-by: Chen Yucong <slaoub@gmail.com <mailto:slaoub@gmail.com>>
> >---
> > arch/x86/include/asm/mce.h              |    5 +++--
> > arch/x86/kernel/cpu/mcheck/mce-inject.c |   16 ++++++++++++++++
> > arch/x86/kernel/cpu/mcheck/threshold.c  |    1 +
> > 3 files changed, 20 insertions(+), 2 deletions(-)
> >
> 
> 
> We currently test decoding logic on AMD by performing mce injections using
> edac/mce_amd_inj.c,
> So instead of modifying mce-inject just for testing
> amd_threshold_interrupt(),
> Why not put it under mce_amd_inj? (It's AMD specific code anyway)

Right, I think this is supposed to be vendor-agnostic as it is calling
mce_threshold_vector() directly.

Btw, I wouldn't mind if someone would sit down and unify those injection
methods and come up with a saner interface which can actually be used by
humans, not those yucky files you feed mce-inject with...

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] x86, mce, amd: extend mce-inject for support threshold interrupt event injection on AMD platform
  2014-11-03 17:51     ` Aravind Gopalakrishnan
  2014-11-03 18:00       ` Borislav Petkov
@ 2014-11-04  1:39       ` Chen Yucong
  1 sibling, 0 replies; 8+ messages in thread
From: Chen Yucong @ 2014-11-04  1:39 UTC (permalink / raw)
  To: Aravind Gopalakrishnan
  Cc: Borislav Petkov, tony.luck, ak, gong.chen, linux-edac, LKML

On Mon, 2014-11-03 at 11:51 -0600, Aravind Gopalakrishnan wrote:
> On 11/3/2014 11:05 AM, Aravind Gopalakrishnan wrote:
> >
> > There are three ways that have been used to report machine check event.
> > And they are MCE, CMCI/Threshold Interrupt, and POLL. On the Intel
> > platform, CMCI/Threshold Interrupt and POLL share the same event handler
> > - machine_check_poll(). However, on the AMD platform, they have a
> > separate event handler. amd_threshold_interrupt() is used for handling
> > Threshold Interrupt event. And machine_check_poll() has been used for
> > polling other events.
> >
> > This patch introduces a new flag MCJ_INTERRUPT that will be used to
> > separate CMCI/Threshold Interrupt and POLL handler in mce-inject.
> >
> > Signed-off-by: Chen Yucong <slaoub@gmail.com <mailto:slaoub@gmail.com>>
> > ---
> >  arch/x86/include/asm/mce.h              |    5 +++--
> >  arch/x86/kernel/cpu/mcheck/mce-inject.c |   16 ++++++++++++++++
> >  arch/x86/kernel/cpu/mcheck/threshold.c  |    1 +
> >  3 files changed, 20 insertions(+), 2 deletions(-)
> >
> 
> 
> We currently test decoding logic on AMD by performing mce injections 
> using edac/mce_amd_inj.c,
> So instead of modifying mce-inject just for testing 
> amd_threshold_interrupt(),
> Why not put it under mce_amd_inj? (It's AMD specific code anyway)
> 
Until now, edac/mce_amd_inj.c is just used for testing EDAC decoding
logic on AMD. But there are some tools that can be also used to decode
machine check error information, such as `rasdaemon' and `mcelog'. If
we want to use mce_amd_inj.c for error injection, we may need to move
it.

In addition, EDAC decoding logic does not need to access machine check 
specific `MSRs', so edac/mce_amd_inj.c can work well for error
injection.

Finally, amd_threshold_interrupt is AMD specific code,
intel_threshold_interrupt is also Intel specific code.

thx!
cyc



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] x86, mce, amd: extend mce-inject for support threshold interrupt event injection on AMD platform
  2014-11-03 18:00       ` Borislav Petkov
@ 2014-11-04  2:02         ` Chen Yucong
  0 siblings, 0 replies; 8+ messages in thread
From: Chen Yucong @ 2014-11-04  2:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Aravind Gopalakrishnan, tony.luck, ak, gong.chen, linux-edac,
	LKML

On Mon, 2014-11-03 at 19:00 +0100, Borislav Petkov wrote:
> On Mon, Nov 03, 2014 at 11:51:47AM -0600, Aravind Gopalakrishnan wrote:
> > On 11/3/2014 11:05 AM, Aravind Gopalakrishnan wrote:
> > >
> > >There are three ways that have been used to report machine check event.
> > >And they are MCE, CMCI/Threshold Interrupt, and POLL. On the Intel
> > >platform, CMCI/Threshold Interrupt and POLL share the same event handler
> > >- machine_check_poll(). However, on the AMD platform, they have a
> > >separate event handler. amd_threshold_interrupt() is used for handling
> > >Threshold Interrupt event. And machine_check_poll() has been used for
> > >polling other events.
> > >
> > >This patch introduces a new flag MCJ_INTERRUPT that will be used to
> > >separate CMCI/Threshold Interrupt and POLL handler in mce-inject.
> > >
> > >Signed-off-by: Chen Yucong <slaoub@gmail.com <mailto:slaoub@gmail.com>>
> > >---
> > > arch/x86/include/asm/mce.h              |    5 +++--
> > > arch/x86/kernel/cpu/mcheck/mce-inject.c |   16 ++++++++++++++++
> > > arch/x86/kernel/cpu/mcheck/threshold.c  |    1 +
> > > 3 files changed, 20 insertions(+), 2 deletions(-)
> > >
> > 
> > 
> > We currently test decoding logic on AMD by performing mce injections using
> > edac/mce_amd_inj.c,
> > So instead of modifying mce-inject just for testing
> > amd_threshold_interrupt(),
> > Why not put it under mce_amd_inj? (It's AMD specific code anyway)
> 
> Right, I think this is supposed to be vendor-agnostic as it is calling
> mce_threshold_vector() directly.
> 
I'm not sure I understand your point. But mce_threshold_vector is shared
by AMD and Intel. 

> Btw, I wouldn't mind if someone would sit down and unify those injection
> methods and come up with a saner interface which can actually be used by
> humans, not those yucky files you feed mce-inject with...
> 
Anyway, I think it can be work well for testing EDAC/raddaemon/mcelog
decoding logic. So I suggest you try to use it, and you can add it to
your list of test tools.

thx!
cyc


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-11-04  2:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-31  1:24 [PATCH 0/2] mce-inject: extend mce-inject for support threshold interrupt event injection on ADM platform Chen Yucong
2014-10-31  1:24 ` [PATCH 1/2] x86, mce: apply MCE MSR wrappers to AMD platform for testing threshold interrupt handler Chen Yucong
2014-10-31  1:24 ` [PATCH 2/2] x86, mce, amd: extend mce-inject for support threshold interrupt event injection on AMD platform Chen Yucong
     [not found]   ` <CAOjmkp9Aec9Ec-93YvT5S_mMaxrOoZSYCDbjyWaxGV_dac6qog@mail.gmail.com>
2014-11-03 17:51     ` Aravind Gopalakrishnan
2014-11-03 18:00       ` Borislav Petkov
2014-11-04  2:02         ` Chen Yucong
2014-11-04  1:39       ` Chen Yucong
2014-10-31  1:24 ` [PATCH] separate CMCI/Threshold Interrupt and POLL in mce-inject Chen Yucong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox