All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	Andi Kleen <ak@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [BUGFIX -v2] x86, mce, inject: Make injected mce valid only during faked handler call
Date: Mon, 28 Sep 2009 15:40:23 +0900	[thread overview]
Message-ID: <4AC05A57.1080505@jp.fujitsu.com> (raw)
In-Reply-To: <1254100882.15717.1312.camel@yhuang-dev.sh.intel.com>

Hi Huang,

Huang Ying wrote:
> In the current implementation, injected MCE is valid from the point
> the MCE is injected to the point the MCE is processed by the faked
> handler call.
> 
> This has an undesired side-effect: it is possible for it to be
> consumed by real machine_check_poll. This may confuse a real system
> error and may confuse the mce test suite.
> 
> To fix this, this patch introduces another flag MCJ_VALID to indicate
                                                  ^^^^^^^^^
MCJ_LOADED?

> that the MCE entry is valid for injector but not for the
> handler. Another flag, mce.finished is used to indicate the MCE entry
> is valid for the handler.
> 
> mce.finished is enabled only during faked MCE handler call and
> protected by IRQ disabling. This make it impossible for real
> machine_check_poll to consume it.

Are there the reverse case - is it possible that the faked handler
call might consume real error which is not handled yet by the real
machine_check_poll?

> 
> Signed-off-by: Huang Ying <ying.huang@intel.com>
> 
> v2:
> - Revise commit changelog (Thanks Ingo)
> - Change naming (XX_BIT for bit definition)
> 
> ---
>  arch/x86/include/asm/mce.h              |   17 +++++++++++------
>  arch/x86/kernel/cpu/mcheck/mce-inject.c |   23 ++++++++++++++++-------
>  2 files changed, 27 insertions(+), 13 deletions(-)
> 
> --- a/arch/x86/include/asm/mce.h
> +++ b/arch/x86/include/asm/mce.h
> @@ -38,13 +38,18 @@
>  #define MCM_ADDR_MEM	 3	/* memory address */
>  #define MCM_ADDR_GENERIC 7	/* generic */
>  
> -#define MCJ_CTX_MASK		3
> +#define MCJ_NMI_BROADCAST_BIT	2    /* do NMI broadcasting */
> +#define MCJ_EXCEPTION_BIT	3    /* raise as exception */
> +#define MCJ_LOADED_BIT		4    /* entry is valid for injector */
> +
> +#define MCJ_CTX_MASK		0x03
>  #define MCJ_CTX(flags)		((flags) & MCJ_CTX_MASK)
> -#define MCJ_CTX_RANDOM		0    /* inject context: random */
> -#define MCJ_CTX_PROCESS		1    /* inject context: process */
> -#define MCJ_CTX_IRQ		2    /* inject context: IRQ */
> -#define MCJ_NMI_BROADCAST	4    /* do NMI broadcasting */
> -#define MCJ_EXCEPTION		8    /* raise as exception */
> +#define MCJ_CTX_RANDOM		0x00    /* inject context: random */
> +#define MCJ_CTX_PROCESS		0x01    /* inject context: process */
> +#define MCJ_CTX_IRQ		0x02    /* inject context: IRQ */
> +#define MCJ_NMI_BROADCAST	(1 << MCJ_NMI_BROADCAST_BIT)
> +#define MCJ_EXCEPTION		(1 << MCJ_EXCEPTION_BIT)
> +#define MCJ_LOADED		(1 << MCJ_LOADED_BIT)

I'd like to see a patch to replace MCJ_* to MCE_INJ_* before
adding new flag.

>  
>  /* Fields are zero when not available */
>  struct mce {
> --- a/arch/x86/kernel/cpu/mcheck/mce-inject.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce-inject.c
> @@ -32,16 +32,16 @@ static void inject_mce(struct mce *m)
>  
>  	/* Make sure noone reads partially written injectm */
>  	i->finished = 0;
> +	clear_bit(MCJ_LOADED_BIT, (unsigned long *)&i->inject_flags);
>  	mb();
>  	m->finished = 0;
> -	/* First set the fields after finished */
> +	clear_bit(MCJ_LOADED_BIT, (unsigned long *)&m->inject_flags);
>  	i->extcpu = m->extcpu;
>  	mb();
> -	/* Now write record in order, finished last (except above) */
>  	memcpy(i, m, sizeof(struct mce));
>  	/* Finally activate it */
>  	mb();
> -	i->finished = 1;
> +	set_bit(MCJ_LOADED_BIT, (unsigned long *)&i->inject_flags);

Why
  clear_bit(MCJ_LOADED_BIT, (unsigned long *)&m->inject_flags);
  set_bit(MCJ_LOADED_BIT, (unsigned long *)&i->inject_flags);
cannot be
  m->inject_flags &= ~MCJ_LOADED;
  m->inject_flags |= MCJ_LOADED;
?

If it can, defined *_BIT will not be necessary here.

>  }
>  
>  static void raise_poll(struct mce *m)
> @@ -51,9 +51,11 @@ static void raise_poll(struct mce *m)
>  
>  	memset(&b, 0xff, sizeof(mce_banks_t));
>  	local_irq_save(flags);
> +	m->finished = 1;
>  	machine_check_poll(0, &b);
> -	local_irq_restore(flags);
>  	m->finished = 0;
> +	clear_bit(MCJ_LOADED_BIT, (unsigned long *)&m->inject_flags);
> +	local_irq_restore(flags);
>  }

I think the "finished" is not good name. (I suppose it is named
after "loading data to structure have been finished" or so.)
And also I think "MCJ_LOADED" is not good name, because I could not
figure out the difference between "loading finished" and "loaded."

OTOH in the course of nature mce_rdmsrl() returns injected data
anytime if data is loaded, because it is defined to do so.
    304 /* MSR access wrappers used for error injection */
    305 static u64 mce_rdmsrl(u32 msr)
    306 {
    307         u64 v;
    308
    309         if (__get_cpu_var(injectm).finished) {

I believe what you want to do here is "make mce_rdmsrl()/mce_wrmsrl()
to refer faked data only during faked handler call."
Then what we have to do is making a flag to indicate that "now
in faked handler call," for an example:

    309         if (__get_cpu_var(mce_fake_in_progress)) {

and:
	local_irq_save(flags);
	__get_cpu_var(mce_fake_in_progress) = 1;
	machine_check_poll(0, &b);
	__get_cpu_var(mce_fake_in_progress) = 0;
	local_irq_restore(flags);

>  
>  static void raise_exception(struct mce *m, struct pt_regs *pregs)
> @@ -69,9 +71,11 @@ static void raise_exception(struct mce *
>  	}
>  	/* in mcheck exeception handler, irq will be disabled */
>  	local_irq_save(flags);
> +	m->finished = 1;
>  	do_machine_check(pregs, 0);
> -	local_irq_restore(flags);
>  	m->finished = 0;
> +	clear_bit(MCJ_LOADED_BIT, (unsigned long *)&m->inject_flags);
> +	local_irq_restore(flags);
>  }
>  
>  static cpumask_t mce_inject_cpumask;
> @@ -89,6 +93,8 @@ static int mce_raise_notify(struct notif
>  		raise_exception(m, args->regs);
>  	else if (m->status)
>  		raise_poll(m);
> +	else
> +		clear_bit(MCJ_LOADED_BIT, (unsigned long *)&m->inject_flags);
>  	return NOTIFY_STOP;
>  }
>  
> @@ -129,7 +135,7 @@ static int raise_local(void)
>  		mce_notify_irq();
>  		printk(KERN_INFO "Machine check poll done on CPU %d\n", cpu);
>  	} else
> -		m->finished = 0;
> +		clear_bit(MCJ_LOADED_BIT, (unsigned long *)&m->inject_flags);
>  
>  	return ret;
>  }
> @@ -152,10 +158,13 @@ static void raise_mce(struct mce *m)
>  		cpu_clear(get_cpu(), mce_inject_cpumask);
>  		for_each_online_cpu(cpu) {
>  			struct mce *mcpu = &per_cpu(injectm, cpu);
> -			if (!mcpu->finished ||
> +			if (!test_bit(MCJ_LOADED_BIT,
> +				      (unsigned long *)&mcpu->inject_flags) ||
>  			    MCJ_CTX(mcpu->inject_flags) != MCJ_CTX_RANDOM)
>  				cpu_clear(cpu, mce_inject_cpumask);
>  		}
> +		/* make sure needed data is available on other CPUs */
> +		smp_mb();

What data are you taking care here for?


Thanks,
H.Seto



  reply	other threads:[~2009-09-28  6:41 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-28  1:21 [BUGFIX -v2] x86, mce, inject: Make injected mce valid only during faked handler call Huang Ying
2009-09-28  6:40 ` Hidetoshi Seto [this message]
2009-09-28  7:09   ` Huang Ying
2009-09-28  8:02     ` Hidetoshi Seto
2009-09-28  8:27       ` Huang Ying
2009-09-28  8:59         ` Hidetoshi Seto
2009-09-28  9:15           ` Huang Ying
2009-09-28  6:46 ` [PATCH 0/5] x86, mce-inject: misc fix Hidetoshi Seto
2009-09-28  6:51   ` [PATCH 1/5] mce-inject: replace MCJ_ to MCE_INJ_ Hidetoshi Seto
2009-09-28  6:51   ` [PATCH 2/5] x86, mce: rename finished to valid in struct mce Hidetoshi Seto
2009-09-28 18:24     ` Andi Kleen
2009-09-29  8:40       ` Hidetoshi Seto
2009-09-29 23:02         ` Andi Kleen
2009-09-29 23:04         ` [PATCH 2/5] x86, mce: rename finished to valid in struct mce II Andi Kleen
2009-09-29 23:20           ` H. Peter Anvin
2009-09-30  4:39             ` Andi Kleen
2009-10-01 11:08               ` Ingo Molnar
2009-09-28  6:52   ` [PATCH 3/5] mce-inject: make injected mce valid only during faked handler call Hidetoshi Seto
2009-09-28  7:27     ` Huang Ying
2009-09-28 18:50     ` Andi Kleen
2009-09-29  8:42       ` Hidetoshi Seto
2009-09-29 20:45         ` Andi Kleen
2009-09-28  6:53   ` [PATCH 4/5] mce-inject: no wait on write with MCE_INJ_CTX_RANDOM Hidetoshi Seto
2009-09-28  6:54   ` [PATCH 5/5] mce-inject: allow injecting status=0 to poll handler Hidetoshi Seto
2009-10-05  2:52 ` [PATCH 0/6] x86, mce, mce-inject: misc fix v2 Hidetoshi Seto
2009-10-05  3:05   ` [PATCH 1/6] x86, mce: replace MCJ_ to MCE_INJ_ Hidetoshi Seto
2009-10-05  3:06   ` [PATCH 2/6] x86, mce: replace MCM_ to MCI_MISC_ Hidetoshi Seto
2009-10-05  3:07   ` [PATCH 3/6] mce-inject: no wait on write with MCE_INJ_CTX_RANDOM Hidetoshi Seto
2009-10-05  3:07   ` [PATCH 4/6] mce-inject: allow injecting status=0 to poll handler Hidetoshi Seto
2009-10-05  3:08   ` [PATCH 5/6] mce-inject: add a barrier to raise_mce() Hidetoshi Seto
2009-10-05  3:10   ` [PATCH 6/6] mce-inject: use injected mce only during faked handler call Hidetoshi Seto
2009-10-09  1:54     ` Huang Ying
2009-10-09  5:38       ` Hidetoshi Seto
2009-10-09  5:44         ` [PATCH 1/4] mce-inject: make raise_global() Hidetoshi Seto
2009-10-09  5:45         ` [PATCH 2/4] mce-inject: use individual members instead of struct mce Hidetoshi Seto
2009-10-09  6:50           ` Huang Ying
2009-10-09  7:18             ` Hidetoshi Seto
2009-10-09  5:45         ` [PATCH 3/4] mce-inject: change msr_to_offset() to mce_get_fake_reg() Hidetoshi Seto
2009-10-09  5:46         ` [PATCH 4/4] mce-inject: support injecting multiple error to a CPU Hidetoshi Seto
2009-10-09  7:14     ` [PATCH 6/6] mce-inject: use injected mce only during faked handler call Huang Ying
2009-10-09  7:27       ` Hidetoshi Seto
2009-10-09  7:44         ` Huang Ying
2009-10-09  8:31           ` Hidetoshi Seto
2009-10-09  9:11             ` Huang Ying
2009-10-13  2:34               ` Hidetoshi Seto
2009-10-13  3:28                 ` Huang Ying
2009-10-13  6:00                   ` Hidetoshi Seto
2009-10-13  6:19                     ` Huang Ying
2009-10-13  6:29                       ` Ingo Molnar
2009-10-13  7:19                         ` [RFC] x86, mce: use of TRACE_EVENT for mce Hidetoshi Seto
2009-10-13  8:43                           ` Ingo Molnar
2009-10-13  8:46                           ` [tip:perf/mce] perf_event, x86, mce: Use TRACE_EVENT() for MCE logging tip-bot for Hidetoshi Seto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AC05A57.1080505@jp.fujitsu.com \
    --to=seto.hidetoshi@jp.fujitsu.com \
    --cc=ak@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.