Re: [PATCH 09/10] MCE: run through processors with more severe problems first

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Tony Luck <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>, Borislav Petkov <bp@amd64.org>,
	linux-kernel@vger.kernel.org, "Huang,
	Ying" <ying.huang@intel.com>, Avi Kivity <avi@redhat.com>
Subject: Re: [PATCH 09/10] MCE: run through processors with more severe problems first
Date: Tue, 14 Jun 2011 10:27:18 +0900	[thread overview]
Message-ID: <4DF6B8F6.2000902@jp.fujitsu.com> (raw)
In-Reply-To: <4df6892b12944b314b@agluck-desktop.sc.intel.com>

(2011/06/14 7:03), Tony Luck wrote:
>>> Or how about checking rip in each mces_seen?
>>
>> This is equivalent to what I did - but I think the code
>> will be cleaner. I'll give it a try.
> 
> Here's a patch on top of my previous series that just looks at
> mces_seen to choose the order. Obviously I'd fold this into the
> other patch for a final version - but this one lets you see what
> the "mce_nextcpu()" function would look like (and how removing
> the bitmaps cleans up the other parts of the code). It does look
> better to me.
> 
> Seto-san: Does this fit with what you were thinking?
> 
> Compile tested only.
> 
> -Tony
> 
> ---
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index a7a8c53..6b4176b 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -791,31 +791,47 @@ static void mce_reign(void)
>  
>  static atomic_t global_nwo;
>  
> -/*
> - * Keep separate bitmaps for cpus that have the option return from
> - * machine check handler (MCG_STATUS.RIPV == 1) and those for that
> - * cannot.
> - */
> -static cpumask_t	can_return;
> -static cpumask_t	cant_return;
> -
>  static int	monarch;
>  
>  /*
> - * next cpu choosing first from cant_return, and then from can_return
> + * Find next cpu that will run through the core of do_machine_check()
> + * checking all the banks of machine check registers. We first take
> + * cpus with serious problems (as indicated by MCG_STATUS_RIPV being
> + * clear in the mcgstatus register). A second pass through mces_seen
> + * is made to process the remaining cpus.
> + * We do this because some machine check banks are shared between cpus,
> + * and it is better to find the error on the cpu that has the problem
> + * and clear the bank so that the innocent bystanders do not have to
> + * worry about errors that do not affect them.

BTW in case of "no_way_out" events, we don't clear banks because they
could be carried over to the next boot (expecting logged as bootlog).
So we may need to have some trick for some known cases; e.g. ignore
observed AR by bystanders, anyway.

>   */
> -int mce_nextcpu(int this)
> +int mce_nextcpu(int cur)
>  {
> -	int next;
> +	struct mce	*m;
> +	int		cpu = cur;
> +	u64	mask = MCG_STATUS_MCIP;

Why do you check the MCG_STATUS_MCIP too here?
What happens if there is a problematic cpu that could not read
MCG register properly so indicates "PANIC with !MCIP"?

>  
> -	if (this == -1 || cpumask_test_cpu(this, &cant_return)) {
> -		next = cpumask_next(this, &cant_return);
> -		if (next >= nr_cpu_ids)
> -			next = cpumask_next(-1, &can_return);
> -		return next;
> +	if (cpu != -1) {
> +		m = &per_cpu(mces_seen, cpu);
> +		if (m->mcgstatus & MCG_STATUS_RIPV)
> +			mask |= MCG_STATUS_RIPV;
>  	}
>  
> -	return cpumask_next(this, &can_return);
> +	while (1) {
> +		cpu = cpumask_next(cpu, cpu_possible_mask);

possible? online?

Ah, I guess you assumed that all cpus checked in should have
mces_seen with MCIP while offline cpus have cleaned mces_seen.  

Though we know there might be races with cpu hotplug, now we
already use num_online_cpus() in this rendezvous mechanism,
it is OK to use cpu_online_mask here at the moment, I think.

Or we should invent new, better rendezvous mechanism...

> +		if (cpu >= nr_cpu_ids) {
> +			if (mask & MCG_STATUS_RIPV)
> +				return cpu;
> +			mask |= MCG_STATUS_RIPV;
> +			cpu = -1;
> +			continue;
> +		}
> +
> +		m = &per_cpu(mces_seen, cpu);
> +		if ((m->mcgstatus & (MCG_STATUS_MCIP|MCG_STATUS_RIPV)) == mask)
> +			break;
> +	}
> +
> +	return cpu;
>  }
>  
>  /*
> @@ -825,7 +841,7 @@ int mce_nextcpu(int this)
>   * one at a time.
>   * TBD double check parallel CPU hotunplug
>   */
> -static int mce_start(int *no_way_out, int noreturn)
> +static int mce_start(int *no_way_out)
>  {
>  	int order;
>  	int cpus = num_online_cpus();
> @@ -841,11 +857,6 @@ static int mce_start(int *no_way_out, int noreturn)
>  	smp_wmb();
>  	order = atomic_inc_return(&mce_callin);
>  
> -	if (noreturn)
> -		cpumask_set_cpu(smp_processor_id(), &cant_return);
> -	else
> -		cpumask_set_cpu(smp_processor_id(), &can_return);
> -
>  	/*
>  	 * Wait for everyone.
>  	 */
> @@ -951,8 +962,6 @@ static int mce_end(int order)
>  reset:
>  	atomic_set(&global_nwo, 0);
>  	atomic_set(&mce_callin, 0);
> -	cpumask_clear(&can_return);
> -	cpumask_clear(&cant_return);
>  	barrier();
>  
>  	/*
> @@ -1134,7 +1143,7 @@ void do_machine_check(struct pt_regs *regs, long error_code)
>  	 * This way we don't report duplicated events on shared banks
>  	 * because the first one to see it will clear it.
>  	 */
> -	order = mce_start(&no_way_out, kill_it);
> +	order = mce_start(&no_way_out);
>  	for (i = 0; i < banks; i++) {
>  		__clear_bit(i, toclear);
>  		if (!mce_banks[i].ctl)
> 
> 

The rest looks good.


Thanks,
H.Seto

next prev parent reply	other threads:[~2011-06-14  1:28 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-09 21:25 [RFC] reworked machine check recovery patches Luck, Tony
2011-06-09 21:29 ` [PATCH 01/10] MCE: fixes for mce severity table Luck, Tony
2011-06-09 21:30 ` [PATCH 02/10] MCE: save most severe error information Luck, Tony
2011-06-10  8:06   ` Hidetoshi Seto
2011-06-10 18:08     ` Tony Luck
2011-06-09 21:31 ` [PATCH 03/10] MCE: introduce mce_gather_info() Luck, Tony
2011-06-09 21:32 ` [PATCH 04/10] MCE: Move ADDR/MISC reading code into common function Luck, Tony
2011-06-10  9:33   ` Borislav Petkov
2011-06-10 18:17     ` Tony Luck
2011-06-09 21:33 ` [PATCH 05/10] MCE: Mask out address mask bits below address granuality Luck, Tony
2011-06-10  8:07   ` Hidetoshi Seto
2011-06-10  9:46     ` Borislav Petkov
2011-06-10 19:06     ` Tony Luck
2011-06-11  0:12       ` Andi Kleen
2011-06-10  9:42   ` Borislav Petkov
2011-06-10 19:09     ` Tony Luck
2011-06-09 21:34 ` [PATCH 06/10] HWPOISON: Handle hwpoison in current process Luck, Tony
2011-06-10  8:07   ` Hidetoshi Seto
2011-06-10 20:36     ` Tony Luck
2011-06-09 21:35 ` [PATCH 07/10] MCE: replace mce.c use of TIF_MCE_NOTIFY with user_return_notifier Luck, Tony
2011-06-10  8:08   ` Hidetoshi Seto
2011-06-10 20:42     ` Tony Luck
2011-06-11 10:24       ` Borislav Petkov
2011-06-12  8:31       ` Avi Kivity
2011-06-12  8:29   ` Avi Kivity
2011-06-12 10:24     ` Borislav Petkov
2011-06-12 10:30       ` Avi Kivity
2011-06-12 13:53         ` Borislav Petkov
2011-06-09 21:36 ` [PATCH 08/10] NOTIFIER: Take over TIF_MCE_NOTIFY and implement task return notifier Luck, Tony
2011-06-12 22:38   ` Borislav Petkov
2011-06-13  5:31     ` Tony Luck
2011-06-13  7:59       ` Avi Kivity
2011-06-13  9:55         ` Borislav Petkov
2011-06-13 11:40           ` Avi Kivity
2011-06-13 12:40             ` Borislav Petkov
2011-06-13 12:47               ` Avi Kivity
2011-06-13 15:12                 ` Borislav Petkov
2011-06-13 16:31                   ` Avi Kivity
2011-06-13 17:13                     ` Tony Luck
2011-06-14  2:50                       ` Hidetoshi Seto
2011-06-14  2:51                         ` [PATCH 1/2] x86, mce: introduce mce_memory_failure_process Hidetoshi Seto
2011-06-14  2:53                         ` [PATCH 2/2] x86, mce: rework use of TIF_MCE_NOTIFY Hidetoshi Seto
2011-06-14 18:02                           ` Tony Luck
2011-06-14 18:28                             ` Tony Luck
2011-06-15  1:29                               ` Hidetoshi Seto
2011-06-15  2:10                                 ` Tony Luck
2011-06-15  3:17                                   ` Hidetoshi Seto
2011-06-14  3:09                         ` [PATCH 08/10] NOTIFIER: Take over TIF_MCE_NOTIFY and implement task return notifier Tony Luck
2011-06-14 11:40                       ` Avi Kivity
2011-06-14 13:33                         ` Borislav Petkov
2011-06-14 13:43                           ` Avi Kivity
2011-06-14 17:13                             ` Luck, Tony
2011-06-15  8:51                               ` Avi Kivity
2011-06-14 16:59                         ` Luck, Tony
2011-06-15  8:52                           ` Avi Kivity
2011-06-13 16:43               ` Tony Luck
2011-06-09 21:37 ` [PATCH 09/10] MCE: run through processors with more severe problems first Luck, Tony
2011-06-10  8:09   ` Hidetoshi Seto
2011-06-10 20:49     ` Tony Luck
2011-06-13 22:03       ` Tony Luck
2011-06-14  1:27         ` Hidetoshi Seto [this message]
2011-06-14  3:04           ` Tony Luck
2011-06-09 21:38 ` [PATCH 10/10] MCE: Add Action-Required support Luck, Tony
2011-06-10  8:06   ` Hidetoshi Seto
2011-06-10 21:00     ` Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DF6B8F6.2000902@jp.fujitsu.com \
    --to=seto.hidetoshi@jp.fujitsu.com \
    --cc=avi@redhat.com \
    --cc=bp@amd64.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tony.luck@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.