public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andi Kleen <ak@linux.intel.com>
Subject: [PATCH 02/16] x86, mce: call-in should be after updating global_nwo
Date: Mon, 15 Jun 2009 17:19:40 +0900	[thread overview]
Message-ID: <4A36041C.9050202@jp.fujitsu.com> (raw)
In-Reply-To: <4A3601BF.2000201@jp.fujitsu.com>

At the beginning of Monarch synchronization, processors wait until
all of them have entered the exception handler and then check the
global_nwo to determine if any of them saw a fatal event.

However since current code does call-in before updating global_nwo,
it might happen that the global_nwo does not reflect some of local
nwo at the time.  This might break printing corrected errors not
handled yet on panic.

Reported-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
---
 arch/x86/kernel/cpu/mcheck/mce.c |   65 ++++++++++++++++++--------------------
 1 files changed, 31 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 6f9db11..84b2630 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -691,18 +691,18 @@ static atomic_t global_nwo;
  * in the entry order.
  * TBD double check parallel CPU hotunplug
  */
-static int mce_start(int no_way_out, int *order)
+static int mce_start(int *no_way_out)
 {
-	int nwo;
+	int order;
 	int cpus = num_online_cpus();
 	u64 timeout = (u64)monarch_timeout * NSEC_PER_USEC;
 
-	if (!timeout) {
-		*order = -1;
-		return no_way_out;
-	}
+	if (!timeout)
+		return -1;
 
-	atomic_add(no_way_out, &global_nwo);
+	atomic_add(*no_way_out, &global_nwo);
+
+	order = atomic_add_return(1, &mce_callin);
 
 	/*
 	 * Wait for everyone.
@@ -710,40 +710,38 @@ static int mce_start(int no_way_out, int *order)
 	while (atomic_read(&mce_callin) != cpus) {
 		if (mce_timed_out(&timeout)) {
 			atomic_set(&global_nwo, 0);
-			*order = -1;
-			return no_way_out;
+			return -1;
 		}
 		ndelay(SPINUNIT);
 	}
 
-	/*
-	 * Cache the global no_way_out state.
-	 */
-	nwo = atomic_read(&global_nwo);
-
-	/*
-	 * Monarch starts executing now, the others wait.
-	 */
-	if (*order == 1) {
+	if (order == 1) {
+		/*
+		 * Monarch: Starts executing now, the others wait.
+		 */
 		atomic_set(&mce_executing, 1);
-		return nwo;
+	} else {
+		/*
+		 * Subject: Now start the scanning loop one by one in
+		 * the original callin order.
+		 * This way when there are any shared banks it will be
+		 * only seen by one CPU before cleared, avoiding duplicates.
+		 */
+		while (atomic_read(&mce_executing) < order) {
+			if (mce_timed_out(&timeout)) {
+				atomic_set(&global_nwo, 0);
+				return -1;
+			}
+			ndelay(SPINUNIT);
+		}
 	}
 
 	/*
-	 * Now start the scanning loop one by one
-	 * in the original callin order.
-	 * This way when there are any shared banks it will
-	 * be only seen by one CPU before cleared, avoiding duplicates.
+	 * Cache the global no_way_out state.
 	 */
-	while (atomic_read(&mce_executing) < *order) {
-		if (mce_timed_out(&timeout)) {
-			atomic_set(&global_nwo, 0);
-			*order = -1;
-			return no_way_out;
-		}
-		ndelay(SPINUNIT);
-	}
-	return nwo;
+	*no_way_out = atomic_read(&global_nwo);
+
+	return order;
 }
 
 /*
@@ -887,7 +885,6 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 	if (!banks)
 		goto out;
 
-	order = atomic_add_return(1, &mce_callin);
 	mce_setup(&m);
 
 	m.mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
@@ -909,7 +906,7 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 	 * This way we don't report duplicated events on shared banks
 	 * because the first one to see it will clear it.
 	 */
-	no_way_out = mce_start(no_way_out, &order);
+	order = mce_start(&no_way_out);
 	for (i = 0; i < banks; i++) {
 		__clear_bit(i, toclear);
 		if (!bank[i])
-- 
1.6.3



  parent reply	other threads:[~2009-06-15  8:19 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-15  8:09 [PATCH 00/16] last-minute patches for MCE .31 Hidetoshi Seto
2009-06-15  8:18 ` [PATCH 01/16] x86, mce: don't init timer if !mce_available Hidetoshi Seto
2009-06-15  8:19 ` Hidetoshi Seto [this message]
2009-06-15  8:26   ` [PATCH 02/16] x86, mce: call-in should be after updating global_nwo huang ying
2009-06-15  8:40     ` Hidetoshi Seto
2009-06-15  9:18     ` [PATCH] x86, mce: cleanup mce_start() Hidetoshi Seto
2009-06-15  8:20 ` [PATCH 03/16] x86, mce: add __read_mostly Hidetoshi Seto
2009-06-15  8:20 ` [PATCH 04/16] x86, mce: rename static variables around trigger Hidetoshi Seto
2009-06-15  8:21 ` [PATCH 05/16] x86, mce: sysfs entries for new options Hidetoshi Seto
2009-06-15  8:22 ` [PATCH 06/16] x86, mce: unify mce.h Hidetoshi Seto
2009-06-15  8:22 ` [PATCH 07/16] x86, mce: make mce_disabled boolean Hidetoshi Seto
2009-06-15  8:23 ` [PATCH 08/16] x86, mce: unify smp_thermal_interrupt, prepare p4 Hidetoshi Seto
2009-06-15  8:24 ` [PATCH 09/16] x86, mce: unify smp_thermal_interrupt, prepare mce_intel_64 Hidetoshi Seto
2009-06-15  8:24 ` [PATCH 10/16] x86, mce: unify smp_thermal_interrupt, prepare Hidetoshi Seto
2009-06-15  8:25 ` [PATCH 11/16] x86, mce: unify smp_thermal_interrupt Hidetoshi Seto
2009-06-15  8:26 ` [PATCH 12/16] x86, mce: squash mce_intel.c into therm_throt.c Hidetoshi Seto
2009-06-18  5:49   ` huang ying
2009-06-18  7:31     ` Hidetoshi Seto
2009-06-18  8:00       ` huang ying
2009-06-18 13:59         ` H. Peter Anvin
2009-06-18 13:56     ` H. Peter Anvin
2009-06-15  8:26 ` [PATCH 13/16] x86, mce: remove intel_set_thermal_handler() Hidetoshi Seto
2009-06-15  8:27 ` [PATCH 14/16] x86, mce: remove therm_throt.h Hidetoshi Seto
2009-06-15  8:27 ` [PATCH 15/16] x86, mce: mce.h cleanup Hidetoshi Seto
2009-06-15  8:28 ` [PATCH 16/16] x86, mce: rename Hidetoshi Seto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A36041C.9050202@jp.fujitsu.com \
    --to=seto.hidetoshi@jp.fujitsu.com \
    --cc=ak@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox