public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chen Yucong <slaoub@gmail.com>
To: Tony Luck <tony.luck@gmail.com>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: kill the current thread if MCG_STATUS_RIPV is not set
Date: Sun, 10 Aug 2014 21:42:15 +0800	[thread overview]
Message-ID: <1407678135.9689.4.camel@debian> (raw)

Hi Tony Luck,

According to the x86 ASDM vol.3A 15.9.3.2, we can find that
Recoverable-not-continuable SRAR Error (RIPV=0, EIPV=x) includes the
following two cases:
  -IA32_MCG_STATUS.RIPV= 0, IA32_MCG_STATUS.EIPV=0, or
  -IA32_MCG_STATUS.RIPV= 0, IA32_MCG_STATUS.EIPV=1.

For the first case, the MCE handler will directly panic the kernel
according the item of severities[]:

/* Neither return not error IP -- no chance to recover -> PANIC */
MCESEV(
       PANIC, "Neither restart nor error IP",
       MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, 0)
       ),

For the second case, the MCE handler should directly kill the current
thread according to the ASDM vol.3A 15.9.3.2:

The current executing thread cannot be continued. System software must
terminate the interrupted stream of execution and provide a new stream
of execution on return from the machine check handler for the affected
logical processor.

But the fact is that the MCE handler does not kill the current thread,
but rather to further handling(invoke memory_failure() by TIF_MCE_NOTIFY
).

I think I have been confused by the gap between documentation and source
code. Perhaps there may need a small fix.

thx!
cyc


Signed-off-by: Chen Yucong <slaoub@gmail.com>
---
 arch/x86/kernel/cpu/mcheck/mce.c |   14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c
b/arch/x86/kernel/cpu/mcheck/mce.c
index bd9ccda..3394494 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1055,9 +1055,12 @@ void do_machine_check(struct pt_regs *regs, long
error_code)
 
 	/*
 	 * When no restart IP might need to kill or panic.
-	 * Assume the worst for now, but if we find the
-	 * severity is MCE_AR_SEVERITY we have other options.
+	 * This indicates that the error is detected at the instruction
+	 * pointer saved on the stack for this machine check exception
+	 * and restarting execution with the interrupted context is not
+	 * possible.(ASDM vol.3A 15.9.3.2)
 	 */
+
 	if (!(m.mcgstatus & MCG_STATUS_RIPV))
 		kill_it = 1;
 
@@ -1154,12 +1157,13 @@ void do_machine_check(struct pt_regs *regs, long
error_code)
 	if (cfg->tolerant < 3) {
 		if (no_way_out)
 			mce_panic("Fatal machine check on current CPU", &m, msg);
-		if (worst == MCE_AR_SEVERITY) {
+
+		if (kill_it) {
+			force_sig(SIGBUS, current);
+		} else if (worst == MCE_AR_SEVERITY) {
 			/* schedule action before return to userland */
 			mce_save_info(m.addr, m.mcgstatus & MCG_STATUS_RIPV);
 			set_thread_flag(TIF_MCE_NOTIFY);
-		} else if (kill_it) {
-			force_sig(SIGBUS, current);
 		}
 	}
 
-- 
1.7.10.4




                 reply	other threads:[~2014-08-10 13:42 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1407678135.9689.4.camel@debian \
    --to=slaoub@gmail.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox