linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Andi Kleen <ak@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: [PATCH] x86, mce: Add options for corrected errors (no_cmci/dont_log_ce/ignore_ce)
Date: Wed, 22 Apr 2009 12:25:04 +0900	[thread overview]
Message-ID: <49EE8E10.9060400@jp.fujitsu.com> (raw)
In-Reply-To: <49EC3AC0.8070109@jp.fujitsu.com>

Here is the updated one.
Thanks in advance!

H.Seto


This patch introduces three boot options to control handling
for corrected errors.

The "mce=no_cmci" boot option disables cmci feature.
Since cmci is a new feature so having boot controls to disable
it will be a help if the hardware is misbehaving.

The "mce=dont_log_ce" boot option disables logging for corrected
errors.  All reported corrected errors will be cleared silently.
This option will be useful if you never care corrected errors.

The "mce=ignore_ce" boot option disables features for corrected
errors, i.e. polling timer and cmci.  All corrected events are
not cleared and kept in bank MSRs.  Usually this disablement
is not recommended, however it will be a help if there are some
conflict with the BIOS or hardware monitoring applications etc.,
that clears corrected events in banks instead of OS.

And trivial cleanup (space -> tab) for doc is included.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
---
 Documentation/x86/x86_64/boot-options.txt |   36 ++++++++++++++++++++++++-----
 arch/x86/include/asm/mce.h                |    2 +
 arch/x86/kernel/cpu/mcheck/mce_64.c       |   20 ++++++++++++++-
 arch/x86/kernel/cpu/mcheck/mce_intel_64.c |    3 ++
 4 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
index 34c1304..04834ad 100644
--- a/Documentation/x86/x86_64/boot-options.txt
+++ b/Documentation/x86/x86_64/boot-options.txt
@@ -5,12 +5,36 @@ only the AMD64 specific ones are listed here.
 
 Machine check
 
-   mce=off disable machine check
-   mce=bootlog Enable logging of machine checks left over from booting.
-               Disabled by default on AMD because some BIOS leave bogus ones.
-               If your BIOS doesn't do that it's a good idea to enable though
-               to make sure you log even machine check events that result
-               in a reboot. On Intel systems it is enabled by default.
+   mce=off
+		Disable machine check
+   mce=no_cmci
+		Disable CMCI(Corrected Machine Check Interrupt) that
+		Intel processor supports.  Usually this disablement is
+		not recommended, but it might be handy if your hardware
+		is misbehaving.
+		Note that you'll get more problems without CMCI than with
+		due to the shared banks, i.e. you might get duplicated
+		error logs.
+   mce=dont_log_ce
+		Don't make logs for corrected errors.  All events reported
+		as corrected are silently cleared by OS.
+		This option will be useful if you have no interest in any
+		of corrected errors.
+   mce=ignore_ce
+		Disable features for corrected errors, e.g. polling timer
+		and CMCI.  All events reported as corrected are not cleared
+		by OS and remained in its error banks.
+		Usually this disablement is not recommended, however if
+		there is an agent checking/clearing corrected errors
+		(e.g. BIOS or hardware monitoring applications), conflicting
+		with OS's error handling, and you cannot deactivate the agent,
+		then this option will be a help.
+   mce=bootlog
+		Enable logging of machine checks left over from booting.
+		Disabled by default on AMD because some BIOS leave bogus ones.
+		If your BIOS doesn't do that it's a good idea to enable though
+		to make sure you log even machine check events that result
+		in a reboot. On Intel systems it is enabled by default.
    mce=nobootlog
 		Disable boot machine check logging.
    mce=tolerancelevel (number)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 563933e..a8a6cd5 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -104,6 +104,8 @@ extern void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu);
 #define MAX_NR_BANKS (MCE_EXTENDED_BANK - 1)
 
 #ifdef CONFIG_X86_MCE_INTEL
+extern int mce_cmci_disabled;
+extern int mce_ignore_ce;
 void mce_intel_feature_init(struct cpuinfo_x86 *c);
 void cmci_clear(void);
 void cmci_reenable(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c b/arch/x86/kernel/cpu/mcheck/mce_64.c
index 33d612e..6abefe3 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -41,6 +41,9 @@
 atomic_t mce_entry;
 
 static int mce_dont_init;
+static int mce_dont_log_ce;
+int mce_cmci_disabled;
+int mce_ignore_ce;
 
 /*
  * Tolerant levels:
@@ -240,7 +243,8 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
 		 * have anything to do with the actual error location.
 		 */
 
-		mce_log(&m);
+		if (!mce_dont_log_ce)
+			mce_log(&m);
 		add_taint(TAINT_MACHINE_CHECK);
 
 		/*
@@ -633,6 +637,9 @@ static void mce_init_timer(void)
 {
 	struct timer_list *t = &__get_cpu_var(mce_timer);
 
+	if (mce_ignore_ce)
+		return;
+
 	/* data race harmless because everyone sets to the same value */
 	if (!next_interval)
 		next_interval = check_interval * HZ;
@@ -841,7 +848,10 @@ static int __init mcheck_disable(char *str)
 __setup("nomce", mcheck_disable);
 
 /*
- * mce=off disables machine check
+ * mce=off Disables machine check
+ * mce=no_cmci Disables CMCI
+ * mce=dont_log_ce Clears corrected events silently, no log created for CEs.
+ * mce=ignore_ce Disables polling and CMCI, corrected events are not cleared.
  * mce=TOLERANCELEVEL (number, see above)
  * mce=bootlog Log MCEs from before booting. Disabled by default on AMD.
  * mce=nobootlog Don't log MCEs from before booting.
@@ -850,6 +860,12 @@ static int __init mcheck_enable(char *str)
 {
 	if (!strcmp(str, "off"))
 		mce_dont_init = 1;
+	else if (!strcmp(str, "no_cmci"))
+		mce_cmci_disabled = 1;
+	else if (!strcmp(str, "dont_log_ce"))
+		mce_dont_log_ce = 1;
+	else if (!strcmp(str, "ignore_ce"))
+		mce_ignore_ce = 1;
 	else if (!strcmp(str, "bootlog") || !strcmp(str, "nobootlog"))
 		mce_bootlog = (str[0] == 'b');
 	else if (isdigit(str[0]))
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
index d6b72df..a88bad9 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
@@ -109,6 +109,9 @@ static int cmci_supported(int *banks)
 {
 	u64 cap;
 
+	if (mce_cmci_disabled || mce_ignore_ce)
+		return 0;
+
 	/*
 	 * Vendor check is not strictly needed, but the initial
 	 * initialization is vendor keyed and this
-- 
1.6.2.2



  reply	other threads:[~2009-04-22  3:25 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-20  1:19 [RESEND][PATCH -tip 0/3] x86, mce: re-implement options for corrected errors Hidetoshi Seto
2009-04-20  1:26 ` [RESEND][PATCH -tip 1/3] x86, mce: Revert "add mce_threshold option for intel cmci" Hidetoshi Seto
2009-04-20  7:20   ` Andi Kleen
2009-04-20  1:27 ` [RESEND][PATCH -tip 2/3] x86, mce: Revert "add mce=nopoll option to disable timer polling" Hidetoshi Seto
2009-04-20  7:26   ` Andi Kleen
2009-04-20  9:04     ` Hidetoshi Seto
2009-04-20 10:03       ` [RESEND][PATCH -tip 2/3] x86, mce: Revert "add mce=nopoll option to disable timer polling"\ Andi Kleen
2009-04-20 10:45         ` Hidetoshi Seto
2009-04-20  1:27 ` [RESEND][PATCH -tip 3/3] x86, mce: Add new option mce=no_cmci and mce=ignore_ce Hidetoshi Seto
2009-04-20  7:31   ` Andi Kleen
2009-04-20  9:05     ` Hidetoshi Seto
2009-04-22  3:25       ` Hidetoshi Seto [this message]
2009-04-22  7:27         ` [PATCH] x86, mce: Add options for corrected errors (no_cmci/dont_log_ce/ignore_ce) Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49EE8E10.9060400@jp.fujitsu.com \
    --to=seto.hidetoshi@jp.fujitsu.com \
    --cc=ak@linux.intel.com \
    --cc=andi@firstfloor.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).