From: tip-bot for Ashok Raj <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: linux-edac@vger.kernel.org, bp@alien8.de, brgerst@gmail.com,
hpa@zytor.com, bp@suse.de, tglx@linutronix.de,
peterz@infradead.org, linux-kernel@vger.kernel.org,
ashok.raj@intel.com, mingo@kernel.org, dvlasenk@redhat.com,
tony.luck@intel.com, torvalds@linux-foundation.org,
luto@amacapital.net, serge.ayoun@intel.com
Subject: [tip:ras/core] x86/mce: Don' t clear shared banks on Intel when offlining CPUs
Date: Tue, 29 Sep 2015 01:37:07 -0700 [thread overview]
Message-ID: <tip-6e06780a98f149f131d46c1108d4ae27f05a9357@git.kernel.org> (raw)
In-Reply-To: <1441391390-16985-1-git-send-email-ashok.raj@intel.com>
Commit-ID: 6e06780a98f149f131d46c1108d4ae27f05a9357
Gitweb: http://git.kernel.org/tip/6e06780a98f149f131d46c1108d4ae27f05a9357
Author: Ashok Raj <ashok.raj@intel.com>
AuthorDate: Mon, 28 Sep 2015 09:21:43 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 28 Sep 2015 10:15:26 +0200
x86/mce: Don't clear shared banks on Intel when offlining CPUs
It is not safe to clear global MCi_CTL banks during CPU offline
or suspend/resume operations. These MSRs are either
thread-scoped (meaning private to a thread), or core-scoped
(private to threads in that core only), or with a socket scope:
visible and controllable from all threads in the socket.
When we offline a single CPU, clearing those MCi_CTL bits will
stop signaling for all the shared, i.e., socket-wide resources,
such as LLC, iMC, etc.
In addition, it might be possible to compromise the integrity of
an Intel Secure Guard eXtentions (SGX) system if the attacker
has control of the host system and is able to inject errors
which would be otherwise ignored when MCi_CTL bits are cleared.
Hence on SGX enabled systems, if MCi_CTL is cleared, SGX gets
disabled.
Tested-by: Serge Ayoun <serge.ayoun@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
[ Cleanup text. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1441391390-16985-1-git-send-email-ashok.raj@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/cpu/mcheck/mce.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 9d014b82..17b5ec6 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2042,7 +2042,7 @@ int __init mcheck_init(void)
* Disable machine checks on suspend and shutdown. We can't really handle
* them later.
*/
-static int mce_disable_error_reporting(void)
+static void mce_disable_error_reporting(void)
{
int i;
@@ -2052,17 +2052,32 @@ static int mce_disable_error_reporting(void)
if (b->init)
wrmsrl(MSR_IA32_MCx_CTL(i), 0);
}
- return 0;
+ return;
+}
+
+static void vendor_disable_error_reporting(void)
+{
+ /*
+ * Don't clear on Intel CPUs. Some of these MSRs are socket-wide.
+ * Disabling them for just a single offlined CPU is bad, since it will
+ * inhibit reporting for all shared resources on the socket like the
+ * last level cache (LLC), the integrated memory controller (iMC), etc.
+ */
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+ return;
+
+ mce_disable_error_reporting();
}
static int mce_syscore_suspend(void)
{
- return mce_disable_error_reporting();
+ vendor_disable_error_reporting();
+ return 0;
}
static void mce_syscore_shutdown(void)
{
- mce_disable_error_reporting();
+ vendor_disable_error_reporting();
}
/*
@@ -2342,19 +2357,14 @@ static void mce_device_remove(unsigned int cpu)
static void mce_disable_cpu(void *h)
{
unsigned long action = *(unsigned long *)h;
- int i;
if (!mce_available(raw_cpu_ptr(&cpu_info)))
return;
if (!(action & CPU_TASKS_FROZEN))
cmci_clear();
- for (i = 0; i < mca_cfg.banks; i++) {
- struct mce_bank *b = &mce_banks[i];
- if (b->init)
- wrmsrl(MSR_IA32_MCx_CTL(i), 0);
- }
+ vendor_disable_error_reporting();
}
static void mce_reenable_cpu(void *h)
prev parent reply other threads:[~2015-09-29 8:42 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-04 18:29 [Patch V1] x86, mce: Don't clear global error reporting banks during cpu_offline Ashok Raj
2015-09-11 19:31 ` Borislav Petkov
2015-09-29 8:37 ` tip-bot for Ashok Raj [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tip-6e06780a98f149f131d46c1108d4ae27f05a9357@git.kernel.org \
--to=tipbot@zytor.com \
--cc=ashok.raj@intel.com \
--cc=bp@alien8.de \
--cc=bp@suse.de \
--cc=brgerst@gmail.com \
--cc=dvlasenk@redhat.com \
--cc=hpa@zytor.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=serge.ayoun@intel.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).