All of lore.kernel.org
 help / color / mirror / Atom feed
From: tip-bot for Dave Hansen <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: jpoimboe@redhat.com, bp@alien8.de, torvalds@linux-foundation.org,
	brgerst@gmail.com, luto@kernel.org, hpa@zytor.com,
	dave.hansen@linux.intel.com, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, peterz@infradead.org, mcgrof@suse.com,
	tglx@linutronix.de, dave@sr71.net, toshi.kani@hp.com,
	dvlasenk@redhat.com, mingo@kernel.org
Subject: [tip:x86/mm] x86/mm: Disallow running with 32-bit PTEs to work around erratum
Date: Wed, 13 Jul 2016 01:04:13 -0700	[thread overview]
Message-ID: <tip-e4a84be6f05eab4778732d799f63b3cd15427885@git.kernel.org> (raw)
In-Reply-To: <20160708001914.D0B50110@viggo.jf.intel.com>

Commit-ID:  e4a84be6f05eab4778732d799f63b3cd15427885
Gitweb:     http://git.kernel.org/tip/e4a84be6f05eab4778732d799f63b3cd15427885
Author:     Dave Hansen <dave.hansen@linux.intel.com>
AuthorDate: Thu, 7 Jul 2016 17:19:14 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 13 Jul 2016 09:43:25 +0200

x86/mm: Disallow running with 32-bit PTEs to work around erratum

The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights
Landing) has an erratum where a processor thread setting the Accessed
or Dirty bits may not do so atomically against its checks for the
Present bit.  This may cause a thread (which is about to page fault)
to set A and/or D, even though the Present bit had already been
atomically cleared.

These bits are truly "stray".  In the case of the Dirty bit, the
thread associated with the stray set was *not* allowed to write to
the page.  This means that we do not have to launder the bit(s); we
can simply ignore them.

If the PTE is used for storing a swap index or a NUMA migration index,
the A bit could be misinterpreted as part of the swap type.  The stray
bits being set cause a software-cleared PTE to be interpreted as a
swap entry.  In some cases (like when the swap index ends up being
for a non-existent swapfile), the kernel detects the stray value
and WARN()s about it, but there is no guarantee that the kernel can
always detect it.

When we have 64-bit PTEs (64-bit mode or 32-bit PAE), we were able
to move the swap PTE format around to avoid these troublesome bits.
But, 32-bit non-PAE is tight on bits.  So, disallow it from running
on this hardware.  I can't imagine anyone wanting to run 32-bit
non-highmem kernels on this hardware, but disallowing them from
running entirely is surely the safe thing to do.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: dave.hansen@intel.com
Cc: linux-mm@kvack.org
Cc: mhocko@suse.com
Link: http://lkml.kernel.org/r/20160708001914.D0B50110@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/boot.h     |  1 +
 arch/x86/boot/cpu.c      |  2 ++
 arch/x86/boot/cpucheck.c | 33 +++++++++++++++++++++++++++++++++
 arch/x86/boot/cpuflags.c |  1 +
 arch/x86/boot/cpuflags.h |  1 +
 5 files changed, 38 insertions(+)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 9011a88..a5ce666 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -294,6 +294,7 @@ static inline int cmdline_find_option_bool(const char *option)
 
 /* cpu.c, cpucheck.c */
 int check_cpu(int *cpu_level_ptr, int *req_level_ptr, u32 **err_flags_ptr);
+int check_knl_erratum(void);
 int validate_cpu(void);
 
 /* early_serial_console.c */
diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 29207f6..26240dd 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -93,6 +93,8 @@ int validate_cpu(void)
 		show_cap_strs(err_flags);
 		putchar('\n');
 		return -1;
+	} else if (check_knl_erratum()) {
+		return -1;
 	} else {
 		return 0;
 	}
diff --git a/arch/x86/boot/cpucheck.c b/arch/x86/boot/cpucheck.c
index 1fd7d57..4ad7d70 100644
--- a/arch/x86/boot/cpucheck.c
+++ b/arch/x86/boot/cpucheck.c
@@ -24,6 +24,7 @@
 # include "boot.h"
 #endif
 #include <linux/types.h>
+#include <asm/intel-family.h>
 #include <asm/processor-flags.h>
 #include <asm/required-features.h>
 #include <asm/msr-index.h>
@@ -175,6 +176,8 @@ int check_cpu(int *cpu_level_ptr, int *req_level_ptr, u32 **err_flags_ptr)
 			puts("WARNING: PAE disabled. Use parameter 'forcepae' to enable at your own risk!\n");
 		}
 	}
+	if (!err)
+		err = check_knl_erratum();
 
 	if (err_flags_ptr)
 		*err_flags_ptr = err ? err_flags : NULL;
@@ -185,3 +188,33 @@ int check_cpu(int *cpu_level_ptr, int *req_level_ptr, u32 **err_flags_ptr)
 
 	return (cpu.level < req_level || err) ? -1 : 0;
 }
+
+int check_knl_erratum(void)
+{
+	/*
+	 * First check for the affected model/family:
+	 */
+	if (!is_intel() ||
+	    cpu.family != 6 ||
+	    cpu.model != INTEL_FAM6_XEON_PHI_KNL)
+		return 0;
+
+	/*
+	 * This erratum affects the Accessed/Dirty bits, and can
+	 * cause stray bits to be set in !Present PTEs.  We have
+	 * enough bits in our 64-bit PTEs (which we have on real
+	 * 64-bit mode or PAE) to avoid using these troublesome
+	 * bits.  But, we do not have enough space in our 32-bit
+	 * PTEs.  So, refuse to run on 32-bit non-PAE kernels.
+	 */
+	if (IS_ENABLED(CONFIG_X86_64) || IS_ENABLED(CONFIG_X86_PAE))
+		return 0;
+
+	puts("This 32-bit kernel can not run on this Xeon Phi x200\n"
+	     "processor due to a processor erratum.  Use a 64-bit\n"
+	     "kernel, or enable PAE in this 32-bit kernel.\n\n");
+
+	return -1;
+}
+
+
diff --git a/arch/x86/boot/cpuflags.c b/arch/x86/boot/cpuflags.c
index 431fa5f..6687ab9 100644
--- a/arch/x86/boot/cpuflags.c
+++ b/arch/x86/boot/cpuflags.c
@@ -102,6 +102,7 @@ void get_cpuflags(void)
 			cpuid(0x1, &tfms, &ignored, &cpu.flags[4],
 			      &cpu.flags[0]);
 			cpu.level = (tfms >> 8) & 15;
+			cpu.family = cpu.level;
 			cpu.model = (tfms >> 4) & 15;
 			if (cpu.level >= 6)
 				cpu.model += ((tfms >> 16) & 0xf) << 4;
diff --git a/arch/x86/boot/cpuflags.h b/arch/x86/boot/cpuflags.h
index 4cb404f..15ad56a 100644
--- a/arch/x86/boot/cpuflags.h
+++ b/arch/x86/boot/cpuflags.h
@@ -6,6 +6,7 @@
 
 struct cpu_features {
 	int level;		/* Family, or 64 for x86-64 */
+	int family;		/* Family, always */
 	int model;
 	u32 flags[NCAPINTS];
 };

  reply	other threads:[~2016-07-13  8:05 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-08  0:19 [PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits erratum Dave Hansen
2016-07-08  0:19 ` Dave Hansen
2016-07-08  0:19 ` [PATCH 1/4] x86, swap: move swap offset/type up in PTE to work around erratum Dave Hansen
2016-07-08  0:19   ` Dave Hansen
2016-07-13  8:03   ` [tip:x86/mm] x86/mm: Move " tip-bot for Dave Hansen
2016-07-13 15:19   ` [PATCH 1/4] x86, swap: move " Michal Hocko
2016-07-13 15:19     ` Michal Hocko
2016-07-08  0:19 ` [PATCH 2/4] x86, pagetable: ignore A/D bits in pte/pmd/pud_none() Dave Hansen
2016-07-08  0:19   ` Dave Hansen
2016-07-13  8:03   ` [tip:x86/mm] x86/mm: Ignore " tip-bot for Dave Hansen
2016-07-13 15:21   ` [PATCH 2/4] x86, pagetable: ignore " Michal Hocko
2016-07-13 15:21     ` Michal Hocko
2016-07-13 15:47     ` Dave Hansen
2016-07-13 15:47       ` Dave Hansen
2016-07-14  6:13       ` Michal Hocko
2016-07-14  6:13         ` Michal Hocko
2016-07-08  0:19 ` [PATCH 3/4] x86: disallow running with 32-bit PTEs to work around erratum Dave Hansen
2016-07-08  0:19   ` Dave Hansen
2016-07-13  8:04   ` tip-bot for Dave Hansen [this message]
2016-07-08  0:19 ` [PATCH 4/4] x86: use pte_none() to test for empty PTE Dave Hansen
2016-07-08  0:19   ` Dave Hansen
2016-07-13  8:04   ` [tip:x86/mm] x86/mm: Use " tip-bot for Dave Hansen
2016-07-13 15:18   ` [PATCH 4/4] x86: use " Michal Hocko
2016-07-13 15:18     ` Michal Hocko
2016-07-13 15:23     ` Julia Lawall
2016-07-13 15:23       ` Julia Lawall
2016-07-13 15:49     ` Julia Lawall
2016-07-13 15:49       ` Julia Lawall
2016-07-13 16:28       ` Dave Hansen
2016-07-13 16:28         ` Dave Hansen
2016-07-14 13:47   ` Vlastimil Babka
2016-07-14 13:47     ` Vlastimil Babka
2016-07-14 14:24     ` Dave Hansen
2016-07-14 14:24       ` Dave Hansen
2016-07-14 14:50       ` David Vrabel
2016-07-14 14:50         ` David Vrabel
2016-07-13  9:54 ` [PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits erratum Vlastimil Babka
2016-07-13  9:54   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-e4a84be6f05eab4778732d799f63b3cd15427885@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave@sr71.net \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mcgrof@suse.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=toshi.kani@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.