All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@sr71.net>
To: dave@sr71.net
Cc: dave.hansen@linux.intel.com, mingo@kernel.org, bp@alien8.de,
	fenghua.yu@intel.com, hpa@zytor.com, x86@kernel.org,
	tim.c.chen@linux.intel.com, linux-kernel@vger.kernel.org
Subject: [PATCH 09/11] x86, fpu: correct and check XSAVE xstate size calculations
Date: Tue, 25 Aug 2015 13:12:05 -0700	[thread overview]
Message-ID: <20150825201205.C590CFEF@viggo.jf.intel.com> (raw)
In-Reply-To: <20150825201201.CF766C1B@viggo.jf.intel.com>


From: Dave Hansen <dave.hansen@linux.intel.com>

Note: our xsaves support is currently broken and disabled.  This
patch does not fix it, but it is an incremental improvement.
This might be useful to someone backporting the entire set of
XSAVES patches at some point, but it should not be backported
alone.

Ingo said he wanted something like this (bullets 2 and 3):

	http://lkml.kernel.org/r/20150808091508.GB32641@gmail.com

There are currently two xsave buffer formats: standard and
compacted.  The standard format is waht 'XSAVE' and 'XSAVEOPT'
produce while 'XSAVES' and 'XSAVEC' produce a compacted-formet
buffer.  (The kernel never uses XSAVEC)

But, the XSAVES buffer *ALSO* contains "system state components"
which are never saved by a plain XSAVE.  So, XSAVES has two
things that might make its buffer differently-sized from an
XSAVE-produced one.

The current code assumes that an XSAVES buffer's size is simply
the sum of the sizes of the (user) states which are supported.
This seems to work in most cases, but it is not consistent with
what the SDM says, and it breaks if we 'align' a component in the
buffer.  The calculation is also unnecessary work since the CPU
*tells* us the size of the buffer directly.

This patch just reads the size of the buffer right out of the
CPUID leaf instead of trying to derive it.

But, blindly trusting the CPU like this is dangerous.  We add
a verification pass in do_extra_xstate_size_checks() to ensure
that the size we calculate matches with what we see from the
hardware.  When it comes down to it, we trust but verify the
CPU.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: linux-kernel@vger.kernel.org
---

 b/arch/x86/kernel/fpu/xstate.c |  181 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 171 insertions(+), 10 deletions(-)

diff -puN arch/x86/kernel/fpu/xstate.c~fix-xstate_size-calculation arch/x86/kernel/fpu/xstate.c
--- a/arch/x86/kernel/fpu/xstate.c~fix-xstate_size-calculation	2015-08-25 12:50:01.109627463 -0700
+++ b/arch/x86/kernel/fpu/xstate.c	2015-08-25 12:50:01.112627598 -0700
@@ -293,27 +293,188 @@ static void __init setup_init_fpu_buf(vo
 	copy_xregs_to_kernel_booting(&init_fpstate.xsave);
 }
 
+static int xfeature_is_supervisor(int xfeature_nr)
+{
+	/*
+	 * We currently do not suport supervisor states, but if
+	 * we did, we could find out like this.
+	 *
+	 * SDM says: If state component i is a user state component,
+	 * ECX[0] return 0; if state component i is a supervisor
+	 * state component, ECX[0] returns 1.
+	u32 eax, ebx, ecx, edx;
+	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
+	return !!(ecx & 1);
+	*/
+	return 0;
+}
+/*
+static int xfeature_is_user(int xfeature_nr)
+{
+	return !xfeature_is_supervisor(xfeature_nr);
+}
+*/
+
+/*
+ * This check is important because it is easy to get XSTATE_*
+ * confused with XSTATE_BIT_*.
+ */
+#define CHECK_XFEATURE_NR(nr) do {		\
+	WARN_ON(nr < FIRST_EXTENDED_XFEATURE_NR);	\
+	WARN_ON(nr >= XFEATURES_NR_MAX);	\
+} while (0);
+
+/*
+ * We could cache this like xstate_size[], but we only use
+ * it here, so it would be a waste of space.
+ */
+static int xfeature_is_aligned(int xfeature_nr)
+{
+	u32 eax, ebx, ecx, edx;
+	CHECK_XFEATURE_NR(xfeature_nr);
+	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
+	/*
+	 * The value returned by ECX[1] indicates the alignment
+	 * of state component i when the compacted format
+	 * of the extended region of an XSAVE area is used
+	 */
+	return !!(ecx & 2);
+}
+
+static int xfeature_uncompacted_offset(int xfeature_nr)
+{
+	u32 eax, ebx, ecx, edx;
+	CHECK_XFEATURE_NR(xfeature_nr);
+	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
+	return ebx;
+}
+
+static int xfeature_size(int xfeature_nr)
+{
+	u32 eax, ebx, ecx, edx;
+	CHECK_XFEATURE_NR(xfeature_nr);
+	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
+	return eax;
+}
+
+/*
+ * 'XSAVES' implies two different things:
+ * 1. saving of supervisor/system state
+ * 2. using the compacted format
+ *
+ * Use this function when dealing with the compacted format so
+ * that it is obvious which aspect of 'XSAVES' is being handled
+ * by the calling code.
+ */
+static int using_compacted_format(void)
+{
+	return cpu_has_xsaves;
+}
+
+static void __xstate_dump_leaves(void)
+{
+	int i;
+	u32 eax, ebx, ecx, edx;
+	static int dumped_once = 0;
+
+	if (dumped_once)
+		return;
+	dumped_once = 1;
+	/*
+	 * Dump out a few leaves past the ones that we support
+	 * just in case there are some goodies up there
+	 */
+	for (i = 0; i < XFEATURES_NR_MAX + 10; i++) {
+		cpuid_count(XSTATE_CPUID, i, &eax, &ebx, &ecx, &edx);
+		pr_warn("CPUID[%02x, %02x]: eax=%08x ebx=%08x ecx=%08x edx=%08x\n",
+			XSTATE_CPUID, i, eax, ebx, ecx, edx);
+	}
+}
+
+#define XSTATE_WARN_ON(x) do {							\
+	if (WARN_ONCE(x, "XSAVE consistency problem, dumping leaves")) {	\
+		__xstate_dump_leaves();						\
+	}									\
+} while (0)
+
+/*
+ * This essentially double-checks what the cpu told us about
+ * how large the XSAVE buffer needs to be.  We are recalculating
+ * it to be safe.
+ */
+static void do_extra_xstate_size_checks(void)
+{
+	int paranoid_xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+	int i;
+
+	for (i = FIRST_EXTENDED_XFEATURE_NR; i < XFEATURES_NR_MAX; i++) {
+		if (!xfeature_nr_enabled(i))
+			continue;
+		/*
+		 * Supervisor state components can be managed only by
+		 * XSAVES, which is compacted-format only.
+		 */
+		if (!using_compacted_format())
+			XSTATE_WARN_ON(xfeature_is_supervisor(i));
+
+		/* Align from the end of the previous feature */
+		if (xfeature_is_aligned(i))
+			paranoid_xstate_size = ALIGN(paranoid_xstate_size, 64);
+		/*
+		 * The offset of a given state in the non-compacted
+		 * format is given to us in a CPUID leaf.  We check
+		 * them for being ordered (increasing offsets) in
+		 * setup_xstate_features().
+		 */
+		if (!using_compacted_format())
+			paranoid_xstate_size = xfeature_uncompacted_offset(i);
+		/*
+		 * The compacted-format offset always depends on where
+		 * the previous state ended.
+		 */
+		paranoid_xstate_size += xfeature_size(i);
+	}
+	XSTATE_WARN_ON(paranoid_xstate_size != xstate_size);
+}
+
 /*
  * Calculate total size of enabled xstates in XCR0/xfeatures_mask.
+ *
+ * Note the SDM's wording here.  "sub-function 0" only enumerates
+ * the size of the *user* states.  If we use it to size a buffer
+ * that we use 'XSAVES' on, we could potentially overflow the
+ * buffer because 'XSAVES' saves system states too.
+ *
+ * Note that we do not currently set any bits on IA32_XSS so
+ * 'XCR0 | IA32_XSS == XCR0' for now.
  */
 static void __init init_xstate_size(void)
 {
 	unsigned int eax, ebx, ecx, edx;
-	int i;
 
 	if (!cpu_has_xsaves) {
+		/*
+		 * - CPUID function 0DH, sub-function 0:
+		 *    EBX enumerates the size (in bytes) required by
+		 *    the XSAVE instruction for an XSAVE area
+		 *    containing all the *user* state components
+		 *    corresponding to bits currently set in XCR0.
+		 */
 		cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
 		xstate_size = ebx;
-		return;
-	}
-
-	xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
-	for (i = FIRST_EXTENDED_XFEATURE_NR; i < 64; i++) {
-		if (xfeature_nr_enabled(i)) {
-			cpuid_count(XSTATE_CPUID, i, &eax, &ebx, &ecx, &edx);
-			xstate_size += eax;
-		}
+	} else {
+		/*
+		 * - CPUID function 0DH, sub-function 1:
+		 *    EBX enumerates the size (in bytes) required by
+		 *    the XSAVES instruction for an XSAVE area
+		 *    containing all the state components
+		 *    corresponding to bits currently set in
+		 *    XCR0 | IA32_XSS.
+		 */
+		cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
+		xstate_size = ebx;
 	}
+	do_extra_xstate_size_checks();
 }
 
 /*
_

  parent reply	other threads:[~2015-08-25 20:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-25 20:12 [PATCH 00/11] x86, fpu: XSAVE cleanups and sanity checks Dave Hansen
2015-08-25 20:12 ` [PATCH 01/11] x86, fpu: kill LWP support Dave Hansen
2015-08-25 20:12 ` [PATCH 02/11] x86, fpu: rename xfeature_bit Dave Hansen
2015-08-26 16:06   ` Borislav Petkov
2015-08-26 16:10     ` Dave Hansen
2015-08-25 20:12 ` [PATCH 04/11] x86, fpu: remove xfeature_nr Dave Hansen
2015-08-25 20:12 ` [PATCH 03/11] x86, fpu: rework XSTATE_* macros to remove magic '2' Dave Hansen
2015-08-25 20:12 ` [PATCH 05/11] x86, fpu: add helper xfeature_nr_enabled() instead of test_bit() Dave Hansen
2015-08-25 20:12 ` [PATCH 07/11] x86, fpu: rework YMM definition Dave Hansen
2015-08-25 20:12 ` [PATCH 06/11] x86, fpu: rework MPX 'xstate' types Dave Hansen
2015-08-25 20:12 ` [PATCH 08/11] x86, fpu: add C structures for AVX-512 state components Dave Hansen
2015-08-25 20:12 ` Dave Hansen [this message]
2015-08-25 20:12 ` [PATCH 10/11] x86, fpu: check to ensure increasing-offset xstate offsets Dave Hansen
2015-08-25 20:12 ` [PATCH 11/11] x86, fpu: check CPU-provided sizes against struct declarations Dave Hansen
2015-08-26 16:18   ` Tim Chen
2015-08-26 16:19     ` Dave Hansen
  -- strict thread matches above, loose matches on Subject: below --
2015-08-27 17:11 [PATCH 00/11] [v2] x86, fpu: XSAVE cleanups and sanity checks Dave Hansen
2015-08-27 17:11 ` [PATCH 09/11] x86, fpu: correct and check XSAVE xstate size calculations Dave Hansen
2015-08-28  4:54   ` Ingo Molnar
2015-08-28 14:31     ` Dave Hansen
2015-08-28 16:44       ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150825201205.C590CFEF@viggo.jf.intel.com \
    --to=dave@sr71.net \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tim.c.chen@linux.intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.