public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Chang S. Bae" <chang.seok.bae@intel.com>
To: bp@suse.de, luto@kernel.org, tglx@linutronix.de,
	mingo@kernel.org, x86@kernel.org
Cc: len.brown@intel.com, lenb@kernel.org, dave.hansen@intel.com,
	thiago.macieira@intel.com, jing2.liu@intel.com,
	ravi.v.shankar@intel.com, linux-kernel@vger.kernel.org,
	chang.seok.bae@intel.com
Subject: [PATCH v11 09/29] x86/fpu/xstate: Introduce helpers to manage the XSTATE buffer dynamically
Date: Fri,  1 Oct 2021 15:37:08 -0700	[thread overview]
Message-ID: <20211001223728.9309-10-chang.seok.bae@intel.com> (raw)
In-Reply-To: <20211001223728.9309-1-chang.seok.bae@intel.com>

The static XSTATE per-task buffer contains the extended register states --
but it is not expandable at runtime. Introduce runtime methods and a new
fpu struct field to support the expansion.

fpu->state_mask indicates which state components are to be saved in the
XSTATE buffer.

realloc_xstate_buffer() uses vzalloc(). If use of this mechanism grows to
re-allocate buffers larger than 64KB, a more sophisticated allocation
scheme that includes purpose-built reclaim capability might be justified.

Introduce a new helper -- calculate_xstate_buf_size_from_mask() to
calculate the buffer size.

Also, use the new field and helper to initialize the buffer.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v10:
* Update comment and a variable name for XSTATE calculation function. (Dave
  Hansen)

Changes from v9:
* Rename and simplify helpers. (Borislav Petkov)
* Add and fix the code comment and the variable name. (Borislav Petkov)
* Use cpu_feature_enabled() instead of boot_cpu_has(). (Borislav Petkov)
* Use fpu->state_mask to ensure states to be written in
  copy_uabi_to_xstate() -- moved from Patch11. (Borislav Petkov)

Changes from v5:
* Added to ensure XSAVES format with current in fpu_reset_fpstate() for new
  base code.

Changes from v3:
* Updated code comments. (Borislav Petkov)
* Used vzalloc() instead of vmalloc() with memset(). (Borislav Petkov)
* Removed the max size check for >64KB. (Borislav Petkov)
* Removed the allocation size check in the helper. (Borislav Petkov)
* Switched the function description in the kernel-doc style.
* Used them for buffer initialization -- moved from the next patch.

Changes from v2:
* Updated the changelog with task->fpu removed. (Borislav Petkov)
* Replaced 'area' with 'buffer' in the comments and the changelog.
* Updated the code comments.

Changes from v1:
* Removed unneeded interrupt masking (Andy Lutomirski)
* Added vmalloc() error tracing (Dave Hansen, PeterZ, and Andy Lutomirski)
---
 arch/x86/include/asm/fpu/types.h  |   7 ++
 arch/x86/include/asm/fpu/xstate.h |   3 +
 arch/x86/kernel/fpu/core.c        |  19 +++--
 arch/x86/kernel/fpu/xstate.c      | 125 ++++++++++++++++++++++++++++++
 4 files changed, 147 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index ad5cbf922e30..0cc9f6c5a10c 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -336,6 +336,13 @@ struct fpu {
 	 */
 	unsigned long			avx512_timestamp;
 
+	/*
+	 * @state_mask:
+	 *
+	 * The bitmap represents state components to be saved in ->state.
+	 */
+	u64				state_mask;
+
 	/*
 	 * @state:
 	 *
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index c4a0914b7717..9574ee20c6aa 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -153,7 +153,10 @@ struct fpu_xstate_buffer_config {
 
 extern struct fpu_xstate_buffer_config fpu_buf_cfg;
 
+unsigned int calculate_xstate_buf_size_from_mask(u64 mask);
 void *get_xsave_addr(struct fpu *fpu, int xfeature_nr);
+int realloc_xstate_buffer(struct fpu *fpu, u64 mask);
+void free_xstate_buffer(union fpregs_state *state);
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct fpu *fpu, const void *kbuf);
 int copy_sigframe_from_user_to_xstate(struct fpu *fpu, const void __user *ubuf);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 6b55b8c651f6..2941d03912db 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -236,9 +236,8 @@ void fpstate_init(struct fpu *fpu)
 
 	if (likely(fpu)) {
 		state = fpu->state;
-		/* The dynamic user states are not prepared yet. */
-		mask = xfeatures_mask_all & ~xfeatures_mask_user_dynamic;
-		size = fpu_buf_cfg.min_size;
+		mask = fpu->state_mask;
+		size = calculate_xstate_buf_size_from_mask(fpu->state_mask);
 	} else {
 		state = &init_fpstate;
 		mask = xfeatures_mask_all;
@@ -274,14 +273,16 @@ int fpu_clone(struct task_struct *dst)
 	if (!cpu_feature_enabled(X86_FEATURE_FPU))
 		return 0;
 
+	/*
+	 * The child does not inherit the dynamic states. Thus, use the
+	 * buffer embedded in struct task_struct, which has the minimum
+	 * size.
+	 */
+	dst_fpu->state_mask = (xfeatures_mask_all & ~xfeatures_mask_user_dynamic);
 	dst_fpu->state = &dst_fpu->__default_state;
-
 	/*
 	 * Don't let 'init optimized' areas of the XSAVE area
 	 * leak into the child task:
-	 *
-	 * The child does not inherit the dynamic states. So,
-	 * the xstate buffer has the minimum size.
 	 */
 	memset(&dst_fpu->state->xsave, 0, fpu_buf_cfg.min_size);
 
@@ -380,6 +381,10 @@ static void fpu_reset_fpstate(void)
 	 * flush_thread().
 	 */
 	memcpy(fpu->state, &init_fpstate, init_fpstate_copy_size());
+	/* Adjust the xstate buffer format for current. */
+	if (cpu_feature_enabled(X86_FEATURE_XSAVES))
+		fpstate_init_xstate(&fpu->state->xsave, fpu->state_mask);
+
 	set_thread_flag(TIF_NEED_FPU_LOAD);
 	fpregs_unlock();
 }
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 4496750208a8..eafedb58b23b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -10,6 +10,7 @@
 #include <linux/pkeys.h>
 #include <linux/seq_file.h>
 #include <linux/proc_fs.h>
+#include <linux/vmalloc.h>
 
 #include <asm/fpu/api.h>
 #include <asm/fpu/internal.h>
@@ -19,6 +20,7 @@
 
 #include <asm/tlbflush.h>
 #include <asm/cpufeature.h>
+#include <asm/trace/fpu.h>
 
 /*
  * Although we spell it out in here, the Processor Trace
@@ -76,6 +78,12 @@ static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+/*
+ * True if the buffer of the corresponding XFEATURE is located on the next 64
+ * byte boundary. Otherwise, it follows the preceding component immediately.
+ */
+static bool xstate_64byte_aligned[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = false};
 
 struct fpu_xstate_buffer_config fpu_buf_cfg __ro_after_init;
 EXPORT_SYMBOL_GPL(fpu_buf_cfg);
@@ -131,6 +139,60 @@ static bool xfeature_is_supervisor(int xfeature_nr)
 	return ecx & 1;
 }
 
+/**
+ * calculate_xstate_buf_size_from_mask - Calculate the amount of space
+ *					 needed to store an xstate buffer
+ *					 with the given features
+ * @mask:	The set of components for which the space is needed.
+ *
+ * Consults values populated in setup_xstate_features(). Must be called
+ * after that setup.
+ *
+ * Returns:	The buffer size
+ */
+unsigned int calculate_xstate_buf_size_from_mask(u64 mask)
+{
+	unsigned int size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+	int i, last_feature_nr;
+
+	if (!mask)
+		return 0;
+
+	/*
+	 * The minimum buffer size excludes the dynamic user state. When a
+	 * task uses the state, the buffer can grow up to the max size.
+	 */
+	if (mask == (xfeatures_mask_all & ~xfeatures_mask_user_dynamic))
+		return fpu_buf_cfg.min_size;
+	else if (mask == xfeatures_mask_all)
+		return fpu_buf_cfg.max_size;
+
+	last_feature_nr = fls64(mask) - 1;
+	if (last_feature_nr < FIRST_EXTENDED_XFEATURE)
+		return size;
+
+	/*
+	 * Each state offset in the non-compacted format is fixed. Take the
+	 * size from the last feature 'nr'.
+	 */
+	if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
+		return xstate_offsets[last_feature_nr] + xstate_sizes[last_feature_nr];
+
+	/*
+	 * With the given mask, no relevant size is found so far. So,
+	 * calculate it by summing up each state size.
+	 */
+	for (i = FIRST_EXTENDED_XFEATURE; i <= last_feature_nr; i++) {
+		if (!(mask & BIT_ULL(i)))
+			continue;
+
+		if (xstate_64byte_aligned[i])
+			size = ALIGN(size, 64);
+		size += xstate_sizes[i];
+	}
+	return size;
+}
+
 /*
  * Enable the extended processor state save/restore feature.
  * Called once per CPU onlining.
@@ -202,6 +264,7 @@ static void __init setup_xstate_features(void)
 			continue;
 
 		xstate_offsets[i] = ebx;
+		xstate_64byte_aligned[i] = (ecx & 2) ? true : false;
 
 		/*
 		 * In our xstate size checks, we assume that the highest-numbered
@@ -805,6 +868,12 @@ void __init fpu__init_system_xstate(void)
 	if (err)
 		goto out_disable;
 
+	/*
+	 * Initially, the FPU buffer used is the static one, without
+	 * dynamic states.
+	 */
+	current->thread.fpu.state_mask = (xfeatures_mask_all & ~xfeatures_mask_user_dynamic);
+
 	/*
 	 * Update info used for ptrace frames; use standard-format size and no
 	 * supervisor xstates:
@@ -995,6 +1064,60 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 }
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
+void free_xstate_buffer(union fpregs_state *state)
+{
+	vfree(state);
+}
+
+/**
+ * realloc_xstate_buffer - Re-alloc a buffer with the size calculated from
+ *			   @mask.
+ *
+ * @fpu:	A struct fpu * pointer
+ * @mask:	The bitmap tells which components to be saved in the new
+ *		buffer.
+ *
+ * It deals with enlarging the xstate buffer with dynamic states.
+ *
+ * Use vzalloc() simply here. If the task with a vzalloc()-allocated buffer
+ * tends to terminate quickly, vfree()-induced IPIs may be a concern.
+ * Caching may be helpful for this. But the task with large state is likely
+ * to live longer.
+ *
+ * Also, this method does not shrink or reclaim the buffer.
+ *
+ * Returns 0 on success, -ENOMEM on allocation error.
+ */
+int realloc_xstate_buffer(struct fpu *fpu, u64 mask)
+{
+	union fpregs_state *state;
+	u64 state_mask;
+
+	state_mask = fpu->state_mask | mask;
+	if ((state_mask & fpu->state_mask) == state_mask)
+		return 0;
+
+	state = vzalloc(calculate_xstate_buf_size_from_mask(state_mask));
+	if (!state)
+		return -ENOMEM;
+
+	if (cpu_feature_enabled(X86_FEATURE_XSAVES))
+		fpstate_init_xstate(&state->xsave, state_mask);
+
+	/* Free the old buffer */
+	if (fpu->state != &fpu->__default_state)
+		free_xstate_buffer(fpu->state);
+
+	/*
+	 * As long as the register state is intact, save the xstate in the
+	 * new buffer at the next context switch or ptrace's context
+	 * injection.
+	 */
+	fpu->state = state;
+	fpu->state_mask = state_mask;
+	return 0;
+}
+
 static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
 			 void *init_xstate, unsigned int size)
 {
@@ -1147,6 +1270,8 @@ static int copy_uabi_to_xstate(struct fpu *fpu, const void *kbuf,
 	if (validate_user_xstate_header(&hdr))
 		return -EINVAL;
 
+	hdr.xfeatures &= fpu->state_mask;
+
 	/* Validate MXCSR when any of the related features is in use */
 	mask = XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM;
 	if (hdr.xfeatures & mask) {
-- 
2.17.1


  parent reply	other threads:[~2021-10-01 22:44 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01 22:36 [PATCH v11 00/29] x86: Support Intel Advanced Matrix Extensions Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 01/29] x86/fpu/xstate: Fix the state copy function to the XSTATE buffer Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 02/29] x86/fpu/xstate: Modify the initialization helper to handle both static and dynamic buffers Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 03/29] x86/fpu/xstate: Modify state copy helpers " Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 04/29] x86/fpu/xstate: Modify address finders " Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 05/29] x86/fpu/xstate: Add a new variable to indicate dynamic user states Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 06/29] x86/fpu/xstate: Add new variables to indicate dynamic XSTATE buffer size Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 07/29] x86/fpu/xstate: Calculate and remember dynamic XSTATE buffer sizes Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 08/29] x86/fpu/xstate: Convert the struct fpu 'state' field to a pointer Chang S. Bae
2021-10-01 22:37 ` Chang S. Bae [this message]
2021-10-01 22:37 ` [PATCH v11 10/29] x86/fpu/xstate: Update the XSTATE save function to support dynamic states Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 11/29] x86/fpu/xstate: Update the XSTATE buffer address finder " Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 12/29] x86/fpu/xstate: Update the XSTATE context copy function " Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 13/29] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 14/29] x86/fpu/xstate: Support ptracer-induced XSTATE buffer expansion Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 15/29] x86/arch_prctl: Create ARCH_SET_STATE_ENABLE/ARCH_GET_STATE_ENABLE Chang S. Bae
2021-10-05  0:30   ` Thomas Gleixner
2021-10-05  9:49     ` Thomas Gleixner
2021-10-05 11:23       ` Peter Zijlstra
2021-10-05 12:27         ` Thomas Gleixner
2021-10-01 22:37 ` [PATCH v11 16/29] x86/fpu/xstate: Support both legacy and expanded signal XSTATE size Chang S. Bae
2021-10-05 12:30   ` Thomas Gleixner
2021-10-05 15:19   ` Thomas Gleixner
2021-10-01 22:37 ` [PATCH v11 17/29] x86/fpu/xstate: Adjust the XSAVE feature table to address gaps in state component numbers Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 18/29] x86/fpu/xstate: Disable XSTATE support if an inconsistent state is detected Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 19/29] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 20/29] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 21/29] x86/fpu/amx: Initialize child's AMX state Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 22/29] x86/fpu/amx: Enable the AMX feature in 64-bit mode Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 23/29] x86/fpu/xstate: Skip writing zeros to signal frame for dynamic user states if in INIT-state Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 24/29] selftest/x86/amx: Test cases for the AMX state management Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 25/29] x86/insn/amx: Add TILERELEASE instruction to the opcode map Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 26/29] intel_idle/amx: Add SPR support with XTILEDATA capability Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 27/29] x86/fpu/xstate: Add a sanity check for XFD state when saving XSTATE Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 28/29] x86/arch_prctl: ARCH_GET_FEATURES_WITH_KERNEL_ASSISTANCE Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 29/29] x86/arch_prctl: ARCH_SET_STATE_ENABLE_ALLOC Chang S. Bae
2021-10-01 22:47 ` [PATCH v11 00/29] x86: Support Intel Advanced Matrix Extensions Bae, Chang Seok
2021-10-01 22:50 ` Bae, Chang Seok
2021-10-03  1:05   ` Thomas Gleixner
2021-10-04 14:48     ` Bae, Chang Seok
2021-10-02 21:54 ` Thomas Gleixner
2021-10-02 22:11   ` Bae, Chang Seok
2021-10-04 13:44     ` Thomas Gleixner
2021-10-04 14:47       ` Bae, Chang Seok
2021-10-02 22:20   ` Bae, Chang Seok

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211001223728.9309-10-chang.seok.bae@intel.com \
    --to=chang.seok.bae@intel.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=len.brown@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=thiago.macieira@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox