From: "Chang S. Bae" <chang.seok.bae@intel.com>
To: bp@suse.de, luto@kernel.org, tglx@linutronix.de,
mingo@kernel.org, x86@kernel.org
Cc: len.brown@intel.com, lenb@kernel.org, dave.hansen@intel.com,
thiago.macieira@intel.com, jing2.liu@intel.com,
ravi.v.shankar@intel.com, linux-kernel@vger.kernel.org,
chang.seok.bae@intel.com
Subject: [PATCH v11 09/29] x86/fpu/xstate: Introduce helpers to manage the XSTATE buffer dynamically
Date: Fri, 1 Oct 2021 15:37:08 -0700 [thread overview]
Message-ID: <20211001223728.9309-10-chang.seok.bae@intel.com> (raw)
In-Reply-To: <20211001223728.9309-1-chang.seok.bae@intel.com>
The static XSTATE per-task buffer contains the extended register states --
but it is not expandable at runtime. Introduce runtime methods and a new
fpu struct field to support the expansion.
fpu->state_mask indicates which state components are to be saved in the
XSTATE buffer.
realloc_xstate_buffer() uses vzalloc(). If use of this mechanism grows to
re-allocate buffers larger than 64KB, a more sophisticated allocation
scheme that includes purpose-built reclaim capability might be justified.
Introduce a new helper -- calculate_xstate_buf_size_from_mask() to
calculate the buffer size.
Also, use the new field and helper to initialize the buffer.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v10:
* Update comment and a variable name for XSTATE calculation function. (Dave
Hansen)
Changes from v9:
* Rename and simplify helpers. (Borislav Petkov)
* Add and fix the code comment and the variable name. (Borislav Petkov)
* Use cpu_feature_enabled() instead of boot_cpu_has(). (Borislav Petkov)
* Use fpu->state_mask to ensure states to be written in
copy_uabi_to_xstate() -- moved from Patch11. (Borislav Petkov)
Changes from v5:
* Added to ensure XSAVES format with current in fpu_reset_fpstate() for new
base code.
Changes from v3:
* Updated code comments. (Borislav Petkov)
* Used vzalloc() instead of vmalloc() with memset(). (Borislav Petkov)
* Removed the max size check for >64KB. (Borislav Petkov)
* Removed the allocation size check in the helper. (Borislav Petkov)
* Switched the function description in the kernel-doc style.
* Used them for buffer initialization -- moved from the next patch.
Changes from v2:
* Updated the changelog with task->fpu removed. (Borislav Petkov)
* Replaced 'area' with 'buffer' in the comments and the changelog.
* Updated the code comments.
Changes from v1:
* Removed unneeded interrupt masking (Andy Lutomirski)
* Added vmalloc() error tracing (Dave Hansen, PeterZ, and Andy Lutomirski)
---
arch/x86/include/asm/fpu/types.h | 7 ++
arch/x86/include/asm/fpu/xstate.h | 3 +
arch/x86/kernel/fpu/core.c | 19 +++--
arch/x86/kernel/fpu/xstate.c | 125 ++++++++++++++++++++++++++++++
4 files changed, 147 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index ad5cbf922e30..0cc9f6c5a10c 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -336,6 +336,13 @@ struct fpu {
*/
unsigned long avx512_timestamp;
+ /*
+ * @state_mask:
+ *
+ * The bitmap represents state components to be saved in ->state.
+ */
+ u64 state_mask;
+
/*
* @state:
*
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index c4a0914b7717..9574ee20c6aa 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -153,7 +153,10 @@ struct fpu_xstate_buffer_config {
extern struct fpu_xstate_buffer_config fpu_buf_cfg;
+unsigned int calculate_xstate_buf_size_from_mask(u64 mask);
void *get_xsave_addr(struct fpu *fpu, int xfeature_nr);
+int realloc_xstate_buffer(struct fpu *fpu, u64 mask);
+void free_xstate_buffer(union fpregs_state *state);
int xfeature_size(int xfeature_nr);
int copy_uabi_from_kernel_to_xstate(struct fpu *fpu, const void *kbuf);
int copy_sigframe_from_user_to_xstate(struct fpu *fpu, const void __user *ubuf);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 6b55b8c651f6..2941d03912db 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -236,9 +236,8 @@ void fpstate_init(struct fpu *fpu)
if (likely(fpu)) {
state = fpu->state;
- /* The dynamic user states are not prepared yet. */
- mask = xfeatures_mask_all & ~xfeatures_mask_user_dynamic;
- size = fpu_buf_cfg.min_size;
+ mask = fpu->state_mask;
+ size = calculate_xstate_buf_size_from_mask(fpu->state_mask);
} else {
state = &init_fpstate;
mask = xfeatures_mask_all;
@@ -274,14 +273,16 @@ int fpu_clone(struct task_struct *dst)
if (!cpu_feature_enabled(X86_FEATURE_FPU))
return 0;
+ /*
+ * The child does not inherit the dynamic states. Thus, use the
+ * buffer embedded in struct task_struct, which has the minimum
+ * size.
+ */
+ dst_fpu->state_mask = (xfeatures_mask_all & ~xfeatures_mask_user_dynamic);
dst_fpu->state = &dst_fpu->__default_state;
-
/*
* Don't let 'init optimized' areas of the XSAVE area
* leak into the child task:
- *
- * The child does not inherit the dynamic states. So,
- * the xstate buffer has the minimum size.
*/
memset(&dst_fpu->state->xsave, 0, fpu_buf_cfg.min_size);
@@ -380,6 +381,10 @@ static void fpu_reset_fpstate(void)
* flush_thread().
*/
memcpy(fpu->state, &init_fpstate, init_fpstate_copy_size());
+ /* Adjust the xstate buffer format for current. */
+ if (cpu_feature_enabled(X86_FEATURE_XSAVES))
+ fpstate_init_xstate(&fpu->state->xsave, fpu->state_mask);
+
set_thread_flag(TIF_NEED_FPU_LOAD);
fpregs_unlock();
}
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 4496750208a8..eafedb58b23b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -10,6 +10,7 @@
#include <linux/pkeys.h>
#include <linux/seq_file.h>
#include <linux/proc_fs.h>
+#include <linux/vmalloc.h>
#include <asm/fpu/api.h>
#include <asm/fpu/internal.h>
@@ -19,6 +20,7 @@
#include <asm/tlbflush.h>
#include <asm/cpufeature.h>
+#include <asm/trace/fpu.h>
/*
* Although we spell it out in here, the Processor Trace
@@ -76,6 +78,12 @@ static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
{ [ 0 ... XFEATURE_MAX - 1] = -1};
static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] __ro_after_init =
{ [ 0 ... XFEATURE_MAX - 1] = -1};
+/*
+ * True if the buffer of the corresponding XFEATURE is located on the next 64
+ * byte boundary. Otherwise, it follows the preceding component immediately.
+ */
+static bool xstate_64byte_aligned[XFEATURE_MAX] __ro_after_init =
+ { [ 0 ... XFEATURE_MAX - 1] = false};
struct fpu_xstate_buffer_config fpu_buf_cfg __ro_after_init;
EXPORT_SYMBOL_GPL(fpu_buf_cfg);
@@ -131,6 +139,60 @@ static bool xfeature_is_supervisor(int xfeature_nr)
return ecx & 1;
}
+/**
+ * calculate_xstate_buf_size_from_mask - Calculate the amount of space
+ * needed to store an xstate buffer
+ * with the given features
+ * @mask: The set of components for which the space is needed.
+ *
+ * Consults values populated in setup_xstate_features(). Must be called
+ * after that setup.
+ *
+ * Returns: The buffer size
+ */
+unsigned int calculate_xstate_buf_size_from_mask(u64 mask)
+{
+ unsigned int size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+ int i, last_feature_nr;
+
+ if (!mask)
+ return 0;
+
+ /*
+ * The minimum buffer size excludes the dynamic user state. When a
+ * task uses the state, the buffer can grow up to the max size.
+ */
+ if (mask == (xfeatures_mask_all & ~xfeatures_mask_user_dynamic))
+ return fpu_buf_cfg.min_size;
+ else if (mask == xfeatures_mask_all)
+ return fpu_buf_cfg.max_size;
+
+ last_feature_nr = fls64(mask) - 1;
+ if (last_feature_nr < FIRST_EXTENDED_XFEATURE)
+ return size;
+
+ /*
+ * Each state offset in the non-compacted format is fixed. Take the
+ * size from the last feature 'nr'.
+ */
+ if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
+ return xstate_offsets[last_feature_nr] + xstate_sizes[last_feature_nr];
+
+ /*
+ * With the given mask, no relevant size is found so far. So,
+ * calculate it by summing up each state size.
+ */
+ for (i = FIRST_EXTENDED_XFEATURE; i <= last_feature_nr; i++) {
+ if (!(mask & BIT_ULL(i)))
+ continue;
+
+ if (xstate_64byte_aligned[i])
+ size = ALIGN(size, 64);
+ size += xstate_sizes[i];
+ }
+ return size;
+}
+
/*
* Enable the extended processor state save/restore feature.
* Called once per CPU onlining.
@@ -202,6 +264,7 @@ static void __init setup_xstate_features(void)
continue;
xstate_offsets[i] = ebx;
+ xstate_64byte_aligned[i] = (ecx & 2) ? true : false;
/*
* In our xstate size checks, we assume that the highest-numbered
@@ -805,6 +868,12 @@ void __init fpu__init_system_xstate(void)
if (err)
goto out_disable;
+ /*
+ * Initially, the FPU buffer used is the static one, without
+ * dynamic states.
+ */
+ current->thread.fpu.state_mask = (xfeatures_mask_all & ~xfeatures_mask_user_dynamic);
+
/*
* Update info used for ptrace frames; use standard-format size and no
* supervisor xstates:
@@ -995,6 +1064,60 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
}
#endif /* ! CONFIG_ARCH_HAS_PKEYS */
+void free_xstate_buffer(union fpregs_state *state)
+{
+ vfree(state);
+}
+
+/**
+ * realloc_xstate_buffer - Re-alloc a buffer with the size calculated from
+ * @mask.
+ *
+ * @fpu: A struct fpu * pointer
+ * @mask: The bitmap tells which components to be saved in the new
+ * buffer.
+ *
+ * It deals with enlarging the xstate buffer with dynamic states.
+ *
+ * Use vzalloc() simply here. If the task with a vzalloc()-allocated buffer
+ * tends to terminate quickly, vfree()-induced IPIs may be a concern.
+ * Caching may be helpful for this. But the task with large state is likely
+ * to live longer.
+ *
+ * Also, this method does not shrink or reclaim the buffer.
+ *
+ * Returns 0 on success, -ENOMEM on allocation error.
+ */
+int realloc_xstate_buffer(struct fpu *fpu, u64 mask)
+{
+ union fpregs_state *state;
+ u64 state_mask;
+
+ state_mask = fpu->state_mask | mask;
+ if ((state_mask & fpu->state_mask) == state_mask)
+ return 0;
+
+ state = vzalloc(calculate_xstate_buf_size_from_mask(state_mask));
+ if (!state)
+ return -ENOMEM;
+
+ if (cpu_feature_enabled(X86_FEATURE_XSAVES))
+ fpstate_init_xstate(&state->xsave, state_mask);
+
+ /* Free the old buffer */
+ if (fpu->state != &fpu->__default_state)
+ free_xstate_buffer(fpu->state);
+
+ /*
+ * As long as the register state is intact, save the xstate in the
+ * new buffer at the next context switch or ptrace's context
+ * injection.
+ */
+ fpu->state = state;
+ fpu->state_mask = state_mask;
+ return 0;
+}
+
static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
void *init_xstate, unsigned int size)
{
@@ -1147,6 +1270,8 @@ static int copy_uabi_to_xstate(struct fpu *fpu, const void *kbuf,
if (validate_user_xstate_header(&hdr))
return -EINVAL;
+ hdr.xfeatures &= fpu->state_mask;
+
/* Validate MXCSR when any of the related features is in use */
mask = XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM;
if (hdr.xfeatures & mask) {
--
2.17.1
next prev parent reply other threads:[~2021-10-01 22:44 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-01 22:36 [PATCH v11 00/29] x86: Support Intel Advanced Matrix Extensions Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 01/29] x86/fpu/xstate: Fix the state copy function to the XSTATE buffer Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 02/29] x86/fpu/xstate: Modify the initialization helper to handle both static and dynamic buffers Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 03/29] x86/fpu/xstate: Modify state copy helpers " Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 04/29] x86/fpu/xstate: Modify address finders " Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 05/29] x86/fpu/xstate: Add a new variable to indicate dynamic user states Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 06/29] x86/fpu/xstate: Add new variables to indicate dynamic XSTATE buffer size Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 07/29] x86/fpu/xstate: Calculate and remember dynamic XSTATE buffer sizes Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 08/29] x86/fpu/xstate: Convert the struct fpu 'state' field to a pointer Chang S. Bae
2021-10-01 22:37 ` Chang S. Bae [this message]
2021-10-01 22:37 ` [PATCH v11 10/29] x86/fpu/xstate: Update the XSTATE save function to support dynamic states Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 11/29] x86/fpu/xstate: Update the XSTATE buffer address finder " Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 12/29] x86/fpu/xstate: Update the XSTATE context copy function " Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 13/29] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 14/29] x86/fpu/xstate: Support ptracer-induced XSTATE buffer expansion Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 15/29] x86/arch_prctl: Create ARCH_SET_STATE_ENABLE/ARCH_GET_STATE_ENABLE Chang S. Bae
2021-10-05 0:30 ` Thomas Gleixner
2021-10-05 9:49 ` Thomas Gleixner
2021-10-05 11:23 ` Peter Zijlstra
2021-10-05 12:27 ` Thomas Gleixner
2021-10-01 22:37 ` [PATCH v11 16/29] x86/fpu/xstate: Support both legacy and expanded signal XSTATE size Chang S. Bae
2021-10-05 12:30 ` Thomas Gleixner
2021-10-05 15:19 ` Thomas Gleixner
2021-10-01 22:37 ` [PATCH v11 17/29] x86/fpu/xstate: Adjust the XSAVE feature table to address gaps in state component numbers Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 18/29] x86/fpu/xstate: Disable XSTATE support if an inconsistent state is detected Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 19/29] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 20/29] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 21/29] x86/fpu/amx: Initialize child's AMX state Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 22/29] x86/fpu/amx: Enable the AMX feature in 64-bit mode Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 23/29] x86/fpu/xstate: Skip writing zeros to signal frame for dynamic user states if in INIT-state Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 24/29] selftest/x86/amx: Test cases for the AMX state management Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 25/29] x86/insn/amx: Add TILERELEASE instruction to the opcode map Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 26/29] intel_idle/amx: Add SPR support with XTILEDATA capability Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 27/29] x86/fpu/xstate: Add a sanity check for XFD state when saving XSTATE Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 28/29] x86/arch_prctl: ARCH_GET_FEATURES_WITH_KERNEL_ASSISTANCE Chang S. Bae
2021-10-01 22:37 ` [PATCH v11 29/29] x86/arch_prctl: ARCH_SET_STATE_ENABLE_ALLOC Chang S. Bae
2021-10-01 22:47 ` [PATCH v11 00/29] x86: Support Intel Advanced Matrix Extensions Bae, Chang Seok
2021-10-01 22:50 ` Bae, Chang Seok
2021-10-03 1:05 ` Thomas Gleixner
2021-10-04 14:48 ` Bae, Chang Seok
2021-10-02 21:54 ` Thomas Gleixner
2021-10-02 22:11 ` Bae, Chang Seok
2021-10-04 13:44 ` Thomas Gleixner
2021-10-04 14:47 ` Bae, Chang Seok
2021-10-02 22:20 ` Bae, Chang Seok
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211001223728.9309-10-chang.seok.bae@intel.com \
--to=chang.seok.bae@intel.com \
--cc=bp@suse.de \
--cc=dave.hansen@intel.com \
--cc=jing2.liu@intel.com \
--cc=len.brown@intel.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@kernel.org \
--cc=ravi.v.shankar@intel.com \
--cc=tglx@linutronix.de \
--cc=thiago.macieira@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.