* [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes
@ 2015-05-08 21:30 Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 1/6] x86/xsave.c: Fix xstate offsets and sizes enumeration Fenghua Yu
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Fenghua Yu @ 2015-05-08 21:30 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Dave Hansen,
Asit K Mallick, Glenn Williamson
Cc: linux-kernel, x86, Fenghua Yu
From: Fenghua Yu <fenghua.yu@intel.com>
This patchset is supposed to fix some xsave/xsaves/fpu related issues.
We may hit the issues on either existing platforms or upcoming platforms.
We had better to have the patches in upstream and backport them to stable
kernel and distros.
The patch 1/6 fixes an xstate offsets and sizes enumeration issue. During
enumerating offsets and sizes starting from 2 to the last enabled feature,
if one xstate's size is 0, current code thinks there is no other xstate
after this xstate and breaks from enumeration. This is not true because
architecturally it's possible to have a few xstates disabled between
xstate 2 and the last enabled xstate. The offsets and sizes of
the xstates that are not enumerated after the disabled xstate will be
consumed and cause issues in runtime.
The patch 2/6 introduces a new global variable "user_xstate_size". This
variable is used for standard formatted xsave area size in signal frame.
Current code incorrectly uses the smaller compacted formatted xsave area
size for signal frame and will cause issues in xstate access in signal
frame.
The patch 3/6 is not fixing a bug. But it renames "xstate_size" to
"kernel_xstate_size" to explicitly distinguish between xstate size in
kernel space and the one in user space. It just makes kernel code more
clear.
The patch 4/6 claims that the structure of xsave_struct is
non-architectural and fields/xstates in the structure is not defined
in compilation time. No new states should be added in xsave_struct.
The xsave area should be constructed during kernel booting time.
The patch 5/6 clears xstate_bv so that init optimization in hardware
can take action. Without the patch, some xstates are always not in
init status and this will impact badly on performance of context
switch.
The patch 6/6 introduces a correct check for user_has_fpu check.
Changes in v3:
1/6: In description, add that Ingo has a same patch in his xstate/fpu
overall clean up patchset.
2/6: Remove copy_to_user_xstate(). Now copy compact format xsave
area directly from processor to user buffer in 6/6.
Initialize user_xstate_size in init_thread_xstate().
3/6: Add Dave Hansen's credit in description.
5/6: Add this new patch for performance issue.
6/6: Add this new patch for a new user_has_fpu check to allow copy
compact format xsave area directly from processor to user buffer.
Fenghua Yu (6):
x86/xsave.c: Fix xstate offsets and sizes enumeration
x86/xsaves: Define and use user_xstate_size for xstate size in signal
context
x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly
distinguish xstate size in kernel from user space
x86/xsave: Don't add new states in xsave_struct
x86/xsaves: Keep xstate_bv in init_xstate_buf header as zero for init
optimimization
x86/xsave.c: Introduce a new check that allows correct xstates copy
from kernel to user directly
arch/x86/include/asm/fpu-internal.h | 7 +--
arch/x86/include/asm/processor.h | 23 +++-----
arch/x86/include/asm/xsave.h | 1 -
arch/x86/kernel/i387.c | 21 ++++----
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/xsave.c | 105 ++++++++++++++++++++++++++----------
6 files changed, 102 insertions(+), 57 deletions(-)
--
1.8.1.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v3 Bugfix 1/6] x86/xsave.c: Fix xstate offsets and sizes enumeration
2015-05-08 21:30 [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Fenghua Yu
@ 2015-05-08 21:31 ` Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 2/6] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Fenghua Yu @ 2015-05-08 21:31 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Dave Hansen,
Asit K Mallick, Glenn Williamson
Cc: linux-kernel, x86, Fenghua Yu
From: Fenghua Yu <fenghua.yu@intel.com>
When enumerating xstate offsets and sizes from cpuid (eax=0x0d, ecx>=2),
it's possible that state m is not implemented while state n (n>m)
is implemented. So enumeration shouldn't stop at state m.
There is no platform configured like above yet. But this could be a problem
in the future. For example, suppose XCR0=0xe7, that means FP, SSE, AVX, and
AVX-512 states are enabled and MPX states (bit 3 and 4) are not enabled.
Then in setup_xstate_features(), after finding BNDREGS size is 0 (i.e. eax
from CPUID xstate subleaf 3, break from the for loop. That stops finding
xstate_offsets and xstate_sizes for AVX-512. Later on incorrect
xstate_offsets and xstate_sizes for AVX-512 will be used in a few places
and will causes issues.
This patch enumerates xstate offsets and sizes for all kernel supported
xstates. If a state is not implemented in hardware or not enabled in XCR0,
its size is set as zero and its offset is read from cpuid.
Ingo is rewriting fpu/xstate and his big patchset includes this patch at:
https://lkml.org/lkml/2015/5/5/892
I still send this patch in the patchset because we need to fix this bug
in upstream ASAP.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
arch/x86/kernel/xsave.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 87a815b..3c0a9d1 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -465,23 +465,18 @@ static inline void xstate_enable(void)
*/
static void __init setup_xstate_features(void)
{
- int eax, ebx, ecx, edx, leaf = 0x2;
+ int eax, ebx, ecx, edx, leaf;
xstate_features = fls64(pcntxt_mask);
xstate_offsets = alloc_bootmem(xstate_features * sizeof(int));
xstate_sizes = alloc_bootmem(xstate_features * sizeof(int));
- do {
+ for (leaf = 2; leaf < xstate_features; leaf++) {
cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx);
- if (eax == 0)
- break;
-
xstate_offsets[leaf] = ebx;
xstate_sizes[leaf] = eax;
-
- leaf++;
- } while (1);
+ }
}
/*
--
1.8.1.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 Bugfix 2/6] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
2015-05-08 21:30 [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 1/6] x86/xsave.c: Fix xstate offsets and sizes enumeration Fenghua Yu
@ 2015-05-08 21:31 ` Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 3/6] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space Fenghua Yu
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Fenghua Yu @ 2015-05-08 21:31 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Dave Hansen,
Asit K Mallick, Glenn Williamson
Cc: linux-kernel, x86, Fenghua Yu
From: Fenghua Yu <fenghua.yu@intel.com>
If "xsaves" is enabled, kernel always uses compact format of xsave area.
But user space still uses standard format of xsave area. Thus, xstate size
in kernel's xsave area is smaller than xstate size in user's xsave area.
xstate in signal frame should be in standard format for user's signal
handler to access.
In no "xsaves" case, xsave area in both user space and kernel space are in
standard format. Therefore, user's and kernel's xstate sizes are equal.
In "xsaves" case, xsave area in user space is in standard format while
xsave area in kernel space is in compact format. Therefore, kernel's
xstate size is less than user's xstate size.
So here is the problem: currently kernel uses the kernel's xstate size
for xstate size in signal frame. This is not a problem in no "xsaves" case.
But it is an issue in "xsaves" case because kernel's xstate size is smaller
than user's xstate size. When setting up signal math frame in
alloc_ mathframe(), the fpstate is in standard format; but a smaller size
of fpstate buffer is allocated in signal frame for standard format
xstate. Then kernel saves only part of xstate registers into this smaller
user's fpstate buffer and user will see part of the xstate registers in
signal context. Similar issue happens after returning from signal handler:
kernel will only restore part of xstate registers from user's fpstate
buffer in signal frame.
This patch defines and uses user_xstate_size for xstate size in signal
frame. It's read from returned value in ebx from CPUID leaf 0x0D subleaf
0x0. This is maximum size required by enabled states in XCR0 and may be
different from ecx when states at the end of the xsave area are not
enabled. This value indicates the size required for XSAVE to save all
supported user states in legacy/standard format.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
arch/x86/include/asm/fpu-internal.h | 3 ++-
arch/x86/include/asm/processor.h | 1 +
arch/x86/include/asm/xsave.h | 1 -
arch/x86/kernel/i387.c | 3 +++
arch/x86/kernel/xsave.c | 29 +++++++++++++++++++++--------
5 files changed, 27 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index da5e967..c00c769 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -496,7 +496,8 @@ extern int __restore_xstate_sig(void __user *buf, void __user *fx, int size);
static inline int xstate_sigframe_size(void)
{
- return use_xsave() ? xstate_size + FP_XSTATE_MAGIC2_SIZE : xstate_size;
+ return use_xsave() ? user_xstate_size + FP_XSTATE_MAGIC2_SIZE :
+ user_xstate_size;
}
static inline int restore_xstate_sig(void __user *buf, int ia32_frame)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 23ba676..576ff8c 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -483,6 +483,7 @@ DECLARE_PER_CPU(struct irq_stack *, softirq_stack);
#endif /* X86_64 */
extern unsigned int xstate_size;
+extern unsigned int user_xstate_size;
extern void free_thread_xstate(struct task_struct *);
extern struct kmem_cache *task_xstate_cachep;
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index c9a6d68..7799d18 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -44,7 +44,6 @@
#define REX_PREFIX
#endif
-extern unsigned int xstate_size;
extern u64 pcntxt_mask;
extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
extern struct xsave_struct *init_xstate_buf;
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 00918327..8a7b96b 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -166,6 +166,7 @@ static void init_thread_xstate(void)
setup_clear_cpu_cap(X86_FEATURE_XSAVE);
setup_clear_cpu_cap(X86_FEATURE_XSAVEOPT);
xstate_size = sizeof(struct i387_soft_struct);
+ user_xstate_size = xstate_size;
return;
}
@@ -173,6 +174,8 @@ static void init_thread_xstate(void)
xstate_size = sizeof(struct i387_fxsave_struct);
else
xstate_size = sizeof(struct i387_fsave_struct);
+
+ user_xstate_size = xstate_size;
}
/*
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 3c0a9d1..f99a6b7 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -29,6 +29,7 @@ static struct _fpx_sw_bytes fx_sw_reserved, fx_sw_reserved_ia32;
static unsigned int *xstate_offsets, *xstate_sizes;
static unsigned int xstate_comp_offsets[sizeof(pcntxt_mask)*8];
static unsigned int xstate_features;
+unsigned int user_xstate_size;
/*
* If a processor implementation discern that a processor state component is
@@ -85,7 +86,7 @@ void __sanitize_i387_state(struct task_struct *tsk)
*/
while (xstate_bv) {
if (xstate_bv & 0x1) {
- int offset = xstate_offsets[feature_bit];
+ int offset = xstate_comp_offsets[feature_bit];
int size = xstate_sizes[feature_bit];
memcpy(((void *) fx) + offset,
@@ -116,7 +117,7 @@ static inline int check_for_xstate(struct i387_fxsave_struct __user *buf,
/* Check for the first magic field and other error scenarios. */
if (fx_sw->magic1 != FP_XSTATE_MAGIC1 ||
fx_sw->xstate_size < min_xstate_size ||
- fx_sw->xstate_size > xstate_size ||
+ fx_sw->xstate_size > user_xstate_size ||
fx_sw->xstate_size > fx_sw->extended_size)
return -1;
@@ -173,7 +174,8 @@ static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
if (!use_xsave())
return err;
- err |= __put_user(FP_XSTATE_MAGIC2, (__u32 *)(buf + xstate_size));
+ err |= __put_user(FP_XSTATE_MAGIC2,
+ (__u32 *)(buf + user_xstate_size));
/*
* Read the xstate_bv which we copied (directly from the cpu or
@@ -210,7 +212,7 @@ static inline int save_user_xstate(struct xsave_struct __user *buf)
else
err = fsave_user((struct i387_fsave_struct __user *) buf);
- if (unlikely(err) && __clear_user(buf, xstate_size))
+ if (unlikely(err) && __clear_user(buf, user_xstate_size))
err = -EFAULT;
return err;
}
@@ -260,8 +262,16 @@ int save_xstate_sig(void __user *buf, void __user *buf_fx, int size)
if (ia32_fxstate)
fpu_fxsave(&tsk->thread.fpu);
} else {
+ /*
+ * When we come here, kernel uses standard format for
+ * xsave area and xstate size should be the same for
+ * both kernel and user space. Warn if it's not the
+ * case.
+ */
+ WARN_ONCE(user_xstate_size != kernel_xstate_size,
+ "Wrong kernel xstate size!\n");
sanitize_i387_state(tsk);
- if (__copy_to_user(buf_fx, xsave, xstate_size))
+ if (__copy_to_user(buf_fx, xsave, user_xstate_size))
return -1;
}
@@ -434,7 +444,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
static void prepare_fx_sw_frame(void)
{
int fsave_header_size = sizeof(struct i387_fsave_struct);
- int size = xstate_size + FP_XSTATE_MAGIC2_SIZE;
+ int size = user_xstate_size + FP_XSTATE_MAGIC2_SIZE;
if (config_enabled(CONFIG_X86_32))
size += fsave_header_size;
@@ -442,7 +452,7 @@ static void prepare_fx_sw_frame(void)
fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
fx_sw_reserved.extended_size = size;
fx_sw_reserved.xstate_bv = pcntxt_mask;
- fx_sw_reserved.xstate_size = xstate_size;
+ fx_sw_reserved.xstate_size = user_xstate_size;
if (config_enabled(CONFIG_IA32_EMULATION)) {
fx_sw_reserved_ia32 = fx_sw_reserved;
@@ -576,14 +586,18 @@ __setup("eagerfpu=", eager_fpu_setup);
/*
* Calculate total size of enabled xstates in XCR0/pcntxt_mask.
+ *
+ * XCR0/pcntxt_mask should not be changed after this.
*/
static void __init init_xstate_size(void)
{
unsigned int eax, ebx, ecx, edx;
int i;
+ cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
+ user_xstate_size = ebx;
+
if (!cpu_has_xsaves) {
- cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
xstate_size = ebx;
return;
}
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 Bugfix 3/6] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space
2015-05-08 21:30 [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 1/6] x86/xsave.c: Fix xstate offsets and sizes enumeration Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 2/6] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
@ 2015-05-08 21:31 ` Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 4/6] x86/xsave: Don't add new states in xsave_struct Fenghua Yu
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Fenghua Yu @ 2015-05-08 21:31 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Dave Hansen,
Asit K Mallick, Glenn Williamson
Cc: linux-kernel, x86, Fenghua Yu
From: Fenghua Yu <fenghua.yu@intel.com>
User space uses standard format xsave area. fpstate in signal frame should
have standard format size.
To explicitly distinguish between xstate size in kernel space and the one
in user space, we rename xstate_size to kernel_xstate_size. This patch is
not fixing a bug. It just makes kernel code more clear.
So we define the xsave area sizes in two global variables:
kernel_xstate_size (previous xstate_size): the xsave area size used in
xsave area allocated in kernel
user_xstate_size: the xsave area size used in xsave area used by user.
In no "xsaves" case, xsave area in both user space and kernel space are in
standard format. Therefore, kernel_xstate_size and user_xstate_size are
equal.
In "xsaves" case, xsave area in user space is in standard format while
xsave area in kernel space is in compact format. Therefore, kernel's
xstate size is less than user's xstate size.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
arch/x86/include/asm/fpu-internal.h | 4 ++--
arch/x86/include/asm/processor.h | 2 +-
arch/x86/kernel/i387.c | 22 +++++++++++-----------
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/xsave.c | 14 +++++++-------
5 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index c00c769..5d9ba0c 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -597,14 +597,14 @@ static inline void fpu_free(struct fpu *fpu)
static inline void fpu_copy(struct task_struct *dst, struct task_struct *src)
{
if (use_eager_fpu()) {
- memset(&dst->thread.fpu.state->xsave, 0, xstate_size);
+ memset(&dst->thread.fpu.state->xsave, 0, kernel_xstate_size);
__save_fpu(dst);
} else {
struct fpu *dfpu = &dst->thread.fpu;
struct fpu *sfpu = &src->thread.fpu;
unlazy_fpu(src);
- memcpy(dfpu->state, sfpu->state, xstate_size);
+ memcpy(dfpu->state, sfpu->state, kernel_xstate_size);
}
}
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 576ff8c..f26051b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -482,7 +482,7 @@ DECLARE_PER_CPU(struct irq_stack *, hardirq_stack);
DECLARE_PER_CPU(struct irq_stack *, softirq_stack);
#endif /* X86_64 */
-extern unsigned int xstate_size;
+extern unsigned int kernel_xstate_size;
extern unsigned int user_xstate_size;
extern void free_thread_xstate(struct task_struct *);
extern struct kmem_cache *task_xstate_cachep;
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 8a7b96b..1eba4f2 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -133,8 +133,8 @@ void unlazy_fpu(struct task_struct *tsk)
EXPORT_SYMBOL(unlazy_fpu);
unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
-unsigned int xstate_size;
-EXPORT_SYMBOL_GPL(xstate_size);
+unsigned int kernel_xstate_size;
+EXPORT_SYMBOL_GPL(kernel_xstate_size);
static struct i387_fxsave_struct fx_scratch;
static void mxcsr_feature_mask_init(void)
@@ -154,7 +154,7 @@ static void mxcsr_feature_mask_init(void)
static void init_thread_xstate(void)
{
/*
- * Note that xstate_size might be overwriten later during
+ * Note that kernel_xstate_size might be overwriten later during
* xsave_init().
*/
@@ -165,17 +165,17 @@ static void init_thread_xstate(void)
*/
setup_clear_cpu_cap(X86_FEATURE_XSAVE);
setup_clear_cpu_cap(X86_FEATURE_XSAVEOPT);
- xstate_size = sizeof(struct i387_soft_struct);
- user_xstate_size = xstate_size;
+ kernel_xstate_size = sizeof(struct i387_soft_struct);
+ user_xstate_size = kernel_xstate_size;
return;
}
if (cpu_has_fxsr)
- xstate_size = sizeof(struct i387_fxsave_struct);
+ kernel_xstate_size = sizeof(struct i387_fxsave_struct);
else
- xstate_size = sizeof(struct i387_fsave_struct);
+ kernel_xstate_size = sizeof(struct i387_fsave_struct);
- user_xstate_size = xstate_size;
+ user_xstate_size = kernel_xstate_size;
}
/*
@@ -211,9 +211,9 @@ void fpu_init(void)
/*
* init_thread_xstate is only called once to avoid overriding
- * xstate_size during boot time or during CPU hotplug.
+ * kernel_xstate_size during boot time or during CPU hotplug.
*/
- if (xstate_size == 0)
+ if (kernel_xstate_size == 0)
init_thread_xstate();
mxcsr_feature_mask_init();
@@ -228,7 +228,7 @@ void fpu_finit(struct fpu *fpu)
return;
}
- memset(fpu->state, 0, xstate_size);
+ memset(fpu->state, 0, kernel_xstate_size);
if (cpu_has_fxsr) {
fx_finit(&fpu->state->fxsave);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 8213da6..ded2c82 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -113,7 +113,7 @@ void arch_release_task_struct(struct task_struct *tsk)
void arch_task_cache_init(void)
{
task_xstate_cachep =
- kmem_cache_create("task_xstate", xstate_size,
+ kmem_cache_create("task_xstate", kernel_xstate_size,
__alignof__(union thread_xstate),
SLAB_PANIC | SLAB_NOTRACK, NULL);
setup_xstate_comp();
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 9a9f8a7..4217bec 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -343,7 +343,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
{
int ia32_fxstate = (buf != buf_fx);
struct task_struct *tsk = current;
- int state_size = xstate_size;
+ int state_size = kernel_xstate_size;
u64 xstate_bv = 0;
int fx_only = 0;
@@ -543,7 +543,7 @@ static void __init setup_init_fpu_buf(void)
* Setup init_xstate_buf to represent the init state of
* all the features managed by the xsave
*/
- init_xstate_buf = alloc_bootmem_align(xstate_size,
+ init_xstate_buf = alloc_bootmem_align(kernel_xstate_size,
__alignof__(struct xsave_struct));
fx_finit(&init_xstate_buf->i387);
@@ -597,15 +597,15 @@ static void __init init_xstate_size(void)
user_xstate_size = ebx;
if (!cpu_has_xsaves) {
- xstate_size = ebx;
+ kernel_xstate_size = ebx;
return;
}
- xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+ kernel_xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
for (i = 2; i < 64; i++) {
if (test_bit(i, (unsigned long *)&pcntxt_mask)) {
cpuid_count(XSTATE_CPUID, i, &eax, &ebx, &ecx, &edx);
- xstate_size += eax;
+ kernel_xstate_size += eax;
}
}
}
@@ -643,7 +643,7 @@ static void __init xstate_enable_boot_cpu(void)
*/
init_xstate_size();
- update_regset_xstate_info(xstate_size, pcntxt_mask);
+ update_regset_xstate_info(kernel_xstate_size, pcntxt_mask);
prepare_fx_sw_frame();
setup_init_fpu_buf();
@@ -662,7 +662,7 @@ static void __init xstate_enable_boot_cpu(void)
}
pr_info("enabled xstate_bv 0x%llx, cntxt size 0x%x using %s\n",
- pcntxt_mask, xstate_size,
+ pcntxt_mask, kernel_xstate_size,
cpu_has_xsaves ? "compacted form" : "standard form");
}
--
1.8.1.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 Bugfix 4/6] x86/xsave: Don't add new states in xsave_struct
2015-05-08 21:30 [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Fenghua Yu
` (2 preceding siblings ...)
2015-05-08 21:31 ` [PATCH v3 Bugfix 3/6] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space Fenghua Yu
@ 2015-05-08 21:31 ` Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 5/6] x86/xsaves: Keep xstate_bv in init_xstate_buf header as zero for init optimimization Fenghua Yu
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Fenghua Yu @ 2015-05-08 21:31 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Dave Hansen,
Asit K Mallick, Glenn Williamson
Cc: linux-kernel, x86, Fenghua Yu
From: Fenghua Yu <fenghua.yu@intel.com>
The structure of xsave_struct is non-architectural. Some xstates could be
disabled and leave some holes in the xsave area. In compact format,
offsets of xstates in the xsave area are decided during booting time.
So the fields in xsave_struct are not static and fixed during compilation
time. The offsets and sizes of the fields in the structure should be
detected from cpuid during runtime.
Therefore, we don't add new states in xsave_struct except legacy fpu/sse
states and header fields whose offsets and sizes are defined
architecturally.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
arch/x86/include/asm/processor.h | 20 +++++---------------
1 file changed, 5 insertions(+), 15 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f26051b..163defc 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -386,16 +386,6 @@ struct i387_soft_struct {
u32 entry_eip;
};
-struct ymmh_struct {
- /* 16 * 16 bytes for each YMMH-reg = 256 bytes */
- u32 ymmh_space[64];
-};
-
-/* We don't support LWP yet: */
-struct lwp_struct {
- u8 reserved[128];
-};
-
struct bndreg {
u64 lower_bound;
u64 upper_bound;
@@ -415,11 +405,11 @@ struct xsave_hdr_struct {
struct xsave_struct {
struct i387_fxsave_struct i387;
struct xsave_hdr_struct xsave_hdr;
- struct ymmh_struct ymmh;
- struct lwp_struct lwp;
- struct bndreg bndreg[4];
- struct bndcsr bndcsr;
- /* new processor state extensions will go here */
+ /*
+ * Please don't add more states here. They are non-architectural.
+ * Offset and size of each state should be calculated during boot time.
+ * So adding states here is meanless.
+ */
} __attribute__ ((packed, aligned (64)));
union thread_xstate {
--
1.8.1.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 Bugfix 5/6] x86/xsaves: Keep xstate_bv in init_xstate_buf header as zero for init optimimization
2015-05-08 21:30 [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Fenghua Yu
` (3 preceding siblings ...)
2015-05-08 21:31 ` [PATCH v3 Bugfix 4/6] x86/xsave: Don't add new states in xsave_struct Fenghua Yu
@ 2015-05-08 21:31 ` Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 6/6] x86/xsave.c: Introduce a new check that allows correct xstates copy from kernel to user directly Fenghua Yu
2015-05-09 6:09 ` [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Ingo Molnar
6 siblings, 0 replies; 8+ messages in thread
From: Fenghua Yu @ 2015-05-08 21:31 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Dave Hansen,
Asit K Mallick, Glenn Williamson
Cc: linux-kernel, x86, Fenghua Yu
From: Fenghua Yu <fenghua.yu@intel.com>
Keep xstate_bv in init_xstate_buf header as zero for init optimization.
This is important for init optimization that is implemented in processor.
If a bit corresponding to an xstate in xstate_bv is 0, it means the
xstate is in init status and will not be read from memory to the processor
during xrestor* instruction. This largely impacts context switch
performance.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
arch/x86/kernel/xsave.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 4217bec..547f293 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -545,6 +545,12 @@ static void __init setup_init_fpu_buf(void)
*/
init_xstate_buf = alloc_bootmem_align(kernel_xstate_size,
__alignof__(struct xsave_struct));
+
+ /*
+ * Make sure xstate_bv is zero to allow init optimization work.
+ */
+ init_xstate_buf->xsave_hdr.xstate_bv = 0;
+
fx_finit(&init_xstate_buf->i387);
if (!cpu_has_xsave)
@@ -552,11 +558,9 @@ static void __init setup_init_fpu_buf(void)
setup_xstate_features();
- if (cpu_has_xsaves) {
+ if (cpu_has_xsaves)
init_xstate_buf->xsave_hdr.xcomp_bv =
(u64)1 << 63 | pcntxt_mask;
- init_xstate_buf->xsave_hdr.xstate_bv = pcntxt_mask;
- }
/*
* Init all the features state with header_bv being 0x0
--
1.8.1.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 Bugfix 6/6] x86/xsave.c: Introduce a new check that allows correct xstates copy from kernel to user directly
2015-05-08 21:30 [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Fenghua Yu
` (4 preceding siblings ...)
2015-05-08 21:31 ` [PATCH v3 Bugfix 5/6] x86/xsaves: Keep xstate_bv in init_xstate_buf header as zero for init optimimization Fenghua Yu
@ 2015-05-08 21:31 ` Fenghua Yu
2015-05-09 6:09 ` [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Ingo Molnar
6 siblings, 0 replies; 8+ messages in thread
From: Fenghua Yu @ 2015-05-08 21:31 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Dave Hansen,
Asit K Mallick, Glenn Williamson
Cc: linux-kernel, x86, Fenghua Yu
From: Fenghua Yu <fenghua.yu@intel.com>
There are two formats of XSAVE buffers: compact and standard.
We avoid copying the compact format out to userspace since it
might break existing userspace which is not aware of the new
compacted format.
This means that save_xstate_sig() can not simply copy_to_user()
the kernel buffer if it is in compact format, ever. Add a
heavily-commented function explaining this.
Note that all the paths to save_user_xstate() do currently
check for used_math(), but add a WARN_ONCE() in there for any
future possible use.
Dave Hansen proposes this method to simplify copy xstate directly
to user.
Signed-off-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
arch/x86/kernel/xsave.c | 41 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 40 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 547f293..69a1847 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -217,6 +217,45 @@ static inline int save_user_xstate(struct xsave_struct __user *buf)
return err;
}
+static int should_save_registers_directly(void)
+{
+ /*
+ * This should only ever be called when we are actually
+ * using the FPU. Otherwise, we run the risk of using
+ * some FPU instructions for saving the registers, and
+ * inflating thread.fpu_counter, making us think that
+ * the _task_ is using the FPU when in fact it was the
+ * kernel.
+ */
+ WARN_ONCE(!used_math(), "direct FPU save with no math use\n");
+
+ /*
+ * In the case that we are using a compacted kernel
+ * xsave area, we can not copy the thread.fpu.state
+ * directly to userspace and *must* save it from the
+ * registers directly.
+ */
+ if (cpu_has_xsaves)
+ return 1;
+
+ /*
+ * user_has_fpu() means "Can I use the FPU hardware
+ * without taking a device-not-available exception?" This
+ * means that saving the registers directly will be
+ * cheaper than copying their contents out of
+ * thread.fpu.state.
+ *
+ * Note that user_has_fpu() is inherently racy and may
+ * become false at any time. If this race happens, we
+ * will take a harmless device-not-available exception
+ * when we attempt the FPU save instruction.
+ */
+ if (user_has_fpu())
+ return 1;
+
+ return 0;
+}
+
/*
* Save the fpu, extended register state to the user signal frame.
*
@@ -254,7 +293,7 @@ int save_xstate_sig(void __user *buf, void __user *buf_fx, int size)
sizeof(struct user_i387_ia32_struct), NULL,
(struct _fpstate_ia32 __user *) buf) ? -1 : 1;
- if (user_has_fpu()) {
+ if (should_save_registers_directly()) {
/* Save the live register state to the user directly. */
if (save_user_xstate(buf_fx))
return -1;
--
1.8.1.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes
2015-05-08 21:30 [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Fenghua Yu
` (5 preceding siblings ...)
2015-05-08 21:31 ` [PATCH v3 Bugfix 6/6] x86/xsave.c: Introduce a new check that allows correct xstates copy from kernel to user directly Fenghua Yu
@ 2015-05-09 6:09 ` Ingo Molnar
6 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2015-05-09 6:09 UTC (permalink / raw)
To: Fenghua Yu
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Dave Hansen,
Asit K Mallick, Glenn Williamson, linux-kernel, x86
* Fenghua Yu <fenghua.yu@intel.com> wrote:
> From: Fenghua Yu <fenghua.yu@intel.com>
>
> This patchset is supposed to fix some xsave/xsaves/fpu related issues.
So as I mentioned it to Dave Hansen already yesterday, I think the
right, easily backportable fix would be to switch the kernel to
'standard format'. It's very close to the compacted format and its
behavior is much more obvious and it doesn't change the context size
on us.
Then, on top of a known working kernel, we can re-introduce compacted
format without pressure, with all the necessary signal frame
conversion problems solved properly.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-05-09 6:10 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-08 21:30 [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 1/6] x86/xsave.c: Fix xstate offsets and sizes enumeration Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 2/6] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 3/6] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 4/6] x86/xsave: Don't add new states in xsave_struct Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 5/6] x86/xsaves: Keep xstate_bv in init_xstate_buf header as zero for init optimimization Fenghua Yu
2015-05-08 21:31 ` [PATCH v3 Bugfix 6/6] x86/xsave.c: Introduce a new check that allows correct xstates copy from kernel to user directly Fenghua Yu
2015-05-09 6:09 ` [PATCH v3 Bugfix 0/6] xstate/fpu bug fixes Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox