* [PATCH v2 0/2] x86: FPU handling cleanup
@ 2024-08-08 13:41 Alejandro Vallejo
2024-08-08 13:41 ` [PATCH v2 1/2] x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu Alejandro Vallejo
2024-08-08 13:41 ` [PATCH v2 2/2] x86/fpu: Split fpu_setup_fpu() in two Alejandro Vallejo
0 siblings, 2 replies; 8+ messages in thread
From: Alejandro Vallejo @ 2024-08-08 13:41 UTC (permalink / raw)
To: Xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné
v1: https://lore.kernel.org/xen-devel/cover.1720538832.git.alejandro.vallejo@cloud.com/T/#t
v1 -> v2: v1/patch1 and v1/patch2 are already in staging.
=============================== Original cover letter =========================
I want to eventually reach a position in which the FPU state can be allocated
from the domheap and hidden via the same core mechanism proposed in Elias'
directmap removal series. Doing so is complicated by the presence of 2 aliased
pointers (v->arch.fpu_ctxt and v->arch.xsave_area) and the rather complicated
semantics of vcpu_setup_fpu(). This series tries to simplify the code so moving
to a "map/modify/unmap" model is more tractable.
Patches 1 and 2 are trivial refactors.
Patch 3 unifies FPU state so an XSAVE area is allocated per vCPU regardless of
the host supporting it or not. The rationale is that the memory savings are
negligible and not worth the extra complexity.
Patch 4 is a non-trivial split of the vcpu_setup_fpu() into 2 separate
functions. One to override x87/SSE state, and another to set a reset state.
===============================================================================
Alejandro Vallejo (2):
x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu
x86/fpu: Split fpu_setup_fpu() in two
xen/arch/x86/domain.c | 7 +-
xen/arch/x86/domctl.c | 5 +-
xen/arch/x86/hvm/emulate.c | 4 +-
xen/arch/x86/hvm/hvm.c | 32 +++++++---
xen/arch/x86/i387.c | 103 ++++++++++--------------------
xen/arch/x86/include/asm/domain.h | 8 +--
xen/arch/x86/include/asm/i387.h | 28 ++++++--
xen/arch/x86/include/asm/xstate.h | 1 +
xen/arch/x86/x86_emulate/blk.c | 3 +-
xen/arch/x86/xstate.c | 13 +++-
10 files changed, 108 insertions(+), 96 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/2] x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu
2024-08-08 13:41 [PATCH v2 0/2] x86: FPU handling cleanup Alejandro Vallejo
@ 2024-08-08 13:41 ` Alejandro Vallejo
2024-08-12 15:16 ` Jan Beulich
2024-08-08 13:41 ` [PATCH v2 2/2] x86/fpu: Split fpu_setup_fpu() in two Alejandro Vallejo
1 sibling, 1 reply; 8+ messages in thread
From: Alejandro Vallejo @ 2024-08-08 13:41 UTC (permalink / raw)
To: Xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné
fpu_ctxt is either a pointer to the legacy x87/SSE save area (used by FXSAVE) or
a pointer aliased with xsave_area that points to its fpu_sse subfield. Such
subfield is at the base and is identical in size and layout to the legacy
buffer.
This patch merges the 2 pointers in the arch_vcpu into a single XSAVE area. In
the very rare case in which the host doesn't support XSAVE all we're doing is
wasting a tiny amount of memory and trading those for a lot more simplicity in
the code.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v2:
* Added BUILD_BUG_ON(sizeof(x) != sizeof(fpusse_t)) on forceful casts
involving fpusse_t.
* Reworded comment on top of vcpu_arch->user_regs
* Added missing whitespace in x86_emulate/blk.c
---
xen/arch/x86/domctl.c | 5 +++-
xen/arch/x86/hvm/emulate.c | 4 +--
xen/arch/x86/hvm/hvm.c | 5 +++-
xen/arch/x86/i387.c | 45 +++++--------------------------
xen/arch/x86/include/asm/domain.h | 8 +++---
xen/arch/x86/x86_emulate/blk.c | 3 ++-
xen/arch/x86/xstate.c | 13 ++++++---
7 files changed, 32 insertions(+), 51 deletions(-)
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 68b5b46d1a83..bceff6be0ff3 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1344,7 +1344,10 @@ void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
#define c(fld) (c.nat->fld)
#endif
- memcpy(&c.nat->fpu_ctxt, v->arch.fpu_ctxt, sizeof(c.nat->fpu_ctxt));
+ memcpy(&c.nat->fpu_ctxt, &v->arch.xsave_area->fpu_sse,
+ sizeof(c.nat->fpu_ctxt));
+ BUILD_BUG_ON(sizeof(c.nat->fpu_ctxt) != sizeof(fpusse_t));
+
if ( is_pv_domain(d) )
c(flags = v->arch.pv.vgc_flags & ~(VGCF_i387_valid|VGCF_in_kernel));
else
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index feb4792cc567..03020542c3ba 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2363,7 +2363,7 @@ static int cf_check hvmemul_get_fpu(
alternative_vcall(hvm_funcs.fpu_dirty_intercept);
else if ( type == X86EMUL_FPU_fpu )
{
- const fpusse_t *fpu_ctxt = curr->arch.fpu_ctxt;
+ const fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse;
/*
* Latch current register state so that we can back out changes
@@ -2403,7 +2403,7 @@ static void cf_check hvmemul_put_fpu(
if ( aux )
{
- fpusse_t *fpu_ctxt = curr->arch.fpu_ctxt;
+ fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse;
bool dval = aux->dval;
int mode = hvm_guest_x86_mode(curr);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f49e29faf753..6607dba562a4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -916,7 +916,10 @@ static int cf_check hvm_save_cpu_ctxt(struct vcpu *v, hvm_domain_context_t *h)
if ( v->fpu_initialised )
{
- memcpy(ctxt.fpu_regs, v->arch.fpu_ctxt, sizeof(ctxt.fpu_regs));
+ memcpy(ctxt.fpu_regs, &v->arch.xsave_area->fpu_sse,
+ sizeof(ctxt.fpu_regs));
+ BUILD_BUG_ON(sizeof(ctxt.fpu_regs) != sizeof(fpusse_t));
+
ctxt.flags = XEN_X86_FPU_INITIALISED;
}
diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index 134e0bece519..fbb9d3584a3d 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -39,7 +39,7 @@ static inline void fpu_xrstor(struct vcpu *v, uint64_t mask)
/* Restore x87 FPU, MMX, SSE and SSE2 state */
static inline void fpu_fxrstor(struct vcpu *v)
{
- const fpusse_t *fpu_ctxt = v->arch.fpu_ctxt;
+ const fpusse_t *fpu_ctxt = &v->arch.xsave_area->fpu_sse;
/*
* Some CPUs don't save/restore FDP/FIP/FOP unless an exception
@@ -151,7 +151,7 @@ static inline void fpu_xsave(struct vcpu *v)
/* Save x87 FPU, MMX, SSE and SSE2 state */
static inline void fpu_fxsave(struct vcpu *v)
{
- fpusse_t *fpu_ctxt = v->arch.fpu_ctxt;
+ fpusse_t *fpu_ctxt = &v->arch.xsave_area->fpu_sse;
unsigned int fip_width = v->domain->arch.x87_fip_width;
if ( fip_width != 4 )
@@ -212,7 +212,7 @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts)
* above) we also need to restore full state, to prevent subsequently
* saving state belonging to another vCPU.
*/
- if ( v->arch.fully_eager_fpu || (v->arch.xsave_area && xstate_all(v)) )
+ if ( v->arch.fully_eager_fpu || xstate_all(v) )
{
if ( cpu_has_xsave )
fpu_xrstor(v, XSTATE_ALL);
@@ -299,44 +299,14 @@ void save_fpu_enable(void)
/* Initialize FPU's context save area */
int vcpu_init_fpu(struct vcpu *v)
{
- int rc;
-
v->arch.fully_eager_fpu = opt_eager_fpu;
-
- if ( (rc = xstate_alloc_save_area(v)) != 0 )
- return rc;
-
- if ( v->arch.xsave_area )
- v->arch.fpu_ctxt = &v->arch.xsave_area->fpu_sse;
- else
- {
- BUILD_BUG_ON(__alignof(v->arch.xsave_area->fpu_sse) < 16);
- v->arch.fpu_ctxt = _xzalloc(sizeof(v->arch.xsave_area->fpu_sse),
- __alignof(v->arch.xsave_area->fpu_sse));
- if ( v->arch.fpu_ctxt )
- {
- fpusse_t *fpu_sse = v->arch.fpu_ctxt;
-
- fpu_sse->fcw = FCW_DEFAULT;
- fpu_sse->mxcsr = MXCSR_DEFAULT;
- }
- else
- rc = -ENOMEM;
- }
-
- return rc;
+ return xstate_alloc_save_area(v);
}
void vcpu_setup_fpu(struct vcpu *v, struct xsave_struct *xsave_area,
const void *data, unsigned int fcw_default)
{
- /*
- * For the entire function please note that vcpu_init_fpu() (above) points
- * v->arch.fpu_ctxt into v->arch.xsave_area when XSAVE is available. Hence
- * accesses through both pointers alias one another, and the shorter form
- * is used here.
- */
- fpusse_t *fpu_sse = v->arch.fpu_ctxt;
+ fpusse_t *fpu_sse = &v->arch.xsave_area->fpu_sse;
ASSERT(!xsave_area || xsave_area == v->arch.xsave_area);
@@ -373,10 +343,7 @@ void vcpu_setup_fpu(struct vcpu *v, struct xsave_struct *xsave_area,
/* Free FPU's context save area */
void vcpu_destroy_fpu(struct vcpu *v)
{
- if ( v->arch.xsave_area )
- xstate_free_save_area(v);
- else
- xfree(v->arch.fpu_ctxt);
+ xstate_free_save_area(v);
}
/*
diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h
index bca3258d69ac..3da60af2a44a 100644
--- a/xen/arch/x86/include/asm/domain.h
+++ b/xen/arch/x86/include/asm/domain.h
@@ -592,11 +592,11 @@ struct pv_vcpu
struct arch_vcpu
{
/*
- * guest context (mirroring struct vcpu_guest_context) common
- * between pv and hvm guests
+ * Guest context common between PV and HVM guests. Includes general purpose
+ * registers, segment registers and other parts of the exception frame.
+ *
+ * It doesn't contain FPU state, as that lives in xsave_area instead.
*/
-
- void *fpu_ctxt;
struct cpu_user_regs user_regs;
/* Debug registers. */
diff --git a/xen/arch/x86/x86_emulate/blk.c b/xen/arch/x86/x86_emulate/blk.c
index e790f4f90056..28b54f26fe29 100644
--- a/xen/arch/x86/x86_emulate/blk.c
+++ b/xen/arch/x86/x86_emulate/blk.c
@@ -11,7 +11,8 @@
!defined(X86EMUL_NO_SIMD)
# ifdef __XEN__
# include <asm/xstate.h>
-# define FXSAVE_AREA current->arch.fpu_ctxt
+# define FXSAVE_AREA ((struct x86_fxsr *) \
+ (void *)¤t->arch.xsave_area->fpu_sse)
# else
# define FXSAVE_AREA get_fpu_save_area()
# endif
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 5c4144d55e89..850ee31bd18c 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -507,9 +507,16 @@ int xstate_alloc_save_area(struct vcpu *v)
unsigned int size;
if ( !cpu_has_xsave )
- return 0;
-
- if ( !is_idle_vcpu(v) || !cpu_has_xsavec )
+ {
+ /*
+ * This is bigger than FXSAVE_SIZE by 64 bytes, but it helps treating
+ * the FPU state uniformly as an XSAVE buffer even if XSAVE is not
+ * available in the host. Note the alignment restriction of the XSAVE
+ * area are stricter than those of the FXSAVE area.
+ */
+ size = XSTATE_AREA_MIN_SIZE;
+ }
+ else if ( !is_idle_vcpu(v) || !cpu_has_xsavec )
{
size = xsave_cntxt_size;
BUG_ON(size < XSTATE_AREA_MIN_SIZE);
--
2.45.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] x86/fpu: Split fpu_setup_fpu() in two
2024-08-08 13:41 [PATCH v2 0/2] x86: FPU handling cleanup Alejandro Vallejo
2024-08-08 13:41 ` [PATCH v2 1/2] x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu Alejandro Vallejo
@ 2024-08-08 13:41 ` Alejandro Vallejo
2024-08-12 15:23 ` Jan Beulich
1 sibling, 1 reply; 8+ messages in thread
From: Alejandro Vallejo @ 2024-08-08 13:41 UTC (permalink / raw)
To: Xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné
It was trying to do too many things at once and there was no clear way of
defining what it was meant to do.This commit splits the function in two.
1. A reset function, parameterized by the FCW value. FCW_RESET means to reset
the state to power-on reset values, while FCW_DEFAULT means to reset to the
default values present during vCPU creation.
2. A x87/SSE state loader (equivalent to the old function when it took a data
pointer).
While at it, make sure the abridged tag is consistent with the manuals and
start as 0xFF.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v2:
* Reworded comment about pre-Xen 4.1 migrations.
* Reset abridged FTW to -1 (tag=0x5555), as per the manuals.
* Undo const cast-away.
* Split vcpu_reset_fpu() into vcpu_default_fpu()
* vcpu_init_fpu() already exists.
* Removed backticks from comments
---
xen/arch/x86/domain.c | 7 ++--
xen/arch/x86/hvm/hvm.c | 27 ++++++++++----
xen/arch/x86/i387.c | 60 +++++++++++++++----------------
xen/arch/x86/include/asm/i387.h | 28 ++++++++++++---
xen/arch/x86/include/asm/xstate.h | 1 +
5 files changed, 77 insertions(+), 46 deletions(-)
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index d977ec71ca20..5af9e3e7a8b4 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1186,9 +1186,10 @@ int arch_set_info_guest(
is_pv_64bit_domain(d) )
v->arch.flags &= ~TF_kernel_mode;
- vcpu_setup_fpu(v, v->arch.xsave_area,
- flags & VGCF_I387_VALID ? &c.nat->fpu_ctxt : NULL,
- FCW_DEFAULT);
+ if ( flags & VGCF_I387_VALID )
+ vcpu_setup_fpu(v, &c.nat->fpu_ctxt);
+ else
+ vcpu_default_fpu(v);
if ( !compat )
{
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 6607dba562a4..83cb21884ce6 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1164,10 +1164,25 @@ static int cf_check hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
seg.attr = ctxt.ldtr_arbytes;
hvm_set_segment_register(v, x86_seg_ldtr, &seg);
- /* Cover xsave-absent save file restoration on xsave-capable host. */
- vcpu_setup_fpu(v, xsave_enabled(v) ? NULL : v->arch.xsave_area,
- ctxt.flags & XEN_X86_FPU_INITIALISED ? ctxt.fpu_regs : NULL,
- FCW_RESET);
+ /*
+ * On Xen 4.1 and later the FPU state is restored on later HVM context in
+ * the migrate stream, so what we're doing here is initialising the FPU
+ * state for guests from even older versions of Xen.
+ *
+ * In particular:
+ * 1. If there's an XSAVE context later in the stream what we do here for
+ * the FPU doesn't matter because it'll be overriden later.
+ * 2. If there isn't and the guest didn't use extended states it's still
+ * fine because we have all the information we need here.
+ * 3. If there isn't and the guest DID use extended states (could've
+ * happened prior to Xen 4.1) then we're in a pickle because we have
+ * to make up non-existing state. For this case we initialise the FPU
+ * as using x87/SSE only because the rest of the state is gone.
+ */
+ if ( ctxt.flags & XEN_X86_FPU_INITIALISED )
+ vcpu_setup_fpu(v, &ctxt.fpu_regs);
+ else
+ vcpu_reset_fpu(v);
v->arch.user_regs.rax = ctxt.rax;
v->arch.user_regs.rbx = ctxt.rbx;
@@ -4007,9 +4022,7 @@ void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
v->arch.guest_table = pagetable_null();
}
- if ( v->arch.xsave_area )
- v->arch.xsave_area->xsave_hdr.xstate_bv = 0;
- vcpu_setup_fpu(v, v->arch.xsave_area, NULL, FCW_RESET);
+ vcpu_reset_fpu(v);
arch_vcpu_regs_init(v);
v->arch.user_regs.rip = ip;
diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index fbb9d3584a3d..af5ae805998a 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -303,41 +303,37 @@ int vcpu_init_fpu(struct vcpu *v)
return xstate_alloc_save_area(v);
}
-void vcpu_setup_fpu(struct vcpu *v, struct xsave_struct *xsave_area,
- const void *data, unsigned int fcw_default)
+void vcpu_reset_fpu(struct vcpu *v)
{
- fpusse_t *fpu_sse = &v->arch.xsave_area->fpu_sse;
-
- ASSERT(!xsave_area || xsave_area == v->arch.xsave_area);
-
- v->fpu_initialised = !!data;
+ v->fpu_initialised = false;
+ *v->arch.xsave_area = (struct xsave_struct) {
+ .fpu_sse = {
+ .mxcsr = MXCSR_DEFAULT,
+ .fcw = FCW_RESET,
+ .ftw = FTW_RESET,
+ },
+ .xsave_hdr.xstate_bv = fcw == X86_XCR0_X87,
+ };
+}
- if ( data )
- {
- memcpy(fpu_sse, data, sizeof(*fpu_sse));
- if ( xsave_area )
- xsave_area->xsave_hdr.xstate_bv = XSTATE_FP_SSE;
- }
- else if ( xsave_area && fcw_default == FCW_DEFAULT )
- {
- xsave_area->xsave_hdr.xstate_bv = 0;
- fpu_sse->mxcsr = MXCSR_DEFAULT;
- }
- else
- {
- memset(fpu_sse, 0, sizeof(*fpu_sse));
- fpu_sse->fcw = fcw_default;
- fpu_sse->mxcsr = MXCSR_DEFAULT;
- if ( v->arch.xsave_area )
- {
- v->arch.xsave_area->xsave_hdr.xstate_bv &= ~XSTATE_FP_SSE;
- if ( fcw_default != FCW_DEFAULT )
- v->arch.xsave_area->xsave_hdr.xstate_bv |= X86_XCR0_X87;
- }
- }
+void vcpu_default_fpu(struct vcpu *v)
+{
+ v->fpu_initialised = false;
+ *v->arch.xsave_area = (struct xsave_struct) {
+ .fpu_sse = {
+ .mxcsr = MXCSR_DEFAULT,
+ .fcw = FCW_DEFAULT,
+ },
+ };
+}
- if ( xsave_area )
- xsave_area->xsave_hdr.xcomp_bv = 0;
+void vcpu_setup_fpu(struct vcpu *v, const void *data)
+{
+ v->fpu_initialised = true;
+ *v->arch.xsave_area = (struct xsave_struct) {
+ .fpu_sse = *(const fpusse_t*)data,
+ .xsave_hdr.xstate_bv = XSTATE_FP_SSE,
+ };
}
/* Free FPU's context save area */
diff --git a/xen/arch/x86/include/asm/i387.h b/xen/arch/x86/include/asm/i387.h
index a783549db991..7a69577de45b 100644
--- a/xen/arch/x86/include/asm/i387.h
+++ b/xen/arch/x86/include/asm/i387.h
@@ -31,10 +31,30 @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts);
void vcpu_restore_fpu_lazy(struct vcpu *v);
void vcpu_save_fpu(struct vcpu *v);
void save_fpu_enable(void);
-
int vcpu_init_fpu(struct vcpu *v);
-struct xsave_struct;
-void vcpu_setup_fpu(struct vcpu *v, struct xsave_struct *xsave_area,
- const void *data, unsigned int fcw_default);
void vcpu_destroy_fpu(struct vcpu *v);
+
+/*
+ * Restore v's FPU to power-on reset values
+ *
+ * @param v vCPU containing the FPU
+ */
+void vcpu_reset_fpu(struct vcpu *v);
+
+/*
+ * Restore v's FPU to default values
+ *
+ * @param v vCPU containing the FPU
+ */
+void vcpu_default_fpu(struct vcpu *v);
+
+/*
+ * Load x87/SSE state into v's FPU
+ *
+ * Overrides the XSAVE header to set the state components to be x87 and SSE.
+ *
+ * @param v vCPU containing the FPU
+ * @param data 512-octet blob for x87/SSE state
+ */
+void vcpu_setup_fpu(struct vcpu *v, const void *data);
#endif /* __ASM_I386_I387_H */
diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
index ebeb2a3dcaf9..6144ed6f8551 100644
--- a/xen/arch/x86/include/asm/xstate.h
+++ b/xen/arch/x86/include/asm/xstate.h
@@ -14,6 +14,7 @@
#define FCW_DEFAULT 0x037f
#define FCW_RESET 0x0040
+#define FTW_RESET 0xFF
#define MXCSR_DEFAULT 0x1f80
extern uint32_t mxcsr_mask;
--
2.45.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu
2024-08-08 13:41 ` [PATCH v2 1/2] x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu Alejandro Vallejo
@ 2024-08-12 15:16 ` Jan Beulich
2024-08-13 12:31 ` Alejandro Vallejo
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2024-08-12 15:16 UTC (permalink / raw)
To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, Xen-devel
On 08.08.2024 15:41, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -1344,7 +1344,10 @@ void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
> #define c(fld) (c.nat->fld)
> #endif
>
> - memcpy(&c.nat->fpu_ctxt, v->arch.fpu_ctxt, sizeof(c.nat->fpu_ctxt));
> + memcpy(&c.nat->fpu_ctxt, &v->arch.xsave_area->fpu_sse,
> + sizeof(c.nat->fpu_ctxt));
> + BUILD_BUG_ON(sizeof(c.nat->fpu_ctxt) != sizeof(fpusse_t));
While it may seem unlikely that it would change going forward, I think
that such build-time checks should make no implications at all. I.e.
here the right side ought to be sizeof(v->arch.xsave_area->fpu_sse)
even if that's longer.
Personally I also think that BUILD_BUG_ON(), just like BUG_ON(), would
better live ahead of the construct they're for.
Same again in at least one more place.
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] x86/fpu: Split fpu_setup_fpu() in two
2024-08-08 13:41 ` [PATCH v2 2/2] x86/fpu: Split fpu_setup_fpu() in two Alejandro Vallejo
@ 2024-08-12 15:23 ` Jan Beulich
2024-08-13 12:40 ` Alejandro Vallejo
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2024-08-12 15:23 UTC (permalink / raw)
To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, Xen-devel
On 08.08.2024 15:41, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1164,10 +1164,25 @@ static int cf_check hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
> seg.attr = ctxt.ldtr_arbytes;
> hvm_set_segment_register(v, x86_seg_ldtr, &seg);
>
> - /* Cover xsave-absent save file restoration on xsave-capable host. */
> - vcpu_setup_fpu(v, xsave_enabled(v) ? NULL : v->arch.xsave_area,
> - ctxt.flags & XEN_X86_FPU_INITIALISED ? ctxt.fpu_regs : NULL,
> - FCW_RESET);
> + /*
> + * On Xen 4.1 and later the FPU state is restored on later HVM context in
> + * the migrate stream, so what we're doing here is initialising the FPU
> + * state for guests from even older versions of Xen.
> + *
> + * In particular:
> + * 1. If there's an XSAVE context later in the stream what we do here for
> + * the FPU doesn't matter because it'll be overriden later.
> + * 2. If there isn't and the guest didn't use extended states it's still
> + * fine because we have all the information we need here.
> + * 3. If there isn't and the guest DID use extended states (could've
> + * happened prior to Xen 4.1) then we're in a pickle because we have
> + * to make up non-existing state. For this case we initialise the FPU
> + * as using x87/SSE only because the rest of the state is gone.
Was this really possible to happen? Guests wouldn't have been able to
turn on CR4.OSXSAVE, would they?
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu
2024-08-12 15:16 ` Jan Beulich
@ 2024-08-13 12:31 ` Alejandro Vallejo
0 siblings, 0 replies; 8+ messages in thread
From: Alejandro Vallejo @ 2024-08-13 12:31 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, Xen-devel
On Mon Aug 12, 2024 at 4:16 PM BST, Jan Beulich wrote:
> On 08.08.2024 15:41, Alejandro Vallejo wrote:
> > --- a/xen/arch/x86/domctl.c
> > +++ b/xen/arch/x86/domctl.c
> > @@ -1344,7 +1344,10 @@ void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
> > #define c(fld) (c.nat->fld)
> > #endif
> >
> > - memcpy(&c.nat->fpu_ctxt, v->arch.fpu_ctxt, sizeof(c.nat->fpu_ctxt));
> > + memcpy(&c.nat->fpu_ctxt, &v->arch.xsave_area->fpu_sse,
> > + sizeof(c.nat->fpu_ctxt));
> > + BUILD_BUG_ON(sizeof(c.nat->fpu_ctxt) != sizeof(fpusse_t));
>
> While it may seem unlikely that it would change going forward, I think
> that such build-time checks should make no implications at all. I.e.
> here the right side ought to be sizeof(v->arch.xsave_area->fpu_sse)
> even if that's longer.
Sounds sensible.
>
> Personally I also think that BUILD_BUG_ON(), just like BUG_ON(), would
> better live ahead of the construct they're for.
>
> Same again in at least one more place.
>
> Jan
Ack, sure.
Cheers,
Alejandro
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] x86/fpu: Split fpu_setup_fpu() in two
2024-08-12 15:23 ` Jan Beulich
@ 2024-08-13 12:40 ` Alejandro Vallejo
2024-08-13 12:47 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: Alejandro Vallejo @ 2024-08-13 12:40 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, Xen-devel
On Mon Aug 12, 2024 at 4:23 PM BST, Jan Beulich wrote:
> On 08.08.2024 15:41, Alejandro Vallejo wrote:
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -1164,10 +1164,25 @@ static int cf_check hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
> > seg.attr = ctxt.ldtr_arbytes;
> > hvm_set_segment_register(v, x86_seg_ldtr, &seg);
> >
> > - /* Cover xsave-absent save file restoration on xsave-capable host. */
> > - vcpu_setup_fpu(v, xsave_enabled(v) ? NULL : v->arch.xsave_area,
> > - ctxt.flags & XEN_X86_FPU_INITIALISED ? ctxt.fpu_regs : NULL,
> > - FCW_RESET);
> > + /*
> > + * On Xen 4.1 and later the FPU state is restored on later HVM context in
> > + * the migrate stream, so what we're doing here is initialising the FPU
> > + * state for guests from even older versions of Xen.
> > + *
> > + * In particular:
> > + * 1. If there's an XSAVE context later in the stream what we do here for
> > + * the FPU doesn't matter because it'll be overriden later.
> > + * 2. If there isn't and the guest didn't use extended states it's still
> > + * fine because we have all the information we need here.
> > + * 3. If there isn't and the guest DID use extended states (could've
> > + * happened prior to Xen 4.1) then we're in a pickle because we have
> > + * to make up non-existing state. For this case we initialise the FPU
> > + * as using x87/SSE only because the rest of the state is gone.
>
> Was this really possible to happen? Guests wouldn't have been able to
> turn on CR4.OSXSAVE, would they?
>
> Jan
You may be right, but my reading of the comment and the code was that
xsave_enabled(v) might be set and the XSAVE hvm context might be missing in the
stream. The archives didn't shed a lot more light than what the code already
gives away.
Otherwise it would've been far simpler to unconditionally pass
v->arch.xsave_area to the second parameter and let the xsave area to be
overriden by the follow-up HVM context with its actual state.
If my understanding is wrong, I'm happy to remove (3), as I don't think it
affects the code anyway. I thought however that it was a relevant data point
to leave paper trail for.
Cheers,
Alejandro
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] x86/fpu: Split fpu_setup_fpu() in two
2024-08-13 12:40 ` Alejandro Vallejo
@ 2024-08-13 12:47 ` Jan Beulich
0 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2024-08-13 12:47 UTC (permalink / raw)
To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, Xen-devel
On 13.08.2024 14:40, Alejandro Vallejo wrote:
> On Mon Aug 12, 2024 at 4:23 PM BST, Jan Beulich wrote:
>> On 08.08.2024 15:41, Alejandro Vallejo wrote:
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -1164,10 +1164,25 @@ static int cf_check hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
>>> seg.attr = ctxt.ldtr_arbytes;
>>> hvm_set_segment_register(v, x86_seg_ldtr, &seg);
>>>
>>> - /* Cover xsave-absent save file restoration on xsave-capable host. */
>>> - vcpu_setup_fpu(v, xsave_enabled(v) ? NULL : v->arch.xsave_area,
>>> - ctxt.flags & XEN_X86_FPU_INITIALISED ? ctxt.fpu_regs : NULL,
>>> - FCW_RESET);
>>> + /*
>>> + * On Xen 4.1 and later the FPU state is restored on later HVM context in
>>> + * the migrate stream, so what we're doing here is initialising the FPU
>>> + * state for guests from even older versions of Xen.
>>> + *
>>> + * In particular:
>>> + * 1. If there's an XSAVE context later in the stream what we do here for
>>> + * the FPU doesn't matter because it'll be overriden later.
>>> + * 2. If there isn't and the guest didn't use extended states it's still
>>> + * fine because we have all the information we need here.
>>> + * 3. If there isn't and the guest DID use extended states (could've
>>> + * happened prior to Xen 4.1) then we're in a pickle because we have
>>> + * to make up non-existing state. For this case we initialise the FPU
>>> + * as using x87/SSE only because the rest of the state is gone.
>>
>> Was this really possible to happen? Guests wouldn't have been able to
>> turn on CR4.OSXSAVE, would they?
>
> You may be right, but my reading of the comment and the code was that
> xsave_enabled(v) might be set and the XSAVE hvm context might be missing in the
> stream. The archives didn't shed a lot more light than what the code already
> gives away.
>
> Otherwise it would've been far simpler to unconditionally pass
> v->arch.xsave_area to the second parameter and let the xsave area to be
> overriden by the follow-up HVM context with its actual state.
>
> If my understanding is wrong, I'm happy to remove (3), as I don't think it
> affects the code anyway. I thought however that it was a relevant data point
> to leave paper trail for.
I would certainly agree - as long as it describes (past) reality. If it
doesn't, I consider it misleading.
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-08-13 12:47 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-08 13:41 [PATCH v2 0/2] x86: FPU handling cleanup Alejandro Vallejo
2024-08-08 13:41 ` [PATCH v2 1/2] x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu Alejandro Vallejo
2024-08-12 15:16 ` Jan Beulich
2024-08-13 12:31 ` Alejandro Vallejo
2024-08-08 13:41 ` [PATCH v2 2/2] x86/fpu: Split fpu_setup_fpu() in two Alejandro Vallejo
2024-08-12 15:23 ` Jan Beulich
2024-08-13 12:40 ` Alejandro Vallejo
2024-08-13 12:47 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.