All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/14] x86: Address Space Isolation FPU preparations
@ 2024-10-28 15:49 Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 01/14] x86/xstate: Update stale assertions in fpu_x{rstor,save}() Alejandro Vallejo
                   ` (13 more replies)
  0 siblings, 14 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné, Elias El Yandouzi, Julien Grall

In a Xen build with Address Space Isolation the FPU state cannot come from the
xenheap, as that means the FPU state of vCPU_A may be speculatively accesible
from any pCPU running a hypercall on behalf of vCPU_B. This series prepares
code that manipulates the FPU state to use wrappers that fetch said state from
"elsewhere"[1]. Those wrappers will crystalise into something more than dummy
accesors after existing ASI efforts are merged. So far, they are:

  a) Remove the directmap (Elias El Yadouzi):
  https://lore.kernel.org/xen-devel/20240513134046.82605-1-eliasely@amazon.com/

    Removes all confidential data pages from the directmap and sets up the
    infrastructure to access them. Its trust boundary is the domain and builds
    the foundations of the secret hiding API around {un,}map_domain_page().

  b) x86: adventures in Address Space Isolation (Roger Pau Monne):
  https://lore.kernel.org/xen-devel/20240726152206.28411-1-roger.pau@citrix.com/

    Extends (a) to put the trust boundary at the vCPU instead so the threat
    model covers mutually distrustful vCPUs of the same domain. Extends the
    API for secret hiding to provide private pCPU-local resources. And an
    efficient means of accessing resources of the "current" vCPU.

In essence, the idea is to stop directly accessing a pointer in the vCPU
structure and instead collect it indirectly via a macro invocation. The
proposed API is a map/unmap pair in order to tame the complexity involved in
the various cases uniformly (Does the domain run with ASI enabled? Is the vCPU
"current"? Are we lazy-switching?).

The series is somewhat long, but each patch is fairly trivial. If need be, I
can fold back a lot of these onto single commits to make it shorter.

  * Patch 1 refreshes of a couple of asserts back into something helpful. Can
    be folded onto patches 12 and 13 if deemed too silly for a Fixes tag.

  * Patch 2 is the introduction of the wrappers in isolation.

  * Patches 3 - 10 are split for ease of review, but are conceptually the same
    thing over and over (to stop using direct v->arch.xsave_area and to use
    wrappers instead).

  * Patch 11 cleans the idle vcpu state after using it as dumping groung. It's
    not strictly required for this series, but I'm bound to forget to do it
    later after we _do_ care and does no harm to do it now. It's otherwise
    independent of the other patches (it clashes with 10, but only due to both
    modifying the same code; it's conceptually independent).

  * Patches 12 and 13 bite the bullet and enlightens the (f)xsave and (f)xrstor
    abstractions to use the wrappers rather than direct access.

  * Patches 14 is the last remaining direct use xsave area. It's too tricky to
    introduce ahead of patches 12 and 13 because they need state passed not
    available until those have gone in.

[1] That "elsewhere" will be with high likelihood either the directmap (on
non-ASI), some perma-mapped vCPU-local area (see series (b) at the top) or
implemented as a transient mapping in the style of {un,}map_domain_page() for
glacially cold accesses to non-current vCPUs. Importantly, writing the final
macros involve the other series going in.

Alejandro Vallejo (14):
  x86/xstate: Update stale assertions in fpu_x{rstor,save}()
  x86/xstate: Create map/unmap primitives for xsave areas
  x86/hvm: Map/unmap xsave area in hvm_save_cpu_ctxt()
  x86/fpu: Map/umap xsave area in vcpu_{reset,setup}_fpu()
  x86/xstate: Map/unmap xsave area in xstate_set_init() and
    handle_setbv()
  x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu()
  x86/domctl: Map/unmap xsave area in arch_get_info_guest()
  x86/xstate: Map/unmap xsave area in {compress,expand}_xsave_states()
  x86/emulator: Refactor FXSAVE_AREA to use wrappers
  x86/mpx: Map/unmap xsave area in in read_bndcfgu()
  x86/mpx: Adjust read_bndcfgu() to clean after itself
  x86/fpu: Pass explicit xsave areas to fpu_(f)xsave()
  x86/fpu: Pass explicit xsave areas to fpu_(f)xrstor()
  x86/xstate: Make xstate_all() and vcpu_xsave_mask() take explicit
    xstate

 xen/arch/x86/domctl.c             |  9 +++--
 xen/arch/x86/hvm/emulate.c        | 10 ++++-
 xen/arch/x86/hvm/hvm.c            |  8 ++--
 xen/arch/x86/i387.c               | 67 ++++++++++++++++++++-----------
 xen/arch/x86/include/asm/xstate.h | 29 +++++++++++--
 xen/arch/x86/x86_emulate/blk.c    | 10 ++++-
 xen/arch/x86/xstate.c             | 51 ++++++++++++++++-------
 7 files changed, 130 insertions(+), 54 deletions(-)

-- 
2.47.0



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/14] x86/xstate: Update stale assertions in fpu_x{rstor,save}()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-28 17:16   ` Andrew Cooper
  2024-10-28 15:49 ` [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas Alejandro Vallejo
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

The asserts' intent was to establish whether the xsave instruction was
usable or not, which at the time was strictly given by the presence of
the xsave area. After edb48e76458b("x86/fpu: Combine fpu_ctxt and
xsave_area in arch_vcpu"), that area is always present a more relevant
assert is that the host supports XSAVE.

Fixes: edb48e76458b("x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu")
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
I'd also be ok with removing the assertions altogether. They serve very
little purpose there after the merge of xsave and fpu_ctxt.
---
 xen/arch/x86/i387.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index 83f9b2502bff..375a8274f632 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -24,7 +24,7 @@ static inline void fpu_xrstor(struct vcpu *v, uint64_t mask)
 {
     bool ok;
 
-    ASSERT(v->arch.xsave_area);
+    ASSERT(cpu_has_xsave);
     /*
      * XCR0 normally represents what guest OS set. In case of Xen itself,
      * we set the accumulated feature mask before doing save/restore.
@@ -136,7 +136,7 @@ static inline void fpu_xsave(struct vcpu *v)
     uint64_t mask = vcpu_xsave_mask(v);
 
     ASSERT(mask);
-    ASSERT(v->arch.xsave_area);
+    ASSERT(cpu_has_xsave);
     /*
      * XCR0 normally represents what guest OS set. In case of Xen itself,
      * we set the accumulated feature mask before doing save/restore.
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 01/14] x86/xstate: Update stale assertions in fpu_x{rstor,save}() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-28 17:20   ` Andrew Cooper
  2024-10-29  8:19   ` Jan Beulich
  2024-10-28 15:49 ` [PATCH 03/14] x86/hvm: Map/unmap xsave area in hvm_save_cpu_ctxt() Alejandro Vallejo
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

Add infrastructure to simplify ASI handling. With ASI in the picture
we'll have several different means of accessing the XSAVE area of a
given vCPU, depending on whether a domain is covered by ASI or not and
whether the vCPU is question is scheduled on the current pCPU or not.

Having these complexities exposed at the call sites becomes unwieldy
very fast. These wrappers are intended to be used in a similar way to
map_domain_page() and unmap_domain_page(); The map operation will
dispatch the appropriate pointer for each case in a future patch, while
unmap will remain a no-op where no unmap is required (e.g: when there's
no ASI) and remove the transient maping if one was required.

Follow-up patches replace all uses of raw v->arch.xsave_area by this
mechanism in preparation to add the beforementioned dispatch logic to be
added at a later time.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/include/asm/xstate.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
index 07017cc4edfd..36260459667c 100644
--- a/xen/arch/x86/include/asm/xstate.h
+++ b/xen/arch/x86/include/asm/xstate.h
@@ -143,4 +143,24 @@ static inline bool xstate_all(const struct vcpu *v)
            (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
 }
 
+/*
+ * Fetch a pointer to the XSAVE area of a vCPU
+ *
+ * If ASI is enabled for the domain, this mapping is pCPU-local.
+ *
+ * @param v Owner of the XSAVE area
+ */
+#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area)
+
+/*
+ * Drops the XSAVE area of a vCPU and nullifies its pointer on exit.
+ *
+ * If ASI is enabled and v is not the currently scheduled vCPU then the
+ * per-pCPU mapping is removed from the address space.
+ *
+ * @param v           vCPU logically owning xsave_area
+ * @param xsave_area  XSAVE blob of v
+ */
+#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; })
+
 #endif /* __ASM_XSTATE_H */
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/14] x86/hvm: Map/unmap xsave area in hvm_save_cpu_ctxt()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 01/14] x86/xstate: Update stale assertions in fpu_x{rstor,save}() Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 04/14] x86/fpu: Map/umap xsave area in vcpu_{reset,setup}_fpu() Alejandro Vallejo
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/hvm/hvm.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 018d44a08b6b..77b975f07f32 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -914,11 +914,11 @@ static int cf_check hvm_save_cpu_ctxt(struct vcpu *v, hvm_domain_context_t *h)
 
     if ( v->fpu_initialised )
     {
-        BUILD_BUG_ON(sizeof(ctxt.fpu_regs) !=
-                     sizeof(v->arch.xsave_area->fpu_sse));
-        memcpy(ctxt.fpu_regs, &v->arch.xsave_area->fpu_sse,
-               sizeof(ctxt.fpu_regs));
+        const struct xsave_struct *xsave_area = vcpu_map_xsave_area(v);
 
+        BUILD_BUG_ON(sizeof(ctxt.fpu_regs) != sizeof(xsave_area->fpu_sse));
+        memcpy(ctxt.fpu_regs, &xsave_area->fpu_sse, sizeof(ctxt.fpu_regs));
+        vcpu_unmap_xsave_area(v, xsave_area);
         ctxt.flags = XEN_X86_FPU_INITIALISED;
     }
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/14] x86/fpu: Map/umap xsave area in vcpu_{reset,setup}_fpu()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (2 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 03/14] x86/hvm: Map/unmap xsave area in hvm_save_cpu_ctxt() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv() Alejandro Vallejo
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/i387.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index 375a8274f632..a571bcb23c91 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -306,8 +306,10 @@ int vcpu_init_fpu(struct vcpu *v)
 
 void vcpu_reset_fpu(struct vcpu *v)
 {
+    struct xsave_struct *xsave_area = vcpu_map_xsave_area(v);
+
     v->fpu_initialised = false;
-    *v->arch.xsave_area = (struct xsave_struct) {
+    *xsave_area = (struct xsave_struct) {
         .fpu_sse = {
             .mxcsr = MXCSR_DEFAULT,
             .fcw = FCW_RESET,
@@ -315,15 +317,21 @@ void vcpu_reset_fpu(struct vcpu *v)
         },
         .xsave_hdr.xstate_bv = X86_XCR0_X87,
     };
+
+    vcpu_unmap_xsave_area(v, xsave_area);
 }
 
 void vcpu_setup_fpu(struct vcpu *v, const void *data)
 {
+    struct xsave_struct *xsave_area = vcpu_map_xsave_area(v);
+
     v->fpu_initialised = true;
-    *v->arch.xsave_area = (struct xsave_struct) {
+    *xsave_area = (struct xsave_struct) {
         .fpu_sse = *(const fpusse_t*)data,
         .xsave_hdr.xstate_bv = XSTATE_FP_SSE,
     };
+
+    vcpu_unmap_xsave_area(v, xsave_area);
 }
 
 /* Free FPU's context save area */
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (3 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 04/14] x86/fpu: Map/umap xsave area in vcpu_{reset,setup}_fpu() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-29  8:26   ` Jan Beulich
  2024-10-28 15:49 ` [PATCH 06/14] x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu() Alejandro Vallejo
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/xstate.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index af9e345a7ace..60e752a245ca 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -993,7 +993,12 @@ int handle_xsetbv(u32 index, u64 new_bv)
 
         clts();
         if ( curr->fpu_dirtied )
-            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
+        {
+            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
+
+            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
+            vcpu_unmap_xsave_area(curr, xsave_area);
+        }
         else if ( xstate_all(curr) )
         {
             /* See the comment in i387.c:vcpu_restore_fpu_eager(). */
@@ -1048,7 +1053,7 @@ void xstate_set_init(uint64_t mask)
     unsigned long cr0 = read_cr0();
     unsigned long xcr0 = this_cpu(xcr0);
     struct vcpu *v = idle_vcpu[smp_processor_id()];
-    struct xsave_struct *xstate = v->arch.xsave_area;
+    struct xsave_struct *xstate;
 
     if ( ~xfeature_mask & mask )
     {
@@ -1061,8 +1066,10 @@ void xstate_set_init(uint64_t mask)
 
     clts();
 
+    xstate = vcpu_map_xsave_area(v);
     memset(&xstate->xsave_hdr, 0, sizeof(xstate->xsave_hdr));
     xrstor(v, mask);
+    vcpu_unmap_xsave_area(v, xstate);
 
     if ( cr0 & X86_CR0_TS )
         write_cr0(cr0);
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/14] x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (4 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-29  8:30   ` Jan Beulich
  2024-10-28 15:49 ` [PATCH 07/14] x86/domctl: Map/unmap xsave area in arch_get_info_guest() Alejandro Vallejo
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/hvm/emulate.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index f2bc6967dfcb..a6ddc9928f16 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2371,7 +2371,8 @@ static int cf_check hvmemul_get_fpu(
         alternative_vcall(hvm_funcs.fpu_dirty_intercept);
     else if ( type == X86EMUL_FPU_fpu )
     {
-        const fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse;
+        const struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
+        const fpusse_t *fpu_ctxt = &xsave_area->fpu_sse;
 
         /*
          * Latch current register state so that we can back out changes
@@ -2397,6 +2398,8 @@ static int cf_check hvmemul_get_fpu(
             else
                 ASSERT(fcw == fpu_ctxt->fcw);
         }
+
+        vcpu_unmap_xsave_area(curr, xsave_area);
     }
 
     return X86EMUL_OKAY;
@@ -2411,7 +2414,8 @@ static void cf_check hvmemul_put_fpu(
 
     if ( aux )
     {
-        fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse;
+        struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
+        fpusse_t *fpu_ctxt = &xsave_area->fpu_sse;
         bool dval = aux->dval;
         int mode = hvm_guest_x86_mode(curr);
 
@@ -2465,6 +2469,8 @@ static void cf_check hvmemul_put_fpu(
 
         fpu_ctxt->fop = aux->op;
 
+        vcpu_unmap_xsave_area(curr, xsave_area);
+
         /* Re-use backout code below. */
         backout = X86EMUL_FPU_fpu;
     }
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/14] x86/domctl: Map/unmap xsave area in arch_get_info_guest()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (5 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 06/14] x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 08/14] x86/xstate: Map/unmap xsave area in {compress,expand}_xsave_states() Alejandro Vallejo
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/domctl.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 5f01111619da..8f6075bc84b8 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1377,16 +1377,17 @@ void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
     unsigned int i;
     const struct domain *d = v->domain;
     bool compat = is_pv_32bit_domain(d);
+    const struct xsave_struct *xsave_area;
 #ifdef CONFIG_COMPAT
 #define c(fld) (!compat ? (c.nat->fld) : (c.cmp->fld))
 #else
 #define c(fld) (c.nat->fld)
 #endif
 
-    BUILD_BUG_ON(sizeof(c.nat->fpu_ctxt) !=
-                 sizeof(v->arch.xsave_area->fpu_sse));
-    memcpy(&c.nat->fpu_ctxt, &v->arch.xsave_area->fpu_sse,
-           sizeof(c.nat->fpu_ctxt));
+    xsave_area = vcpu_map_xsave_area(v);
+    BUILD_BUG_ON(sizeof(c.nat->fpu_ctxt) != sizeof(xsave_area->fpu_sse));
+    memcpy(&c.nat->fpu_ctxt, &xsave_area->fpu_sse, sizeof(c.nat->fpu_ctxt));
+    vcpu_unmap_xsave_area(v, xsave_area);
 
     if ( is_pv_domain(d) )
         c(flags = v->arch.pv.vgc_flags & ~(VGCF_i387_valid|VGCF_in_kernel));
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/14] x86/xstate: Map/unmap xsave area in {compress,expand}_xsave_states()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (6 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 07/14] x86/domctl: Map/unmap xsave area in arch_get_info_guest() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 09/14] x86/emulator: Refactor FXSAVE_AREA to use wrappers Alejandro Vallejo
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/xstate.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 60e752a245ca..4019ca4aae83 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -177,7 +177,7 @@ static void setup_xstate_comp(uint16_t *comp_offsets,
  */
 void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size)
 {
-    const struct xsave_struct *xstate = v->arch.xsave_area;
+    const struct xsave_struct *xstate = vcpu_map_xsave_area(v);
     const void *src;
     uint16_t comp_offsets[sizeof(xfeature_mask)*8];
     u64 xstate_bv = xstate->xsave_hdr.xstate_bv;
@@ -228,6 +228,8 @@ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size)
 
         valid &= ~feature;
     }
+
+    vcpu_unmap_xsave_area(v, xstate);
 }
 
 /*
@@ -242,7 +244,7 @@ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size)
  */
 void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size)
 {
-    struct xsave_struct *xstate = v->arch.xsave_area;
+    struct xsave_struct *xstate = vcpu_map_xsave_area(v);
     void *dest;
     uint16_t comp_offsets[sizeof(xfeature_mask)*8];
     u64 xstate_bv, valid;
@@ -294,6 +296,8 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size)
 
         valid &= ~feature;
     }
+
+    vcpu_unmap_xsave_area(v, xstate);
 }
 
 void xsave(struct vcpu *v, uint64_t mask)
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/14] x86/emulator: Refactor FXSAVE_AREA to use wrappers
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (7 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 08/14] x86/xstate: Map/unmap xsave area in {compress,expand}_xsave_states() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 10/14] x86/mpx: Map/unmap xsave area in in read_bndcfgu() Alejandro Vallejo
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

Adds an UNMAP primitive to make use of vcpu_unmap_xsave_area() when
linked into xen. unmap is a no-op during tests.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/x86_emulate/blk.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/x86_emulate/blk.c b/xen/arch/x86/x86_emulate/blk.c
index 08a05f8453f7..d5b59333823f 100644
--- a/xen/arch/x86/x86_emulate/blk.c
+++ b/xen/arch/x86/x86_emulate/blk.c
@@ -11,9 +11,11 @@
     !defined(X86EMUL_NO_SIMD)
 # ifdef __XEN__
 #  include <asm/xstate.h>
-#  define FXSAVE_AREA ((void *)&current->arch.xsave_area->fpu_sse)
+#  define FXSAVE_AREA ((void *)vcpu_map_xsave_area(current))
+#  define UNMAP_FXSAVE_AREA(x) vcpu_unmap_xsave_area(currt ent, x)
 # else
 #  define FXSAVE_AREA get_fpu_save_area()
+#  define UNMAP_FXSAVE_AREA(x) ((void)x)
 # endif
 #endif
 
@@ -292,6 +294,9 @@ int x86_emul_blk(
         }
         else
             asm volatile ( "fxrstor %0" :: "m" (*fxsr) );
+
+        UNMAP_FXSAVE_AREA(fxsr);
+
         break;
     }
 
@@ -320,6 +325,9 @@ int x86_emul_blk(
 
         if ( fxsr != ptr ) /* i.e. s->op_bytes < sizeof(*fxsr) */
             memcpy(ptr, fxsr, s->op_bytes);
+
+        UNMAP_FXSAVE_AREA(fxsr);
+
         break;
     }
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/14] x86/mpx: Map/unmap xsave area in in read_bndcfgu()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (8 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 09/14] x86/emulator: Refactor FXSAVE_AREA to use wrappers Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-29  8:27   ` Jan Beulich
  2024-10-28 15:49 ` [PATCH 11/14] x86/mpx: Adjust read_bndcfgu() to clean after itself Alejandro Vallejo
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/xstate.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 4019ca4aae83..2a54da2823cf 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -1021,9 +1021,10 @@ int handle_xsetbv(u32 index, u64 new_bv)
 
 uint64_t read_bndcfgu(void)
 {
+    uint64_t ret = 0;
     unsigned long cr0 = read_cr0();
-    struct xsave_struct *xstate
-        = idle_vcpu[smp_processor_id()]->arch.xsave_area;
+    struct vcpu *v = idle_vcpu[smp_processor_id()];
+    struct xsave_struct *xstate = vcpu_map_xsave_area(v);
     const struct xstate_bndcsr *bndcsr;
 
     ASSERT(cpu_has_mpx);
@@ -1049,7 +1050,12 @@ uint64_t read_bndcfgu(void)
     if ( cr0 & X86_CR0_TS )
         write_cr0(cr0);
 
-    return xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR ? bndcsr->bndcfgu : 0;
+    if ( xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR )
+        ret = bndcsr->bndcfgu;
+
+    vcpu_unmap_xsave_area(v, xstate);
+
+    return ret;
 }
 
 void xstate_set_init(uint64_t mask)
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/14] x86/mpx: Adjust read_bndcfgu() to clean after itself
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (9 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 10/14] x86/mpx: Map/unmap xsave area in in read_bndcfgu() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-29  8:32   ` Jan Beulich
  2024-10-28 15:49 ` [PATCH 12/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xsave() Alejandro Vallejo
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

Overwrite the MPX data dumped in the idle XSAVE area to avoid leaking
it. While it's not very sensitive, better to err on the side of caution.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
Depending on whether the idle domain is considered ASI or non-ASI this
might or might not be enough. If the idle domain is not ASI the XSAVE
area would be in the directmap, which would render the zap ineffective
because it would still be transiently readable from another pCPU.
---
 xen/arch/x86/xstate.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 2a54da2823cf..a9a7ee2cd1e6 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -1025,7 +1025,7 @@ uint64_t read_bndcfgu(void)
     unsigned long cr0 = read_cr0();
     struct vcpu *v = idle_vcpu[smp_processor_id()];
     struct xsave_struct *xstate = vcpu_map_xsave_area(v);
-    const struct xstate_bndcsr *bndcsr;
+    struct xstate_bndcsr *bndcsr;
 
     ASSERT(cpu_has_mpx);
     clts();
@@ -1051,7 +1051,10 @@ uint64_t read_bndcfgu(void)
         write_cr0(cr0);
 
     if ( xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR )
+    {
         ret = bndcsr->bndcfgu;
+        *bndcsr = (struct xstate_bndcsr){};
+    }
 
     vcpu_unmap_xsave_area(v, xstate);
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xsave()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (10 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 11/14] x86/mpx: Adjust read_bndcfgu() to clean after itself Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-29  8:37   ` Jan Beulich
  2024-10-28 15:49 ` [PATCH 13/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xrstor() Alejandro Vallejo
  2024-10-28 15:49 ` [PATCH 14/14] x86/xstate: Make xstate_all() and vcpu_xsave_mask() take explicit xstate Alejandro Vallejo
  13 siblings, 1 reply; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/i387.c               | 16 ++++++++++------
 xen/arch/x86/include/asm/xstate.h |  2 +-
 xen/arch/x86/xstate.c             |  3 +--
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index a571bcb23c91..5950fbcf272e 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -130,7 +130,7 @@ static inline uint64_t vcpu_xsave_mask(const struct vcpu *v)
 }
 
 /* Save x87 extended state */
-static inline void fpu_xsave(struct vcpu *v)
+static inline void fpu_xsave(struct vcpu *v, struct xsave_struct *xsave_area)
 {
     bool ok;
     uint64_t mask = vcpu_xsave_mask(v);
@@ -143,15 +143,14 @@ static inline void fpu_xsave(struct vcpu *v)
      */
     ok = set_xcr0(v->arch.xcr0_accum | XSTATE_FP_SSE);
     ASSERT(ok);
-    xsave(v, mask);
+    xsave(v, xsave_area, mask);
     ok = set_xcr0(v->arch.xcr0 ?: XSTATE_FP_SSE);
     ASSERT(ok);
 }
 
 /* Save x87 FPU, MMX, SSE and SSE2 state */
-static inline void fpu_fxsave(struct vcpu *v)
+static inline void fpu_fxsave(struct vcpu *v, fpusse_t *fpu_ctxt)
 {
-    fpusse_t *fpu_ctxt = &v->arch.xsave_area->fpu_sse;
     unsigned int fip_width = v->domain->arch.x87_fip_width;
 
     if ( fip_width != 4 )
@@ -266,6 +265,8 @@ void vcpu_restore_fpu_lazy(struct vcpu *v)
  */
 static bool _vcpu_save_fpu(struct vcpu *v)
 {
+    struct xsave_struct *xsave_area;
+
     if ( !v->fpu_dirtied && !v->arch.nonlazy_xstate_used )
         return false;
 
@@ -274,11 +275,14 @@ static bool _vcpu_save_fpu(struct vcpu *v)
     /* This can happen, if a paravirtualised guest OS has set its CR0.TS. */
     clts();
 
+    xsave_area = vcpu_map_xsave_area(v);
+
     if ( cpu_has_xsave )
-        fpu_xsave(v);
+        fpu_xsave(v, xsave_area);
     else
-        fpu_fxsave(v);
+        fpu_fxsave(v, &xsave_area->fpu_sse);
 
+    vcpu_unmap_xsave_area(v, xsave_area);
     v->fpu_dirtied = 0;
 
     return true;
diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
index 36260459667c..104fe0d44173 100644
--- a/xen/arch/x86/include/asm/xstate.h
+++ b/xen/arch/x86/include/asm/xstate.h
@@ -97,7 +97,7 @@ uint64_t get_xcr0(void);
 void set_msr_xss(u64 xss);
 uint64_t get_msr_xss(void);
 uint64_t read_bndcfgu(void);
-void xsave(struct vcpu *v, uint64_t mask);
+void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask);
 void xrstor(struct vcpu *v, uint64_t mask);
 void xstate_set_init(uint64_t mask);
 bool xsave_enabled(const struct vcpu *v);
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index a9a7ee2cd1e6..518388e6e272 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -300,9 +300,8 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size)
     vcpu_unmap_xsave_area(v, xstate);
 }
 
-void xsave(struct vcpu *v, uint64_t mask)
+void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask)
 {
-    struct xsave_struct *ptr = v->arch.xsave_area;
     uint32_t hmask = mask >> 32;
     uint32_t lmask = mask;
     unsigned int fip_width = v->domain->arch.x87_fip_width;
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xrstor()
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (11 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 12/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xsave() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  2024-10-29  8:40   ` Jan Beulich
  2024-10-28 15:49 ` [PATCH 14/14] x86/xstate: Make xstate_all() and vcpu_xsave_mask() take explicit xstate Alejandro Vallejo
  13 siblings, 1 reply; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/i387.c               | 26 ++++++++++++++++----------
 xen/arch/x86/include/asm/xstate.h |  2 +-
 xen/arch/x86/xstate.c             | 10 ++++++----
 3 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index 5950fbcf272e..7e1fb8ad8779 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -20,7 +20,8 @@
 /*     FPU Restore Functions   */
 /*******************************/
 /* Restore x87 extended state */
-static inline void fpu_xrstor(struct vcpu *v, uint64_t mask)
+static inline void fpu_xrstor(struct vcpu *v, struct xsave_struct *xsave_area,
+                              uint64_t mask)
 {
     bool ok;
 
@@ -31,16 +32,14 @@ static inline void fpu_xrstor(struct vcpu *v, uint64_t mask)
      */
     ok = set_xcr0(v->arch.xcr0_accum | XSTATE_FP_SSE);
     ASSERT(ok);
-    xrstor(v, mask);
+    xrstor(v, xsave_area, mask);
     ok = set_xcr0(v->arch.xcr0 ?: XSTATE_FP_SSE);
     ASSERT(ok);
 }
 
 /* Restore x87 FPU, MMX, SSE and SSE2 state */
-static inline void fpu_fxrstor(struct vcpu *v)
+static inline void fpu_fxrstor(struct vcpu *v, const fpusse_t *fpu_ctxt)
 {
-    const fpusse_t *fpu_ctxt = &v->arch.xsave_area->fpu_sse;
-
     /*
      * Some CPUs don't save/restore FDP/FIP/FOP unless an exception
      * is pending. Clear the x87 state here by setting it to fixed
@@ -197,6 +196,8 @@ static inline void fpu_fxsave(struct vcpu *v, fpusse_t *fpu_ctxt)
 /* Restore FPU state whenever VCPU is schduled in. */
 void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts)
 {
+    struct xsave_struct *xsave_area;
+
     /* Restore nonlazy extended state (i.e. parts not tracked by CR0.TS). */
     if ( !v->arch.fully_eager_fpu && !v->arch.nonlazy_xstate_used )
         goto maybe_stts;
@@ -211,12 +212,13 @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts)
      * above) we also need to restore full state, to prevent subsequently
      * saving state belonging to another vCPU.
      */
+    xsave_area = vcpu_map_xsave_area(v);
     if ( v->arch.fully_eager_fpu || xstate_all(v) )
     {
         if ( cpu_has_xsave )
-            fpu_xrstor(v, XSTATE_ALL);
+            fpu_xrstor(v, xsave_area, XSTATE_ALL);
         else
-            fpu_fxrstor(v);
+            fpu_fxrstor(v, &xsave_area->fpu_sse);
 
         v->fpu_initialised = 1;
         v->fpu_dirtied = 1;
@@ -226,9 +228,10 @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts)
     }
     else
     {
-        fpu_xrstor(v, XSTATE_NONLAZY);
+        fpu_xrstor(v, xsave_area, XSTATE_NONLAZY);
         need_stts = true;
     }
+    vcpu_unmap_xsave_area(v, xsave_area);
 
  maybe_stts:
     if ( need_stts )
@@ -240,6 +243,7 @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts)
  */
 void vcpu_restore_fpu_lazy(struct vcpu *v)
 {
+    struct xsave_struct *xsave_area;
     ASSERT(!is_idle_vcpu(v));
 
     /* Avoid recursion. */
@@ -250,10 +254,12 @@ void vcpu_restore_fpu_lazy(struct vcpu *v)
 
     ASSERT(!v->arch.fully_eager_fpu);
 
+    xsave_area = vcpu_map_xsave_area(v);
     if ( cpu_has_xsave )
-        fpu_xrstor(v, XSTATE_LAZY);
+        fpu_xrstor(v, xsave_area, XSTATE_LAZY);
     else
-        fpu_fxrstor(v);
+        fpu_fxrstor(v, &xsave_area->fpu_sse);
+    vcpu_unmap_xsave_area(v, xsave_area);
 
     v->fpu_initialised = 1;
     v->fpu_dirtied = 1;
diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
index 104fe0d44173..43f7731c2b17 100644
--- a/xen/arch/x86/include/asm/xstate.h
+++ b/xen/arch/x86/include/asm/xstate.h
@@ -98,7 +98,7 @@ void set_msr_xss(u64 xss);
 uint64_t get_msr_xss(void);
 uint64_t read_bndcfgu(void);
 void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask);
-void xrstor(struct vcpu *v, uint64_t mask);
+void xrstor(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask);
 void xstate_set_init(uint64_t mask);
 bool xsave_enabled(const struct vcpu *v);
 int __must_check validate_xstate(const struct domain *d,
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 518388e6e272..aa5c062f7e51 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -374,11 +374,10 @@ void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask)
         ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = fip_width;
 }
 
-void xrstor(struct vcpu *v, uint64_t mask)
+void xrstor(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask)
 {
     uint32_t hmask = mask >> 32;
     uint32_t lmask = mask;
-    struct xsave_struct *ptr = v->arch.xsave_area;
     unsigned int faults, prev_faults;
 
     /*
@@ -992,6 +991,7 @@ int handle_xsetbv(u32 index, u64 new_bv)
     mask &= curr->fpu_dirtied ? ~XSTATE_FP_SSE : XSTATE_NONLAZY;
     if ( mask )
     {
+        struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
         unsigned long cr0 = read_cr0();
 
         clts();
@@ -1010,7 +1010,9 @@ int handle_xsetbv(u32 index, u64 new_bv)
             curr->fpu_dirtied = 1;
             cr0 &= ~X86_CR0_TS;
         }
-        xrstor(curr, mask);
+        xrstor(curr, xsave_area, mask);
+        vcpu_unmap_xsave_area(curr, xsave_area);
+
         if ( cr0 & X86_CR0_TS )
             write_cr0(cr0);
     }
@@ -1080,7 +1082,7 @@ void xstate_set_init(uint64_t mask)
 
     xstate = vcpu_map_xsave_area(v);
     memset(&xstate->xsave_hdr, 0, sizeof(xstate->xsave_hdr));
-    xrstor(v, mask);
+    xrstor(v, xstate, mask);
     vcpu_unmap_xsave_area(v, xstate);
 
     if ( cr0 & X86_CR0_TS )
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 14/14] x86/xstate: Make xstate_all() and vcpu_xsave_mask() take explicit xstate
  2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
                   ` (12 preceding siblings ...)
  2024-10-28 15:49 ` [PATCH 13/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xrstor() Alejandro Vallejo
@ 2024-10-28 15:49 ` Alejandro Vallejo
  13 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-28 15:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/i387.c               | 9 +++++----
 xen/arch/x86/include/asm/xstate.h | 5 +++--
 xen/arch/x86/xstate.c             | 2 +-
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index 7e1fb8ad8779..87b44dc11b55 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -108,7 +108,8 @@ static inline void fpu_fxrstor(struct vcpu *v, const fpusse_t *fpu_ctxt)
 /*      FPU Save Functions     */
 /*******************************/
 
-static inline uint64_t vcpu_xsave_mask(const struct vcpu *v)
+static inline uint64_t vcpu_xsave_mask(const struct vcpu *v,
+                                       const struct xsave_struct *xsave_area)
 {
     if ( v->fpu_dirtied )
         return v->arch.nonlazy_xstate_used ? XSTATE_ALL : XSTATE_LAZY;
@@ -125,14 +126,14 @@ static inline uint64_t vcpu_xsave_mask(const struct vcpu *v)
      * XSTATE_FP_SSE), vcpu_xsave_mask will return XSTATE_ALL. Otherwise
      * return XSTATE_NONLAZY.
      */
-    return xstate_all(v) ? XSTATE_ALL : XSTATE_NONLAZY;
+    return xstate_all(v, xsave_area) ? XSTATE_ALL : XSTATE_NONLAZY;
 }
 
 /* Save x87 extended state */
 static inline void fpu_xsave(struct vcpu *v, struct xsave_struct *xsave_area)
 {
     bool ok;
-    uint64_t mask = vcpu_xsave_mask(v);
+    uint64_t mask = vcpu_xsave_mask(v, xsave_area);
 
     ASSERT(mask);
     ASSERT(cpu_has_xsave);
@@ -213,7 +214,7 @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts)
      * saving state belonging to another vCPU.
      */
     xsave_area = vcpu_map_xsave_area(v);
-    if ( v->arch.fully_eager_fpu || xstate_all(v) )
+    if ( v->arch.fully_eager_fpu || xstate_all(v, xsave_area) )
     {
         if ( cpu_has_xsave )
             fpu_xrstor(v, xsave_area, XSTATE_ALL);
diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
index 43f7731c2b17..81350d0105bb 100644
--- a/xen/arch/x86/include/asm/xstate.h
+++ b/xen/arch/x86/include/asm/xstate.h
@@ -132,14 +132,15 @@ xsave_area_compressed(const struct xsave_struct *xsave_area)
     return xsave_area->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED;
 }
 
-static inline bool xstate_all(const struct vcpu *v)
+static inline bool xstate_all(const struct vcpu *v,
+                              const struct xsave_struct *xsave_area)
 {
     /*
      * XSTATE_FP_SSE may be excluded, because the offsets of XSTATE_FP_SSE
      * (in the legacy region of xsave area) are fixed, so saving
      * XSTATE_FP_SSE will not cause overwriting problem with XSAVES/XSAVEC.
      */
-    return xsave_area_compressed(v->arch.xsave_area) &&
+    return xsave_area_compressed(xsave_area) &&
            (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
 }
 
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index aa5c062f7e51..cbe56eba89eb 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -1002,7 +1002,7 @@ int handle_xsetbv(u32 index, u64 new_bv)
             asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
             vcpu_unmap_xsave_area(curr, xsave_area);
         }
-        else if ( xstate_all(curr) )
+        else if ( xstate_all(curr, xsave_area) )
         {
             /* See the comment in i387.c:vcpu_restore_fpu_eager(). */
             mask |= XSTATE_LAZY;
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/14] x86/xstate: Update stale assertions in fpu_x{rstor,save}()
  2024-10-28 15:49 ` [PATCH 01/14] x86/xstate: Update stale assertions in fpu_x{rstor,save}() Alejandro Vallejo
@ 2024-10-28 17:16   ` Andrew Cooper
  2024-10-29  8:13     ` Jan Beulich
  0 siblings, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2024-10-28 17:16 UTC (permalink / raw)
  To: Alejandro Vallejo, xen-devel; +Cc: Jan Beulich, Roger Pau Monné

On 28/10/2024 3:49 pm, Alejandro Vallejo wrote:
> The asserts' intent was to establish whether the xsave instruction was
> usable or not, which at the time was strictly given by the presence of
> the xsave area. After edb48e76458b("x86/fpu: Combine fpu_ctxt and
> xsave_area in arch_vcpu"), that area is always present a more relevant
> assert is that the host supports XSAVE.
>
> Fixes: edb48e76458b("x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu")
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> I'd also be ok with removing the assertions altogether. They serve very
> little purpose there after the merge of xsave and fpu_ctxt.

I'd be fine with dropping them.  If they're violated, the use of
XSAVE/XRSTOR immediately afterwards will be fatal too.

~Andrew


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas
  2024-10-28 15:49 ` [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas Alejandro Vallejo
@ 2024-10-28 17:20   ` Andrew Cooper
  2024-10-29 11:57     ` Alejandro Vallejo
  2024-10-29  8:19   ` Jan Beulich
  1 sibling, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2024-10-28 17:20 UTC (permalink / raw)
  To: Alejandro Vallejo, xen-devel; +Cc: Jan Beulich, Roger Pau Monné

On 28/10/2024 3:49 pm, Alejandro Vallejo wrote:
> diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
> index 07017cc4edfd..36260459667c 100644
> --- a/xen/arch/x86/include/asm/xstate.h
> +++ b/xen/arch/x86/include/asm/xstate.h
> @@ -143,4 +143,24 @@ static inline bool xstate_all(const struct vcpu *v)
>             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
>  }
>  
> +/*
> + * Fetch a pointer to the XSAVE area of a vCPU
> + *
> + * If ASI is enabled for the domain, this mapping is pCPU-local.
> + *
> + * @param v Owner of the XSAVE area
> + */
> +#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area)
> +
> +/*
> + * Drops the XSAVE area of a vCPU and nullifies its pointer on exit.
> + *
> + * If ASI is enabled and v is not the currently scheduled vCPU then the
> + * per-pCPU mapping is removed from the address space.
> + *
> + * @param v           vCPU logically owning xsave_area
> + * @param xsave_area  XSAVE blob of v
> + */
> +#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; })
> +

Is there a preview of how these will end up looking with the real ASI
bits in place?

Having a macro-that-reads-like-a-function mutating x by name, rather
than by pointer, is somewhat rude.  This is why we capitalise
XFREE()/etc which have a similar pattern; to make it clear it's a macro
and potentially doing weird things with scopes.

~Andrew


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/14] x86/xstate: Update stale assertions in fpu_x{rstor,save}()
  2024-10-28 17:16   ` Andrew Cooper
@ 2024-10-29  8:13     ` Jan Beulich
  2024-10-29 10:56       ` Alejandro Vallejo
  0 siblings, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2024-10-29  8:13 UTC (permalink / raw)
  To: Andrew Cooper, Alejandro Vallejo; +Cc: Roger Pau Monné, xen-devel

On 28.10.2024 18:16, Andrew Cooper wrote:
> On 28/10/2024 3:49 pm, Alejandro Vallejo wrote:
>> The asserts' intent was to establish whether the xsave instruction was
>> usable or not, which at the time was strictly given by the presence of
>> the xsave area. After edb48e76458b("x86/fpu: Combine fpu_ctxt and
>> xsave_area in arch_vcpu"), that area is always present a more relevant
>> assert is that the host supports XSAVE.
>>
>> Fixes: edb48e76458b("x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu")
>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>> ---
>> I'd also be ok with removing the assertions altogether. They serve very
>> little purpose there after the merge of xsave and fpu_ctxt.
> 
> I'd be fine with dropping them.

+1

Jan

>  If they're violated, the use of
> XSAVE/XRSTOR immediately afterwards will be fatal too.
> 
> ~Andrew



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas
  2024-10-28 15:49 ` [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas Alejandro Vallejo
  2024-10-28 17:20   ` Andrew Cooper
@ 2024-10-29  8:19   ` Jan Beulich
  2024-10-29 10:55     ` Alejandro Vallejo
  1 sibling, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2024-10-29  8:19 UTC (permalink / raw)
  To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On 28.10.2024 16:49, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/include/asm/xstate.h
> +++ b/xen/arch/x86/include/asm/xstate.h
> @@ -143,4 +143,24 @@ static inline bool xstate_all(const struct vcpu *v)
>             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
>  }
>  
> +/*
> + * Fetch a pointer to the XSAVE area of a vCPU
> + *
> + * If ASI is enabled for the domain, this mapping is pCPU-local.

Taking the umap commentary into account, I think this needs to expand
some, to also symmetrically cover what the unmap comment says regarding
"v is [not] the currently scheduled vCPU". This may then also help
better see the further outlook, as Andrew was asking for.

> + * @param v Owner of the XSAVE area
> + */
> +#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area)
> +
> +/*
> + * Drops the XSAVE area of a vCPU and nullifies its pointer on exit.

Nit: I expect it drops the mapping, not the area.

> + * If ASI is enabled and v is not the currently scheduled vCPU then the
> + * per-pCPU mapping is removed from the address space.
> + *
> + * @param v           vCPU logically owning xsave_area
> + * @param xsave_area  XSAVE blob of v
> + */
> +#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; })
> +
>  #endif /* __ASM_XSTATE_H */



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv()
  2024-10-28 15:49 ` [PATCH 05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv() Alejandro Vallejo
@ 2024-10-29  8:26   ` Jan Beulich
  2024-10-29 13:00     ` Alejandro Vallejo
  0 siblings, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2024-10-29  8:26 UTC (permalink / raw)
  To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On 28.10.2024 16:49, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -993,7 +993,12 @@ int handle_xsetbv(u32 index, u64 new_bv)
>  
>          clts();
>          if ( curr->fpu_dirtied )
> -            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
> +        {
> +            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
> +
> +            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
> +            vcpu_unmap_xsave_area(curr, xsave_area);
> +        }

Since it's curr that we're dealing with, is this largely a cosmetic change? I.e.
there's no going to be any actual map/unmap operation in that case? Otherwise
I'd be inclined to say that an actual map/unmap is pretty high overhead for a
mere store of a 32-bit value.

Jan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 10/14] x86/mpx: Map/unmap xsave area in in read_bndcfgu()
  2024-10-28 15:49 ` [PATCH 10/14] x86/mpx: Map/unmap xsave area in in read_bndcfgu() Alejandro Vallejo
@ 2024-10-29  8:27   ` Jan Beulich
  0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2024-10-29  8:27 UTC (permalink / raw)
  To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On 28.10.2024 16:49, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -1021,9 +1021,10 @@ int handle_xsetbv(u32 index, u64 new_bv)
>  
>  uint64_t read_bndcfgu(void)
>  {
> +    uint64_t ret = 0;

Seeing the purpose of the variable, imo it would better be named bndcfgu.

Jan

>      unsigned long cr0 = read_cr0();
> -    struct xsave_struct *xstate
> -        = idle_vcpu[smp_processor_id()]->arch.xsave_area;
> +    struct vcpu *v = idle_vcpu[smp_processor_id()];
> +    struct xsave_struct *xstate = vcpu_map_xsave_area(v);
>      const struct xstate_bndcsr *bndcsr;
>  
>      ASSERT(cpu_has_mpx);
> @@ -1049,7 +1050,12 @@ uint64_t read_bndcfgu(void)
>      if ( cr0 & X86_CR0_TS )
>          write_cr0(cr0);
>  
> -    return xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR ? bndcsr->bndcfgu : 0;
> +    if ( xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR )
> +        ret = bndcsr->bndcfgu;
> +
> +    vcpu_unmap_xsave_area(v, xstate);
> +
> +    return ret;
>  }
>  
>  void xstate_set_init(uint64_t mask)



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/14] x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu()
  2024-10-28 15:49 ` [PATCH 06/14] x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu() Alejandro Vallejo
@ 2024-10-29  8:30   ` Jan Beulich
  0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2024-10-29  8:30 UTC (permalink / raw)
  To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On 28.10.2024 16:49, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -2371,7 +2371,8 @@ static int cf_check hvmemul_get_fpu(
>          alternative_vcall(hvm_funcs.fpu_dirty_intercept);
>      else if ( type == X86EMUL_FPU_fpu )
>      {
> -        const fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse;
> +        const struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
> +        const fpusse_t *fpu_ctxt = &xsave_area->fpu_sse;
>  
>          /*
>           * Latch current register state so that we can back out changes
> @@ -2397,6 +2398,8 @@ static int cf_check hvmemul_get_fpu(
>              else
>                  ASSERT(fcw == fpu_ctxt->fcw);
>          }
> +
> +        vcpu_unmap_xsave_area(curr, xsave_area);
>      }

Same question as for the other patch: Mainly a cosmetic change, with no
actual map/unmap?

> @@ -2411,7 +2414,8 @@ static void cf_check hvmemul_put_fpu(
>  
>      if ( aux )
>      {
> -        fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse;
> +        struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
> +        fpusse_t *fpu_ctxt = &xsave_area->fpu_sse;
>          bool dval = aux->dval;
>          int mode = hvm_guest_x86_mode(curr);
>  
> @@ -2465,6 +2469,8 @@ static void cf_check hvmemul_put_fpu(
>  
>          fpu_ctxt->fop = aux->op;
>  
> +        vcpu_unmap_xsave_area(curr, xsave_area);
> +
>          /* Re-use backout code below. */
>          backout = X86EMUL_FPU_fpu;
>      }

Same here. Because of the overhead concern, such places may be worthwhile to
gain brief comments.

Jan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 11/14] x86/mpx: Adjust read_bndcfgu() to clean after itself
  2024-10-28 15:49 ` [PATCH 11/14] x86/mpx: Adjust read_bndcfgu() to clean after itself Alejandro Vallejo
@ 2024-10-29  8:32   ` Jan Beulich
  0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2024-10-29  8:32 UTC (permalink / raw)
  To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On 28.10.2024 16:49, Alejandro Vallejo wrote:
> Overwrite the MPX data dumped in the idle XSAVE area to avoid leaking
> it. While it's not very sensitive, better to err on the side of caution.
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> Depending on whether the idle domain is considered ASI or non-ASI this
> might or might not be enough. If the idle domain is not ASI the XSAVE
> area would be in the directmap, which would render the zap ineffective
> because it would still be transiently readable from another pCPU.

Yet that needs to be known / decided before this change can sensibly be
acked.

Jan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 12/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xsave()
  2024-10-28 15:49 ` [PATCH 12/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xsave() Alejandro Vallejo
@ 2024-10-29  8:37   ` Jan Beulich
  2024-10-29 10:59     ` Alejandro Vallejo
  0 siblings, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2024-10-29  8:37 UTC (permalink / raw)
  To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On 28.10.2024 16:49, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -300,9 +300,8 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size)
>      vcpu_unmap_xsave_area(v, xstate);
>  }
>  
> -void xsave(struct vcpu *v, uint64_t mask)
> +void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask)
>  {
> -    struct xsave_struct *ptr = v->arch.xsave_area;
>      uint32_t hmask = mask >> 32;
>      uint32_t lmask = mask;
>      unsigned int fip_width = v->domain->arch.x87_fip_width;

Imo this change wants to constify v at the same time, to demonstrate that
nothing is changed through v anymore. The comment may extend to other functions
as well that are being altered here; I only closely looks at this one.

Jan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 13/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xrstor()
  2024-10-28 15:49 ` [PATCH 13/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xrstor() Alejandro Vallejo
@ 2024-10-29  8:40   ` Jan Beulich
  0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2024-10-29  8:40 UTC (permalink / raw)
  To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On 28.10.2024 16:49, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -374,11 +374,10 @@ void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask)
>          ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = fip_width;
>  }
>  
> -void xrstor(struct vcpu *v, uint64_t mask)
> +void xrstor(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask)
>  {
>      uint32_t hmask = mask >> 32;
>      uint32_t lmask = mask;
> -    struct xsave_struct *ptr = v->arch.xsave_area;
>      unsigned int faults, prev_faults;

Same remark here as on the previous patch as to constification of v.

Jan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas
  2024-10-29  8:19   ` Jan Beulich
@ 2024-10-29 10:55     ` Alejandro Vallejo
  0 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-29 10:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On Tue Oct 29, 2024 at 8:19 AM GMT, Jan Beulich wrote:
> On 28.10.2024 16:49, Alejandro Vallejo wrote:
> > --- a/xen/arch/x86/include/asm/xstate.h
> > +++ b/xen/arch/x86/include/asm/xstate.h
> > @@ -143,4 +143,24 @@ static inline bool xstate_all(const struct vcpu *v)
> >             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
> >  }
> >  
> > +/*
> > + * Fetch a pointer to the XSAVE area of a vCPU
> > + *
> > + * If ASI is enabled for the domain, this mapping is pCPU-local.
>
> Taking the umap commentary into account, I think this needs to expand
> some, to also symmetrically cover what the unmap comment says regarding
> "v is [not] the currently scheduled vCPU".

Yes, that's fair.

> This may then also help
> better see the further outlook, as Andrew was asking for.

Sure, I'll answer his comment in a jiffy with a rough approximation of what I
expect them to contain.

>
> > + * @param v Owner of the XSAVE area
> > + */
> > +#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area)
> > +
> > +/*
> > + * Drops the XSAVE area of a vCPU and nullifies its pointer on exit.
>
> Nit: I expect it drops the mapping, not the area.

Yes, although even the mapping might not be dropped if we can credibly avoid
it. Regardless, yes this needs rewriting.

The particulars are murky and should become easier to see with the pseudo-code
I'm about to answer Andrew with

>
> > + * If ASI is enabled and v is not the currently scheduled vCPU then the
> > + * per-pCPU mapping is removed from the address space.
> > + *
> > + * @param v           vCPU logically owning xsave_area
> > + * @param xsave_area  XSAVE blob of v
> > + */
> > +#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; })
> > +
> >  #endif /* __ASM_XSTATE_H */

Cheers,
Alejandro


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/14] x86/xstate: Update stale assertions in fpu_x{rstor,save}()
  2024-10-29  8:13     ` Jan Beulich
@ 2024-10-29 10:56       ` Alejandro Vallejo
  0 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-29 10:56 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, xen-devel

On Tue Oct 29, 2024 at 8:13 AM GMT, Jan Beulich wrote:
> On 28.10.2024 18:16, Andrew Cooper wrote:
> > On 28/10/2024 3:49 pm, Alejandro Vallejo wrote:
> >> The asserts' intent was to establish whether the xsave instruction was
> >> usable or not, which at the time was strictly given by the presence of
> >> the xsave area. After edb48e76458b("x86/fpu: Combine fpu_ctxt and
> >> xsave_area in arch_vcpu"), that area is always present a more relevant
> >> assert is that the host supports XSAVE.
> >>
> >> Fixes: edb48e76458b("x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu")
> >> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> >> ---
> >> I'd also be ok with removing the assertions altogether. They serve very
> >> little purpose there after the merge of xsave and fpu_ctxt.
> > 
> > I'd be fine with dropping them.
>
> +1
>
> Jan
>
> >  If they're violated, the use of
> > XSAVE/XRSTOR immediately afterwards will be fatal too.
> > 
> > ~Andrew

Ok then, I'll re-send this one as a removal.

Cheers,
Alejandro


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 12/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xsave()
  2024-10-29  8:37   ` Jan Beulich
@ 2024-10-29 10:59     ` Alejandro Vallejo
  0 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-29 10:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On Tue Oct 29, 2024 at 8:37 AM GMT, Jan Beulich wrote:
> On 28.10.2024 16:49, Alejandro Vallejo wrote:
> > --- a/xen/arch/x86/xstate.c
> > +++ b/xen/arch/x86/xstate.c
> > @@ -300,9 +300,8 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size)
> >      vcpu_unmap_xsave_area(v, xstate);
> >  }
> >  
> > -void xsave(struct vcpu *v, uint64_t mask)
> > +void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask)
> >  {
> > -    struct xsave_struct *ptr = v->arch.xsave_area;
> >      uint32_t hmask = mask >> 32;
> >      uint32_t lmask = mask;
> >      unsigned int fip_width = v->domain->arch.x87_fip_width;
>
> Imo this change wants to constify v at the same time, to demonstrate that
> nothing is changed through v anymore. The comment may extend to other functions
> as well that are being altered here; I only closely looks at this one.
>
> Jan

I didn't think of that angle... I'll have a look and take it into account for
v2.

Cheers,
Alejandro


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas
  2024-10-28 17:20   ` Andrew Cooper
@ 2024-10-29 11:57     ` Alejandro Vallejo
  2024-10-29 13:24       ` Frediano Ziglio
  2024-10-29 13:28       ` Jan Beulich
  0 siblings, 2 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-29 11:57 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Jan Beulich, Roger Pau Monné

Hi,

On Mon Oct 28, 2024 at 5:20 PM GMT, Andrew Cooper wrote:
> On 28/10/2024 3:49 pm, Alejandro Vallejo wrote:
> > diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
> > index 07017cc4edfd..36260459667c 100644
> > --- a/xen/arch/x86/include/asm/xstate.h
> > +++ b/xen/arch/x86/include/asm/xstate.h
> > @@ -143,4 +143,24 @@ static inline bool xstate_all(const struct vcpu *v)
> >             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
> >  }
> >  
> > +/*
> > + * Fetch a pointer to the XSAVE area of a vCPU
> > + *
> > + * If ASI is enabled for the domain, this mapping is pCPU-local.
> > + *
> > + * @param v Owner of the XSAVE area
> > + */
> > +#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area)
> > +
> > +/*
> > + * Drops the XSAVE area of a vCPU and nullifies its pointer on exit.
> > + *
> > + * If ASI is enabled and v is not the currently scheduled vCPU then the
> > + * per-pCPU mapping is removed from the address space.
> > + *
> > + * @param v           vCPU logically owning xsave_area
> > + * @param xsave_area  XSAVE blob of v
> > + */
> > +#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; })
> > +
>
> Is there a preview of how these will end up looking with the real ASI
> bits in place?

I expect the contents to be something along these lines (in function form for
clarity):

  struct xsave_struct *vcpu_map_xsave_area(struct vcpu *v)
  {
      if ( !v->domain->asi )
          return v->arch.xsave_area;

      if ( likely(v == current) )
          return percpu_fixmap(v, PCPU_FIX_XSAVE_AREA);

      /* Likely some new vmap-like abstraction after AMX */
      return map_domain_page(v->arch.xsave_area_pg);
  }

Where:
  1. v->arch.xsave_area is a pointer to the XSAVE area on non-ASI domains.
  2. v->arch.xsave_area_pg an mfn (or a pointer to a page_info, converted)
  3. percpu_fixmap(v, PCPU_FIX_XSAVE_AREA) is a slot in a per-vCPU fixmap, that
     changes as we context switch from vCPU to vCPU.

  /*
   * NOTE: Being a function this doesn't nullify the xsave_area pointer, but
   * it would in a macro. It's unimportant for the overall logic though.
   */
  void vcpu_unmap_xsave_area(struct vcpu *v, struct xsave_struct *xsave_area)
  {
      /* Catch mismatched areas when ASI is disabled */
      ASSERT(v->domain->asi || xsave_area == v->arch.xsave_area);

      /* Likely some new vunmap-like abstraction after AMX */
      if ( v->domain->asi && v != current )
          unmap_domain_page(xsave_area);
  }

Of course, many of these details hang in the balance of what happens to the ASI
series from Roger. In any case, the takeaway is that map/unmap must have
fastpaths for "current" that don't involve mapping. The assumption is that
non-current vCPUs are cold paths. In particular, context switches will undergo
some refactoring in order to make save/restore not require additional
map/unmaps besides the page table switch and yet another change to further
align "current" with the currently running page tables. Paths like the
instruction emulator go through these wrappers later on for ease of
auditability, but are early-returns that cause no major overhead.

My expectation is that these macros are general enough to be tweakable in
whatever way is most suitable, thus allowing the refactor of the codebase at
large to make it ASI-friendly before the details of the ASI infra are merged,
or even finalised.

>
> Having a macro-that-reads-like-a-function mutating x by name, rather
> than by pointer, is somewhat rude.  This is why we capitalise
> XFREE()/etc which have a similar pattern; to make it clear it's a macro
> and potentially doing weird things with scopes.
>
> ~Andrew

That magic trick on unmap warrants uppercase, agreed. Initially it was all
function calls and after macrofying them I was lazy to change their users.

Cheers,
Alejandro


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv()
  2024-10-29  8:26   ` Jan Beulich
@ 2024-10-29 13:00     ` Alejandro Vallejo
  2024-10-29 13:31       ` Jan Beulich
  0 siblings, 1 reply; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-29 13:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On Tue Oct 29, 2024 at 8:26 AM GMT, Jan Beulich wrote:
> On 28.10.2024 16:49, Alejandro Vallejo wrote:
> > --- a/xen/arch/x86/xstate.c
> > +++ b/xen/arch/x86/xstate.c
> > @@ -993,7 +993,12 @@ int handle_xsetbv(u32 index, u64 new_bv)
> >  
> >          clts();
> >          if ( curr->fpu_dirtied )
> > -            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
> > +        {
> > +            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
> > +
> > +            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
> > +            vcpu_unmap_xsave_area(curr, xsave_area);
> > +        }
>
> Since it's curr that we're dealing with, is this largely a cosmetic change? I.e.
> there's no going to be any actual map/unmap operation in that case? Otherwise
> I'd be inclined to say that an actual map/unmap is pretty high overhead for a
> mere store of a 32-bit value.
>
> Jan

Somewhat.

See the follow-up reply to patch2 with something resembling what I expect the
wrappers to have. In short, yes, I expect "current" to not require
mapping/unmapping; but I still would rather see those sites using the same
wrappers for auditability. After we settle on a particular interface, we can
let the implementation details creep out if that happens to be clearer, but
it's IMO easier to work this way for the time being until those details
crystalise.

Cheers,
Alejandro


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas
  2024-10-29 11:57     ` Alejandro Vallejo
@ 2024-10-29 13:24       ` Frediano Ziglio
  2024-10-29 14:23         ` Alejandro Vallejo
  2024-10-29 13:28       ` Jan Beulich
  1 sibling, 1 reply; 36+ messages in thread
From: Frediano Ziglio @ 2024-10-29 13:24 UTC (permalink / raw)
  To: Alejandro Vallejo
  Cc: Andrew Cooper, xen-devel, Jan Beulich, Roger Pau Monné

On Tue, Oct 29, 2024 at 11:58 AM Alejandro Vallejo
<alejandro.vallejo@cloud.com> wrote:
>
> Hi,
>
> On Mon Oct 28, 2024 at 5:20 PM GMT, Andrew Cooper wrote:
> > On 28/10/2024 3:49 pm, Alejandro Vallejo wrote:
> > > diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
> > > index 07017cc4edfd..36260459667c 100644
> > > --- a/xen/arch/x86/include/asm/xstate.h
> > > +++ b/xen/arch/x86/include/asm/xstate.h
> > > @@ -143,4 +143,24 @@ static inline bool xstate_all(const struct vcpu *v)
> > >             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
> > >  }
> > >
> > > +/*
> > > + * Fetch a pointer to the XSAVE area of a vCPU
> > > + *
> > > + * If ASI is enabled for the domain, this mapping is pCPU-local.
> > > + *
> > > + * @param v Owner of the XSAVE area
> > > + */
> > > +#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area)
> > > +
> > > +/*
> > > + * Drops the XSAVE area of a vCPU and nullifies its pointer on exit.
> > > + *
> > > + * If ASI is enabled and v is not the currently scheduled vCPU then the
> > > + * per-pCPU mapping is removed from the address space.
> > > + *
> > > + * @param v           vCPU logically owning xsave_area
> > > + * @param xsave_area  XSAVE blob of v
> > > + */
> > > +#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; })
> > > +
> >
> > Is there a preview of how these will end up looking with the real ASI
> > bits in place?
>
> I expect the contents to be something along these lines (in function form for
> clarity):
>
>   struct xsave_struct *vcpu_map_xsave_area(struct vcpu *v)
>   {
>       if ( !v->domain->asi )
>           return v->arch.xsave_area;
>
>       if ( likely(v == current) )
>           return percpu_fixmap(v, PCPU_FIX_XSAVE_AREA);
>
>       /* Likely some new vmap-like abstraction after AMX */
>       return map_domain_page(v->arch.xsave_area_pg);
>   }
>
> Where:
>   1. v->arch.xsave_area is a pointer to the XSAVE area on non-ASI domains.
>   2. v->arch.xsave_area_pg an mfn (or a pointer to a page_info, converted)
>   3. percpu_fixmap(v, PCPU_FIX_XSAVE_AREA) is a slot in a per-vCPU fixmap, that
>      changes as we context switch from vCPU to vCPU.
>
>   /*
>    * NOTE: Being a function this doesn't nullify the xsave_area pointer, but
>    * it would in a macro. It's unimportant for the overall logic though.
>    */
>   void vcpu_unmap_xsave_area(struct vcpu *v, struct xsave_struct *xsave_area)
>   {
>       /* Catch mismatched areas when ASI is disabled */
>       ASSERT(v->domain->asi || xsave_area == v->arch.xsave_area);
>
>       /* Likely some new vunmap-like abstraction after AMX */
>       if ( v->domain->asi && v != current )
>           unmap_domain_page(xsave_area);
>   }
>
> Of course, many of these details hang in the balance of what happens to the ASI
> series from Roger. In any case, the takeaway is that map/unmap must have
> fastpaths for "current" that don't involve mapping. The assumption is that
> non-current vCPUs are cold paths. In particular, context switches will undergo
> some refactoring in order to make save/restore not require additional
> map/unmaps besides the page table switch and yet another change to further
> align "current" with the currently running page tables. Paths like the
> instruction emulator go through these wrappers later on for ease of
> auditability, but are early-returns that cause no major overhead.
>
> My expectation is that these macros are general enough to be tweakable in
> whatever way is most suitable, thus allowing the refactor of the codebase at
> large to make it ASI-friendly before the details of the ASI infra are merged,
> or even finalised.
>
> >
> > Having a macro-that-reads-like-a-function mutating x by name, rather
> > than by pointer, is somewhat rude.  This is why we capitalise
> > XFREE()/etc which have a similar pattern; to make it clear it's a macro
> > and potentially doing weird things with scopes.
> >
> > ~Andrew
>
> That magic trick on unmap warrants uppercase, agreed. Initially it was all
> function calls and after macrofying them I was lazy to change their users.
>
> Cheers,
> Alejandro
>

Why not using static inline functions?

On the documentation, I found weird that "v" is described quite
differently for the 2 macros:
1) @param v Owner of the XSAVE area;
2) @param v           vCPU logically owning xsave_area

For "x" the documentation is "@param xsave_area  XSAVE blob of v", but
there's no "xsave_area" parameter.

(very minors, you can ignore)

Frediano


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas
  2024-10-29 11:57     ` Alejandro Vallejo
  2024-10-29 13:24       ` Frediano Ziglio
@ 2024-10-29 13:28       ` Jan Beulich
  2024-10-29 14:12         ` Alejandro Vallejo
  1 sibling, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2024-10-29 13:28 UTC (permalink / raw)
  To: Alejandro Vallejo; +Cc: Roger Pau Monné, Andrew Cooper, xen-devel

On 29.10.2024 12:57, Alejandro Vallejo wrote:
> On Mon Oct 28, 2024 at 5:20 PM GMT, Andrew Cooper wrote:
>> On 28/10/2024 3:49 pm, Alejandro Vallejo wrote:
>>> diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
>>> index 07017cc4edfd..36260459667c 100644
>>> --- a/xen/arch/x86/include/asm/xstate.h
>>> +++ b/xen/arch/x86/include/asm/xstate.h
>>> @@ -143,4 +143,24 @@ static inline bool xstate_all(const struct vcpu *v)
>>>             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
>>>  }
>>>  
>>> +/*
>>> + * Fetch a pointer to the XSAVE area of a vCPU
>>> + *
>>> + * If ASI is enabled for the domain, this mapping is pCPU-local.
>>> + *
>>> + * @param v Owner of the XSAVE area
>>> + */
>>> +#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area)
>>> +
>>> +/*
>>> + * Drops the XSAVE area of a vCPU and nullifies its pointer on exit.
>>> + *
>>> + * If ASI is enabled and v is not the currently scheduled vCPU then the
>>> + * per-pCPU mapping is removed from the address space.
>>> + *
>>> + * @param v           vCPU logically owning xsave_area
>>> + * @param xsave_area  XSAVE blob of v
>>> + */
>>> +#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; })
>>> +
>>
>> Is there a preview of how these will end up looking with the real ASI
>> bits in place?
> 
> I expect the contents to be something along these lines (in function form for
> clarity):
> 
>   struct xsave_struct *vcpu_map_xsave_area(struct vcpu *v)
>   {
>       if ( !v->domain->asi )
>           return v->arch.xsave_area;
> 
>       if ( likely(v == current) )
>           return percpu_fixmap(v, PCPU_FIX_XSAVE_AREA);
> 
>       /* Likely some new vmap-like abstraction after AMX */
>       return map_domain_page(v->arch.xsave_area_pg);
>   }

I'd like to ask that map_domain_page() be avoided here from the beginning, to
take AMX into account right away. I've been sitting on the AMX series for
years, and I'd consider it pretty unfair if it was me to take care of such an
aspect, when instead the series should (imo) long have landed.

Jan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv()
  2024-10-29 13:00     ` Alejandro Vallejo
@ 2024-10-29 13:31       ` Jan Beulich
  2024-10-29 14:14         ` Alejandro Vallejo
  0 siblings, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2024-10-29 13:31 UTC (permalink / raw)
  To: Alejandro Vallejo; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On 29.10.2024 14:00, Alejandro Vallejo wrote:
> On Tue Oct 29, 2024 at 8:26 AM GMT, Jan Beulich wrote:
>> On 28.10.2024 16:49, Alejandro Vallejo wrote:
>>> --- a/xen/arch/x86/xstate.c
>>> +++ b/xen/arch/x86/xstate.c
>>> @@ -993,7 +993,12 @@ int handle_xsetbv(u32 index, u64 new_bv)
>>>  
>>>          clts();
>>>          if ( curr->fpu_dirtied )
>>> -            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
>>> +        {
>>> +            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
>>> +
>>> +            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
>>> +            vcpu_unmap_xsave_area(curr, xsave_area);
>>> +        }
>>
>> Since it's curr that we're dealing with, is this largely a cosmetic change? I.e.
>> there's no going to be any actual map/unmap operation in that case? Otherwise
>> I'd be inclined to say that an actual map/unmap is pretty high overhead for a
>> mere store of a 32-bit value.
> 
> Somewhat.
> 
> See the follow-up reply to patch2 with something resembling what I expect the
> wrappers to have. In short, yes, I expect "current" to not require
> mapping/unmapping; but I still would rather see those sites using the same
> wrappers for auditability. After we settle on a particular interface, we can
> let the implementation details creep out if that happens to be clearer, but
> it's IMO easier to work this way for the time being until those details
> crystalise.

Sure. As expressed in a later reply on the same topic, what I'm after are brief
comments indicating that despite the function names involved, no actual mapping
operations will be carried out in these cases, thus addressing concerns towards
the overhead involved.

Jan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas
  2024-10-29 13:28       ` Jan Beulich
@ 2024-10-29 14:12         ` Alejandro Vallejo
  0 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-29 14:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monné, Andrew Cooper, xen-devel

On Tue Oct 29, 2024 at 1:28 PM GMT, Jan Beulich wrote:
> On 29.10.2024 12:57, Alejandro Vallejo wrote:
> > On Mon Oct 28, 2024 at 5:20 PM GMT, Andrew Cooper wrote:
> >> On 28/10/2024 3:49 pm, Alejandro Vallejo wrote:
> >>> diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
> >>> index 07017cc4edfd..36260459667c 100644
> >>> --- a/xen/arch/x86/include/asm/xstate.h
> >>> +++ b/xen/arch/x86/include/asm/xstate.h
> >>> @@ -143,4 +143,24 @@ static inline bool xstate_all(const struct vcpu *v)
> >>>             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
> >>>  }
> >>>  
> >>> +/*
> >>> + * Fetch a pointer to the XSAVE area of a vCPU
> >>> + *
> >>> + * If ASI is enabled for the domain, this mapping is pCPU-local.
> >>> + *
> >>> + * @param v Owner of the XSAVE area
> >>> + */
> >>> +#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area)
> >>> +
> >>> +/*
> >>> + * Drops the XSAVE area of a vCPU and nullifies its pointer on exit.
> >>> + *
> >>> + * If ASI is enabled and v is not the currently scheduled vCPU then the
> >>> + * per-pCPU mapping is removed from the address space.
> >>> + *
> >>> + * @param v           vCPU logically owning xsave_area
> >>> + * @param xsave_area  XSAVE blob of v
> >>> + */
> >>> +#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; })
> >>> +
> >>
> >> Is there a preview of how these will end up looking with the real ASI
> >> bits in place?
> > 
> > I expect the contents to be something along these lines (in function form for
> > clarity):
> > 
> >   struct xsave_struct *vcpu_map_xsave_area(struct vcpu *v)
> >   {
> >       if ( !v->domain->asi )
> >           return v->arch.xsave_area;
> > 
> >       if ( likely(v == current) )
> >           return percpu_fixmap(v, PCPU_FIX_XSAVE_AREA);
> > 
> >       /* Likely some new vmap-like abstraction after AMX */
> >       return map_domain_page(v->arch.xsave_area_pg);
> >   }
>
> I'd like to ask that map_domain_page() be avoided here from the beginning, to
> take AMX into account right away. I've been sitting on the AMX series for
> years, and I'd consider it pretty unfair if it was me to take care of such an
> aspect, when instead the series should (imo) long have landed.
>
> Jan

Of course. This is just pseudo-code for explanation purposes, but I didn't want
to introduce imaginary functions. In the final thing we'll want to map an array
of MFNs if the XSAVE area is large enough.

I am already accounting for the XSAVE area to possibly exceed a single page (3
after AMX, I think?). Part of this abstraction stems from that want, in fact,
as otherwise I could simply stash it all away under map_domain_page() and let
that take care of everything. We'll want map_domain_pages_contig() or something
along those lines that takes an array of mfns we've previously stored in
arch_vcpu. But that's a tomorrow problem for when we do have a secret area to
create those mappings on.

For today, I'd be happy with most code to stop assuming there will be a pointer
in the vcpu.

Cheers,



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv()
  2024-10-29 13:31       ` Jan Beulich
@ 2024-10-29 14:14         ` Alejandro Vallejo
  0 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-29 14:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On Tue Oct 29, 2024 at 1:31 PM GMT, Jan Beulich wrote:
> On 29.10.2024 14:00, Alejandro Vallejo wrote:
> > On Tue Oct 29, 2024 at 8:26 AM GMT, Jan Beulich wrote:
> >> On 28.10.2024 16:49, Alejandro Vallejo wrote:
> >>> --- a/xen/arch/x86/xstate.c
> >>> +++ b/xen/arch/x86/xstate.c
> >>> @@ -993,7 +993,12 @@ int handle_xsetbv(u32 index, u64 new_bv)
> >>>  
> >>>          clts();
> >>>          if ( curr->fpu_dirtied )
> >>> -            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
> >>> +        {
> >>> +            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
> >>> +
> >>> +            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
> >>> +            vcpu_unmap_xsave_area(curr, xsave_area);
> >>> +        }
> >>
> >> Since it's curr that we're dealing with, is this largely a cosmetic change? I.e.
> >> there's no going to be any actual map/unmap operation in that case? Otherwise
> >> I'd be inclined to say that an actual map/unmap is pretty high overhead for a
> >> mere store of a 32-bit value.
> > 
> > Somewhat.
> > 
> > See the follow-up reply to patch2 with something resembling what I expect the
> > wrappers to have. In short, yes, I expect "current" to not require
> > mapping/unmapping; but I still would rather see those sites using the same
> > wrappers for auditability. After we settle on a particular interface, we can
> > let the implementation details creep out if that happens to be clearer, but
> > it's IMO easier to work this way for the time being until those details
> > crystalise.
>
> Sure. As expressed in a later reply on the same topic, what I'm after are brief
> comments indicating that despite the function names involved, no actual mapping
> operations will be carried out in these cases, thus addressing concerns towards
> the overhead involved.
>
> Jan

Right, I can add those to the sites using exclusively "current". That's no
problem.

Cheers,
Alejandro


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas
  2024-10-29 13:24       ` Frediano Ziglio
@ 2024-10-29 14:23         ` Alejandro Vallejo
  0 siblings, 0 replies; 36+ messages in thread
From: Alejandro Vallejo @ 2024-10-29 14:23 UTC (permalink / raw)
  To: Frediano Ziglio
  Cc: Andrew Cooper, xen-devel, Jan Beulich, Roger Pau Monné

On Tue Oct 29, 2024 at 1:24 PM GMT, Frediano Ziglio wrote:
> On Tue, Oct 29, 2024 at 11:58 AM Alejandro Vallejo
> <alejandro.vallejo@cloud.com> wrote:
> >
> > Hi,
> >
> > On Mon Oct 28, 2024 at 5:20 PM GMT, Andrew Cooper wrote:
> > > On 28/10/2024 3:49 pm, Alejandro Vallejo wrote:
> > > > diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h
> > > > index 07017cc4edfd..36260459667c 100644
> > > > --- a/xen/arch/x86/include/asm/xstate.h
> > > > +++ b/xen/arch/x86/include/asm/xstate.h
> > > > @@ -143,4 +143,24 @@ static inline bool xstate_all(const struct vcpu *v)
> > > >             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
> > > >  }
> > > >
> > > > +/*
> > > > + * Fetch a pointer to the XSAVE area of a vCPU
> > > > + *
> > > > + * If ASI is enabled for the domain, this mapping is pCPU-local.
> > > > + *
> > > > + * @param v Owner of the XSAVE area
> > > > + */
> > > > +#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area)
> > > > +
> > > > +/*
> > > > + * Drops the XSAVE area of a vCPU and nullifies its pointer on exit.
> > > > + *
> > > > + * If ASI is enabled and v is not the currently scheduled vCPU then the
> > > > + * per-pCPU mapping is removed from the address space.
> > > > + *
> > > > + * @param v           vCPU logically owning xsave_area
> > > > + * @param xsave_area  XSAVE blob of v
> > > > + */
> > > > +#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; })
> > > > +
> > >
> > > Is there a preview of how these will end up looking with the real ASI
> > > bits in place?
> >
> > I expect the contents to be something along these lines (in function form for
> > clarity):
> >
> >   struct xsave_struct *vcpu_map_xsave_area(struct vcpu *v)
> >   {
> >       if ( !v->domain->asi )
> >           return v->arch.xsave_area;
> >
> >       if ( likely(v == current) )
> >           return percpu_fixmap(v, PCPU_FIX_XSAVE_AREA);
> >
> >       /* Likely some new vmap-like abstraction after AMX */
> >       return map_domain_page(v->arch.xsave_area_pg);
> >   }
> >
> > Where:
> >   1. v->arch.xsave_area is a pointer to the XSAVE area on non-ASI domains.
> >   2. v->arch.xsave_area_pg an mfn (or a pointer to a page_info, converted)
> >   3. percpu_fixmap(v, PCPU_FIX_XSAVE_AREA) is a slot in a per-vCPU fixmap, that
> >      changes as we context switch from vCPU to vCPU.
> >
> >   /*
> >    * NOTE: Being a function this doesn't nullify the xsave_area pointer, but
> >    * it would in a macro. It's unimportant for the overall logic though.
> >    */
> >   void vcpu_unmap_xsave_area(struct vcpu *v, struct xsave_struct *xsave_area)
> >   {
> >       /* Catch mismatched areas when ASI is disabled */
> >       ASSERT(v->domain->asi || xsave_area == v->arch.xsave_area);
> >
> >       /* Likely some new vunmap-like abstraction after AMX */
> >       if ( v->domain->asi && v != current )
> >           unmap_domain_page(xsave_area);
> >   }
> >
> > Of course, many of these details hang in the balance of what happens to the ASI
> > series from Roger. In any case, the takeaway is that map/unmap must have
> > fastpaths for "current" that don't involve mapping. The assumption is that
> > non-current vCPUs are cold paths. In particular, context switches will undergo
> > some refactoring in order to make save/restore not require additional
> > map/unmaps besides the page table switch and yet another change to further
> > align "current" with the currently running page tables. Paths like the
> > instruction emulator go through these wrappers later on for ease of
> > auditability, but are early-returns that cause no major overhead.
> >
> > My expectation is that these macros are general enough to be tweakable in
> > whatever way is most suitable, thus allowing the refactor of the codebase at
> > large to make it ASI-friendly before the details of the ASI infra are merged,
> > or even finalised.
> >
> > >
> > > Having a macro-that-reads-like-a-function mutating x by name, rather
> > > than by pointer, is somewhat rude.  This is why we capitalise
> > > XFREE()/etc which have a similar pattern; to make it clear it's a macro
> > > and potentially doing weird things with scopes.
> > >
> > > ~Andrew
> >
> > That magic trick on unmap warrants uppercase, agreed. Initially it was all
> > function calls and after macrofying them I was lazy to change their users.
> >
> > Cheers,
> > Alejandro
> >
>
> Why not using static inline functions?

I'd rather use regular function in fact for the final thing. These ones aren't
to avoid headaches with const parameters and to allow nullifying the pointer
on exit without requiring a double pointer (which is doubly fun with const
involved).

As they gain more logic it's not impossible for them to be split in most
everything to be done in a function and the rest by the macro to avoid tons of
duplicate codegen everywhere.

Anyhow, all of that can be decided later driven by bloat checkers, benchmarks
and the like. My preference is also towards type-safety where possible.

>
> On the documentation, I found weird that "v" is described quite
> differently for the 2 macros:
> 1) @param v Owner of the XSAVE area;
> 2) @param v           vCPU logically owning xsave_area

(2) should have the content of (1). Will do in v2.

>
> For "x" the documentation is "@param xsave_area  XSAVE blob of v", but
> there's no "xsave_area" parameter.

True. I created these macros from previous functions, and there the parameter
was called xsave_area. It should be "x" here.

>
> (very minors, you can ignore)
>
> Frediano



^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2024-10-29 14:24 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-28 15:49 [PATCH 00/14] x86: Address Space Isolation FPU preparations Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 01/14] x86/xstate: Update stale assertions in fpu_x{rstor,save}() Alejandro Vallejo
2024-10-28 17:16   ` Andrew Cooper
2024-10-29  8:13     ` Jan Beulich
2024-10-29 10:56       ` Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 02/14] x86/xstate: Create map/unmap primitives for xsave areas Alejandro Vallejo
2024-10-28 17:20   ` Andrew Cooper
2024-10-29 11:57     ` Alejandro Vallejo
2024-10-29 13:24       ` Frediano Ziglio
2024-10-29 14:23         ` Alejandro Vallejo
2024-10-29 13:28       ` Jan Beulich
2024-10-29 14:12         ` Alejandro Vallejo
2024-10-29  8:19   ` Jan Beulich
2024-10-29 10:55     ` Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 03/14] x86/hvm: Map/unmap xsave area in hvm_save_cpu_ctxt() Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 04/14] x86/fpu: Map/umap xsave area in vcpu_{reset,setup}_fpu() Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv() Alejandro Vallejo
2024-10-29  8:26   ` Jan Beulich
2024-10-29 13:00     ` Alejandro Vallejo
2024-10-29 13:31       ` Jan Beulich
2024-10-29 14:14         ` Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 06/14] x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu() Alejandro Vallejo
2024-10-29  8:30   ` Jan Beulich
2024-10-28 15:49 ` [PATCH 07/14] x86/domctl: Map/unmap xsave area in arch_get_info_guest() Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 08/14] x86/xstate: Map/unmap xsave area in {compress,expand}_xsave_states() Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 09/14] x86/emulator: Refactor FXSAVE_AREA to use wrappers Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 10/14] x86/mpx: Map/unmap xsave area in in read_bndcfgu() Alejandro Vallejo
2024-10-29  8:27   ` Jan Beulich
2024-10-28 15:49 ` [PATCH 11/14] x86/mpx: Adjust read_bndcfgu() to clean after itself Alejandro Vallejo
2024-10-29  8:32   ` Jan Beulich
2024-10-28 15:49 ` [PATCH 12/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xsave() Alejandro Vallejo
2024-10-29  8:37   ` Jan Beulich
2024-10-29 10:59     ` Alejandro Vallejo
2024-10-28 15:49 ` [PATCH 13/14] x86/fpu: Pass explicit xsave areas to fpu_(f)xrstor() Alejandro Vallejo
2024-10-29  8:40   ` Jan Beulich
2024-10-28 15:49 ` [PATCH 14/14] x86/xstate: Make xstate_all() and vcpu_xsave_mask() take explicit xstate Alejandro Vallejo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.