All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/10] Make PAT handling less brittle
@ 2022-12-20  1:07 Demi Marie Obenour
  2022-12-20  1:07 ` [PATCH v5 01/10] x86: Add memory type constants Demi Marie Obenour
                   ` (10 more replies)
  0 siblings, 11 replies; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

While working on Qubes OS Marek found out that there were some PAT hacks
in the Linux i195 driver.  I decided to make Xen use Linux’s PAT to see
if it solved the graphics glitches that were observed; it did.  This
required a substantial amount of preliminary work that is useful even
without using Linux’s PAT.

Patches 1 through 9 are the preliminary work and I would like them to be
accepted into upstream Xen.  Patch 9 does break ABI by rejecting the
unused PAT entries, but this will only impact buggy PV guests and can be
disabled with a Xen command-line option.  Patch 10 actually switches to
Linux’s PAT and is NOT intended to be merged (at least for now) as it
would at a minimum break migration of PV guests from hosts that do not
have the patch.

Only patches 9 and 10 actually change Xen’s observable behavior.
Patches 1, 2, and 7 are prerequisites, and patches 3 through 6 are
cleanups.  Patch 8 makes changing the PAT much less error-prone, as
problems with the PAT or with the associated _PAGE_* constants will
be detected at compile time.

Demi Marie Obenour (10):
  x86: Add memory type constants
  x86/mm: Avoid hard-coding PAT in get_page_from_l1e()
  x86: Replace PAT_* with X86_MT_*
  x86: Replace MTRR_* constants with X86_MT_* constants
  x86: Replace EPT_EMT_* constants with X86_MT_*
  x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE
  x86: Derive XEN_MSR_PAT from its individual entries
  x86/mm: make code robust to future PAT changes
  x86/mm: Reject invalid cacheability in PV guests by default
  x86: Use Linux's PAT

 xen/arch/x86/cpu/mtrr/generic.c         |  10 +-
 xen/arch/x86/cpu/mtrr/main.c            |  26 +++---
 xen/arch/x86/e820.c                     |   4 +-
 xen/arch/x86/hvm/hvm.c                  |  12 +--
 xen/arch/x86/hvm/mtrr.c                 | 100 ++++++++++----------
 xen/arch/x86/hvm/vmx/vmcs.c             |   2 +-
 xen/arch/x86/hvm/vmx/vmx.c              |  18 ++--
 xen/arch/x86/include/asm/hvm/vmx/vmcs.h |   2 +-
 xen/arch/x86/include/asm/hvm/vmx/vmx.h  |   9 --
 xen/arch/x86/include/asm/mtrr.h         |  22 +----
 xen/arch/x86/include/asm/page.h         |   4 +-
 xen/arch/x86/include/asm/processor.h    |  11 ++-
 xen/arch/x86/include/asm/x86-defns.h    |  11 +++
 xen/arch/x86/mm.c                       | 118 ++++++++++++++++++++++--
 xen/arch/x86/mm/hap/nested_ept.c        |   4 +-
 xen/arch/x86/mm/p2m-ept.c               |  51 +++++-----
 xen/arch/x86/mm/shadow/multi.c          |   8 +-
 17 files changed, 253 insertions(+), 159 deletions(-)

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 01/10] x86: Add memory type constants
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-20  1:07 ` [PATCH v5 02/10] x86/mm: Avoid hard-coding PAT in get_page_from_l1e() Demi Marie Obenour
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

These are not currently used, so there is no functional change.  Future
patches will use these constants.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v4:
- Add Jan Beulich’s Acked-by

Changes since v3:
- Name the reserved values X86_MT_RSVD_2 and X86_MT_RSVD_3, to
  match the architectural values of 0x02 and 0x03.

Changes since v2:
- Avoid using _AC where not required.
- Define reserved memory types inline
---
 xen/arch/x86/include/asm/x86-defns.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/xen/arch/x86/include/asm/x86-defns.h b/xen/arch/x86/include/asm/x86-defns.h
index 28628807cb9897cf6fa8e266f71f5f220813984d..42b5f382d438d21ac97b6438e8c810c7b964cf6d 100644
--- a/xen/arch/x86/include/asm/x86-defns.h
+++ b/xen/arch/x86/include/asm/x86-defns.h
@@ -153,4 +153,15 @@
      (1u << X86_EXC_AC) | (1u << X86_EXC_CP) |                      \
      (1u << X86_EXC_VC) | (1u << X86_EXC_SX))
 
+/* Memory types */
+#define X86_MT_UC     0x00 /* uncachable */
+#define X86_MT_WC     0x01 /* write-combined */
+#define X86_MT_RSVD_2 0x02 /* reserved */
+#define X86_MT_RSVD_3 0x03 /* reserved */
+#define X86_MT_WT     0x04 /* write-through */
+#define X86_MT_WP     0x05 /* write-protect */
+#define X86_MT_WB     0x06 /* write-back */
+#define X86_MT_UCM    0x07 /* UC- */
+#define X86_NUM_MT    0x08
+
 #endif	/* __XEN_X86_DEFNS_H__ */
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 02/10] x86/mm: Avoid hard-coding PAT in get_page_from_l1e()
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
  2022-12-20  1:07 ` [PATCH v5 01/10] x86: Add memory type constants Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-20  8:19   ` Jan Beulich
  2022-12-20  1:07 ` [PATCH v5 03/10] x86: Replace PAT_* with X86_MT_* Demi Marie Obenour
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

get_page_from_l1e() relied on Xen's choice of PAT, which is brittle in
the face of future PAT changes.  Instead, compute the actual cacheability
used by the CPU and switch on that, as this will work no matter what PAT
Xen uses.

No functional change intended.  This code is itself questionable and may
be removed in the future, but removing it would be an observable
behavior change and so is out of scope for this patch series.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
Changes since v4:
- Do not add new pte_flags_to_cacheability() helper, as this code may be
  removed in the near future and so adding a new helper for it is a bad
  idea.
- Do not BUG() in the event of an unexpected cacheability.  This cannot
  happen, but it is simpler to force such types to UC than to prove that
  the BUG() is not reachable.

Changes since v3:
- Compute and use the actual cacheability as seen by the processor.

Changes since v2:
- Improve commit message.
---
 xen/arch/x86/mm.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 78b1972e4170cacccc9c37c6e64e76e66a7da87f..dba6c77ef2f7ed7fcb7f7e526583ccadd35e62cc 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -959,14 +959,16 @@ get_page_from_l1e(
             flip = _PAGE_RW;
         }
 
-        switch ( l1f & PAGE_CACHE_ATTRS )
+        switch ( 0xFF & (XEN_MSR_PAT >> (8 * pte_flags_to_cacheattr(l1f))) )
         {
-        case 0: /* WB */
-            flip |= _PAGE_PWT | _PAGE_PCD;
+        case X86_MT_UC:
+        case X86_MT_UCM:
+        case X86_MT_WC:
+            /* not cacheable */
             break;
-        case _PAGE_PWT: /* WT */
-        case _PAGE_PWT | _PAGE_PAT: /* WP */
-            flip |= _PAGE_PCD | (l1f & _PAGE_PAT);
+        default:
+            /* cacheable */
+            flip |= ((l1f & PAGE_CACHE_ATTRS) ^ _PAGE_UC);
             break;
         }
 
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 03/10] x86: Replace PAT_* with X86_MT_*
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
  2022-12-20  1:07 ` [PATCH v5 01/10] x86: Add memory type constants Demi Marie Obenour
  2022-12-20  1:07 ` [PATCH v5 02/10] x86/mm: Avoid hard-coding PAT in get_page_from_l1e() Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-20  1:07 ` [PATCH v5 04/10] x86: Replace MTRR_* constants with X86_MT_* constants Demi Marie Obenour
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

This allows eliminating the former.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v2: Style adjustments
---
 xen/arch/x86/hvm/hvm.c          | 12 ++++----
 xen/arch/x86/hvm/mtrr.c         | 52 ++++++++++++++++-----------------
 xen/arch/x86/hvm/vmx/vmx.c      | 16 +++++-----
 xen/arch/x86/include/asm/mtrr.h | 12 +-------
 xen/arch/x86/mm/p2m-ept.c       |  4 +--
 xen/arch/x86/mm/shadow/multi.c  |  4 +--
 6 files changed, 45 insertions(+), 55 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ae4368ec4b338cf8c6cb14d383f612c91c98e800..00b3fa56e25e2934e2870e11fd19b120daff2715 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -307,12 +307,12 @@ int hvm_set_guest_pat(struct vcpu *v, uint64_t guest_pat)
     for ( i = 0, tmp = guest_pat; i < 8; i++, tmp >>= 8 )
         switch ( tmp & 0xff )
         {
-        case PAT_TYPE_UC_MINUS:
-        case PAT_TYPE_UNCACHABLE:
-        case PAT_TYPE_WRBACK:
-        case PAT_TYPE_WRCOMB:
-        case PAT_TYPE_WRPROT:
-        case PAT_TYPE_WRTHROUGH:
+        case X86_MT_UCM:
+        case X86_MT_UC:
+        case X86_MT_WB:
+        case X86_MT_WC:
+        case X86_MT_WP:
+        case X86_MT_WT:
             break;
         default:
             return 0;
diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index 4d2aa6def86de45aeeaade7a1a7815c5ef2b3d7a..242623f3c239ee18a44f882ecb3910a00c615825 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -37,7 +37,7 @@ static const uint8_t pat_entry_2_pte_flags[8] = {
     _PAGE_PAT | _PAGE_PCD, _PAGE_PAT | _PAGE_PCD | _PAGE_PWT };
 
 /* Effective mm type lookup table, according to MTRR and PAT. */
-static const uint8_t mm_type_tbl[MTRR_NUM_TYPES][PAT_TYPE_NUMS] = {
+static const uint8_t mm_type_tbl[MTRR_NUM_TYPES][X86_NUM_MT] = {
 #define RS MEMORY_NUM_TYPES
 #define UC MTRR_TYPE_UNCACHABLE
 #define WB MTRR_TYPE_WRBACK
@@ -72,8 +72,8 @@ static uint8_t __read_mostly mtrr_epat_tbl[MTRR_NUM_TYPES][MEMORY_NUM_TYPES] =
     };
 
 /* Lookup table for PAT entry of a given PAT value in host PAT. */
-static uint8_t __read_mostly pat_entry_tbl[PAT_TYPE_NUMS] =
-    { [0 ... PAT_TYPE_NUMS-1] = INVALID_MEM_TYPE };
+static uint8_t __read_mostly pat_entry_tbl[X86_NUM_MT] =
+    { [0 ... X86_NUM_MT - 1] = INVALID_MEM_TYPE };
 
 static int __init cf_check hvm_mtrr_pat_init(void)
 {
@@ -81,7 +81,7 @@ static int __init cf_check hvm_mtrr_pat_init(void)
 
     for ( i = 0; i < MTRR_NUM_TYPES; i++ )
     {
-        for ( j = 0; j < PAT_TYPE_NUMS; j++ )
+        for ( j = 0; j < X86_NUM_MT; j++ )
         {
             unsigned int tmp = mm_type_tbl[i][j];
 
@@ -90,9 +90,9 @@ static int __init cf_check hvm_mtrr_pat_init(void)
         }
     }
 
-    for ( i = 0; i < PAT_TYPE_NUMS; i++ )
+    for ( i = 0; i < X86_NUM_MT; i++ )
     {
-        for ( j = 0; j < PAT_TYPE_NUMS; j++ )
+        for ( j = 0; j < X86_NUM_MT; j++ )
         {
             if ( pat_cr_2_paf(XEN_MSR_PAT, j) == i )
             {
@@ -115,7 +115,7 @@ uint8_t pat_type_2_pte_flags(uint8_t pat_type)
      * given pat_type. If host PAT covers all the PAT types, it can't happen.
      */
     if ( unlikely(pat_entry == INVALID_MEM_TYPE) )
-        pat_entry = pat_entry_tbl[PAT_TYPE_UNCACHABLE];
+        pat_entry = pat_entry_tbl[X86_MT_UC];
 
     return pat_entry_2_pte_flags[pat_entry];
 }
@@ -145,14 +145,14 @@ int hvm_vcpu_cacheattr_init(struct vcpu *v)
     m->mtrr_cap = (1u << 10) | (1u << 8) | num_var_ranges;
 
     v->arch.hvm.pat_cr =
-        ((uint64_t)PAT_TYPE_WRBACK) |               /* PAT0: WB */
-        ((uint64_t)PAT_TYPE_WRTHROUGH << 8) |       /* PAT1: WT */
-        ((uint64_t)PAT_TYPE_UC_MINUS << 16) |       /* PAT2: UC- */
-        ((uint64_t)PAT_TYPE_UNCACHABLE << 24) |     /* PAT3: UC */
-        ((uint64_t)PAT_TYPE_WRBACK << 32) |         /* PAT4: WB */
-        ((uint64_t)PAT_TYPE_WRTHROUGH << 40) |      /* PAT5: WT */
-        ((uint64_t)PAT_TYPE_UC_MINUS << 48) |       /* PAT6: UC- */
-        ((uint64_t)PAT_TYPE_UNCACHABLE << 56);      /* PAT7: UC */
+        ((uint64_t)X86_MT_WB) |           /* PAT0: WB */
+        ((uint64_t)X86_MT_WT << 8) |      /* PAT1: WT */
+        ((uint64_t)X86_MT_UCM << 16) |    /* PAT2: UC- */
+        ((uint64_t)X86_MT_UC << 24) |     /* PAT3: UC */
+        ((uint64_t)X86_MT_WB << 32) |     /* PAT4: WB */
+        ((uint64_t)X86_MT_WT << 40) |     /* PAT5: WT */
+        ((uint64_t)X86_MT_UCM << 48) |    /* PAT6: UC- */
+        ((uint64_t)X86_MT_UC << 56);      /* PAT7: UC */
 
     if ( is_hardware_domain(v->domain) )
     {
@@ -356,7 +356,7 @@ uint32_t get_pat_flags(struct vcpu *v,
      */
     pat_entry_value = mtrr_epat_tbl[shadow_mtrr_type][guest_eff_mm_type];
     /* If conflit occurs(e.g host MTRR is UC, guest memory type is
-     * WB),set UC as effective memory. Here, returning PAT_TYPE_UNCACHABLE will
+     * WB), set UC as effective memory. Here, returning X86_MT_UC will
      * always set effective memory as UC.
      */
     if ( pat_entry_value == INVALID_MEM_TYPE )
@@ -371,7 +371,7 @@ uint32_t get_pat_flags(struct vcpu *v,
                     "because the host mtrr type is:%d\n",
                     gl1e_flags, (uint64_t)gpaddr, guest_eff_mm_type,
                     shadow_mtrr_type);
-        pat_entry_value = PAT_TYPE_UNCACHABLE;
+        pat_entry_value = X86_MT_UC;
     }
     /* 4. Get the pte flags */
     return pat_type_2_pte_flags(pat_entry_value);
@@ -620,13 +620,13 @@ int hvm_set_mem_pinned_cacheattr(struct domain *d, uint64_t gfn_start,
                 p2m_memory_type_changed(d);
                 switch ( type )
                 {
-                case PAT_TYPE_UC_MINUS:
+                case X86_MT_UCM:
                     /*
                      * For EPT we can also avoid the flush in this case;
                      * see epte_get_entry_emt().
                      */
                     if ( hap_enabled(d) && cpu_has_vmx )
-                case PAT_TYPE_UNCACHABLE:
+                case X86_MT_UC:
                         break;
                     /* fall through */
                 default:
@@ -638,12 +638,12 @@ int hvm_set_mem_pinned_cacheattr(struct domain *d, uint64_t gfn_start,
         rcu_read_unlock(&pinned_cacheattr_rcu_lock);
         return -ENOENT;
 
-    case PAT_TYPE_UC_MINUS:
-    case PAT_TYPE_UNCACHABLE:
-    case PAT_TYPE_WRBACK:
-    case PAT_TYPE_WRCOMB:
-    case PAT_TYPE_WRPROT:
-    case PAT_TYPE_WRTHROUGH:
+    case X86_MT_UCM:
+    case X86_MT_UC:
+    case X86_MT_WB:
+    case X86_MT_WC:
+    case X86_MT_WP:
+    case X86_MT_WT:
         break;
 
     default:
@@ -681,7 +681,7 @@ int hvm_set_mem_pinned_cacheattr(struct domain *d, uint64_t gfn_start,
 
     list_add_rcu(&range->list, &d->arch.hvm.pinned_cacheattr_ranges);
     p2m_memory_type_changed(d);
-    if ( type != PAT_TYPE_WRBACK )
+    if ( type != X86_MT_WB )
         flush_all(FLUSH_CACHE);
 
     return 0;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 7c81b80710f99e08fe8291d3e413c449322b777d..b543c3983d77ae807e8bd97330691a79d8d39bae 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1231,14 +1231,14 @@ static void cf_check vmx_handle_cd(struct vcpu *v, unsigned long value)
              * memory type are all UC.
              */
             u64 uc_pat =
-                ((uint64_t)PAT_TYPE_UNCACHABLE)       |       /* PAT0 */
-                ((uint64_t)PAT_TYPE_UNCACHABLE << 8)  |       /* PAT1 */
-                ((uint64_t)PAT_TYPE_UNCACHABLE << 16) |       /* PAT2 */
-                ((uint64_t)PAT_TYPE_UNCACHABLE << 24) |       /* PAT3 */
-                ((uint64_t)PAT_TYPE_UNCACHABLE << 32) |       /* PAT4 */
-                ((uint64_t)PAT_TYPE_UNCACHABLE << 40) |       /* PAT5 */
-                ((uint64_t)PAT_TYPE_UNCACHABLE << 48) |       /* PAT6 */
-                ((uint64_t)PAT_TYPE_UNCACHABLE << 56);        /* PAT7 */
+                ((uint64_t)X86_MT_UC)       |       /* PAT0 */
+                ((uint64_t)X86_MT_UC << 8)  |       /* PAT1 */
+                ((uint64_t)X86_MT_UC << 16) |       /* PAT2 */
+                ((uint64_t)X86_MT_UC << 24) |       /* PAT3 */
+                ((uint64_t)X86_MT_UC << 32) |       /* PAT4 */
+                ((uint64_t)X86_MT_UC << 40) |       /* PAT5 */
+                ((uint64_t)X86_MT_UC << 48) |       /* PAT6 */
+                ((uint64_t)X86_MT_UC << 56);        /* PAT7 */
 
             vmx_get_guest_pat(v, pat);
             vmx_set_guest_pat(v, uc_pat);
diff --git a/xen/arch/x86/include/asm/mtrr.h b/xen/arch/x86/include/asm/mtrr.h
index 7733800b798fc2c72ba87e4ce6500e4183553d04..92fc930c692039b6c709d6a04f6553593f40aa55 100644
--- a/xen/arch/x86/include/asm/mtrr.h
+++ b/xen/arch/x86/include/asm/mtrr.h
@@ -16,17 +16,7 @@
 #define NORMAL_CACHE_MODE          0
 #define NO_FILL_CACHE_MODE         2
 
-enum {
-    PAT_TYPE_UNCACHABLE=0,
-    PAT_TYPE_WRCOMB=1,
-    PAT_TYPE_WRTHROUGH=4,
-    PAT_TYPE_WRPROT=5,
-    PAT_TYPE_WRBACK=6,
-    PAT_TYPE_UC_MINUS=7,
-    PAT_TYPE_NUMS
-};
-
-#define INVALID_MEM_TYPE PAT_TYPE_NUMS
+#define INVALID_MEM_TYPE X86_NUM_MT
 
 /* In the Intel processor's MTRR interface, the MTRR type is always held in
    an 8 bit field: */
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index d61d66c20e4180f8cbe21bcd97b568519e0b738e..126437285d8a9f222fca6a7b6ff4434b60637847 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -573,8 +573,8 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
     if ( gmtrr_mtype >= 0 )
     {
         *ipat = true;
-        return gmtrr_mtype != PAT_TYPE_UC_MINUS ? gmtrr_mtype
-                                                : MTRR_TYPE_UNCACHABLE;
+        return gmtrr_mtype != X86_MT_UCM ? gmtrr_mtype
+                                         : MTRR_TYPE_UNCACHABLE;
     }
     if ( gmtrr_mtype == -EADDRNOTAVAIL )
         return -1;
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 6bb564b0145285afc93b72a60b7797fcfe8696dc..b64bba70fc17906236872a017ad48ce91fd30803 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -561,7 +561,7 @@ _sh_propagate(struct vcpu *v,
              (type = hvm_get_mem_pinned_cacheattr(d, target_gfn, 0)) >= 0 )
             sflags |= pat_type_2_pte_flags(type);
         else if ( d->arch.hvm.is_in_uc_mode )
-            sflags |= pat_type_2_pte_flags(PAT_TYPE_UNCACHABLE);
+            sflags |= pat_type_2_pte_flags(X86_MT_UC);
         else
             if ( iomem_access_permitted(d, mfn_x(target_mfn), mfn_x(target_mfn)) )
             {
@@ -572,7 +572,7 @@ _sh_propagate(struct vcpu *v,
                             mfn_to_maddr(target_mfn),
                             MTRR_TYPE_UNCACHABLE);
                 else if ( iommu_snoop )
-                    sflags |= pat_type_2_pte_flags(PAT_TYPE_WRBACK);
+                    sflags |= pat_type_2_pte_flags(X86_MT_WB);
                 else
                     sflags |= get_pat_flags(v,
                             gflags,
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 04/10] x86: Replace MTRR_* constants with X86_MT_* constants
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
                   ` (2 preceding siblings ...)
  2022-12-20  1:07 ` [PATCH v5 03/10] x86: Replace PAT_* with X86_MT_* Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-20  1:07 ` [PATCH v5 05/10] x86: Replace EPT_EMT_* constants with X86_MT_* Demi Marie Obenour
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

This allows eliminating of the former, with the exception of
MTRR_NUM_TYPES.  MTRR_NUM_TYPES is kept, as due to a quirk of the x86
architecture X86_MT_UCM (7) is not valid in an MTRR.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v2:

- Improve commit message
- Do not replace MTRR_NUM_TYPES with X86_MT_UCM
- State explicitly that MTRR_NUM_TYPES is kept
---
 xen/arch/x86/cpu/mtrr/generic.c         | 10 ++---
 xen/arch/x86/cpu/mtrr/main.c            | 26 ++++++-------
 xen/arch/x86/e820.c                     |  4 +-
 xen/arch/x86/hvm/mtrr.c                 | 30 +++++++--------
 xen/arch/x86/hvm/vmx/vmcs.c             |  2 +-
 xen/arch/x86/hvm/vmx/vmx.c              |  2 +-
 xen/arch/x86/include/asm/hvm/vmx/vmcs.h |  2 +-
 xen/arch/x86/include/asm/mtrr.h         | 10 +----
 xen/arch/x86/mm/p2m-ept.c               | 51 ++++++++++++-------------
 xen/arch/x86/mm/shadow/multi.c          |  2 +-
 10 files changed, 66 insertions(+), 73 deletions(-)

diff --git a/xen/arch/x86/cpu/mtrr/generic.c b/xen/arch/x86/cpu/mtrr/generic.c
index 47aaf76226e0a8a0712b7211ed339a4a032ab3f3..660ae26c2350b3436a471155fc0426699ba8ac1d 100644
--- a/xen/arch/x86/cpu/mtrr/generic.c
+++ b/xen/arch/x86/cpu/mtrr/generic.c
@@ -127,11 +127,11 @@ static const char *__init mtrr_attrib_to_str(mtrr_type x)
 {
 	static const char __initconst strings[MTRR_NUM_TYPES][16] =
 	{
-		[MTRR_TYPE_UNCACHABLE]     = "uncachable",
-		[MTRR_TYPE_WRCOMB]         = "write-combining",
-		[MTRR_TYPE_WRTHROUGH]      = "write-through",
-		[MTRR_TYPE_WRPROT]         = "write-protect",
-		[MTRR_TYPE_WRBACK]         = "write-back",
+		[X86_MT_UC] = "uncachable",
+		[X86_MT_WC] = "write-combining",
+		[X86_MT_WT] = "write-through",
+		[X86_MT_WP] = "write-protect",
+		[X86_MT_WB] = "write-back",
 	};
 
 	return (x < ARRAY_SIZE(strings) && strings[x][0]) ? strings[x] : "?";
diff --git a/xen/arch/x86/cpu/mtrr/main.c b/xen/arch/x86/cpu/mtrr/main.c
index 4e01c8d6f9df6562b94438f265d79a0a6fca8de6..2946003b84938f3b83c98b62dfaa3ace90822983 100644
--- a/xen/arch/x86/cpu/mtrr/main.c
+++ b/xen/arch/x86/cpu/mtrr/main.c
@@ -163,10 +163,10 @@ static void cf_check ipi_handler(void *info)
 }
 
 static inline int types_compatible(mtrr_type type1, mtrr_type type2) {
-	return type1 == MTRR_TYPE_UNCACHABLE ||
-	       type2 == MTRR_TYPE_UNCACHABLE ||
-	       (type1 == MTRR_TYPE_WRTHROUGH && type2 == MTRR_TYPE_WRBACK) ||
-	       (type1 == MTRR_TYPE_WRBACK && type2 == MTRR_TYPE_WRTHROUGH);
+	return type1 == X86_MT_UC ||
+	       type2 == X86_MT_UC ||
+	       (type1 == X86_MT_WT && type2 == X86_MT_WB) ||
+	       (type1 == X86_MT_WB && type2 == X86_MT_WT);
 }
 
 /**
@@ -297,13 +297,13 @@ static void set_mtrr(unsigned int reg, unsigned long base,
  *
  *	The available types are
  *
- *	%MTRR_TYPE_UNCACHABLE	-	No caching
+ *	%X86_MT_UC	-	No caching
  *
- *	%MTRR_TYPE_WRBACK	-	Write data back in bursts whenever
+ *	%X86_MT_WB	-	Write data back in bursts whenever
  *
- *	%MTRR_TYPE_WRCOMB	-	Write data back soon but allow bursts
+ *	%X86_MT_WC	-	Write data back soon but allow bursts
  *
- *	%MTRR_TYPE_WRTHROUGH	-	Cache reads but not writes
+ *	%X86_MT_WT	-	Cache reads but not writes
  *
  *	BUGS: Needs a quiet flag for the cases where drivers do not mind
  *	failures and do not wish system log messages to be sent.
@@ -328,7 +328,7 @@ int mtrr_add_page(unsigned long base, unsigned long size,
 	}
 
 	/*  If the type is WC, check that this processor supports it  */
-	if ((type == MTRR_TYPE_WRCOMB) && !have_wrcomb()) {
+	if ((type == X86_MT_WC) && !have_wrcomb()) {
 		printk(KERN_WARNING
 		       "mtrr: your processor doesn't support write-combining\n");
 		return -EOPNOTSUPP;
@@ -442,13 +442,13 @@ static int mtrr_check(unsigned long base, unsigned long size)
  *
  *	The available types are
  *
- *	%MTRR_TYPE_UNCACHABLE	-	No caching
+ *	%X86_MT_UC	-	No caching
  *
- *	%MTRR_TYPE_WRBACK	-	Write data back in bursts whenever
+ *	%X86_MT_WB	-	Write data back in bursts whenever
  *
- *	%MTRR_TYPE_WRCOMB	-	Write data back soon but allow bursts
+ *	%X86_MT_WC	-	Write data back soon but allow bursts
  *
- *	%MTRR_TYPE_WRTHROUGH	-	Cache reads but not writes
+ *	%X86_MT_WT	-	Cache reads but not writes
  *
  *	BUGS: Needs a quiet flag for the cases where drivers do not mind
  *	failures and do not wish system log messages to be sent.
diff --git a/xen/arch/x86/e820.c b/xen/arch/x86/e820.c
index b653a19c93afb98c2d64330384cb4fa7b4d2e1ec..c5911cf48dc4a281c03ddef35f23b19bc7af42eb 100644
--- a/xen/arch/x86/e820.c
+++ b/xen/arch/x86/e820.c
@@ -459,7 +459,7 @@ static uint64_t __init mtrr_top_of_ram(void)
         printk(" MTRR cap: %"PRIx64" type: %"PRIx64"\n", mtrr_cap, mtrr_def);
 
     /* MTRRs enabled, and default memory type is not writeback? */
-    if ( !test_bit(11, &mtrr_def) || ((uint8_t)mtrr_def == MTRR_TYPE_WRBACK) )
+    if ( !test_bit(11, &mtrr_def) || ((uint8_t)mtrr_def == X86_MT_WB) )
         return 0;
 
     /*
@@ -476,7 +476,7 @@ static uint64_t __init mtrr_top_of_ram(void)
             printk(" MTRR[%d]: base %"PRIx64" mask %"PRIx64"\n",
                    i, base, mask);
 
-        if ( !test_bit(11, &mask) || ((uint8_t)base != MTRR_TYPE_WRBACK) )
+        if ( !test_bit(11, &mask) || ((uint8_t)base != X86_MT_WB) )
             continue;
         base &= addr_mask;
         mask &= addr_mask;
diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index 242623f3c239ee18a44f882ecb3910a00c615825..093103f6c768cf64f880d1b20e1c14f5918c1250 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -39,11 +39,11 @@ static const uint8_t pat_entry_2_pte_flags[8] = {
 /* Effective mm type lookup table, according to MTRR and PAT. */
 static const uint8_t mm_type_tbl[MTRR_NUM_TYPES][X86_NUM_MT] = {
 #define RS MEMORY_NUM_TYPES
-#define UC MTRR_TYPE_UNCACHABLE
-#define WB MTRR_TYPE_WRBACK
-#define WC MTRR_TYPE_WRCOMB
-#define WP MTRR_TYPE_WRPROT
-#define WT MTRR_TYPE_WRTHROUGH
+#define UC X86_MT_UC
+#define WB X86_MT_WB
+#define WC X86_MT_WC
+#define WP X86_MT_WP
+#define WT X86_MT_WT
 
 /*          PAT(UC, WC, RS, RS, WT, WP, WB, UC-) */
 /* MTRR(UC) */ {UC, WC, RS, RS, UC, UC, UC, UC},
@@ -202,7 +202,7 @@ int mtrr_get_type(const struct mtrr_state *m, paddr_t pa, unsigned int order)
    unsigned int seg, num_var_ranges = MASK_EXTR(m->mtrr_cap, MTRRcap_VCNT);
 
    if ( unlikely(!m->enabled) )
-       return MTRR_TYPE_UNCACHABLE;
+       return X86_MT_UC;
 
    pa &= mask;
    if ( (pa < 0x100000) && m->fixed_enabled )
@@ -277,13 +277,13 @@ int mtrr_get_type(const struct mtrr_state *m, paddr_t pa, unsigned int order)
        return -1;
 
    /* Two or more matches, one being UC? */
-   if ( overlap_mtrr & (1 << MTRR_TYPE_UNCACHABLE) )
-       return MTRR_TYPE_UNCACHABLE;
+   if ( overlap_mtrr & (1 << X86_MT_UC) )
+       return X86_MT_UC;
 
    /* Two or more matches, all of them WT and WB? */
    if ( overlap_mtrr ==
-        ((1 << MTRR_TYPE_WRTHROUGH) | (1 << MTRR_TYPE_WRBACK)) )
-       return MTRR_TYPE_WRTHROUGH;
+        ((1 << X86_MT_WT) | (1 << X86_MT_WB)) )
+       return X86_MT_WT;
 
    /* Behaviour is undefined, but return the last overlapped type. */
    return overlap_mtrr_pos;
@@ -381,11 +381,11 @@ static inline bool_t valid_mtrr_type(uint8_t type)
 {
     switch ( type )
     {
-    case MTRR_TYPE_UNCACHABLE:
-    case MTRR_TYPE_WRBACK:
-    case MTRR_TYPE_WRCOMB:
-    case MTRR_TYPE_WRPROT:
-    case MTRR_TYPE_WRTHROUGH:
+    case X86_MT_UC:
+    case X86_MT_WB:
+    case X86_MT_WC:
+    case X86_MT_WP:
+    case X86_MT_WT:
         return 1;
     }
     return 0;
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 84dbb88d33b76111833a37339186199f8bc03b5e..f0825216d722d978f221bb34a797d8de5505cb80 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -555,7 +555,7 @@ static int vmx_init_vmcs_config(bool bsp)
     /* Require Write-Back (WB) memory type for VMCS accesses. */
     opt = (vmx_basic_msr_high & (VMX_BASIC_MEMORY_TYPE_MASK >> 32)) /
           ((VMX_BASIC_MEMORY_TYPE_MASK & -VMX_BASIC_MEMORY_TYPE_MASK) >> 32);
-    if ( opt != MTRR_TYPE_WRBACK )
+    if ( opt != X86_MT_WB )
     {
         printk("VMX: CPU%d has unexpected VMCS access type %u\n",
                smp_processor_id(), opt);
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b543c3983d77ae807e8bd97330691a79d8d39bae..4ae7dd56c9981d32ac545d6e7b7c126b15f68969 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -434,7 +434,7 @@ static void cf_check domain_creation_finished(struct domain *d)
         return;
 
     ASSERT(epte_get_entry_emt(d, gfn, apic_access_mfn, 0, &ipat,
-                              p2m_mmio_direct) == MTRR_TYPE_WRBACK);
+                              p2m_mmio_direct) == X86_MT_WB);
     ASSERT(ipat);
 
     if ( set_mmio_p2m_entry(d, gfn, apic_access_mfn, PAGE_ORDER_4K) )
diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
index 75f9928abfad28e3895fe3dd4058b2b0a6e145c3..65e9e27b5437adff59abc46976f73a9f2cc587da 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -38,7 +38,7 @@ struct vmx_msr_entry {
     u64 data;
 };
 
-#define EPT_DEFAULT_MT      MTRR_TYPE_WRBACK
+#define EPT_DEFAULT_MT      X86_MT_WB
 
 struct ept_data {
     union {
diff --git a/xen/arch/x86/include/asm/mtrr.h b/xen/arch/x86/include/asm/mtrr.h
index 92fc930c692039b6c709d6a04f6553593f40aa55..e4f6ca6048334b2094a1836cc2f298453641232f 100644
--- a/xen/arch/x86/include/asm/mtrr.h
+++ b/xen/arch/x86/include/asm/mtrr.h
@@ -3,15 +3,9 @@
 
 #include <xen/mm.h>
 
-/* These are the region types. They match the architectural specification. */
-#define MTRR_TYPE_UNCACHABLE 0
-#define MTRR_TYPE_WRCOMB     1
-#define MTRR_TYPE_WRTHROUGH  4
-#define MTRR_TYPE_WRPROT     5
-#define MTRR_TYPE_WRBACK     6
-#define MTRR_NUM_TYPES       7
+#define MTRR_NUM_TYPES       X86_MT_UCM
 #define MEMORY_NUM_TYPES     MTRR_NUM_TYPES
-#define NO_HARDCODE_MEM_TYPE    MTRR_NUM_TYPES
+#define NO_HARDCODE_MEM_TYPE MTRR_NUM_TYPES
 
 #define NORMAL_CACHE_MODE          0
 #define NO_FILL_CACHE_MODE         2
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 126437285d8a9f222fca6a7b6ff4434b60637847..bb143c6c42c69db4e054b9156aad9a18ea0b2378 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -506,7 +506,7 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
                                                mfn_x(mfn) | ((1UL << order) - 1)) )
         {
             *ipat = true;
-            return MTRR_TYPE_UNCACHABLE;
+            return X86_MT_UC;
         }
         /* Force invalid memory type so resolve_misconfig() will split it */
         return -1;
@@ -515,7 +515,7 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
     if ( !mfn_valid(mfn) )
     {
         *ipat = true;
-        return MTRR_TYPE_UNCACHABLE;
+        return X86_MT_UC;
     }
 
     /*
@@ -526,7 +526,7 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
          !cache_flush_permitted(d) )
     {
         *ipat = true;
-        return MTRR_TYPE_WRBACK;
+        return X86_MT_WB;
     }
 
     for ( special_pgs = i = 0; i < (1ul << order); i++ )
@@ -539,13 +539,13 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
             return -1;
 
         *ipat = true;
-        return MTRR_TYPE_WRBACK;
+        return X86_MT_WB;
     }
 
     switch ( type )
     {
     case p2m_mmio_direct:
-        return MTRR_TYPE_UNCACHABLE;
+        return X86_MT_UC;
 
     case p2m_grant_map_ro:
     case p2m_grant_map_rw:
@@ -563,7 +563,7 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
          * diverges. See p2m_type_to_flags for the AMD attributes.
          */
         *ipat = true;
-        return MTRR_TYPE_WRBACK;
+        return X86_MT_WB;
 
     default:
         break;
@@ -573,15 +573,14 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
     if ( gmtrr_mtype >= 0 )
     {
         *ipat = true;
-        return gmtrr_mtype != X86_MT_UCM ? gmtrr_mtype
-                                         : MTRR_TYPE_UNCACHABLE;
+        return gmtrr_mtype != X86_MT_UCM ? gmtrr_mtype : X86_MT_UC;
     }
     if ( gmtrr_mtype == -EADDRNOTAVAIL )
         return -1;
 
     gmtrr_mtype = v ? mtrr_get_type(&v->arch.hvm.mtrr,
                                     gfn_x(gfn) << PAGE_SHIFT, order)
-                    : MTRR_TYPE_WRBACK;
+                    : X86_MT_WB;
     hmtrr_mtype = mtrr_get_type(&mtrr_state, mfn_x(mfn) << PAGE_SHIFT,
                                 order);
     if ( gmtrr_mtype < 0 || hmtrr_mtype < 0 )
@@ -592,14 +591,14 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
         return hmtrr_mtype;
 
     /* If either type is UC, we have to go with that one. */
-    if ( gmtrr_mtype == MTRR_TYPE_UNCACHABLE ||
-         hmtrr_mtype == MTRR_TYPE_UNCACHABLE )
-        return MTRR_TYPE_UNCACHABLE;
+    if ( gmtrr_mtype == X86_MT_UC ||
+         hmtrr_mtype == X86_MT_UC )
+        return X86_MT_UC;
 
     /* If either type is WB, we have to go with the other one. */
-    if ( gmtrr_mtype == MTRR_TYPE_WRBACK )
+    if ( gmtrr_mtype == X86_MT_WB )
         return hmtrr_mtype;
-    if ( hmtrr_mtype == MTRR_TYPE_WRBACK )
+    if ( hmtrr_mtype == X86_MT_WB )
         return gmtrr_mtype;
 
     /*
@@ -610,13 +609,13 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
      * permit this), while WT and WP require writes to go straight to memory
      * (WC can buffer them).
      */
-    if ( (gmtrr_mtype == MTRR_TYPE_WRTHROUGH &&
-          hmtrr_mtype == MTRR_TYPE_WRPROT) ||
-         (gmtrr_mtype == MTRR_TYPE_WRPROT &&
-          hmtrr_mtype == MTRR_TYPE_WRTHROUGH) )
-        return MTRR_TYPE_WRPROT;
+    if ( (gmtrr_mtype == X86_MT_WT &&
+          hmtrr_mtype == X86_MT_WP) ||
+         (gmtrr_mtype == X86_MT_WP &&
+          hmtrr_mtype == X86_MT_WT) )
+        return X86_MT_WP;
 
-    return MTRR_TYPE_UNCACHABLE;
+    return X86_MT_UC;
 }
 
 /*
@@ -1426,12 +1425,12 @@ void ept_p2m_uninit(struct p2m_domain *p2m)
 static const char *memory_type_to_str(unsigned int x)
 {
     static const char memory_types[8][3] = {
-        [MTRR_TYPE_UNCACHABLE]     = "UC",
-        [MTRR_TYPE_WRCOMB]         = "WC",
-        [MTRR_TYPE_WRTHROUGH]      = "WT",
-        [MTRR_TYPE_WRPROT]         = "WP",
-        [MTRR_TYPE_WRBACK]         = "WB",
-        [MTRR_NUM_TYPES]           = "??"
+        [X86_MT_UC]      = "UC",
+        [X86_MT_WC]      = "WC",
+        [X86_MT_WT]      = "WT",
+        [X86_MT_WP]      = "WP",
+        [X86_MT_WB]      = "WB",
+        [MTRR_NUM_TYPES] = "??",
     };
 
     ASSERT(x < ARRAY_SIZE(memory_types));
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index b64bba70fc17906236872a017ad48ce91fd30803..f5f7ff021bd9e057c5b6f6329de7acb5ef05d58f 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -570,7 +570,7 @@ _sh_propagate(struct vcpu *v,
                             gflags,
                             gfn_to_paddr(target_gfn),
                             mfn_to_maddr(target_mfn),
-                            MTRR_TYPE_UNCACHABLE);
+                            X86_MT_UC);
                 else if ( iommu_snoop )
                     sflags |= pat_type_2_pte_flags(X86_MT_WB);
                 else
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 05/10] x86: Replace EPT_EMT_* constants with X86_MT_*
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
                   ` (3 preceding siblings ...)
  2022-12-20  1:07 ` [PATCH v5 04/10] x86: Replace MTRR_* constants with X86_MT_* constants Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-20  8:21   ` Jan Beulich
  2022-12-20  1:07 ` [PATCH v5 06/10] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE Demi Marie Obenour
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

This allows eliminating the former.  No functional change intended.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
 xen/arch/x86/include/asm/hvm/vmx/vmx.h | 9 ---------
 xen/arch/x86/mm/hap/nested_ept.c       | 4 ++--
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmx.h b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
index 8eedf59155e01ec1ca84dcc6b30961f9c884cb3b..49fe9822fac5eae15b67f0cfd3d0cb96347dc7ed 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
@@ -80,15 +80,6 @@ typedef enum {
 #define EPTE_RWX_MASK           0x7
 #define EPTE_FLAG_MASK          0x7f
 
-#define EPT_EMT_UC              0
-#define EPT_EMT_WC              1
-#define EPT_EMT_RSV0            2
-#define EPT_EMT_RSV1            3
-#define EPT_EMT_WT              4
-#define EPT_EMT_WP              5
-#define EPT_EMT_WB              6
-#define EPT_EMT_RSV2            7
-
 #define PI_xAPIC_NDST_MASK      0xFF00
 
 void vmx_asm_vmexit_handler(struct cpu_user_regs);
diff --git a/xen/arch/x86/mm/hap/nested_ept.c b/xen/arch/x86/mm/hap/nested_ept.c
index 1cb7fefc37091bf7d92a277203e652add5611871..23fb3889b7605be62805731218c314819d5027b5 100644
--- a/xen/arch/x86/mm/hap/nested_ept.c
+++ b/xen/arch/x86/mm/hap/nested_ept.c
@@ -84,8 +84,8 @@ static bool_t nept_emt_bits_check(ept_entry_t e, uint32_t level)
 {
     if ( e.sp || level == 1 )
     {
-        if ( e.emt == EPT_EMT_RSV0 || e.emt == EPT_EMT_RSV1 ||
-             e.emt == EPT_EMT_RSV2 )
+        if ( e.emt == X86_MT_RSVD_2 || e.emt == X86_MT_RSVD_3 ||
+             e.emt == X86_MT_UCM )
             return 1;
     }
     return 0;
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 06/10] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
                   ` (4 preceding siblings ...)
  2022-12-20  1:07 ` [PATCH v5 05/10] x86: Replace EPT_EMT_* constants with X86_MT_* Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-20  1:07 ` [PATCH v5 07/10] x86: Derive XEN_MSR_PAT from its individual entries Demi Marie Obenour
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

No functional change intended.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
This patch is optional.  Subsequent patches should not depend on it.

Changes since v2:

- Keep MTRR_NUM_TYPES and adjust commit message accordingly
---
 xen/arch/x86/hvm/mtrr.c         | 18 +++++++++---------
 xen/arch/x86/include/asm/mtrr.h |  2 --
 xen/arch/x86/mm/shadow/multi.c  |  2 +-
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index 093103f6c768cf64f880d1b20e1c14f5918c1250..05e978041d62fd0d559462de181a04bef8a5bca9 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -38,7 +38,7 @@ static const uint8_t pat_entry_2_pte_flags[8] = {
 
 /* Effective mm type lookup table, according to MTRR and PAT. */
 static const uint8_t mm_type_tbl[MTRR_NUM_TYPES][X86_NUM_MT] = {
-#define RS MEMORY_NUM_TYPES
+#define RS MTRR_NUM_TYPES
 #define UC X86_MT_UC
 #define WB X86_MT_WB
 #define WC X86_MT_WC
@@ -66,9 +66,9 @@ static const uint8_t mm_type_tbl[MTRR_NUM_TYPES][X86_NUM_MT] = {
  * Reverse lookup table, to find a pat type according to MTRR and effective
  * memory type. This table is dynamically generated.
  */
-static uint8_t __read_mostly mtrr_epat_tbl[MTRR_NUM_TYPES][MEMORY_NUM_TYPES] =
-    { [0 ... MTRR_NUM_TYPES-1] =
-        { [0 ... MEMORY_NUM_TYPES-1] = INVALID_MEM_TYPE }
+static uint8_t __read_mostly mtrr_epat_tbl[MTRR_NUM_TYPES][MTRR_NUM_TYPES] =
+    { [0 ... MTRR_NUM_TYPES - 1] =
+        { [0 ... MTRR_NUM_TYPES - 1] = INVALID_MEM_TYPE }
     };
 
 /* Lookup table for PAT entry of a given PAT value in host PAT. */
@@ -85,7 +85,7 @@ static int __init cf_check hvm_mtrr_pat_init(void)
         {
             unsigned int tmp = mm_type_tbl[i][j];
 
-            if ( tmp < MEMORY_NUM_TYPES )
+            if ( tmp < MTRR_NUM_TYPES )
                 mtrr_epat_tbl[i][tmp] = j;
         }
     }
@@ -317,11 +317,11 @@ static uint8_t effective_mm_type(struct mtrr_state *m,
                                  uint8_t gmtrr_mtype)
 {
     uint8_t mtrr_mtype, pat_value;
-   
+
     /* if get_pat_flags() gives a dedicated MTRR type,
      * just use it
-     */ 
-    if ( gmtrr_mtype == NO_HARDCODE_MEM_TYPE )
+     */
+    if ( gmtrr_mtype == MTRR_NUM_TYPES )
         mtrr_mtype = mtrr_get_type(m, gpa, 0);
     else
         mtrr_mtype = gmtrr_mtype;
@@ -346,7 +346,7 @@ uint32_t get_pat_flags(struct vcpu *v,
     /* 1. Get the effective memory type of guest physical address,
      * with the pair of guest MTRR and PAT
      */
-    guest_eff_mm_type = effective_mm_type(g, pat, gpaddr, 
+    guest_eff_mm_type = effective_mm_type(g, pat, gpaddr,
                                           gl1e_flags, gmtrr_mtype);
     /* 2. Get the memory type of host physical address, with MTRR */
     shadow_mtrr_type = mtrr_get_type(&mtrr_state, spaddr, 0);
diff --git a/xen/arch/x86/include/asm/mtrr.h b/xen/arch/x86/include/asm/mtrr.h
index e4f6ca6048334b2094a1836cc2f298453641232f..4b7f840a965954cc4b59698327a37e47026893a4 100644
--- a/xen/arch/x86/include/asm/mtrr.h
+++ b/xen/arch/x86/include/asm/mtrr.h
@@ -4,8 +4,6 @@
 #include <xen/mm.h>
 
 #define MTRR_NUM_TYPES       X86_MT_UCM
-#define MEMORY_NUM_TYPES     MTRR_NUM_TYPES
-#define NO_HARDCODE_MEM_TYPE MTRR_NUM_TYPES
 
 #define NORMAL_CACHE_MODE          0
 #define NO_FILL_CACHE_MODE         2
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index f5f7ff021bd9e057c5b6f6329de7acb5ef05d58f..1faf9940db6b0afefc5977c00c00fb6a39cd27d2 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -578,7 +578,7 @@ _sh_propagate(struct vcpu *v,
                             gflags,
                             gfn_to_paddr(target_gfn),
                             mfn_to_maddr(target_mfn),
-                            NO_HARDCODE_MEM_TYPE);
+                            MTRR_NUM_TYPES);
             }
     }
 
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 07/10] x86: Derive XEN_MSR_PAT from its individual entries
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
                   ` (5 preceding siblings ...)
  2022-12-20  1:07 ` [PATCH v5 06/10] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-20  1:07 ` [PATCH v5 08/10] x86/mm: make code robust to future PAT changes Demi Marie Obenour
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

This avoids it being a magic constant that is difficult for humans to
decode.  Use BUILD_BUG_ON to check that the old and new values are
identical.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v4:
- Explain that changing XEN_MSR_PAT breaks guests that rely on the API
  in xen.h instead of reading the PAT from Xen.
---
 xen/arch/x86/include/asm/processor.h |  9 ++++++++-
 xen/arch/x86/mm.c                    | 11 +++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/include/asm/processor.h b/xen/arch/x86/include/asm/processor.h
index 8e2816fae9b97bd4e153a30cc3802971fe0355af..60b902060914584957db8afa5c7c1e6abdad4d13 100644
--- a/xen/arch/x86/include/asm/processor.h
+++ b/xen/arch/x86/include/asm/processor.h
@@ -96,7 +96,14 @@
  * Host IA32_CR_PAT value to cover all memory types.  This is not the default
  * MSR_PAT value, and is an ABI with PV guests.
  */
-#define XEN_MSR_PAT _AC(0x050100070406, ULL)
+#define XEN_MSR_PAT ((_AC(X86_MT_WB,  ULL) << 0x00) | \
+                     (_AC(X86_MT_WT,  ULL) << 0x08) | \
+                     (_AC(X86_MT_UCM, ULL) << 0x10) | \
+                     (_AC(X86_MT_UC,  ULL) << 0x18) | \
+                     (_AC(X86_MT_WC,  ULL) << 0x20) | \
+                     (_AC(X86_MT_WP,  ULL) << 0x28) | \
+                     (_AC(X86_MT_UC,  ULL) << 0x30) | \
+                     (_AC(X86_MT_UC,  ULL) << 0x38))
 
 #ifndef __ASSEMBLY__
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index dba6c77ef2f7ed7fcb7f7e526583ccadd35e62cc..b40a575b61418ea1137299e68b64f7efd9efeced 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -6352,6 +6352,17 @@ unsigned long get_upper_mfn_bound(void)
     return min(max_mfn, 1UL << (paddr_bits - PAGE_SHIFT)) - 1;
 }
 
+static void __init __maybe_unused build_assertions(void)
+{
+    /*
+     * If this trips, any guest that blindly rely on the public API in xen.h
+     * (instead of reading the PAT from Xen, as Linux 3.19+ does) will be
+     * broken.  Furthermore, live migration of PV guests between Xen versions
+     * using different PATs will not work.
+     */
+    BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
+}
+
 /*
  * Local variables:
  * mode: C
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 08/10] x86/mm: make code robust to future PAT changes
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
                   ` (6 preceding siblings ...)
  2022-12-20  1:07 ` [PATCH v5 07/10] x86: Derive XEN_MSR_PAT from its individual entries Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-22  9:35   ` Jan Beulich
  2022-12-20  1:07 ` [PATCH v5 09/10] x86/mm: Reject invalid cacheability in PV guests by default Demi Marie Obenour
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

It may be desirable to change Xen's PAT for various reasons.  This
requires changes to several _PAGE_* macros as well.  Add static
assertions to check that XEN_MSR_PAT is consistent with the _PAGE_*
macros, and that _PAGE_WB is 0 as required by Linux.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
Changes since v4:
- Add lots of comments explaining what the various BUILD_BUG_ON()s mean.

Changes since v3:
- Refactor some macros
- Avoid including a string literal in BUILD_BUG_ON
---
 xen/arch/x86/mm.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index b40a575b61418ea1137299e68b64f7efd9efeced..a72556668633ee57b77c9a57d3a13dd5a12d9bbf 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -6352,6 +6352,11 @@ unsigned long get_upper_mfn_bound(void)
     return min(max_mfn, 1UL << (paddr_bits - PAGE_SHIFT)) - 1;
 }
 
+
+/*
+ * A bunch of static assertions to check that the XEN_MSR_PAT is valid
+ * and consistent with the _PAGE_* macros, and that _PAGE_WB is zero.
+ */
 static void __init __maybe_unused build_assertions(void)
 {
     /*
@@ -6361,6 +6366,72 @@ static void __init __maybe_unused build_assertions(void)
      * using different PATs will not work.
      */
     BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
+
+    /*
+     * _PAGE_WB must be zero for several reasons, not least because Linux
+     * assumes it.
+     */
+    BUILD_BUG_ON(_PAGE_WB);
+
+    /* A macro to convert from cache attributes to actual cacheability */
+#define PAT_ENTRY(v) (0xFF & (XEN_MSR_PAT >> (8 * (v))))
+
+    /* Validate at compile-time that v is a valid value for a PAT entry */
+#define CHECK_PAT_ENTRY_VALUE(v)                                               \
+    BUILD_BUG_ON((v) < 0 || (v) > 7 ||                                         \
+                 (v) == X86_MT_RSVD_2 || (v) == X86_MT_RSVD_3)
+
+    /* Validate at compile-time that PAT entry v is valid */
+#define CHECK_PAT_ENTRY(v) do {                                                \
+    BUILD_BUG_ON((v) < 0 || (v) > 7);                                          \
+    CHECK_PAT_ENTRY_VALUE(PAT_ENTRY(v));                                       \
+} while (0);
+
+    /*
+     * If one of these trips, the corresponding entry in XEN_MSR_PAT is invalid.
+     * This would cause Xen to crash (with #GP) at startup.
+     */
+    CHECK_PAT_ENTRY(0);
+    CHECK_PAT_ENTRY(1);
+    CHECK_PAT_ENTRY(2);
+    CHECK_PAT_ENTRY(3);
+    CHECK_PAT_ENTRY(4);
+    CHECK_PAT_ENTRY(5);
+    CHECK_PAT_ENTRY(6);
+    CHECK_PAT_ENTRY(7);
+
+#undef CHECK_PAT_ENTRY
+#undef CHECK_PAT_ENTRY_VALUE
+
+    /* Macro version of page_flags_to_cacheattr(), for use in BUILD_BUG_ON()s */
+#define PAGE_FLAGS_TO_CACHEATTR(page_value)                                    \
+    ((((page_value) >> 5) & 4) | (((page_value) >> 3) & 3))
+
+    /* Check that a PAT-related _PAGE_* macro is correct */
+#define CHECK_PAGE_VALUE(page_value) do {                                      \
+    /* Check that the _PAGE_* macros only use bits from PAGE_CACHE_ATTRS */    \
+    BUILD_BUG_ON(((_PAGE_##page_value) & PAGE_CACHE_ATTRS) !=                  \
+                  (_PAGE_##page_value));                                       \
+    /* Check that the _PAGE_* are consistent with XEN_MSR_PAT */               \
+    BUILD_BUG_ON(PAT_ENTRY(PAGE_FLAGS_TO_CACHEATTR(_PAGE_##page_value)) !=     \
+                 (X86_MT_##page_value));                                       \
+} while (0)
+
+    /*
+     * If one of these trips, the corresponding _PAGE_* macro is inconsistent
+     * with XEN_MSR_PAT.  This would cause Xen to use incorrect cacheability
+     * flags, with results that are undefined and probably harmful.
+     */
+    CHECK_PAGE_VALUE(WT);
+    CHECK_PAGE_VALUE(WB);
+    CHECK_PAGE_VALUE(WC);
+    CHECK_PAGE_VALUE(UC);
+    CHECK_PAGE_VALUE(UCM);
+    CHECK_PAGE_VALUE(WP);
+
+#undef CHECK_PAGE_VALUE
+#undef PAGE_FLAGS_TO_CACHEATTR
+#undef PAT_ENTRY
 }
 
 /*
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 09/10] x86/mm: Reject invalid cacheability in PV guests by default
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
                   ` (7 preceding siblings ...)
  2022-12-20  1:07 ` [PATCH v5 08/10] x86/mm: make code robust to future PAT changes Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-22  9:48   ` Jan Beulich
  2022-12-20  1:07 ` [PATCH v5 10/10] x86: Use Linux's PAT Demi Marie Obenour
  2022-12-20 16:13 ` [PATCH v5 11/10] hvmloader: use memory type constants Jan Beulich
  10 siblings, 1 reply; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

Setting cacheability flags that are not ones specified by Xen is a bug
in the guest.  By default, inject #GP into any guest that does this.
allow_invalid_cacheability can be used on the Xen command line to
disable this check.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
Changes since v4:
- Remove pointless BUILD_BUG_ON().
- Add comment explaining why an exception is being injected.

Changes since v3:
- Add Andrew Cooper’s Suggested-by
---
 xen/arch/x86/mm.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a72556668633ee57b77c9a57d3a13dd5a12d9bbf..69ce597c7cd5283ae4b5f3bc0a6dfa0bb3228d3d 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -145,6 +145,8 @@
 
 #ifdef CONFIG_PV
 #include "pv/mm.h"
+bool allow_invalid_cacheability;
+boolean_param("allow_invalid_cacheability", allow_invalid_cacheability);
 #endif
 
 /* Override macros from asm/page.h to make them work with mfn_t */
@@ -1343,7 +1345,33 @@ static int promote_l1_table(struct page_info *page)
         }
         else
         {
-            switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
+            l1_pgentry_t l1e = pl1e[i];
+
+            if ( !allow_invalid_cacheability )
+            {
+                switch ( l1e.l1 & PAGE_CACHE_ATTRS )
+                {
+                case _PAGE_WB:
+                case _PAGE_UC:
+                case _PAGE_UCM:
+                case _PAGE_WC:
+                case _PAGE_WT:
+                case _PAGE_WP:
+                    break;
+                default:
+                    /*
+                     * If we get here, a PV guest tried to use one of the
+                     * reserved values in Xen's PAT.  This indicates a bug in
+                     * the guest, so inject #GP to cause the guest to log a
+                     * stack trace.
+                     */
+                    pv_inject_hw_exception(TRAP_gp_fault, 0);
+                    ret = -EINVAL;
+                    goto fail;
+                }
+            }
+
+            switch ( ret = get_page_from_l1e(l1e, d, d) )
             {
             default:
                 goto fail;
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 10/10] x86: Use Linux's PAT
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
                   ` (8 preceding siblings ...)
  2022-12-20  1:07 ` [PATCH v5 09/10] x86/mm: Reject invalid cacheability in PV guests by default Demi Marie Obenour
@ 2022-12-20  1:07 ` Demi Marie Obenour
  2022-12-20 16:13 ` [PATCH v5 11/10] hvmloader: use memory type constants Jan Beulich
  10 siblings, 0 replies; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20  1:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Jun Nakajima,
	Kevin Tian, George Dunlap, Tim Deegan

This is purely for testing, to see if it works around a bug in i915.  It
is not intended to be merged.

NOT-signed-off-by: DO NOT MERGE
---
 xen/arch/x86/include/asm/page.h      |  4 ++--
 xen/arch/x86/include/asm/processor.h | 10 +++++-----
 xen/arch/x86/mm.c                    |  8 --------
 3 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/include/asm/page.h b/xen/arch/x86/include/asm/page.h
index b585235d064a567082582c8e92a4e8283fd949ca..ab9b46f1d0901e50a83fd035ff28d1bda0b781a2 100644
--- a/xen/arch/x86/include/asm/page.h
+++ b/xen/arch/x86/include/asm/page.h
@@ -333,11 +333,11 @@ void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t);
 
 /* Memory types, encoded under Xen's choice of MSR_PAT. */
 #define _PAGE_WB         (                                0)
-#define _PAGE_WT         (                        _PAGE_PWT)
+#define _PAGE_WC         (                        _PAGE_PWT)
 #define _PAGE_UCM        (            _PAGE_PCD            )
 #define _PAGE_UC         (            _PAGE_PCD | _PAGE_PWT)
-#define _PAGE_WC         (_PAGE_PAT                        )
 #define _PAGE_WP         (_PAGE_PAT |             _PAGE_PWT)
+#define _PAGE_WT         (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)
 
 /*
  * Debug option: Ensure that granted mappings are not implicitly unmapped.
diff --git a/xen/arch/x86/include/asm/processor.h b/xen/arch/x86/include/asm/processor.h
index 60b902060914584957db8afa5c7c1e6abdad4d13..3993d5638626f0948bb7ac8192d2eda187eb1bdb 100644
--- a/xen/arch/x86/include/asm/processor.h
+++ b/xen/arch/x86/include/asm/processor.h
@@ -94,16 +94,16 @@
 
 /*
  * Host IA32_CR_PAT value to cover all memory types.  This is not the default
- * MSR_PAT value, and is an ABI with PV guests.
+ * MSR_PAT value, and is needed by the Linux i915 driver.
  */
 #define XEN_MSR_PAT ((_AC(X86_MT_WB,  ULL) << 0x00) | \
-                     (_AC(X86_MT_WT,  ULL) << 0x08) | \
+                     (_AC(X86_MT_WC,  ULL) << 0x08) | \
                      (_AC(X86_MT_UCM, ULL) << 0x10) | \
                      (_AC(X86_MT_UC,  ULL) << 0x18) | \
-                     (_AC(X86_MT_WC,  ULL) << 0x20) | \
+                     (_AC(X86_MT_WB,  ULL) << 0x20) | \
                      (_AC(X86_MT_WP,  ULL) << 0x28) | \
-                     (_AC(X86_MT_UC,  ULL) << 0x30) | \
-                     (_AC(X86_MT_UC,  ULL) << 0x38))
+                     (_AC(X86_MT_UCM, ULL) << 0x30) | \
+                     (_AC(X86_MT_WT,  ULL) << 0x38))
 
 #ifndef __ASSEMBLY__
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 69ce597c7cd5283ae4b5f3bc0a6dfa0bb3228d3d..c536f7807a418c160366c22b6c4f937a5023f14b 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -6387,14 +6387,6 @@ unsigned long get_upper_mfn_bound(void)
  */
 static void __init __maybe_unused build_assertions(void)
 {
-    /*
-     * If this trips, any guest that blindly rely on the public API in xen.h
-     * (instead of reading the PAT from Xen, as Linux 3.19+ does) will be
-     * broken.  Furthermore, live migration of PV guests between Xen versions
-     * using different PATs will not work.
-     */
-    BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
-
     /*
      * _PAGE_WB must be zero for several reasons, not least because Linux
      * assumes it.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 02/10] x86/mm: Avoid hard-coding PAT in get_page_from_l1e()
  2022-12-20  1:07 ` [PATCH v5 02/10] x86/mm: Avoid hard-coding PAT in get_page_from_l1e() Demi Marie Obenour
@ 2022-12-20  8:19   ` Jan Beulich
  2022-12-22  9:51     ` Jan Beulich
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2022-12-20  8:19 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

On 20.12.2022 02:07, Demi Marie Obenour wrote:
> get_page_from_l1e() relied on Xen's choice of PAT, which is brittle in
> the face of future PAT changes.  Instead, compute the actual cacheability
> used by the CPU and switch on that, as this will work no matter what PAT
> Xen uses.
> 
> No functional change intended.  This code is itself questionable and may
> be removed in the future, but removing it would be an observable
> behavior change and so is out of scope for this patch series.
> 
> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
> ---
> Changes since v4:
> - Do not add new pte_flags_to_cacheability() helper, as this code may be
>   removed in the near future and so adding a new helper for it is a bad
>   idea.
> - Do not BUG() in the event of an unexpected cacheability.  This cannot
>   happen, but it is simpler to force such types to UC than to prove that
>   the BUG() is not reachable.
> 
> Changes since v3:
> - Compute and use the actual cacheability as seen by the processor.
> 
> Changes since v2:
> - Improve commit message.
> ---
>  xen/arch/x86/mm.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 78b1972e4170cacccc9c37c6e64e76e66a7da87f..dba6c77ef2f7ed7fcb7f7e526583ccadd35e62cc 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -959,14 +959,16 @@ get_page_from_l1e(
>              flip = _PAGE_RW;
>          }
>  
> -        switch ( l1f & PAGE_CACHE_ATTRS )
> +        switch ( 0xFF & (XEN_MSR_PAT >> (8 * pte_flags_to_cacheattr(l1f))) )
>          {
> -        case 0: /* WB */
> -            flip |= _PAGE_PWT | _PAGE_PCD;
> +        case X86_MT_UC:
> +        case X86_MT_UCM:
> +        case X86_MT_WC:
> +            /* not cacheable */
>              break;
> -        case _PAGE_PWT: /* WT */
> -        case _PAGE_PWT | _PAGE_PAT: /* WP */
> -            flip |= _PAGE_PCD | (l1f & _PAGE_PAT);
> +        default:
> +            /* cacheable */
> +            flip |= ((l1f & PAGE_CACHE_ATTRS) ^ _PAGE_UC);
>              break;

In v4 the comment here was "cacheable, force to UC". The latter aspect is
quite relevant (and iirc also what Andrew had asked for to have as a
comment). But with this now being the default case, the comment (in either
this or the earlier form) would become stale. A forward compatible way of
wording this would e.g. be "force any other type to UC". With an adjustment
along these lines (which I think could be done while committing)
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 05/10] x86: Replace EPT_EMT_* constants with X86_MT_*
  2022-12-20  1:07 ` [PATCH v5 05/10] x86: Replace EPT_EMT_* constants with X86_MT_* Demi Marie Obenour
@ 2022-12-20  8:21   ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2022-12-20  8:21 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

On 20.12.2022 02:07, Demi Marie Obenour wrote:
> This allows eliminating the former.  No functional change intended.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 11/10] hvmloader: use memory type constants
  2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
                   ` (9 preceding siblings ...)
  2022-12-20  1:07 ` [PATCH v5 10/10] x86: Use Linux's PAT Demi Marie Obenour
@ 2022-12-20 16:13 ` Jan Beulich
  2022-12-20 16:36   ` Andrew Cooper
  2022-12-20 17:45   ` Demi Marie Obenour
  10 siblings, 2 replies; 28+ messages in thread
From: Jan Beulich @ 2022-12-20 16:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Demi Marie Obenour

Now that we have them available in a header which is okay to use from
hvmloader sources, do away with respective literal numbers and silent
assumptions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/tools/firmware/hvmloader/cacheattr.c
+++ b/tools/firmware/hvmloader/cacheattr.c
@@ -22,6 +22,8 @@
 #include "util.h"
 #include "config.h"
 
+#include <xen/asm/x86-defns.h>
+
 #define MSR_MTRRphysBase(reg) (0x200 + 2 * (reg))
 #define MSR_MTRRphysMask(reg) (0x200 + 2 * (reg) + 1)
 #define MSR_MTRRcap          0x00fe
@@ -71,23 +73,32 @@ void cacheattr_init(void)
 
     addr_mask = ((1ull << phys_bits) - 1) & ~((1ull << 12) - 1);
     mtrr_cap = rdmsr(MSR_MTRRcap);
-    mtrr_def = (1u << 11) | 6; /* E, default type WB */
+    mtrr_def = (1u << 11) | X86_MT_WB; /* E, default type WB */
 
     /* Fixed-range MTRRs supported? */
     if ( mtrr_cap & (1u << 8) )
     {
+#define BCST2(mt) ((mt) | ((mt) << 8))
+#define BCST4(mt) (BCST2(mt) | (BCST2(mt) << 16))
+#define BCST8(mt) (BCST4(mt) | ((uint64_t)BCST4(mt) << 32))
         /* 0x00000-0x9ffff: Write Back (WB) */
-        content = 0x0606060606060606ull;
+        content = BCST8(X86_MT_WB);
         wrmsr(MSR_MTRRfix64K_00000, content);
         wrmsr(MSR_MTRRfix16K_80000, content);
+
         /* 0xa0000-0xbffff: Write Combining (WC) */
         if ( mtrr_cap & (1u << 10) ) /* WC supported? */
-            content = 0x0101010101010101ull;
+            content = BCST8(X86_MT_WC);
         wrmsr(MSR_MTRRfix16K_A0000, content);
+
         /* 0xc0000-0xfffff: Write Back (WB) */
-        content = 0x0606060606060606ull;
+        content = BCST8(X86_MT_WB);
         for ( i = 0; i < 8; i++ )
             wrmsr(MSR_MTRRfix4K_C0000 + i, content);
+#undef BCST8
+#undef BCST4
+#undef BCST2
+
         mtrr_def |= 1u << 10; /* FE */
         printf("fixed MTRRs ... ");
     }
@@ -106,7 +117,7 @@ void cacheattr_init(void)
             while ( ((base + size) < base) || ((base + size) > pci_mem_end) )
                 size >>= 1;
 
-            wrmsr(MSR_MTRRphysBase(i), base);
+            wrmsr(MSR_MTRRphysBase(i), base | X86_MT_UC);
             wrmsr(MSR_MTRRphysMask(i), (~(size - 1) & addr_mask) | (1u << 11));
 
             base += size;
@@ -121,7 +132,7 @@ void cacheattr_init(void)
             while ( (base + size < base) || (base + size > pci_hi_mem_end) )
                 size >>= 1;
 
-            wrmsr(MSR_MTRRphysBase(i), base);
+            wrmsr(MSR_MTRRphysBase(i), base | X86_MT_UC);
             wrmsr(MSR_MTRRphysMask(i), (~(size - 1) & addr_mask) | (1u << 11));
 
             base += size;



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 11/10] hvmloader: use memory type constants
  2022-12-20 16:13 ` [PATCH v5 11/10] hvmloader: use memory type constants Jan Beulich
@ 2022-12-20 16:36   ` Andrew Cooper
  2022-12-20 16:50     ` Jan Beulich
  2022-12-20 17:45   ` Demi Marie Obenour
  1 sibling, 1 reply; 28+ messages in thread
From: Andrew Cooper @ 2022-12-20 16:36 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Demi Marie Obenour

On 20/12/2022 4:13 pm, Jan Beulich wrote:
> Now that we have them available in a header which is okay to use from
> hvmloader sources, do away with respective literal numbers and silent
> assumptions.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/tools/firmware/hvmloader/cacheattr.c
> +++ b/tools/firmware/hvmloader/cacheattr.c
> @@ -22,6 +22,8 @@
>  #include "util.h"
>  #include "config.h"
>  
> +#include <xen/asm/x86-defns.h>
> +
>  #define MSR_MTRRphysBase(reg) (0x200 + 2 * (reg))
>  #define MSR_MTRRphysMask(reg) (0x200 + 2 * (reg) + 1)
>  #define MSR_MTRRcap          0x00fe
> @@ -71,23 +73,32 @@ void cacheattr_init(void)
>  
>      addr_mask = ((1ull << phys_bits) - 1) & ~((1ull << 12) - 1);

Does this want to be PAGE_SHIFT ?

>      mtrr_cap = rdmsr(MSR_MTRRcap);
> -    mtrr_def = (1u << 11) | 6; /* E, default type WB */
> +    mtrr_def = (1u << 11) | X86_MT_WB; /* E, default type WB */
>  
>      /* Fixed-range MTRRs supported? */
>      if ( mtrr_cap & (1u << 8) )
>      {
> +#define BCST2(mt) ((mt) | ((mt) << 8))
> +#define BCST4(mt) (BCST2(mt) | (BCST2(mt) << 16))
> +#define BCST8(mt) (BCST4(mt) | ((uint64_t)BCST4(mt) << 32))

IMO this is clearer as

#define BCST8(mt) ((mt) * 0x0101010101010101ULL)

which saves several macros.

>          /* 0x00000-0x9ffff: Write Back (WB) */
> -        content = 0x0606060606060606ull;
> +        content = BCST8(X86_MT_WB);
>          wrmsr(MSR_MTRRfix64K_00000, content);
>          wrmsr(MSR_MTRRfix16K_80000, content);
> +
>          /* 0xa0000-0xbffff: Write Combining (WC) */
>          if ( mtrr_cap & (1u << 10) ) /* WC supported? */
> -            content = 0x0101010101010101ull;
> +            content = BCST8(X86_MT_WC);
>          wrmsr(MSR_MTRRfix16K_A0000, content);

This looks like it's latently buggy.  We'll end up with WB if WC isn't
supported, when it ought to be UC.

I suppose it doesn't actually matter because if there is a VGA region
there, it will actually be holes in the P2M and trap for emulation.

Also, Xen being 64-bit, WC is always available.

> +
>          /* 0xc0000-0xfffff: Write Back (WB) */
> -        content = 0x0606060606060606ull;
> +        content = BCST8(X86_MT_WB);
>          for ( i = 0; i < 8; i++ )
>              wrmsr(MSR_MTRRfix4K_C0000 + i, content);
> +#undef BCST8
> +#undef BCST4
> +#undef BCST2
> +
>          mtrr_def |= 1u << 10; /* FE */
>          printf("fixed MTRRs ... ");
>      }
> @@ -106,7 +117,7 @@ void cacheattr_init(void)
>              while ( ((base + size) < base) || ((base + size) > pci_mem_end) )
>                  size >>= 1;
>  
> -            wrmsr(MSR_MTRRphysBase(i), base);
> +            wrmsr(MSR_MTRRphysBase(i), base | X86_MT_UC);
>              wrmsr(MSR_MTRRphysMask(i), (~(size - 1) & addr_mask) | (1u << 11));

If you're re-spinning for page_size or the macros, any chance of also
gaining some constants for E/FE to avoid these naked shifts?

~Andrew

>  
>              base += size;
> @@ -121,7 +132,7 @@ void cacheattr_init(void)
>              while ( (base + size < base) || (base + size > pci_hi_mem_end) )
>                  size >>= 1;
>  
> -            wrmsr(MSR_MTRRphysBase(i), base);
> +            wrmsr(MSR_MTRRphysBase(i), base | X86_MT_UC);
>              wrmsr(MSR_MTRRphysMask(i), (~(size - 1) & addr_mask) | (1u << 11));
>  
>              base += size;
>
>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 11/10] hvmloader: use memory type constants
  2022-12-20 16:36   ` Andrew Cooper
@ 2022-12-20 16:50     ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2022-12-20 16:50 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Demi Marie Obenour,
	xen-devel

On 20.12.2022 17:36, Andrew Cooper wrote:
> On 20/12/2022 4:13 pm, Jan Beulich wrote:
>> --- a/tools/firmware/hvmloader/cacheattr.c
>> +++ b/tools/firmware/hvmloader/cacheattr.c
>> @@ -22,6 +22,8 @@
>>  #include "util.h"
>>  #include "config.h"
>>  
>> +#include <xen/asm/x86-defns.h>
>> +
>>  #define MSR_MTRRphysBase(reg) (0x200 + 2 * (reg))
>>  #define MSR_MTRRphysMask(reg) (0x200 + 2 * (reg) + 1)
>>  #define MSR_MTRRcap          0x00fe
>> @@ -71,23 +73,32 @@ void cacheattr_init(void)
>>  
>>      addr_mask = ((1ull << phys_bits) - 1) & ~((1ull << 12) - 1);
> 
> Does this want to be PAGE_SHIFT ?

Not really (it's rather an MTRR layout related constant which we ought
to use here, which only happens to match PAGE_{SIZE,SHIFT}) and not in
this patch (see the title).

>>      mtrr_cap = rdmsr(MSR_MTRRcap);
>> -    mtrr_def = (1u << 11) | 6; /* E, default type WB */
>> +    mtrr_def = (1u << 11) | X86_MT_WB; /* E, default type WB */
>>  
>>      /* Fixed-range MTRRs supported? */
>>      if ( mtrr_cap & (1u << 8) )
>>      {
>> +#define BCST2(mt) ((mt) | ((mt) << 8))
>> +#define BCST4(mt) (BCST2(mt) | (BCST2(mt) << 16))
>> +#define BCST8(mt) (BCST4(mt) | ((uint64_t)BCST4(mt) << 32))
> 
> IMO this is clearer as
> 
> #define BCST8(mt) ((mt) * 0x0101010101010101ULL)
> 
> which saves several macros.

Ah, yes, will switch (and then name the thing just BCST()).

>>          /* 0x00000-0x9ffff: Write Back (WB) */
>> -        content = 0x0606060606060606ull;
>> +        content = BCST8(X86_MT_WB);
>>          wrmsr(MSR_MTRRfix64K_00000, content);
>>          wrmsr(MSR_MTRRfix16K_80000, content);
>> +
>>          /* 0xa0000-0xbffff: Write Combining (WC) */
>>          if ( mtrr_cap & (1u << 10) ) /* WC supported? */
>> -            content = 0x0101010101010101ull;
>> +            content = BCST8(X86_MT_WC);
>>          wrmsr(MSR_MTRRfix16K_A0000, content);
> 
> This looks like it's latently buggy.  We'll end up with WB if WC isn't
> supported, when it ought to be UC.
> 
> I suppose it doesn't actually matter because if there is a VGA region
> there, it will actually be holes in the P2M and trap for emulation.
> 
> Also, Xen being 64-bit, WC is always available.

Right, but we (or the admin) may elect to not surface availability to
the guest.

>> @@ -106,7 +117,7 @@ void cacheattr_init(void)
>>              while ( ((base + size) < base) || ((base + size) > pci_mem_end) )
>>                  size >>= 1;
>>  
>> -            wrmsr(MSR_MTRRphysBase(i), base);
>> +            wrmsr(MSR_MTRRphysBase(i), base | X86_MT_UC);
>>              wrmsr(MSR_MTRRphysMask(i), (~(size - 1) & addr_mask) | (1u << 11));
> 
> If you're re-spinning for page_size or the macros, any chance of also
> gaining some constants for E/FE to avoid these naked shifts?

Again, yes in principle, but not in this patch. This requires shuffling
more stuff into x86-defns.h first.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 11/10] hvmloader: use memory type constants
  2022-12-20 16:13 ` [PATCH v5 11/10] hvmloader: use memory type constants Jan Beulich
  2022-12-20 16:36   ` Andrew Cooper
@ 2022-12-20 17:45   ` Demi Marie Obenour
  2022-12-21  6:56     ` Jan Beulich
  1 sibling, 1 reply; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-20 17:45 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu

[-- Attachment #1: Type: text/plain, Size: 3295 bytes --]

On Tue, Dec 20, 2022 at 05:13:04PM +0100, Jan Beulich wrote:
> Now that we have them available in a header which is okay to use from
> hvmloader sources, do away with respective literal numbers and silent
> assumptions.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/tools/firmware/hvmloader/cacheattr.c
> +++ b/tools/firmware/hvmloader/cacheattr.c
> @@ -22,6 +22,8 @@
>  #include "util.h"
>  #include "config.h"
>  
> +#include <xen/asm/x86-defns.h>
> +
>  #define MSR_MTRRphysBase(reg) (0x200 + 2 * (reg))
>  #define MSR_MTRRphysMask(reg) (0x200 + 2 * (reg) + 1)
>  #define MSR_MTRRcap          0x00fe
> @@ -71,23 +73,32 @@ void cacheattr_init(void)
>  
>      addr_mask = ((1ull << phys_bits) - 1) & ~((1ull << 12) - 1);
>      mtrr_cap = rdmsr(MSR_MTRRcap);
> -    mtrr_def = (1u << 11) | 6; /* E, default type WB */
> +    mtrr_def = (1u << 11) | X86_MT_WB; /* E, default type WB */
>  
>      /* Fixed-range MTRRs supported? */
>      if ( mtrr_cap & (1u << 8) )
>      {
> +#define BCST2(mt) ((mt) | ((mt) << 8))
> +#define BCST4(mt) (BCST2(mt) | (BCST2(mt) << 16))

This should include a cast to uint32_t, just like BCST8 includes a cast
to uint64_t.  It doesn’t have any functional impact since none of the
memory types have the high bit set, but it makes the correctness of the
code much more obvious to readers.

> +#define BCST8(mt) (BCST4(mt) | ((uint64_t)BCST4(mt) << 32))
>          /* 0x00000-0x9ffff: Write Back (WB) */
> -        content = 0x0606060606060606ull;
> +        content = BCST8(X86_MT_WB);
>          wrmsr(MSR_MTRRfix64K_00000, content);
>          wrmsr(MSR_MTRRfix16K_80000, content);
> +
>          /* 0xa0000-0xbffff: Write Combining (WC) */
>          if ( mtrr_cap & (1u << 10) ) /* WC supported? */
> -            content = 0x0101010101010101ull;
> +            content = BCST8(X86_MT_WC);
>          wrmsr(MSR_MTRRfix16K_A0000, content);
> +
>          /* 0xc0000-0xfffff: Write Back (WB) */
> -        content = 0x0606060606060606ull;
> +        content = BCST8(X86_MT_WB);
>          for ( i = 0; i < 8; i++ )
>              wrmsr(MSR_MTRRfix4K_C0000 + i, content);
> +#undef BCST8
> +#undef BCST4
> +#undef BCST2
> +
>          mtrr_def |= 1u << 10; /* FE */
>          printf("fixed MTRRs ... ");
>      }
> @@ -106,7 +117,7 @@ void cacheattr_init(void)
>              while ( ((base + size) < base) || ((base + size) > pci_mem_end) )
>                  size >>= 1;
>  
> -            wrmsr(MSR_MTRRphysBase(i), base);
> +            wrmsr(MSR_MTRRphysBase(i), base | X86_MT_UC);
>              wrmsr(MSR_MTRRphysMask(i), (~(size - 1) & addr_mask) | (1u << 11));
>  
>              base += size;
> @@ -121,7 +132,7 @@ void cacheattr_init(void)
>              while ( (base + size < base) || (base + size > pci_hi_mem_end) )
>                  size >>= 1;
>  
> -            wrmsr(MSR_MTRRphysBase(i), base);
> +            wrmsr(MSR_MTRRphysBase(i), base | X86_MT_UC);
>              wrmsr(MSR_MTRRphysMask(i), (~(size - 1) & addr_mask) | (1u << 11));
>  
>              base += size;
> 

With the suggested change:

Acked-by: Demi Marie Obenour <demi@invisiblethingslab.com>
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 11/10] hvmloader: use memory type constants
  2022-12-20 17:45   ` Demi Marie Obenour
@ 2022-12-21  6:56     ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2022-12-21  6:56 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 20.12.2022 18:45, Demi Marie Obenour wrote:
> On Tue, Dec 20, 2022 at 05:13:04PM +0100, Jan Beulich wrote:
>> --- a/tools/firmware/hvmloader/cacheattr.c
>> +++ b/tools/firmware/hvmloader/cacheattr.c
>> @@ -22,6 +22,8 @@
>>  #include "util.h"
>>  #include "config.h"
>>  
>> +#include <xen/asm/x86-defns.h>
>> +
>>  #define MSR_MTRRphysBase(reg) (0x200 + 2 * (reg))
>>  #define MSR_MTRRphysMask(reg) (0x200 + 2 * (reg) + 1)
>>  #define MSR_MTRRcap          0x00fe
>> @@ -71,23 +73,32 @@ void cacheattr_init(void)
>>  
>>      addr_mask = ((1ull << phys_bits) - 1) & ~((1ull << 12) - 1);
>>      mtrr_cap = rdmsr(MSR_MTRRcap);
>> -    mtrr_def = (1u << 11) | 6; /* E, default type WB */
>> +    mtrr_def = (1u << 11) | X86_MT_WB; /* E, default type WB */
>>  
>>      /* Fixed-range MTRRs supported? */
>>      if ( mtrr_cap & (1u << 8) )
>>      {
>> +#define BCST2(mt) ((mt) | ((mt) << 8))
>> +#define BCST4(mt) (BCST2(mt) | (BCST2(mt) << 16))
> 
> This should include a cast to uint32_t, just like BCST8 includes a cast
> to uint64_t.  It doesn’t have any functional impact since none of the
> memory types have the high bit set, but it makes the correctness of the
> code much more obvious to readers.

I've already followed Andrew's suggestion.

> With the suggested change:
> 
> Acked-by: Demi Marie Obenour <demi@invisiblethingslab.com>

Thanks. Since I've addressed this differently, I'll hold back applying
of this.

From a formal perspective I'd also like to point out that an Acked-by:
from a person not being maintainer of any code being changed by a patch
has no meaning at all. Only Reviewed-by: has a meaning (but of course
implies more thorough looking at the change than Acked-by: does).

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 08/10] x86/mm: make code robust to future PAT changes
  2022-12-20  1:07 ` [PATCH v5 08/10] x86/mm: make code robust to future PAT changes Demi Marie Obenour
@ 2022-12-22  9:35   ` Jan Beulich
  2022-12-22  9:50     ` Demi Marie Obenour
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2022-12-22  9:35 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

On 20.12.2022 02:07, Demi Marie Obenour wrote:
> It may be desirable to change Xen's PAT for various reasons.  This
> requires changes to several _PAGE_* macros as well.  Add static
> assertions to check that XEN_MSR_PAT is consistent with the _PAGE_*
> macros, and that _PAGE_WB is 0 as required by Linux.

In line with the code comment, perhaps add (not just)"?

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -6352,6 +6352,11 @@ unsigned long get_upper_mfn_bound(void)
>      return min(max_mfn, 1UL << (paddr_bits - PAGE_SHIFT)) - 1;
>  }
>  
> +
> +/*
> + * A bunch of static assertions to check that the XEN_MSR_PAT is valid
> + * and consistent with the _PAGE_* macros, and that _PAGE_WB is zero.
> + */
>  static void __init __maybe_unused build_assertions(void)
>  {
>      /*
> @@ -6361,6 +6366,72 @@ static void __init __maybe_unused build_assertions(void)
>       * using different PATs will not work.
>       */
>      BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
> +
> +    /*
> +     * _PAGE_WB must be zero for several reasons, not least because Linux
> +     * assumes it.
> +     */
> +    BUILD_BUG_ON(_PAGE_WB);

Strictly speaking this is a requirement only for PV guests. We may not
want to go as far as putting "#ifdef CONFIG_PV" around it, but at least
the code comment (and then also the part of the description referring
to this) will imo want to say so.

> +    /* A macro to convert from cache attributes to actual cacheability */
> +#define PAT_ENTRY(v) (0xFF & (XEN_MSR_PAT >> (8 * (v))))

I don't think the comment is appropriate here. All you do is extract a
slot from the hard-coded PAT value we use.

> +    /* Validate at compile-time that v is a valid value for a PAT entry */
> +#define CHECK_PAT_ENTRY_VALUE(v)                                               \
> +    BUILD_BUG_ON((v) < 0 || (v) > 7 ||                                         \

PAT_ENTRY() won't produce negative values, so I don't think "(v) < 0" is
really needed here. And the "(v) > 7" will imo want replacing by
"(v) >= X86_NUM_MT".

> +                 (v) == X86_MT_RSVD_2 || (v) == X86_MT_RSVD_3)
> +
> +    /* Validate at compile-time that PAT entry v is valid */
> +#define CHECK_PAT_ENTRY(v) do {                                                \
> +    BUILD_BUG_ON((v) < 0 || (v) > 7);                                          \

I think this would better be part of PAT_ENTRY(), as the validity of the
shift there depends on it. If this check is needed in the first place,
seeing that the macro is #undef-ed right after use below, and hence the
checks only cover the eight invocations of the macro right here.

> +    CHECK_PAT_ENTRY_VALUE(PAT_ENTRY(v));                                       \
> +} while (0);

Nit (style): Missing blanks around 0 and perhaps better nowadays to use
"false" in such constructs. (Because of other comments this may go away
here anyway, but there's another similar instance below).

> +    /*
> +     * If one of these trips, the corresponding entry in XEN_MSR_PAT is invalid.
> +     * This would cause Xen to crash (with #GP) at startup.
> +     */
> +    CHECK_PAT_ENTRY(0);
> +    CHECK_PAT_ENTRY(1);
> +    CHECK_PAT_ENTRY(2);
> +    CHECK_PAT_ENTRY(3);
> +    CHECK_PAT_ENTRY(4);
> +    CHECK_PAT_ENTRY(5);
> +    CHECK_PAT_ENTRY(6);
> +    CHECK_PAT_ENTRY(7);
> +
> +#undef CHECK_PAT_ENTRY
> +#undef CHECK_PAT_ENTRY_VALUE
> +
> +    /* Macro version of page_flags_to_cacheattr(), for use in BUILD_BUG_ON()s */

DYM pte_flags_to_cacheattr()? At which point the macro name wants to
match and its parameter may also better be named "pte_value"?

> +#define PAGE_FLAGS_TO_CACHEATTR(page_value)                                    \
> +    ((((page_value) >> 5) & 4) | (((page_value) >> 3) & 3))

Hmm, yet more uses of magic literal numbers.

> +    /* Check that a PAT-related _PAGE_* macro is correct */
> +#define CHECK_PAGE_VALUE(page_value) do {                                      \
> +    /* Check that the _PAGE_* macros only use bits from PAGE_CACHE_ATTRS */    \
> +    BUILD_BUG_ON(((_PAGE_##page_value) & PAGE_CACHE_ATTRS) !=                  \
> +                  (_PAGE_##page_value));                                       \

Nit (style): One too many blanks used for indentation.

> +    /* Check that the _PAGE_* are consistent with XEN_MSR_PAT */               \
> +    BUILD_BUG_ON(PAT_ENTRY(PAGE_FLAGS_TO_CACHEATTR(_PAGE_##page_value)) !=     \
> +                 (X86_MT_##page_value));                                       \

Nit (style): Nowadays we tend to consider ## a binary operator just like
e.g. +, and hence it wants to be surrounded by blanks.

> +} while (0)
> +
> +    /*
> +     * If one of these trips, the corresponding _PAGE_* macro is inconsistent
> +     * with XEN_MSR_PAT.  This would cause Xen to use incorrect cacheability
> +     * flags, with results that are undefined and probably harmful.

Why "undefined"? They may be wrong / broken, but the result is still well-
defined afaict.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 09/10] x86/mm: Reject invalid cacheability in PV guests by default
  2022-12-20  1:07 ` [PATCH v5 09/10] x86/mm: Reject invalid cacheability in PV guests by default Demi Marie Obenour
@ 2022-12-22  9:48   ` Jan Beulich
  2022-12-22  9:55     ` Demi Marie Obenour
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2022-12-22  9:48 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

On 20.12.2022 02:07, Demi Marie Obenour wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -145,6 +145,8 @@
>  
>  #ifdef CONFIG_PV
>  #include "pv/mm.h"
> +bool allow_invalid_cacheability;

static and __ro_after_init please (the former not the least with Misra
in mind).

> +boolean_param("allow_invalid_cacheability", allow_invalid_cacheability);
>  #endif

Any new command line option will need to come with an entry in
docs/misc/xen-command-line.pandoc. Furthermore we're trying to avoid
underscores in command line option names, when dashes serve the
purpose quite fine.

Also please add a blank line at least between #include and your
addition. Personally I would find it more natural if such a single-use
control was defined next to the place it is used, i.e. 

> @@ -1343,7 +1345,33 @@ static int promote_l1_table(struct page_info *page)
>          }
>          else
>          {

... here.

> -            switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
> +            l1_pgentry_t l1e = pl1e[i];
> +
> +            if ( !allow_invalid_cacheability )
> +            {
> +                switch ( l1e.l1 & PAGE_CACHE_ATTRS )
> +                {
> +                case _PAGE_WB:
> +                case _PAGE_UC:
> +                case _PAGE_UCM:
> +                case _PAGE_WC:
> +                case _PAGE_WT:
> +                case _PAGE_WP:
> +                    break;
> +                default:

Nit (style): Blank line please between non-fall-through case blocks.

> +                    /*
> +                     * If we get here, a PV guest tried to use one of the
> +                     * reserved values in Xen's PAT.  This indicates a bug in
> +                     * the guest, so inject #GP to cause the guest to log a
> +                     * stack trace.
> +                     */

And likely make it die. Which, yes, ...

> +                    pv_inject_hw_exception(TRAP_gp_fault, 0);
> +                    ret = -EINVAL;

... also may or may not be the result of returning failure here (if the
guest "checks" for errors by using a BUG()-like construct), but that's
then more of its own fault than not coping with a non-architectural #GP.

Also wasn't there talk of having this behavior in debug hypervisors
only? Without that restriction I'm yet less happy with the change ...

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 08/10] x86/mm: make code robust to future PAT changes
  2022-12-22  9:35   ` Jan Beulich
@ 2022-12-22  9:50     ` Demi Marie Obenour
  2022-12-22  9:54       ` Jan Beulich
  0 siblings, 1 reply; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-22  9:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

[-- Attachment #1: Type: text/plain, Size: 6030 bytes --]

On Thu, Dec 22, 2022 at 10:35:08AM +0100, Jan Beulich wrote:
> On 20.12.2022 02:07, Demi Marie Obenour wrote:
> > It may be desirable to change Xen's PAT for various reasons.  This
> > requires changes to several _PAGE_* macros as well.  Add static
> > assertions to check that XEN_MSR_PAT is consistent with the _PAGE_*
> > macros, and that _PAGE_WB is 0 as required by Linux.
> 
> In line with the code comment, perhaps add (not just)"?

Will reword in v6.

> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -6352,6 +6352,11 @@ unsigned long get_upper_mfn_bound(void)
> >      return min(max_mfn, 1UL << (paddr_bits - PAGE_SHIFT)) - 1;
> >  }
> >  
> > +
> > +/*
> > + * A bunch of static assertions to check that the XEN_MSR_PAT is valid
> > + * and consistent with the _PAGE_* macros, and that _PAGE_WB is zero.
> > + */
> >  static void __init __maybe_unused build_assertions(void)
> >  {
> >      /*
> > @@ -6361,6 +6366,72 @@ static void __init __maybe_unused build_assertions(void)
> >       * using different PATs will not work.
> >       */
> >      BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
> > +
> > +    /*
> > +     * _PAGE_WB must be zero for several reasons, not least because Linux
> > +     * assumes it.
> > +     */
> > +    BUILD_BUG_ON(_PAGE_WB);
> 
> Strictly speaking this is a requirement only for PV guests. We may not
> want to go as far as putting "#ifdef CONFIG_PV" around it, but at least
> the code comment (and then also the part of the description referring
> to this) will imo want to say so.

Does Xen itself depend on this?

> > +    /* A macro to convert from cache attributes to actual cacheability */
> > +#define PAT_ENTRY(v) (0xFF & (XEN_MSR_PAT >> (8 * (v))))
> 
> I don't think the comment is appropriate here. All you do is extract a
> slot from the hard-coded PAT value we use.

Will drop in v6.

> > +    /* Validate at compile-time that v is a valid value for a PAT entry */
> > +#define CHECK_PAT_ENTRY_VALUE(v)                                               \
> > +    BUILD_BUG_ON((v) < 0 || (v) > 7 ||                                         \
> 
> PAT_ENTRY() won't produce negative values, so I don't think "(v) < 0" is
> really needed here. And the "(v) > 7" will imo want replacing by
> "(v) >= X86_NUM_MT".

Will fix in v6.

> > +                 (v) == X86_MT_RSVD_2 || (v) == X86_MT_RSVD_3)
> > +
> > +    /* Validate at compile-time that PAT entry v is valid */
> > +#define CHECK_PAT_ENTRY(v) do {                                                \
> > +    BUILD_BUG_ON((v) < 0 || (v) > 7);                                          \
> 
> I think this would better be part of PAT_ENTRY(), as the validity of the
> shift there depends on it. If this check is needed in the first place,
> seeing that the macro is #undef-ed right after use below, and hence the
> checks only cover the eight invocations of the macro right here.
> 
> > +    CHECK_PAT_ENTRY_VALUE(PAT_ENTRY(v));                                       \
> > +} while (0);
> 
> Nit (style): Missing blanks around 0 and perhaps better nowadays to use
> "false" in such constructs. (Because of other comments this may go away
> here anyway, but there's another similar instance below).

Will fix in v6.

> > +    /*
> > +     * If one of these trips, the corresponding entry in XEN_MSR_PAT is invalid.
> > +     * This would cause Xen to crash (with #GP) at startup.
> > +     */
> > +    CHECK_PAT_ENTRY(0);
> > +    CHECK_PAT_ENTRY(1);
> > +    CHECK_PAT_ENTRY(2);
> > +    CHECK_PAT_ENTRY(3);
> > +    CHECK_PAT_ENTRY(4);
> > +    CHECK_PAT_ENTRY(5);
> > +    CHECK_PAT_ENTRY(6);
> > +    CHECK_PAT_ENTRY(7);
> > +
> > +#undef CHECK_PAT_ENTRY
> > +#undef CHECK_PAT_ENTRY_VALUE
> > +
> > +    /* Macro version of page_flags_to_cacheattr(), for use in BUILD_BUG_ON()s */
> 
> DYM pte_flags_to_cacheattr()? At which point the macro name wants to
> match and its parameter may also better be named "pte_value"?

Indeed so.

> > +#define PAGE_FLAGS_TO_CACHEATTR(page_value)                                    \
> > +    ((((page_value) >> 5) & 4) | (((page_value) >> 3) & 3))
> 
> Hmm, yet more uses of magic literal numbers.

I wanted to keep the definition as close to that of
pte_flags_to_cacheattr() as possible.

> > +    /* Check that a PAT-related _PAGE_* macro is correct */
> > +#define CHECK_PAGE_VALUE(page_value) do {                                      \
> > +    /* Check that the _PAGE_* macros only use bits from PAGE_CACHE_ATTRS */    \
> > +    BUILD_BUG_ON(((_PAGE_##page_value) & PAGE_CACHE_ATTRS) !=                  \
> > +                  (_PAGE_##page_value));                                       \
> 
> Nit (style): One too many blanks used for indentation.

Will fix in v6.

> > +    /* Check that the _PAGE_* are consistent with XEN_MSR_PAT */               \
> > +    BUILD_BUG_ON(PAT_ENTRY(PAGE_FLAGS_TO_CACHEATTR(_PAGE_##page_value)) !=     \
> > +                 (X86_MT_##page_value));                                       \
> 
> Nit (style): Nowadays we tend to consider ## a binary operator just like
> e.g. +, and hence it wants to be surrounded by blanks.

Will fix in v6.

> > +} while (0)
> > +
> > +    /*
> > +     * If one of these trips, the corresponding _PAGE_* macro is inconsistent
> > +     * with XEN_MSR_PAT.  This would cause Xen to use incorrect cacheability
> > +     * flags, with results that are undefined and probably harmful.
> 
> Why "undefined"? They may be wrong / broken, but the result is still well-
> defined afaict.

“undefined” is meant as “one has violated a core assumption that
higher-level stuff depends on, so things can go arbitrarily wrong,
including e.g. corrupting memory or data”.  Is this accurate?  Should I
drop the dependent clause, or do you have a suggestion for something
better?
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 02/10] x86/mm: Avoid hard-coding PAT in get_page_from_l1e()
  2022-12-20  8:19   ` Jan Beulich
@ 2022-12-22  9:51     ` Jan Beulich
  2022-12-22  9:58       ` Demi Marie Obenour
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2022-12-22  9:51 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

On 20.12.2022 09:19, Jan Beulich wrote:
> On 20.12.2022 02:07, Demi Marie Obenour wrote:
>> get_page_from_l1e() relied on Xen's choice of PAT, which is brittle in
>> the face of future PAT changes.  Instead, compute the actual cacheability
>> used by the CPU and switch on that, as this will work no matter what PAT
>> Xen uses.
>>
>> No functional change intended.  This code is itself questionable and may
>> be removed in the future, but removing it would be an observable
>> behavior change and so is out of scope for this patch series.
>>
>> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
>> ---
>> Changes since v4:
>> - Do not add new pte_flags_to_cacheability() helper, as this code may be
>>   removed in the near future and so adding a new helper for it is a bad
>>   idea.
>> - Do not BUG() in the event of an unexpected cacheability.  This cannot
>>   happen, but it is simpler to force such types to UC than to prove that
>>   the BUG() is not reachable.
>>
>> Changes since v3:
>> - Compute and use the actual cacheability as seen by the processor.
>>
>> Changes since v2:
>> - Improve commit message.
>> ---
>>  xen/arch/x86/mm.c | 14 ++++++++------
>>  1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
>> index 78b1972e4170cacccc9c37c6e64e76e66a7da87f..dba6c77ef2f7ed7fcb7f7e526583ccadd35e62cc 100644
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -959,14 +959,16 @@ get_page_from_l1e(
>>              flip = _PAGE_RW;
>>          }
>>  
>> -        switch ( l1f & PAGE_CACHE_ATTRS )
>> +        switch ( 0xFF & (XEN_MSR_PAT >> (8 * pte_flags_to_cacheattr(l1f))) )
>>          {
>> -        case 0: /* WB */
>> -            flip |= _PAGE_PWT | _PAGE_PCD;
>> +        case X86_MT_UC:
>> +        case X86_MT_UCM:
>> +        case X86_MT_WC:
>> +            /* not cacheable */
>>              break;
>> -        case _PAGE_PWT: /* WT */
>> -        case _PAGE_PWT | _PAGE_PAT: /* WP */
>> -            flip |= _PAGE_PCD | (l1f & _PAGE_PAT);
>> +        default:
>> +            /* cacheable */
>> +            flip |= ((l1f & PAGE_CACHE_ATTRS) ^ _PAGE_UC);
>>              break;
> 
> In v4 the comment here was "cacheable, force to UC". The latter aspect is
> quite relevant (and iirc also what Andrew had asked for to have as a
> comment). But with this now being the default case, the comment (in either
> this or the earlier form) would become stale. A forward compatible way of
> wording this would e.g. be "force any other type to UC". With an adjustment
> along these lines (which I think could be done while committing)
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

If you had replied signaling your consent (and perhaps the preferred by you
wording), I would have committed this. Now it's going to be v6 afaic ...

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 08/10] x86/mm: make code robust to future PAT changes
  2022-12-22  9:50     ` Demi Marie Obenour
@ 2022-12-22  9:54       ` Jan Beulich
  2022-12-22 10:00         ` Demi Marie Obenour
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2022-12-22  9:54 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

On 22.12.2022 10:50, Demi Marie Obenour wrote:
> On Thu, Dec 22, 2022 at 10:35:08AM +0100, Jan Beulich wrote:
>> On 20.12.2022 02:07, Demi Marie Obenour wrote:
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -6352,6 +6352,11 @@ unsigned long get_upper_mfn_bound(void)
>>>      return min(max_mfn, 1UL << (paddr_bits - PAGE_SHIFT)) - 1;
>>>  }
>>>  
>>> +
>>> +/*
>>> + * A bunch of static assertions to check that the XEN_MSR_PAT is valid
>>> + * and consistent with the _PAGE_* macros, and that _PAGE_WB is zero.
>>> + */
>>>  static void __init __maybe_unused build_assertions(void)
>>>  {
>>>      /*
>>> @@ -6361,6 +6366,72 @@ static void __init __maybe_unused build_assertions(void)
>>>       * using different PATs will not work.
>>>       */
>>>      BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
>>> +
>>> +    /*
>>> +     * _PAGE_WB must be zero for several reasons, not least because Linux
>>> +     * assumes it.
>>> +     */
>>> +    BUILD_BUG_ON(_PAGE_WB);
>>
>> Strictly speaking this is a requirement only for PV guests. We may not
>> want to go as far as putting "#ifdef CONFIG_PV" around it, but at least
>> the code comment (and then also the part of the description referring
>> to this) will imo want to say so.
> 
> Does Xen itself depend on this?

With the wording in the description I was going from the assumption that
you have audited code and found that we properly use _PAGE_* constants
everywhere.

>>> +} while (0)
>>> +
>>> +    /*
>>> +     * If one of these trips, the corresponding _PAGE_* macro is inconsistent
>>> +     * with XEN_MSR_PAT.  This would cause Xen to use incorrect cacheability
>>> +     * flags, with results that are undefined and probably harmful.
>>
>> Why "undefined"? They may be wrong / broken, but the result is still well-
>> defined afaict.
> 
> “undefined” is meant as “one has violated a core assumption that
> higher-level stuff depends on, so things can go arbitrarily wrong,
> including e.g. corrupting memory or data”.  Is this accurate?  Should I
> drop the dependent clause, or do you have a suggestion for something
> better?

s/undefined/unknown/ ?

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 09/10] x86/mm: Reject invalid cacheability in PV guests by default
  2022-12-22  9:48   ` Jan Beulich
@ 2022-12-22  9:55     ` Demi Marie Obenour
  2022-12-22 10:06       ` Jan Beulich
  0 siblings, 1 reply; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-22  9:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

[-- Attachment #1: Type: text/plain, Size: 3092 bytes --]

On Thu, Dec 22, 2022 at 10:48:02AM +0100, Jan Beulich wrote:
> On 20.12.2022 02:07, Demi Marie Obenour wrote:
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -145,6 +145,8 @@
> >  
> >  #ifdef CONFIG_PV
> >  #include "pv/mm.h"
> > +bool allow_invalid_cacheability;
> 
> static and __ro_after_init please (the former not the least with Misra
> in mind).

Will fix in v6.

> > +boolean_param("allow_invalid_cacheability", allow_invalid_cacheability);
> >  #endif
> 
> Any new command line option will need to come with an entry in
> docs/misc/xen-command-line.pandoc. Furthermore we're trying to avoid
> underscores in command line option names, when dashes serve the
> purpose quite fine.

Will fix in v6.

> Also please add a blank line at least between #include and your
> addition. Personally I would find it more natural if such a single-use
> control was defined next to the place it is used, i.e. 

Will fix in v6.

> > @@ -1343,7 +1345,33 @@ static int promote_l1_table(struct page_info *page)
> >          }
> >          else
> >          {
> 
> ... here.

Will move in v6.

> > -            switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
> > +            l1_pgentry_t l1e = pl1e[i];
> > +
> > +            if ( !allow_invalid_cacheability )
> > +            {
> > +                switch ( l1e.l1 & PAGE_CACHE_ATTRS )
> > +                {
> > +                case _PAGE_WB:
> > +                case _PAGE_UC:
> > +                case _PAGE_UCM:
> > +                case _PAGE_WC:
> > +                case _PAGE_WT:
> > +                case _PAGE_WP:
> > +                    break;
> > +                default:
> 
> Nit (style): Blank line please between non-fall-through case blocks.

Will fix in v6.

> > +                    /*
> > +                     * If we get here, a PV guest tried to use one of the
> > +                     * reserved values in Xen's PAT.  This indicates a bug in
> > +                     * the guest, so inject #GP to cause the guest to log a
> > +                     * stack trace.
> > +                     */
> 
> And likely make it die. Which, yes, ...
> 
> > +                    pv_inject_hw_exception(TRAP_gp_fault, 0);
> > +                    ret = -EINVAL;
> 
> ... also may or may not be the result of returning failure here (if the
> guest "checks" for errors by using a BUG()-like construct), but that's
> then more of its own fault than not coping with a non-architectural #GP.

Is there really any architectural behavior here?

> Also wasn't there talk of having this behavior in debug hypervisors
> only? Without that restriction I'm yet less happy with the change ...

My understanding was that Andrew preferred the behavior to be turned on
in release hypervisors too.  I am inclined to agree with Andrew, if for
no other reason than that those working on guest OSs do not generally
use debug hypervisors if they are not also working on Xen itself.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 02/10] x86/mm: Avoid hard-coding PAT in get_page_from_l1e()
  2022-12-22  9:51     ` Jan Beulich
@ 2022-12-22  9:58       ` Demi Marie Obenour
  0 siblings, 0 replies; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-22  9:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

[-- Attachment #1: Type: text/plain, Size: 3370 bytes --]

On Thu, Dec 22, 2022 at 10:51:08AM +0100, Jan Beulich wrote:
> On 20.12.2022 09:19, Jan Beulich wrote:
> > On 20.12.2022 02:07, Demi Marie Obenour wrote:
> >> get_page_from_l1e() relied on Xen's choice of PAT, which is brittle in
> >> the face of future PAT changes.  Instead, compute the actual cacheability
> >> used by the CPU and switch on that, as this will work no matter what PAT
> >> Xen uses.
> >>
> >> No functional change intended.  This code is itself questionable and may
> >> be removed in the future, but removing it would be an observable
> >> behavior change and so is out of scope for this patch series.
> >>
> >> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
> >> ---
> >> Changes since v4:
> >> - Do not add new pte_flags_to_cacheability() helper, as this code may be
> >>   removed in the near future and so adding a new helper for it is a bad
> >>   idea.
> >> - Do not BUG() in the event of an unexpected cacheability.  This cannot
> >>   happen, but it is simpler to force such types to UC than to prove that
> >>   the BUG() is not reachable.
> >>
> >> Changes since v3:
> >> - Compute and use the actual cacheability as seen by the processor.
> >>
> >> Changes since v2:
> >> - Improve commit message.
> >> ---
> >>  xen/arch/x86/mm.c | 14 ++++++++------
> >>  1 file changed, 8 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> >> index 78b1972e4170cacccc9c37c6e64e76e66a7da87f..dba6c77ef2f7ed7fcb7f7e526583ccadd35e62cc 100644
> >> --- a/xen/arch/x86/mm.c
> >> +++ b/xen/arch/x86/mm.c
> >> @@ -959,14 +959,16 @@ get_page_from_l1e(
> >>              flip = _PAGE_RW;
> >>          }
> >>  
> >> -        switch ( l1f & PAGE_CACHE_ATTRS )
> >> +        switch ( 0xFF & (XEN_MSR_PAT >> (8 * pte_flags_to_cacheattr(l1f))) )
> >>          {
> >> -        case 0: /* WB */
> >> -            flip |= _PAGE_PWT | _PAGE_PCD;
> >> +        case X86_MT_UC:
> >> +        case X86_MT_UCM:
> >> +        case X86_MT_WC:
> >> +            /* not cacheable */
> >>              break;
> >> -        case _PAGE_PWT: /* WT */
> >> -        case _PAGE_PWT | _PAGE_PAT: /* WP */
> >> -            flip |= _PAGE_PCD | (l1f & _PAGE_PAT);
> >> +        default:
> >> +            /* cacheable */
> >> +            flip |= ((l1f & PAGE_CACHE_ATTRS) ^ _PAGE_UC);
> >>              break;
> > 
> > In v4 the comment here was "cacheable, force to UC". The latter aspect is
> > quite relevant (and iirc also what Andrew had asked for to have as a
> > comment). But with this now being the default case, the comment (in either
> > this or the earlier form) would become stale. A forward compatible way of
> > wording this would e.g. be "force any other type to UC". With an adjustment
> > along these lines (which I think could be done while committing)
> > Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> If you had replied signaling your consent (and perhaps the preferred by you
> wording), I would have committed this. Now it's going to be v6 afaic ...
> 
> Jan

Sorry about that.  "potentially cacheable, force to UC" is the wording
I have planned for v6, along with "not cacheable, allow" in the other
case.  Feel free to go ahead and commit if you want.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 08/10] x86/mm: make code robust to future PAT changes
  2022-12-22  9:54       ` Jan Beulich
@ 2022-12-22 10:00         ` Demi Marie Obenour
  2022-12-22 10:03           ` Jan Beulich
  0 siblings, 1 reply; 28+ messages in thread
From: Demi Marie Obenour @ 2022-12-22 10:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

[-- Attachment #1: Type: text/plain, Size: 2690 bytes --]

On Thu, Dec 22, 2022 at 10:54:48AM +0100, Jan Beulich wrote:
> On 22.12.2022 10:50, Demi Marie Obenour wrote:
> > On Thu, Dec 22, 2022 at 10:35:08AM +0100, Jan Beulich wrote:
> >> On 20.12.2022 02:07, Demi Marie Obenour wrote:
> >>> --- a/xen/arch/x86/mm.c
> >>> +++ b/xen/arch/x86/mm.c
> >>> @@ -6352,6 +6352,11 @@ unsigned long get_upper_mfn_bound(void)
> >>>      return min(max_mfn, 1UL << (paddr_bits - PAGE_SHIFT)) - 1;
> >>>  }
> >>>  
> >>> +
> >>> +/*
> >>> + * A bunch of static assertions to check that the XEN_MSR_PAT is valid
> >>> + * and consistent with the _PAGE_* macros, and that _PAGE_WB is zero.
> >>> + */
> >>>  static void __init __maybe_unused build_assertions(void)
> >>>  {
> >>>      /*
> >>> @@ -6361,6 +6366,72 @@ static void __init __maybe_unused build_assertions(void)
> >>>       * using different PATs will not work.
> >>>       */
> >>>      BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
> >>> +
> >>> +    /*
> >>> +     * _PAGE_WB must be zero for several reasons, not least because Linux
> >>> +     * assumes it.
> >>> +     */
> >>> +    BUILD_BUG_ON(_PAGE_WB);
> >>
> >> Strictly speaking this is a requirement only for PV guests. We may not
> >> want to go as far as putting "#ifdef CONFIG_PV" around it, but at least
> >> the code comment (and then also the part of the description referring
> >> to this) will imo want to say so.
> > 
> > Does Xen itself depend on this?
> 
> With the wording in the description I was going from the assumption that
> you have audited code and found that we properly use _PAGE_* constants
> everywhere.

There could be other hard-coded uses of magic numbers I haven’t found,
and _PAGE_WB is currently zero so I would be quite surpised if no code
in Xen omits it.  Linux also has a BUILD_BUG_ON() to check the same
thing.

> >>> +} while (0)
> >>> +
> >>> +    /*
> >>> +     * If one of these trips, the corresponding _PAGE_* macro is inconsistent
> >>> +     * with XEN_MSR_PAT.  This would cause Xen to use incorrect cacheability
> >>> +     * flags, with results that are undefined and probably harmful.
> >>
> >> Why "undefined"? They may be wrong / broken, but the result is still well-
> >> defined afaict.
> > 
> > “undefined” is meant as “one has violated a core assumption that
> > higher-level stuff depends on, so things can go arbitrarily wrong,
> > including e.g. corrupting memory or data”.  Is this accurate?  Should I
> > drop the dependent clause, or do you have a suggestion for something
> > better?
> 
> s/undefined/unknown/ ?

Will fix in v6.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 08/10] x86/mm: make code robust to future PAT changes
  2022-12-22 10:00         ` Demi Marie Obenour
@ 2022-12-22 10:03           ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2022-12-22 10:03 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

On 22.12.2022 11:00, Demi Marie Obenour wrote:
> On Thu, Dec 22, 2022 at 10:54:48AM +0100, Jan Beulich wrote:
>> On 22.12.2022 10:50, Demi Marie Obenour wrote:
>>> On Thu, Dec 22, 2022 at 10:35:08AM +0100, Jan Beulich wrote:
>>>> On 20.12.2022 02:07, Demi Marie Obenour wrote:
>>>>> @@ -6361,6 +6366,72 @@ static void __init __maybe_unused build_assertions(void)
>>>>>       * using different PATs will not work.
>>>>>       */
>>>>>      BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
>>>>> +
>>>>> +    /*
>>>>> +     * _PAGE_WB must be zero for several reasons, not least because Linux
>>>>> +     * assumes it.
>>>>> +     */
>>>>> +    BUILD_BUG_ON(_PAGE_WB);
>>>>
>>>> Strictly speaking this is a requirement only for PV guests. We may not
>>>> want to go as far as putting "#ifdef CONFIG_PV" around it, but at least
>>>> the code comment (and then also the part of the description referring
>>>> to this) will imo want to say so.
>>>
>>> Does Xen itself depend on this?
>>
>> With the wording in the description I was going from the assumption that
>> you have audited code and found that we properly use _PAGE_* constants
>> everywhere.
> 
> There could be other hard-coded uses of magic numbers I haven’t found,
> and _PAGE_WB is currently zero so I would be quite surpised if no code
> in Xen omits it.  Linux also has a BUILD_BUG_ON() to check the same
> thing.

Fair enough - please adjust description and comment text accordingly then.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 09/10] x86/mm: Reject invalid cacheability in PV guests by default
  2022-12-22  9:55     ` Demi Marie Obenour
@ 2022-12-22 10:06       ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2022-12-22 10:06 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian,
	George Dunlap, Tim Deegan, xen-devel

On 22.12.2022 10:55, Demi Marie Obenour wrote:
> On Thu, Dec 22, 2022 at 10:48:02AM +0100, Jan Beulich wrote:
>> Also wasn't there talk of having this behavior in debug hypervisors
>> only? Without that restriction I'm yet less happy with the change ...
> 
> My understanding was that Andrew preferred the behavior to be turned on
> in release hypervisors too.  I am inclined to agree with Andrew, if for
> no other reason than that those working on guest OSs do not generally
> use debug hypervisors if they are not also working on Xen itself.

That's likely a mistake then, at least for work on PV guest OSes. In
particular PV memory management code is sprinkled with gdprintk(),
in an attempt to help those people understand why some operation has
failed.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2022-12-22 10:07 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-20  1:07 [PATCH v5 00/10] Make PAT handling less brittle Demi Marie Obenour
2022-12-20  1:07 ` [PATCH v5 01/10] x86: Add memory type constants Demi Marie Obenour
2022-12-20  1:07 ` [PATCH v5 02/10] x86/mm: Avoid hard-coding PAT in get_page_from_l1e() Demi Marie Obenour
2022-12-20  8:19   ` Jan Beulich
2022-12-22  9:51     ` Jan Beulich
2022-12-22  9:58       ` Demi Marie Obenour
2022-12-20  1:07 ` [PATCH v5 03/10] x86: Replace PAT_* with X86_MT_* Demi Marie Obenour
2022-12-20  1:07 ` [PATCH v5 04/10] x86: Replace MTRR_* constants with X86_MT_* constants Demi Marie Obenour
2022-12-20  1:07 ` [PATCH v5 05/10] x86: Replace EPT_EMT_* constants with X86_MT_* Demi Marie Obenour
2022-12-20  8:21   ` Jan Beulich
2022-12-20  1:07 ` [PATCH v5 06/10] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE Demi Marie Obenour
2022-12-20  1:07 ` [PATCH v5 07/10] x86: Derive XEN_MSR_PAT from its individual entries Demi Marie Obenour
2022-12-20  1:07 ` [PATCH v5 08/10] x86/mm: make code robust to future PAT changes Demi Marie Obenour
2022-12-22  9:35   ` Jan Beulich
2022-12-22  9:50     ` Demi Marie Obenour
2022-12-22  9:54       ` Jan Beulich
2022-12-22 10:00         ` Demi Marie Obenour
2022-12-22 10:03           ` Jan Beulich
2022-12-20  1:07 ` [PATCH v5 09/10] x86/mm: Reject invalid cacheability in PV guests by default Demi Marie Obenour
2022-12-22  9:48   ` Jan Beulich
2022-12-22  9:55     ` Demi Marie Obenour
2022-12-22 10:06       ` Jan Beulich
2022-12-20  1:07 ` [PATCH v5 10/10] x86: Use Linux's PAT Demi Marie Obenour
2022-12-20 16:13 ` [PATCH v5 11/10] hvmloader: use memory type constants Jan Beulich
2022-12-20 16:36   ` Andrew Cooper
2022-12-20 16:50     ` Jan Beulich
2022-12-20 17:45   ` Demi Marie Obenour
2022-12-21  6:56     ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.