[PATCH 0/5] target/i386: Fix physical address masking bugs

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/5] target/i386: Fix physical address masking bugs
@ 2023-12-22 17:59 Paolo Bonzini
  2023-12-22 17:59 ` [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode Paolo Bonzini
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Paolo Bonzini @ 2023-12-22 17:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, mcb30

The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.
Instead, truncation must be applied only to non-paging mode, because
all paths that go through page table accesses already produce a
correctly-masked address.

Furthermore, when inspecting the code I noticed that the A20 mask is
applied incorrectly when NPT is active.  The mask should not be applied
to the addresses that are looked up in the NPT, only to the physical
addresses.  Obviously no hypervisor is going to leave A20 masking on,
but the code actually becomes simpler so let's do it.

Patches 1 and 2 fix cases in which the addresses must be masked,
or overflow is otherwise invalid, for MMU_PHYS_IDX accesses.

Patch 3 fixes the bug, by limiting the masking to the case of CR0.PG=0.

Patches 4 and 5 further clean up the MMU functions to centralize
application of the A20 mask and fix bugs in the process.

Untested except for running the SVM tests from kvm-unit-tests
(which is better than nothing, still).

Supersedes: <0102018c8d11471f-9a6d73eb-0c34-4f61-8d37-5a4418f9e0d7-000000@eu-west-1.amazonses.com>

Paolo Bonzini (5):
  target/i386: mask high bits of CR3 in 32-bit mode
  target/i386: check validity of VMCB addresses
  target/i386: Fix physical address truncation
  target/i386: remove unnecessary/wrong application of the A20 mask
  target/i386: leave the A20 bit set in the final NPT walk

 target/i386/tcg/sysemu/excp_helper.c | 44 ++++++++++++----------------
 target/i386/tcg/sysemu/misc_helper.c |  3 ++
 target/i386/tcg/sysemu/svm_helper.c  | 27 +++++++++++++----
 3 files changed, 43 insertions(+), 31 deletions(-)

-- 
2.43.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode
  2023-12-22 17:59 [PATCH 0/5] target/i386: Fix physical address masking bugs Paolo Bonzini
@ 2023-12-22 17:59 ` Paolo Bonzini
  2023-12-25 20:33   ` Richard Henderson
  2024-01-18  8:04   ` Michael Tokarev
  2023-12-22 17:59 ` [PATCH 2/5] target/i386: check validity of VMCB addresses Paolo Bonzini
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: Paolo Bonzini @ 2023-12-22 17:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, mcb30, qemu-stable

CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level
paging or PAE paging).  Do this in mmu_translate() to remove
the last where get_physical_address() meaningfully drops the high
bits of the address.

Cc: qemu-stable@nongnu.org
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/sysemu/excp_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 5b86f439add..11126c860d4 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -238,7 +238,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
             /*
              * Page table level 3
              */
-            pte_addr = ((in->cr3 & ~0x1f) + ((addr >> 27) & 0x18)) & a20_mask;
+            pte_addr = ((in->cr3 & 0xffffffe0ULL) + ((addr >> 27) & 0x18)) & a20_mask;
             if (!ptw_translate(&pte_trans, pte_addr)) {
                 return false;
             }
@@ -306,7 +306,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
         /*
          * Page table level 2
          */
-        pte_addr = ((in->cr3 & ~0xfff) + ((addr >> 20) & 0xffc)) & a20_mask;
+        pte_addr = ((in->cr3 & 0xfffff000ULL) + ((addr >> 20) & 0xffc)) & a20_mask;
         if (!ptw_translate(&pte_trans, pte_addr)) {
             return false;
         }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] target/i386: check validity of VMCB addresses
  2023-12-22 17:59 [PATCH 0/5] target/i386: Fix physical address masking bugs Paolo Bonzini
  2023-12-22 17:59 ` [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode Paolo Bonzini
@ 2023-12-22 17:59 ` Paolo Bonzini
  2023-12-22 17:59 ` [PATCH 3/5] target/i386: Fix physical address truncation Paolo Bonzini
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2023-12-22 17:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, mcb30, qemu-stable

MSR_VM_HSAVE_PA bits 0-11 are reserved, as are the bits above the
maximum physical address width of the processor.  Setting them to
1 causes a #GP (see "15.30.4 VM_HSAVE_PA MSR" in the AMD manual).

The same is true of VMCB addresses passed to VMRUN/VMLOAD/VMSAVE,
even though the manual is not clear on that.

Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/sysemu/misc_helper.c |  3 +++
 target/i386/tcg/sysemu/svm_helper.c  | 27 +++++++++++++++++++++------
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/target/i386/tcg/sysemu/misc_helper.c b/target/i386/tcg/sysemu/misc_helper.c
index e1528b7f80b..1901712ecef 100644
--- a/target/i386/tcg/sysemu/misc_helper.c
+++ b/target/i386/tcg/sysemu/misc_helper.c
@@ -201,6 +201,9 @@ void helper_wrmsr(CPUX86State *env)
         tlb_flush(cs);
         break;
     case MSR_VM_HSAVE_PA:
+        if (val & (0xfff | ((~0ULL) << env_archcpu(env)->phys_bits))) {
+            goto error;
+        }
         env->vm_hsave = val;
         break;
 #ifdef TARGET_X86_64
diff --git a/target/i386/tcg/sysemu/svm_helper.c b/target/i386/tcg/sysemu/svm_helper.c
index 32ff0dbb13c..5d6de2294fa 100644
--- a/target/i386/tcg/sysemu/svm_helper.c
+++ b/target/i386/tcg/sysemu/svm_helper.c
@@ -164,14 +164,19 @@ void helper_vmrun(CPUX86State *env, int aflag, int next_eip_addend)
     uint64_t new_cr3;
     uint64_t new_cr4;
 
-    cpu_svm_check_intercept_param(env, SVM_EXIT_VMRUN, 0, GETPC());
-
     if (aflag == 2) {
         addr = env->regs[R_EAX];
     } else {
         addr = (uint32_t)env->regs[R_EAX];
     }
 
+    /* Exceptions are checked before the intercept.  */
+    if (addr & (0xfff | ((~0ULL) << env_archcpu(env)->phys_bits))) {
+        raise_exception_err_ra(env, EXCP0D_GPF, 0, GETPC());
+    }
+
+    cpu_svm_check_intercept_param(env, SVM_EXIT_VMRUN, 0, GETPC());
+
     qemu_log_mask(CPU_LOG_TB_IN_ASM, "vmrun! " TARGET_FMT_lx "\n", addr);
 
     env->vm_vmcb = addr;
@@ -463,14 +468,19 @@ void helper_vmload(CPUX86State *env, int aflag)
     int mmu_idx = MMU_PHYS_IDX;
     target_ulong addr;
 
-    cpu_svm_check_intercept_param(env, SVM_EXIT_VMLOAD, 0, GETPC());
-
     if (aflag == 2) {
         addr = env->regs[R_EAX];
     } else {
         addr = (uint32_t)env->regs[R_EAX];
     }
 
+    /* Exceptions are checked before the intercept.  */
+    if (addr & (0xfff | ((~0ULL) << env_archcpu(env)->phys_bits))) {
+        raise_exception_err_ra(env, EXCP0D_GPF, 0, GETPC());
+    }
+
+    cpu_svm_check_intercept_param(env, SVM_EXIT_VMLOAD, 0, GETPC());
+
     if (virtual_vm_load_save_enabled(env, SVM_EXIT_VMLOAD, GETPC())) {
         mmu_idx = MMU_NESTED_IDX;
     }
@@ -519,14 +529,19 @@ void helper_vmsave(CPUX86State *env, int aflag)
     int mmu_idx = MMU_PHYS_IDX;
     target_ulong addr;
 
-    cpu_svm_check_intercept_param(env, SVM_EXIT_VMSAVE, 0, GETPC());
-
     if (aflag == 2) {
         addr = env->regs[R_EAX];
     } else {
         addr = (uint32_t)env->regs[R_EAX];
     }
 
+    /* Exceptions are checked before the intercept.  */
+    if (addr & (0xfff | ((~0ULL) << env_archcpu(env)->phys_bits))) {
+        raise_exception_err_ra(env, EXCP0D_GPF, 0, GETPC());
+    }
+
+    cpu_svm_check_intercept_param(env, SVM_EXIT_VMSAVE, 0, GETPC());
+
     if (virtual_vm_load_save_enabled(env, SVM_EXIT_VMSAVE, GETPC())) {
         mmu_idx = MMU_NESTED_IDX;
     }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] target/i386: Fix physical address truncation
  2023-12-22 17:59 [PATCH 0/5] target/i386: Fix physical address masking bugs Paolo Bonzini
  2023-12-22 17:59 ` [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode Paolo Bonzini
  2023-12-22 17:59 ` [PATCH 2/5] target/i386: check validity of VMCB addresses Paolo Bonzini
@ 2023-12-22 17:59 ` Paolo Bonzini
  2023-12-23 10:34   ` Michael Brown
  2023-12-22 17:59 ` [PATCH 4/5] target/i386: remove unnecessary/wrong application of the A20 mask Paolo Bonzini
  2023-12-22 17:59 ` [PATCH 5/5] target/i386: leave the A20 bit set in the final NPT walk Paolo Bonzini
  4 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2023-12-22 17:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, mcb30, qemu-stable

The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.

The truncation code was originally introduced in commit 33dfdb5 ("x86:
only allow real mode to access 32bit without LMA"), where it applied
only to translations performed while paging is disabled (and so cannot
affect guests using PAE).

Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
rearranged the code such that the truncation also applied to the use
of MMU_PHYS_IDX and MMU_NESTED_IDX.  Commit 4a1e9d4 ("target/i386: Use
atomic operations for pte updates") brought this truncation into scope
for page table entry accesses, and is the first commit for which a
Windows 10 32-bit guest will reliably fail to boot if memory above 4G
is present.

The truncation code however is not completely redundant.  Even though the
maximum address size for any executed instruction is 32 bits, helpers for
operations such as BOUND, FSAVE or XSAVE may ask get_physical_address()
to translate an address outside of the 32-bit range, if invoked with an
argument that is close to the 4G boundary.

So, move the address truncation in get_physical_address() in the
CR0.PG=0 case.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Cc: qemu-stable@nongnu.org
Co-developed-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/sysemu/excp_helper.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 11126c860d4..eee1af52710 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -577,17 +577,14 @@ static bool get_physical_address(CPUX86State *env, vaddr addr,
             }
             return mmu_translate(env, &in, out, err);
         }
+
+        /* No paging implies long mode is disabled.  */
+        addr = (uint32_t)addr;
         break;
     }
 
-    /* Translation disabled. */
+    /* No translation needed. */
     out->paddr = addr & x86_get_a20_mask(env);
-#ifdef TARGET_X86_64
-    if (!(env->hflags & HF_LMA_MASK)) {
-        /* Without long mode we can only address 32bits in real mode */
-        out->paddr = (uint32_t)out->paddr;
-    }
-#endif
     out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
     out->page_size = TARGET_PAGE_SIZE;
     return true;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] target/i386: remove unnecessary/wrong application of the A20 mask
  2023-12-22 17:59 [PATCH 0/5] target/i386: Fix physical address masking bugs Paolo Bonzini
                   ` (2 preceding siblings ...)
  2023-12-22 17:59 ` [PATCH 3/5] target/i386: Fix physical address truncation Paolo Bonzini
@ 2023-12-22 17:59 ` Paolo Bonzini
  2023-12-22 17:59 ` [PATCH 5/5] target/i386: leave the A20 bit set in the final NPT walk Paolo Bonzini
  4 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2023-12-22 17:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, mcb30, qemu-stable

If ptw_translate() does a MMU_PHYS_IDX access, the A20 mask is already
applied in get_physical_address(), which is called via probe_access_full()
and x86_cpu_tlb_fill().

If ptw_translate() on the other hand does a MMU_NESTED_IDX access,
the A20 mask must not be applied to the address that is looked up in
the nested page tables; it must be applied only *while* looking up NPT
entries.

Therefore, we can remove A20 masking from the computation of the page
table entry's address, and let get_physical_address() or mmu_translate()
apply it when they know they are returning a host-physical address.

Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/sysemu/excp_helper.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index eee1af52710..ede8ba6b80e 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -164,8 +164,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
                 /*
                  * Page table level 5
                  */
-                pte_addr = ((in->cr3 & ~0xfff) +
-                            (((addr >> 48) & 0x1ff) << 3)) & a20_mask;
+                pte_addr = (in->cr3 & ~0xfff) + (((addr >> 48) & 0x1ff) << 3);
                 if (!ptw_translate(&pte_trans, pte_addr)) {
                     return false;
                 }
@@ -189,8 +188,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
             /*
              * Page table level 4
              */
-            pte_addr = ((pte & PG_ADDRESS_MASK) +
-                        (((addr >> 39) & 0x1ff) << 3)) & a20_mask;
+            pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 39) & 0x1ff) << 3);
             if (!ptw_translate(&pte_trans, pte_addr)) {
                 return false;
             }
@@ -210,8 +208,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
             /*
              * Page table level 3
              */
-            pte_addr = ((pte & PG_ADDRESS_MASK) +
-                        (((addr >> 30) & 0x1ff) << 3)) & a20_mask;
+            pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 30) & 0x1ff) << 3);
             if (!ptw_translate(&pte_trans, pte_addr)) {
                 return false;
             }
@@ -238,7 +235,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
             /*
              * Page table level 3
              */
-            pte_addr = ((in->cr3 & 0xffffffe0ULL) + ((addr >> 27) & 0x18)) & a20_mask;
+            pte_addr = (in->cr3 & 0xffffffe0ULL) + ((addr >> 27) & 0x18);
             if (!ptw_translate(&pte_trans, pte_addr)) {
                 return false;
             }
@@ -260,8 +257,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
         /*
          * Page table level 2
          */
-        pte_addr = ((pte & PG_ADDRESS_MASK) +
-                    (((addr >> 21) & 0x1ff) << 3)) & a20_mask;
+        pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 21) & 0x1ff) << 3);
         if (!ptw_translate(&pte_trans, pte_addr)) {
             return false;
         }
@@ -287,8 +283,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
         /*
          * Page table level 1
          */
-        pte_addr = ((pte & PG_ADDRESS_MASK) +
-                    (((addr >> 12) & 0x1ff) << 3)) & a20_mask;
+        pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 12) & 0x1ff) << 3);
         if (!ptw_translate(&pte_trans, pte_addr)) {
             return false;
         }
@@ -306,7 +301,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
         /*
          * Page table level 2
          */
-        pte_addr = ((in->cr3 & 0xfffff000ULL) + ((addr >> 20) & 0xffc)) & a20_mask;
+        pte_addr = (in->cr3 & 0xfffff000ULL) + ((addr >> 20) & 0xffc);
         if (!ptw_translate(&pte_trans, pte_addr)) {
             return false;
         }
@@ -335,7 +330,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
         /*
          * Page table level 1
          */
-        pte_addr = ((pte & ~0xfffu) + ((addr >> 10) & 0xffc)) & a20_mask;
+        pte_addr = (pte & ~0xfffu) + ((addr >> 10) & 0xffc);
         if (!ptw_translate(&pte_trans, pte_addr)) {
             return false;
         }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] target/i386: leave the A20 bit set in the final NPT walk
  2023-12-22 17:59 [PATCH 0/5] target/i386: Fix physical address masking bugs Paolo Bonzini
                   ` (3 preceding siblings ...)
  2023-12-22 17:59 ` [PATCH 4/5] target/i386: remove unnecessary/wrong application of the A20 mask Paolo Bonzini
@ 2023-12-22 17:59 ` Paolo Bonzini
  4 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2023-12-22 17:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, mcb30, qemu-stable

The A20 mask is only applied to the final memory access.  Nested
page tables are always walked with the raw guest-physical address.

Unlike the previous patch, in this one the masking must be kept, but
it was done too early.

Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/sysemu/excp_helper.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index ede8ba6b80e..37e650c1fcd 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -134,7 +134,6 @@ static inline bool ptw_setl(const PTETranslate *in, uint32_t old, uint32_t set)
 static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
                           TranslateResult *out, TranslateFault *err)
 {
-    const int32_t a20_mask = x86_get_a20_mask(env);
     const target_ulong addr = in->addr;
     const int pg_mode = in->pg_mode;
     const bool is_user = (in->mmu_idx == MMU_USER_IDX);
@@ -417,10 +416,13 @@ do_check_protect_pse36:
         }
     }
 
-    /* align to page_size */
-    paddr = (pte & a20_mask & PG_ADDRESS_MASK & ~(page_size - 1))
-          | (addr & (page_size - 1));
+    /* merge offset within page */
+    paddr = (pte & PG_ADDRESS_MASK & ~(page_size - 1)) | (addr & (page_size - 1));
 
+    /*
+     * Note that NPT is walked (for both paging structures and final guest
+     * addresses) using the address with the A20 bit set.
+     */
     if (in->ptw_idx == MMU_NESTED_IDX) {
         CPUTLBEntryFull *full;
         int flags, nested_page_size;
@@ -459,7 +461,7 @@ do_check_protect_pse36:
         }
     }
 
-    out->paddr = paddr;
+    out->paddr = paddr & x86_get_a20_mask(env);
     out->prot = prot;
     out->page_size = page_size;
     return true;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/5] target/i386: Fix physical address truncation
  2023-12-22 17:59 ` [PATCH 3/5] target/i386: Fix physical address truncation Paolo Bonzini
@ 2023-12-23 10:34   ` Michael Brown
  2023-12-23 11:47     ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Brown @ 2023-12-23 10:34 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: richard.henderson, qemu-stable

On 22/12/2023 17:59, Paolo Bonzini wrote:
> The address translation logic in get_physical_address() will currently
> truncate physical addresses to 32 bits unless long mode is enabled.
> This is incorrect when using physical address extensions (PAE) outside
> of long mode, with the result that a 32-bit operating system using PAE
> to access memory above 4G will experience undefined behaviour.
> 
> The truncation code was originally introduced in commit 33dfdb5 ("x86:
> only allow real mode to access 32bit without LMA"), where it applied
> only to translations performed while paging is disabled (and so cannot
> affect guests using PAE).
> 
> Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
> rearranged the code such that the truncation also applied to the use
> of MMU_PHYS_IDX and MMU_NESTED_IDX.  Commit 4a1e9d4 ("target/i386: Use
> atomic operations for pte updates") brought this truncation into scope
> for page table entry accesses, and is the first commit for which a
> Windows 10 32-bit guest will reliably fail to boot if memory above 4G
> is present.
> 
> The truncation code however is not completely redundant.  Even though the
> maximum address size for any executed instruction is 32 bits, helpers for
> operations such as BOUND, FSAVE or XSAVE may ask get_physical_address()
> to translate an address outside of the 32-bit range, if invoked with an
> argument that is close to the 4G boundary.

Thank you for investigating and updating this patch.

I am confused by how BOUND can result in an access to a linear address 
outside of the address-size range.  I don't know the internals well 
enough, but I'm guessing it might be in the line in helper_boundl():

     high = cpu_ldl_data_ra(env, a0 + 4, GETPC());

where an address is calculated as (a0+4) using a 64-bit target_ulong 
type with no truncation to 32 bits applied.

If so, then ought the truncation to be applied on this line instead (and 
the equivalent in helper_boundw())?  My understanding (which may well be 
incorrect) is that the linear address gets truncated to the instruction 
address size (16 or 32 bits) before any conversion to a physical address 
takes place.

Regardless: this updated patch (in isolation) definitely fixes the issue 
that I observed, so I'm happy for an added

Tested-by: Michael Brown <mcb30@ipxe.org>

Thanks,

Michael



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/5] target/i386: Fix physical address truncation
  2023-12-23 10:34   ` Michael Brown
@ 2023-12-23 11:47     ` Paolo Bonzini
  2023-12-28 16:00       ` Michael Brown
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2023-12-23 11:47 UTC (permalink / raw)
  To: Michael Brown; +Cc: qemu-devel, Richard Henderson, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]

Il sab 23 dic 2023, 11:34 Michael Brown <mcb30@ipxe.org> ha scritto:

> I am confused by how BOUND can result in an access to a linear address
> outside of the address-size range.  I don't know the internals well
> enough, but I'm guessing it might be in the line in helper_boundl():
>
>      high = cpu_ldl_data_ra(env, a0 + 4, GETPC());
>
> where an address is calculated as (a0+4) using a 64-bit target_ulong
> type with no truncation to 32 bits applied.
>
> If so, then ought the truncation to be applied on this line instead (and
> the equivalent in helper_boundw())?  My understanding (which may well be
> incorrect) is that the linear address gets truncated to the instruction
> address size (16 or 32 bits) before any conversion to a physical address
> takes place.
>

The linear address is the one that has the segment base added, and it is
not truncated to 16 bits (otherwise the whole A20 thing would not exist).
The same should be true of e.g. an FSAVE instruction; it would allow access
slightly beyond the usual 1M+64K limit that is possible in real mode with
286 and later processors.

In big real mode with 32-bit addresses, it should not be possible to go
beyond 4G physical address by adding the segment base, it should wrap
around and that's what I implemented. However you're probably right that
this patch has a hole for accesses made from 32-bit code segments with
paging enabled. I think LMA was the wrong bit to test all the time, and I
am not even sure if the masking must be applied even before the call to
mmu_translate(). I will ponder it a bit and possibly send a revised version.

Paolo

> Regardless: this updated patch (in isolation) definitely fixes the issue
> that I observed, so I'm happy for an added
>
> Tested-by: Michael Brown <mcb30@ipxe.org>
>
> Thanks,
>
> Michael
>
>

[-- Attachment #2: Type: text/html, Size: 2602 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode
  2023-12-22 17:59 ` [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode Paolo Bonzini
@ 2023-12-25 20:33   ` Richard Henderson
  2024-01-18  8:04   ` Michael Tokarev
  1 sibling, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2023-12-25 20:33 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: mcb30, qemu-stable

On 12/23/23 04:59, Paolo Bonzini wrote:
> CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level
> paging or PAE paging).  Do this in mmu_translate() to remove
> the last where get_physical_address() meaningfully drops the high
> bits of the address.
> 
> Cc: qemu-stable@nongnu.org
> Suggested-by: Richard Henderson <richard.henderson@linaro.org>
> Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18)
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/sysemu/excp_helper.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~

> 
> diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
> index 5b86f439add..11126c860d4 100644
> --- a/target/i386/tcg/sysemu/excp_helper.c
> +++ b/target/i386/tcg/sysemu/excp_helper.c
> @@ -238,7 +238,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
>               /*
>                * Page table level 3
>                */
> -            pte_addr = ((in->cr3 & ~0x1f) + ((addr >> 27) & 0x18)) & a20_mask;
> +            pte_addr = ((in->cr3 & 0xffffffe0ULL) + ((addr >> 27) & 0x18)) & a20_mask;
>               if (!ptw_translate(&pte_trans, pte_addr)) {
>                   return false;
>               }
> @@ -306,7 +306,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
>           /*
>            * Page table level 2
>            */
> -        pte_addr = ((in->cr3 & ~0xfff) + ((addr >> 20) & 0xffc)) & a20_mask;
> +        pte_addr = ((in->cr3 & 0xfffff000ULL) + ((addr >> 20) & 0xffc)) & a20_mask;
>           if (!ptw_translate(&pte_trans, pte_addr)) {
>               return false;
>           }



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/5] target/i386: Fix physical address truncation
  2023-12-23 11:47     ` Paolo Bonzini
@ 2023-12-28 16:00       ` Michael Brown
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Brown @ 2023-12-28 16:00 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Richard Henderson, qemu-stable

On 23/12/2023 11:47, Paolo Bonzini wrote:
> The linear address is the one that has the segment base added, and it is 
> not truncated to 16 bits (otherwise the whole A20 thing would not 
> exist). The same should be true of e.g. an FSAVE instruction; it would 
> allow access slightly beyond the usual 1M+64K limit that is possible in 
> real mode with 286 and later processors.
> 
> In big real mode with 32-bit addresses, it should not be possible to go 
> beyond 4G physical address by adding the segment base, it should wrap 
> around and that's what I implemented. However you're probably right that 
> this patch has a hole for accesses made from 32-bit code segments with 
> paging enabled. I think LMA was the wrong bit to test all the time, and 
> I am not even sure if the masking must be applied even before the call 
> to mmu_translate(). I will ponder it a bit and possibly send a revised 
> version.

You are of course correct that the linear address is not truncated to 16 
bits when the address size is 16 bits - my mistake.

I've been looking through the SDM for any definitive statement on the 
topic.  The closest I can find is in volume 3 table 4-1, which states 
that the linear address width is:

  - 32 bits with paging disabled

  - 32 bits with 32-bit paging

  - 32 bits with PAE paging

  - 48 bits with 4-level paging

  - 57 bits with 5-level paging

My previous experiment seems to show that the linear address *does* also 
get truncated to 32 bits for an instruction with a 32-bit address size 
even when running in long mode with 4-level paging (on a Core i7-6600U), 
so this table definitely isn't telling the complete story.

My best guess at this point is that the linear address gets truncated to 
32 bits when the address size is 32 bits (which will always be the case 
when paging is disabled, or when using 32-bit paging or PAE paging).

Thanks,

Michael

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode
  2023-12-22 17:59 ` [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode Paolo Bonzini
  2023-12-25 20:33   ` Richard Henderson
@ 2024-01-18  8:04   ` Michael Tokarev
  2024-01-23 11:11     ` Paolo Bonzini
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Tokarev @ 2024-01-18  8:04 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: richard.henderson, mcb30, qemu-stable

22.12.2023 20:59, Paolo Bonzini:
> CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level
> paging or PAE paging).  Do this in mmu_translate() to remove
> the last where get_physical_address() meaningfully drops the high
> bits of the address.
> 
> Cc: qemu-stable@nongnu.org
> Suggested-by: Richard Henderson <richard.henderson@linaro.org>
> Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18)
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Ping?

Can we get this patch set to master before Jan-27?

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode
  2024-01-18  8:04   ` Michael Tokarev
@ 2024-01-23 11:11     ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2024-01-23 11:11 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: qemu-devel, richard.henderson, mcb30, qemu-stable

On Thu, Jan 18, 2024 at 9:04 AM Michael Tokarev <mjt@tls.msk.ru> wrote:
>
> 22.12.2023 20:59, Paolo Bonzini:
> > CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level
> > paging or PAE paging).  Do this in mmu_translate() to remove
> > the last where get_physical_address() meaningfully drops the high
> > bits of the address.
> >
> > Cc: qemu-stable@nongnu.org
> > Suggested-by: Richard Henderson <richard.henderson@linaro.org>
> > Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18)
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>
> Ping?
>
> Can we get this patch set to master before Jan-27?

I have to test and send a new version, and had to spend some of my
precious TCG time on the PCREL regressions. :(


Paolo



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-01-23 11:12 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-22 17:59 [PATCH 0/5] target/i386: Fix physical address masking bugs Paolo Bonzini
2023-12-22 17:59 ` [PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode Paolo Bonzini
2023-12-25 20:33   ` Richard Henderson
2024-01-18  8:04   ` Michael Tokarev
2024-01-23 11:11     ` Paolo Bonzini
2023-12-22 17:59 ` [PATCH 2/5] target/i386: check validity of VMCB addresses Paolo Bonzini
2023-12-22 17:59 ` [PATCH 3/5] target/i386: Fix physical address truncation Paolo Bonzini
2023-12-23 10:34   ` Michael Brown
2023-12-23 11:47     ` Paolo Bonzini
2023-12-28 16:00       ` Michael Brown
2023-12-22 17:59 ` [PATCH 4/5] target/i386: remove unnecessary/wrong application of the A20 mask Paolo Bonzini
2023-12-22 17:59 ` [PATCH 5/5] target/i386: leave the A20 bit set in the final NPT walk Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).