* [PATCH] target/i386: Fix physical address truncation when PAE is enabled
@ 2023-12-18 12:56 Michael Brown
2023-12-20 4:22 ` Richard Henderson
0 siblings, 1 reply; 5+ messages in thread
From: Michael Brown @ 2023-12-18 12:56 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-stable, Michael Brown, Paolo Bonzini, Richard Henderson,
Eduardo Habkost
The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.
The truncation code was originally introduced in commit 33dfdb5 ("x86:
only allow real mode to access 32bit without LMA"), where it applied
only to translations performed while paging is disabled (and so cannot
affect guests using PAE).
Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
rearranged the code such that the truncation also applied to the use
of MMU_PHYS_IDX and MMU_NESTED_IDX. Commit 4a1e9d4 ("target/i386: Use
atomic operations for pte updates") brought this truncation into scope
for page table entry accesses, and is the first commit for which a
Windows 10 32-bit guest will reliably fail to boot if memory above 4G
is present.
Fix by testing for PAE being enabled via the relevant bit in CR4,
instead of testing for long mode being enabled. PAE must be enabled
as a prerequisite of long mode, and so this is a generalisation of the
current test.
Remove the #ifdef TARGET_X86_64 check since PAE exists in both 32-bit
and 64-bit processors, and both should exhibit the same truncation
behaviour when PAE is disabled.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040
Signed-off-by: Michael Brown <mcb30@ipxe.org>
---
target/i386/tcg/sysemu/excp_helper.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 5b86f439ad..3d0d0d78d7 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -582,12 +582,10 @@ static bool get_physical_address(CPUX86State *env, vaddr addr,
/* Translation disabled. */
out->paddr = addr & x86_get_a20_mask(env);
-#ifdef TARGET_X86_64
- if (!(env->hflags & HF_LMA_MASK)) {
- /* Without long mode we can only address 32bits in real mode */
+ if (!(env->cr[4] & CR4_PAE_MASK)) {
+ /* Without PAE we can address only 32 bits */
out->paddr = (uint32_t)out->paddr;
}
-#endif
out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
out->page_size = TARGET_PAGE_SIZE;
return true;
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] target/i386: Fix physical address truncation when PAE is enabled
2023-12-18 12:56 [PATCH] target/i386: Fix physical address truncation when PAE is enabled Michael Brown
@ 2023-12-20 4:22 ` Richard Henderson
2023-12-20 11:03 ` Michael Brown
0 siblings, 1 reply; 5+ messages in thread
From: Richard Henderson @ 2023-12-20 4:22 UTC (permalink / raw)
To: Michael Brown, qemu-devel; +Cc: qemu-stable, Paolo Bonzini, Eduardo Habkost
On 12/18/23 23:56, Michael Brown wrote:
> The address translation logic in get_physical_address() will currently
> truncate physical addresses to 32 bits unless long mode is enabled.
> This is incorrect when using physical address extensions (PAE) outside
> of long mode, with the result that a 32-bit operating system using PAE
> to access memory above 4G will experience undefined behaviour.
>
> The truncation code was originally introduced in commit 33dfdb5 ("x86:
> only allow real mode to access 32bit without LMA"), where it applied
> only to translations performed while paging is disabled (and so cannot
> affect guests using PAE).
>
> Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
> rearranged the code such that the truncation also applied to the use
> of MMU_PHYS_IDX and MMU_NESTED_IDX. Commit 4a1e9d4 ("target/i386: Use
> atomic operations for pte updates") brought this truncation into scope
> for page table entry accesses, and is the first commit for which a
> Windows 10 32-bit guest will reliably fail to boot if memory above 4G
> is present.
>
> Fix by testing for PAE being enabled via the relevant bit in CR4,
> instead of testing for long mode being enabled. PAE must be enabled
> as a prerequisite of long mode, and so this is a generalisation of the
> current test.
>
> Remove the #ifdef TARGET_X86_64 check since PAE exists in both 32-bit
> and 64-bit processors, and both should exhibit the same truncation
> behaviour when PAE is disabled.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040
> Signed-off-by: Michael Brown <mcb30@ipxe.org>
> ---
> target/i386/tcg/sysemu/excp_helper.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
> index 5b86f439ad..3d0d0d78d7 100644
> --- a/target/i386/tcg/sysemu/excp_helper.c
> +++ b/target/i386/tcg/sysemu/excp_helper.c
> @@ -582,12 +582,10 @@ static bool get_physical_address(CPUX86State *env, vaddr addr,
>
> /* Translation disabled. */
> out->paddr = addr & x86_get_a20_mask(env);
> -#ifdef TARGET_X86_64
> - if (!(env->hflags & HF_LMA_MASK)) {
> - /* Without long mode we can only address 32bits in real mode */
> + if (!(env->cr[4] & CR4_PAE_MASK)) {
> + /* Without PAE we can address only 32 bits */
> out->paddr = (uint32_t)out->paddr;
> }
> -#endif
This is not the correct refactoring.
I agree that what we're currently doing is wrong, esp for MMU_PHYS_IDX, but for the
default case, if CR0.PG == 0, then CR4.PAE is ignored (vol 3, section 4.1.1).
I suspect the correct fix is to have MMU_PHYS_IDX pass through the input address
unchanged, and it is the responsibility of the higher level paging mmu_idx to truncate
physical addresses per PG_MODE_* before recursing.
r~
r~
> out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
> out->page_size = TARGET_PAGE_SIZE;
> return true;
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] target/i386: Fix physical address truncation when PAE is enabled
2023-12-20 4:22 ` Richard Henderson
@ 2023-12-20 11:03 ` Michael Brown
2023-12-20 21:51 ` Richard Henderson
0 siblings, 1 reply; 5+ messages in thread
From: Michael Brown @ 2023-12-20 11:03 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-stable, Paolo Bonzini, Eduardo Habkost
On 20/12/2023 04:22, Richard Henderson wrote:
> On 12/18/23 23:56, Michael Brown wrote:
>> The address translation logic in get_physical_address() will currently
>> truncate physical addresses to 32 bits unless long mode is enabled.
>> This is incorrect when using physical address extensions (PAE) outside
>> of long mode, with the result that a 32-bit operating system using PAE
>> to access memory above 4G will experience undefined behaviour.
>>
>> <snip>
>>
>> --- a/target/i386/tcg/sysemu/excp_helper.c
>> +++ b/target/i386/tcg/sysemu/excp_helper.c
>> @@ -582,12 +582,10 @@ static bool get_physical_address(CPUX86State
>> *env, vaddr addr,
>> /* Translation disabled. */
>> out->paddr = addr & x86_get_a20_mask(env);
>> -#ifdef TARGET_X86_64
>> - if (!(env->hflags & HF_LMA_MASK)) {
>> - /* Without long mode we can only address 32bits in real mode */
>> + if (!(env->cr[4] & CR4_PAE_MASK)) {
>> + /* Without PAE we can address only 32 bits */
>> out->paddr = (uint32_t)out->paddr;
>> }
>> -#endif
>
> This is not the correct refactoring.
>
> I agree that what we're currently doing is wrong, esp for MMU_PHYS_IDX,
> but for the default case, if CR0.PG == 0, then CR4.PAE is ignored (vol
> 3, section 4.1.1).
>
> I suspect the correct fix is to have MMU_PHYS_IDX pass through the input
> address unchanged, and it is the responsibility of the higher level
> paging mmu_idx to truncate physical addresses per PG_MODE_* before
> recursing.
Thank you for reviewing, and for confirming the bug.
For the MMU_PHYS_IDX case, I agree that it makes sense for the address
to be passed through unchanged.
For the default case, I think it would make sense to unconditionally
truncate the address to 32 bits if paging is disabled. (I am not sure
why the original commit 33dfdb5 included a test for long mode, since I
do not see how it is possible to get the CPU into long mode with paging
disabled.)
I do not know what ought to be done in the MMU_NESTED_IDX case, and
would appreciate your input on this.
Thanks,
Michael
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] target/i386: Fix physical address truncation when PAE is enabled
2023-12-20 11:03 ` Michael Brown
@ 2023-12-20 21:51 ` Richard Henderson
2023-12-21 15:20 ` Michael Brown
0 siblings, 1 reply; 5+ messages in thread
From: Richard Henderson @ 2023-12-20 21:51 UTC (permalink / raw)
To: Michael Brown, qemu-devel; +Cc: qemu-stable, Paolo Bonzini, Eduardo Habkost
On 12/20/23 22:03, Michael Brown wrote:
> For the default case, I think it would make sense to unconditionally truncate the address
> to 32 bits if paging is disabled. (I am not sure why the original commit 33dfdb5 included
> a test for long mode, since I do not see how it is possible to get the CPU into long mode
> with paging disabled.)
You are correct that paging is mandatory for LMA -- indeed, setting CR0.PG is the final
step in 10.8.5 Initializing IA-32e Mode, which copies EFER.LME to EFER.LMA.
The commit 33dfdb5 that you reference is definitely wrong.
> I do not know what ought to be done in the MMU_NESTED_IDX case, and would appreciate your
> input on this.
It would be ideal if MMU_NESTED_IDX were only used when virtualization is enabled and
nested_ctl & SVM_NPT_ENABLED, i.e. when use_stage2 is true.
I can't remember why I added a check for use_stage2 for MMU_NESTED_IDX in 98281984a37. It
is possible that, as that patch introduced the feature, I was being cautious. It's also
possible that I did see something, but then it was cleaned up somewhere in the rest of
that rather large patch series.
In any case, current intended behaviour is that MMU_NESTED_IDX with !use_stage2 equates to
MMU_PHYS_IDX.
r~
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] target/i386: Fix physical address truncation when PAE is enabled
2023-12-20 21:51 ` Richard Henderson
@ 2023-12-21 15:20 ` Michael Brown
0 siblings, 0 replies; 5+ messages in thread
From: Michael Brown @ 2023-12-21 15:20 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-stable, Paolo Bonzini, Eduardo Habkost
On 20/12/2023 21:51, Richard Henderson wrote:
> On 12/20/23 22:03, Michael Brown wrote:
>> For the default case, I think it would make sense to unconditionally
>> truncate the address to 32 bits if paging is disabled. (I am not sure
>> why the original commit 33dfdb5 included a test for long mode, since I
>> do not see how it is possible to get the CPU into long mode with
>> paging disabled.)
>
> You are correct that paging is mandatory for LMA -- indeed, setting
> CR0.PG is the final step in 10.8.5 Initializing IA-32e Mode, which
> copies EFER.LME to EFER.LMA.
>
> The commit 33dfdb5 that you reference is definitely wrong.
I have done some further investigation, and come to the conclusion that
we should just delete the truncation code entirely.
With paging disabled, there is (as far as I know) no way to execute an
instruction with an address size of 64 bits. With the instruction
address size being 32 bits, the linear address will already be truncated
to 32 bits anyway.
A quick userspace test program confirms that on a physical CPU a 32-bit
address size will result in the linear address being truncated to 32 bits:
#include <stdint.h>
#include <stdio.h>
uint8_t val = 42;
int main ( void ) {
uint8_t *ptr = ( &val + 0x80000001 );
uint8_t res;
printf ( "&val = %p\n", &val ); // will be around 0x400000
printf ( "ptr = %p\n", ptr );
printf ( "addr32 read via 0x7fffffff(%p)...\n", ptr );
__asm__ ( "addr32 movb 0x7fffffff(%k1), %b0\n\t"
: "=r" ( res ) : "r" ( ptr ) );
printf ( "...got %d\n", res );
printf ( "addr64 read via 0x7fffffff(%p)...\n", ptr );
__asm__ ( "movb 0x7fffffff(%1), %b0\n\t"
: "=r" ( res ) : "r" ( ptr ) );
printf ( "...got %d\n", res );
return 0;
}
produces the expected output:
$ cc -o test test.c && ./test
&val = 0x40400c
ptr = 0x8040400d
addr32 read via 0x7fffffff(0x8040400d)...
...got 42
addr64 read via 0x7fffffff(0x8040400d)...
Segmentation fault (core dumped)
which I believe shows that the addr32 instruction experiences wraparound
of the linear address (and so accesses the correct location), while the
addr64 instruction with the same base and offset does not experience
wraparound of the linear address (and so segfaults).
The original commit 33dfdb5 describes 32-bit code relying on the linear
address wraparound to access memory below the segment base. This is
what iPXE also does: the iPXE code is physically copied to somewhere
high in 32-bit memory, the segment base is set to point to the location
of iPXE (thereby avoiding the need to apply relocation records), and
wraparound is relied upon to access all memory below this. I have
tested removing the truncation code from get_physical_address() and
verified that iPXE continues to work as expected.
I have also verified that removing the truncation code from
get_physical_address() does seem to fix the problem with PAE mode (i.e.
is able to boot Windows 10 with RAM above 4G).
I will send a v2 patch shortly.
Thanks,
Michael
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-12-21 15:21 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-18 12:56 [PATCH] target/i386: Fix physical address truncation when PAE is enabled Michael Brown
2023-12-20 4:22 ` Richard Henderson
2023-12-20 11:03 ` Michael Brown
2023-12-20 21:51 ` Richard Henderson
2023-12-21 15:20 ` Michael Brown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).