qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Questions about the real mode in kvm/qemu
@ 2019-09-26  7:52 Li Qiang
  2019-09-26  8:31 ` Maxim Levitsky
  2019-09-26  9:15 ` Paolo Bonzini
  0 siblings, 2 replies; 17+ messages in thread
From: Li Qiang @ 2019-09-26  7:52 UTC (permalink / raw)
  To: Paolo Bonzini, Qemu Developers

[-- Attachment #1: Type: text/plain, Size: 819 bytes --]

Hi Paolo and all,

There are some question about the emulation for real mode in kvm/qemu. For
all the
question I suppose the 'unstrict guest' is not enabled.

1. how the protected mode CPU emulate the real mode? It seems it uses vm86,
however, vm86 is not available in x86_64 CPU? So what's the
'to_vmx(vcpu)->rmode.vm86_active' here vm86 means?

2. Does the guest's real mode code run directly in native CPU? It seems
'vmx->emulation_required' is also be false, it the vmx_vcpu_run will do a
switch to guest.

3. How the EPT work in guest real mode? The EPT is for GVA->GPA->HPA,
however there is no GVA, seems the identity mapping does something. But
there also some confusion for me. For example the real mode uses CS*4 + IP
to address the code.  Who does this calculation? In the kernel emulator?

Thanks,
Li Qiang

[-- Attachment #2: Type: text/html, Size: 1033 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  7:52 Questions about the real mode in kvm/qemu Li Qiang
@ 2019-09-26  8:31 ` Maxim Levitsky
  2019-09-26  8:52   ` Li Qiang
  2019-09-26  9:15 ` Paolo Bonzini
  1 sibling, 1 reply; 17+ messages in thread
From: Maxim Levitsky @ 2019-09-26  8:31 UTC (permalink / raw)
  To: Li Qiang, Paolo Bonzini, Qemu Developers

On Thu, 2019-09-26 at 15:52 +0800, Li Qiang wrote:
> Hi Paolo and all,
> 
> There are some question about the emulation for real mode in kvm/qemu. For all the 
> question I suppose the 'unstrict guest' is not enabled. 
> 
> 1. how the protected mode CPU emulate the real mode? It seems it uses vm86, however, vm86 is not available in x86_64 CPU? So what's the 'to_vmx(vcpu)->rmode.vm86_active' here vm86 means?
> 

As far as I know it, modern intel's cpus support so called unrestricted guest mode, which allows guest to be basically in any mode,
as long as EPT paging is used (that is guest can be in real mode with
no paging, but EPT has to be enabled).
The 'vm86_active' is probably lefover support for cpus that don't support EPT and/or the unrestricted guest mode,
where KVM tried to use the good old vm86 mode to
for real mode virtualization.


> 2. Does the guest's real mode code run directly in native CPU? It seems 'vmx->emulation_required' is also be false, it the vmx_vcpu_run will do a switch to guest.

Same as above

> 
> 3. How the EPT work in guest real mode? The EPT is for GVA->GPA->HPA, however there is no GVA, seems the identity mapping does something. But there also some confusion for me. For example the real
> mode uses CS*4 + IP to address the code.  Who does this calculation? In the kernel emulator? 

EPT sits underneath the guest's paging mode, which in case of real mode is 1:1 mapping.
Thus CS<<4 + IP would be the guest physical address and it will be looked up in the EPT to translate to the real physical address.



Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  8:31 ` Maxim Levitsky
@ 2019-09-26  8:52   ` Li Qiang
  2019-09-26  8:59     ` Maxim Levitsky
  0 siblings, 1 reply; 17+ messages in thread
From: Li Qiang @ 2019-09-26  8:52 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Paolo Bonzini, Qemu Developers

[-- Attachment #1: Type: text/plain, Size: 2130 bytes --]

Maxim Levitsky <mlevitsk@redhat.com> 于2019年9月26日周四 下午4:31写道:

> On Thu, 2019-09-26 at 15:52 +0800, Li Qiang wrote:
> > Hi Paolo and all,
> >
> > There are some question about the emulation for real mode in kvm/qemu.
> For all the
> > question I suppose the 'unstrict guest' is not enabled.
> >
> > 1. how the protected mode CPU emulate the real mode? It seems it uses
> vm86, however, vm86 is not available in x86_64 CPU? So what's the
> 'to_vmx(vcpu)->rmode.vm86_active' here vm86 means?
> >
>
>
Hi Maxim,

Thanks for your kind reply.



> As far as I know it, modern intel's cpus support so called unrestricted
> guest mode, which allows guest to be basically in any mode,
>

Right, but I also want to know the secret when the 'unstrict guest' is
disabled. So I suppose the 'unstrict guest' is  not enabled for these
questions.


> as long as EPT paging is used (that is guest can be in real mode with
> no paging, but EPT has to be enabled).
> The 'vm86_active' is probably lefover support for cpus that don't support
> EPT and/or the unrestricted guest mode,
> where KVM tried to use the good old vm86 mode to
> for real mode virtualization.
>
>
> > 2. Does the guest's real mode code run directly in native CPU? It seems
> 'vmx->emulation_required' is also be false, it the vmx_vcpu_run will do a
> switch to guest.
>
> Same as above
>
> >
> > 3. How the EPT work in guest real mode? The EPT is for GVA->GPA->HPA,
> however there is no GVA, seems the identity mapping does something. But
> there also some confusion for me. For example the real
> > mode uses CS*4 + IP to address the code.  Who does this calculation? In
> the kernel emulator?
>
> EPT sits underneath the guest's paging mode, which in case of real mode is
> 1:1 mapping.
>

It seems when the 'unstrict guest' is enabled, there is no identity mapping
table.

Thanks,
Li Qiang



> Thus CS<<4 + IP would be the guest physical address and it will be looked
> up in the EPT to translate to the real physical address.
>
>
>
> Best regards,
>         Maxim Levitsky
>
>

[-- Attachment #2: Type: text/html, Size: 3175 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  8:52   ` Li Qiang
@ 2019-09-26  8:59     ` Maxim Levitsky
  2019-09-26  9:18       ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Maxim Levitsky @ 2019-09-26  8:59 UTC (permalink / raw)
  To: Li Qiang; +Cc: Paolo Bonzini, Qemu Developers

On Thu, 2019-09-26 at 16:52 +0800, Li Qiang wrote:
> 
> 
> Maxim Levitsky <mlevitsk@redhat.com> 于2019年9月26日周四 下午4:31写道:
> > On Thu, 2019-09-26 at 15:52 +0800, Li Qiang wrote:
> > > Hi Paolo and all,
> > > 
> > > There are some question about the emulation for real mode in kvm/qemu. For all the 
> > > question I suppose the 'unstrict guest' is not enabled. 
> > > 
> > > 1. how the protected mode CPU emulate the real mode? It seems it uses vm86, however, vm86 is not available in x86_64 CPU? So what's the 'to_vmx(vcpu)->rmode.vm86_active' here vm86 means?
> > > 
> > 
> 
> Hi Maxim,
> 
> Thanks for your kind reply.
> 
>  
> > As far as I know it, modern intel's cpus support so called unrestricted guest mode, which allows guest to be basically in any mode,
> 
> Right, but I also want to know the secret when the 'unstrict guest' is disabled. So I suppose the 'unstrict guest' is  not enabled for these questions.
>  
> > as long as EPT paging is used (that is guest can be in real mode with
> > no paging, but EPT has to be enabled).
> > The 'vm86_active' is probably lefover support for cpus that don't support EPT and/or the unrestricted guest mode,
> > where KVM tried to use the good old vm86 mode to
> > for real mode virtualization.
> > 
> > 
> > > 2. Does the guest's real mode code run directly in native CPU? It seems 'vmx->emulation_required' is also be false, it the vmx_vcpu_run will do a switch to guest.
> > 
> > Same as above
> > 
> > > 
> > > 3. How the EPT work in guest real mode? The EPT is for GVA->GPA->HPA, however there is no GVA, seems the identity mapping does something. But there also some confusion for me. For example the
> > real
> > > mode uses CS*4 + IP to address the code.  Who does this calculation? In the kernel emulator? 
> > 
> > EPT sits underneath the guest's paging mode, which in case of real mode is 1:1 mapping.
> 
> It seems when the 'unstrict guest' is enabled, there is no identity mapping table.

If you mean to ask if there is a way to let guest access use no paging at all, that is access host physical addresses directly,
then indeed there is no way, since regular non 'unrestricted guest' mode required both protected mode and paging, and 'unrestricted guest' requires
EPT.
Academically speaking it is of course possible to create paging tables that are 1:1...


Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  7:52 Questions about the real mode in kvm/qemu Li Qiang
  2019-09-26  8:31 ` Maxim Levitsky
@ 2019-09-26  9:15 ` Paolo Bonzini
  2019-09-26  9:35   ` Maxim Levitsky
  2019-09-26  9:35   ` Li Qiang
  1 sibling, 2 replies; 17+ messages in thread
From: Paolo Bonzini @ 2019-09-26  9:15 UTC (permalink / raw)
  To: Li Qiang, Qemu Developers

On 26/09/19 09:52, Li Qiang wrote:
> Hi Paolo and all,
> 
> There are some question about the emulation for real mode in kvm/qemu.
> For all the 
> question I suppose the 'unstrict guest' is not enabled. 
> 
> 1. how the protected mode CPU emulate the real mode? It seems it uses
> vm86, however, vm86 is not available in x86_64 CPU? So what's the
> 'to_vmx(vcpu)->rmode.vm86_active' here vm86 means?

vm86 mode is available in 64-bit CPUs; it is not available in 64-bit
mode (EFER.LMA=1).

Once KVM places the processor in VMX non-root mode (VMLAUNCH/VMRESUME),
the processor can leave 64-bit mode and run in vm86 mode.  And that's
exactly what KVM does on processors without unrestricted guest support.

In that mode, KVM will start trapping all exceptions and send them to
handle_rmode_exception.  This in turn will either emulate privileged
instructions (for #GP or #SS) or inject them as realmode exceptions.

However...

> 2. Does the guest's real mode code run directly in native CPU? It seems
> 'vmx->emulation_required' is also be false, it the vmx_vcpu_run will do
> a switch to guest.

... as soon as the guest tries to enter protected mode, it will get into
a situation which is not real mode but doesn't have the segment
registers properly loaded with selectors.  Therefore, it will either
hack things together (enter_pmode) or emulate instructions until the
state is accepted even without unrestricted guest support.

The "hacking things together" part however does not work for big real
mode (where you enter 32-bit mode, load segment registers with a flat 4G
segment, and go back to real mode with the 4G segments).  In big real
mode, therefore, KVM cannot use vm86 and will keep emulating (slowly -
up to 1000x slower than unrestricted guest).

> 3. How the EPT work in guest real mode? The EPT is for GVA->GPA->HPA,
> however there is no GVA, seems the identity mapping does something. But
> there also some confusion for me. For example the real mode uses CS*4 +
> IP to address the code.  Who does this calculation? In the kernel emulator? 

Right, and in fact the CR0.PG and CR0.PE bits must be 1 when running in
VMX non-root mode with unrestricted guest disabled.

Therefore, KVM places a 4 KiB identity map at an address provided by
userspace (with KVM_SET_IDENTITY_MAP_ADDR).  When 1) EPT is enabled 2)
unrestricted guest isn't 3) CR0.PG=0, KVM points CR3 to it and runs the
guest with CR0.PG=1, CR4.PAE=0, CR4.PSE=1.  This CR4 setup enables 4 MiB
huge pages so that a 4 GiB identity map fits in 4 KiB.  This is the only
case where EPT is enabled but CR3 loads and stores will trap to the
hypervisor.  This way, the guest CR3 value is faked in the vmexit
handler, but the processor always uses the identity GVA->GPA mapping.

Paolo



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  8:59     ` Maxim Levitsky
@ 2019-09-26  9:18       ` Paolo Bonzini
  2019-09-26  9:24         ` Maxim Levitsky
                           ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Paolo Bonzini @ 2019-09-26  9:18 UTC (permalink / raw)
  To: Maxim Levitsky, Li Qiang; +Cc: Qemu Developers, Avi Kivity

On 26/09/19 10:59, Maxim Levitsky wrote:
> If you mean to ask if there is a way to let guest access use no
> paging at all, that is access host physical addresses directly, then
> indeed there is no way, since regular non 'unrestricted guest' mode
> required both protected mode and paging, and 'unrestricted guest'
> requires EPT. Academically speaking it is of course possible to
> create paging tables that are 1:1...

Not so academically, it's exactly what KVM does.  However, indeed it
would also be possible to switch out of EPT mode when CR0.PG=0.  I'm not
sure why it was done this way, maybe when the code was written it was
simpler to use the identity map.

Let's see if Avi is listening... :)

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:18       ` Paolo Bonzini
@ 2019-09-26  9:24         ` Maxim Levitsky
  2019-09-26  9:33           ` Paolo Bonzini
  2019-09-28 22:10         ` Avi Kivity
  2019-09-29  7:39         ` Li Qiang
  2 siblings, 1 reply; 17+ messages in thread
From: Maxim Levitsky @ 2019-09-26  9:24 UTC (permalink / raw)
  To: Paolo Bonzini, Li Qiang; +Cc: Qemu Developers, Avi Kivity

On Thu, 2019-09-26 at 11:18 +0200, Paolo Bonzini wrote:
> On 26/09/19 10:59, Maxim Levitsky wrote:
> > If you mean to ask if there is a way to let guest access use no
> > paging at all, that is access host physical addresses directly, then
> > indeed there is no way, since regular non 'unrestricted guest' mode
> > required both protected mode and paging, and 'unrestricted guest'
> > requires EPT. Academically speaking it is of course possible to
> > create paging tables that are 1:1...
> 
> Not so academically, it's exactly what KVM does.
You mean KVM uses 1:1 EPT pages and no guest paging,
to allow guest to access host physical address space?
That would break the security completely, thus I think you
mean something else here.


>   However, indeed it
> would also be possible to switch out of EPT mode when CR0.PG=0.  I'm not
> sure why it was done this way, maybe when the code was written it was
> simpler to use the identity map.
> 
> Let's see if Avi is listening... :)
> 
> Paolo

Here a quote from the PRM:

"The first processors to support VMX operation require CR0.PE and CR0.PG to be 1 in VMX operation (see Section
23.8). This restriction implies that guest software cannot be run in unpaged protected mode or in real-address
mode. Later processors support a VM-execution control called “unrestricted guest”. 1 If this control is 1, CR0.PE and
CR0.PG may be 0 in VMX non-root operation. Such processors allow guest software to run in unpaged protected
mode or in real-address mode. The following items describe the behavior of such software:"
...

"As noted in Section 26.2.1.1, the “enable EPT” VM-execution control must be 1 if the “unrestricted guest” VM-execution control is 1."


Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:24         ` Maxim Levitsky
@ 2019-09-26  9:33           ` Paolo Bonzini
  2019-09-26  9:41             ` Maxim Levitsky
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2019-09-26  9:33 UTC (permalink / raw)
  To: Maxim Levitsky, Li Qiang; +Cc: Qemu Developers, Avi Kivity

On 26/09/19 11:24, Maxim Levitsky wrote:
> On Thu, 2019-09-26 at 11:18 +0200, Paolo Bonzini wrote:
>> On 26/09/19 10:59, Maxim Levitsky wrote:
>>> If you mean to ask if there is a way to let guest access use no
>>> paging at all, that is access host physical addresses directly, then
>>> indeed there is no way, since regular non 'unrestricted guest' mode
>>> required both protected mode and paging, and 'unrestricted guest'
>>> requires EPT. Academically speaking it is of course possible to
>>> create paging tables that are 1:1...
>>
>> Not so academically, it's exactly what KVM does.
> You mean KVM uses 1:1 EPT pages and no guest paging,
> to allow guest to access host physical address space?

No, it uses the usual HVA->GPA EPT pages and 1:1 GPA->GVA pages when EPT
is enabled and guest CR0.PG=0.  This lets KVM work around the CR0.PG=1
requirement when unrestricted guest mode.

Thinking more about it, I suppose that saves memory (the same EPT page
tables can now be used independent of guest CR0.PG), at the cost of
making TLB misses a little slower.

Thanks,

Paolo

>>   However, indeed it
>> would also be possible to switch out of EPT mode when CR0.PG=0.  I'm not
>> sure why it was done this way, maybe when the code was written it was
>> simpler to use the identity map.
>>
>> Let's see if Avi is listening... :)
>>
>> Paolo
> 
> Here a quote from the PRM:
> 
> "The first processors to support VMX operation require CR0.PE and CR0.PG to be 1 in VMX operation (see Section
> 23.8). This restriction implies that guest software cannot be run in unpaged protected mode or in real-address
> mode. Later processors support a VM-execution control called “unrestricted guest”. 1 If this control is 1, CR0.PE and
> CR0.PG may be 0 in VMX non-root operation. Such processors allow guest software to run in unpaged protected
> mode or in real-address mode. The following items describe the behavior of such software:"
> ...
> 
> "As noted in Section 26.2.1.1, the “enable EPT” VM-execution control must be 1 if the “unrestricted guest” VM-execution control is 1."
> 
> 
> Best regards,
> 	Maxim Levitsky
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:15 ` Paolo Bonzini
@ 2019-09-26  9:35   ` Maxim Levitsky
  2019-09-26  9:35   ` Li Qiang
  1 sibling, 0 replies; 17+ messages in thread
From: Maxim Levitsky @ 2019-09-26  9:35 UTC (permalink / raw)
  To: Paolo Bonzini, Li Qiang, Qemu Developers

On Thu, 2019-09-26 at 11:15 +0200, Paolo Bonzini wrote:
> On 26/09/19 09:52, Li Qiang wrote:
> > Hi Paolo and all,
> > 
> > There are some question about the emulation for real mode in kvm/qemu.
> > For all the 
> > question I suppose the 'unstrict guest' is not enabled. 
> > 
> > 1. how the protected mode CPU emulate the real mode? It seems it uses
> > vm86, however, vm86 is not available in x86_64 CPU? So what's the
> > 'to_vmx(vcpu)->rmode.vm86_active' here vm86 means?
> 
> vm86 mode is available in 64-bit CPUs; it is not available in 64-bit
> mode (EFER.LMA=1).
> 
> Once KVM places the processor in VMX non-root mode (VMLAUNCH/VMRESUME),
> the processor can leave 64-bit mode and run in vm86 mode.  And that's
> exactly what KVM does on processors without unrestricted guest support.
> 
> In that mode, KVM will start trapping all exceptions and send them to
> handle_rmode_exception.  This in turn will either emulate privileged
> instructions (for #GP or #SS) or inject them as realmode exceptions.
Ah, so you run vm86 inside the guest, to emulate the real mode.
That is clever :-)

> 
> However...
> 
> > 2. Does the guest's real mode code run directly in native CPU? It seems
> > 'vmx->emulation_required' is also be false, it the vmx_vcpu_run will do
> > a switch to guest.
> 
> ... as soon as the guest tries to enter protected mode, it will get into
> a situation which is not real mode but doesn't have the segment
> registers properly loaded with selectors.  Therefore, it will either
> hack things together (enter_pmode) or emulate instructions until the
> state is accepted even without unrestricted guest support.
> 
> The "hacking things together" part however does not work for big real
> mode (where you enter 32-bit mode, load segment registers with a flat 4G
> segment, and go back to real mode with the 4G segments).  In big real
> mode, therefore, KVM cannot use vm86 and will keep emulating (slowly -
> up to 1000x slower than unrestricted guest).
That is also understandable.

> 
> > 3. How the EPT work in guest real mode? The EPT is for GVA->GPA->HPA,
> > however there is no GVA, seems the identity mapping does something. But
> > there also some confusion for me. For example the real mode uses CS*4 +
> > IP to address the code.  Who does this calculation? In the kernel emulator? 
> 
> Right, and in fact the CR0.PG and CR0.PE bits must be 1 when running in
> VMX non-root mode with unrestricted guest disabled.
> 
> Therefore, KVM places a 4 KiB identity map at an address provided by
> userspace (with KVM_SET_IDENTITY_MAP_ADDR).  When 1) EPT is enabled 2)
> unrestricted guest isn't 3) CR0.PG=0, KVM points CR3 to it and runs the
> guest with CR0.PG=1, CR4.PAE=0, CR4.PSE=1.  This CR4 setup enables 4 MiB
> huge pages so that a 4 GiB identity map fits in 4 KiB.  This is the only
> case where EPT is enabled but CR3 loads and stores will trap to the
> hypervisor.  This way, the guest CR3 value is faked in the vmexit
> handler, but the processor always uses the identity GVA->GPA mapping.

Ah, so you fake the guest's paging table, not the EPT table. That makes
sense.

Thankfully all of this is not needed when unrestricted guest is available.

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:15 ` Paolo Bonzini
  2019-09-26  9:35   ` Maxim Levitsky
@ 2019-09-26  9:35   ` Li Qiang
  2019-09-26  9:53     ` Paolo Bonzini
  1 sibling, 1 reply; 17+ messages in thread
From: Li Qiang @ 2019-09-26  9:35 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Qemu Developers

[-- Attachment #1: Type: text/plain, Size: 3572 bytes --]

Paolo Bonzini <pbonzini@redhat.com> 于2019年9月26日周四 下午5:15写道:

> On 26/09/19 09:52, Li Qiang wrote:
> > Hi Paolo and all,
> >
> > There are some question about the emulation for real mode in kvm/qemu.
> > For all the
> > question I suppose the 'unstrict guest' is not enabled.
> >
> > 1. how the protected mode CPU emulate the real mode? It seems it uses
> > vm86, however, vm86 is not available in x86_64 CPU? So what's the
> > 'to_vmx(vcpu)->rmode.vm86_active' here vm86 means?
>
> vm86 mode is available in 64-bit CPUs; it is not available in 64-bit
> mode (EFER.LMA=1).
>
> Once KVM places the processor in VMX non-root mode (VMLAUNCH/VMRESUME),
> the processor can leave 64-bit mode and run in vm86 mode.  And that's
> exactly what KVM does on processors without unrestricted guest support.
>
> In that mode, KVM will start trapping all exceptions and send them to
> handle_rmode_exception.  This in turn will either emulate privileged
> instructions (for #GP or #SS) or inject them as realmode exceptions.
>
>

So without unrestrict guest the mainline is this:
KVM set guest's rflag bit X86_EFLAGS_VM, so when the guest enter guest
mode, it is in vm86 mode.
In this mode, the CPU will access the address like in real
mode(seg*4+offset), this address is linear address. And in fact, the vm86
is still
in protected, so the linear address will be translated to gpa by the
identity mapping table. Then goes to EPT table?

Is this understanding right?



> However...
>
> > 2. Does the guest's real mode code run directly in native CPU? It seems
> > 'vmx->emulation_required' is also be false, it the vmx_vcpu_run will do
> > a switch to guest.
>
> ... as soon as the guest tries to enter protected mode, it will get into
> a situation which is not real mode but doesn't have the segment
> registers properly loaded with selectors.

Therefore, it will either
> hack things together (enter_pmode) or emulate instructions until the
> state is accepted even without unrestricted guest support.
>

Could you please explain this situation more detailed? Why this happen?

Thanks,
Li Qiang



>
> The "hacking things together" part however does not work for big real
> mode (where you enter 32-bit mode, load segment registers with a flat 4G
> segment, and go back to real mode with the 4G segments).  In big real
> mode, therefore, KVM cannot use vm86 and will keep emulating (slowly -
> up to 1000x slower than unrestricted guest).
>
> > 3. How the EPT work in guest real mode? The EPT is for GVA->GPA->HPA,
> > however there is no GVA, seems the identity mapping does something. But
> > there also some confusion for me. For example the real mode uses CS*4 +
> > IP to address the code.  Who does this calculation? In the kernel
> emulator?
>
> Right, and in fact the CR0.PG and CR0.PE bits must be 1 when running in
> VMX non-root mode with unrestricted guest disabled.
>
> Therefore, KVM places a 4 KiB identity map at an address provided by
> userspace (with KVM_SET_IDENTITY_MAP_ADDR).  When 1) EPT is enabled 2)
> unrestricted guest isn't 3) CR0.PG=0, KVM points CR3 to it and runs the
> guest with CR0.PG=1, CR4.PAE=0, CR4.PSE=1.  This CR4 setup enables 4 MiB
> huge pages so that a 4 GiB identity map fits in 4 KiB.  This is the only
> case where EPT is enabled but CR3 loads and stores will trap to the
> hypervisor.  This way, the guest CR3 value is faked in the vmexit
> handler, but the processor always uses the identity GVA->GPA mapping.
>
> Paolo
>
>

[-- Attachment #2: Type: text/html, Size: 4963 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:33           ` Paolo Bonzini
@ 2019-09-26  9:41             ` Maxim Levitsky
  2019-09-26 10:00               ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Maxim Levitsky @ 2019-09-26  9:41 UTC (permalink / raw)
  To: Paolo Bonzini, Li Qiang; +Cc: Qemu Developers, Avi Kivity

On Thu, 2019-09-26 at 11:33 +0200, Paolo Bonzini wrote:
> On 26/09/19 11:24, Maxim Levitsky wrote:
> > On Thu, 2019-09-26 at 11:18 +0200, Paolo Bonzini wrote:
> > > On 26/09/19 10:59, Maxim Levitsky wrote:
> > > > If you mean to ask if there is a way to let guest access use no
> > > > paging at all, that is access host physical addresses directly, then
> > > > indeed there is no way, since regular non 'unrestricted guest' mode
> > > > required both protected mode and paging, and 'unrestricted guest'
> > > > requires EPT. Academically speaking it is of course possible to
> > > > create paging tables that are 1:1...
> > > 
> > > Not so academically, it's exactly what KVM does.
> > 
> > You mean KVM uses 1:1 EPT pages and no guest paging,
> > to allow guest to access host physical address space?
> 
> No, it uses the usual HVA->GPA EPT pages and 1:1 GPA->GVA pages when EPT
> is enabled and guest CR0.PG=0.  This lets KVM work around the CR0.PG=1
> requirement when unrestricted guest mode.
I understand now.

> 
> Thinking more about it, I suppose that saves memory (the same EPT page
> tables can now be used independent of guest CR0.PG), at the cost of
> making TLB misses a little slower.
Don't really understand what you mean. 
Isn't this always the case that EPT and guest paging
are independent (at least when no nesting is involved)?


Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:35   ` Li Qiang
@ 2019-09-26  9:53     ` Paolo Bonzini
  2019-09-26 11:47       ` Li Qiang
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2019-09-26  9:53 UTC (permalink / raw)
  To: Li Qiang; +Cc: Qemu Developers

On 26/09/19 11:35, Li Qiang wrote:
> So without unrestrict guest the mainline is this: KVM set guest's
> rflag bit X86_EFLAGS_VM, so when the guest enter guest mode, it is in
> vm86 mode. In this mode, the CPU will access the address like in
> real mode(seg*4+offset), this address is linear address. And in fact,
> the vm86 is still in protected, so the linear address will be
> translated to gpa by the identity mapping table. Then goes to EPT
> table?

Yes.

>     ... as soon as the guest tries to enter protected mode, it will get into
>     a situation which is not real mode but doesn't have the segment
>     registers properly loaded with selectors.  
> 
>     Therefore, it will either
>     hack things together (enter_pmode) or emulate instructions until the
>     state is accepted even without unrestricted guest support.
> 
> Could you please explain this situation more detailed? Why this happen?

Protected mode entry looks like this:

        mov %cr0, %eax
        or $1, %al
        mov %eax, %cr0
	# [1] now in 16-bit protected mode
        lgdtl gdt32
        ljmpl $8, 2f
	# [2] now in 32-bit protected mode
2:
        .code32
        mov $16, %ax
        mov %ax, %ds
        mov %ax, %es
        mov %ax, %fs
        mov %ax, %gs
        mov %ax, %ss
	# [3] now everything is okay

Between [1] and [3] the vmentry could fail if not in unrestricted mode.
 For example (see checks on guest segment registers in the SDM):

- "CS. Type must be 9, 11, 13, or 15 (accessed code segment)."  CS in
real-mode is a RW data segment, not a code segment.  This applies
between [1] and [2].

- "SS. If the guest will not be virtual-8086 and the “unrestricted
guest” VM-execution control is 0, the RPL (bits 1:0) must equal the RPL
of the selector field for CS."  This may not be the case if the segment
register still holds real-mode values (which are not selectors, just
base >> 4).  This applies between [1] and [3].

- "DS, ES, FS, GS. The DPL cannot be less than the RPL in the selector
field"   Again, the real-mode DPL is zero but the RPL makes no sense if
the segment registers hold a real-mode value.

You can find more about these checks in guest_state_valid(); look at the
"else" branch of that function, the "then" branch is for pmode->rmode
transitions.  When any of the checks fail, KVM emulates instructions
instead of using VMX non-root mode (usually it's just a handful of them,
as in the case above).

Paolo



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:41             ` Maxim Levitsky
@ 2019-09-26 10:00               ` Paolo Bonzini
  2019-09-26 10:03                 ` Maxim Levitsky
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2019-09-26 10:00 UTC (permalink / raw)
  To: Maxim Levitsky, Li Qiang; +Cc: Qemu Developers, Avi Kivity

On 26/09/19 11:41, Maxim Levitsky wrote:
>> Thinking more about it, I suppose that saves memory (the same EPT page
>> tables can now be used independent of guest CR0.PG), at the cost of
>> making TLB misses a little slower.
> Don't really understand what you mean. 
> Isn't this always the case that EPT and guest paging
> are independent (at least when no nesting is involved)?

There are two possibilities:

1) emulate CR0.PG=0 with EPT + identity page

- advantage: the EPT pages will be reused once the guest sets CR0.PG=1

- disadvantage: TLB misses have to walk two levels of page tables

2) emulate CR0.PG=0 with EPT disabled.  Similar to ept=0, CR3 will point
to PAE page tables that do the HVA->GPA transition.

- advantage: faster TLB misses

- disadvantage: need to build separate page tables for CR0.PG=1 (EPT
format) and CR0.PG=0 (PAE format), need to "waste" 4k of GPA space for
the identity map

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26 10:00               ` Paolo Bonzini
@ 2019-09-26 10:03                 ` Maxim Levitsky
  0 siblings, 0 replies; 17+ messages in thread
From: Maxim Levitsky @ 2019-09-26 10:03 UTC (permalink / raw)
  To: Paolo Bonzini, Li Qiang; +Cc: Qemu Developers, Avi Kivity

On Thu, 2019-09-26 at 12:00 +0200, Paolo Bonzini wrote:
> On 26/09/19 11:41, Maxim Levitsky wrote:
> > > Thinking more about it, I suppose that saves memory (the same EPT page
> > > tables can now be used independent of guest CR0.PG), at the cost of
> > > making TLB misses a little slower.
> > 
> > Don't really understand what you mean. 
> > Isn't this always the case that EPT and guest paging
> > are independent (at least when no nesting is involved)?
> 
> There are two possibilities:
> 
> 1) emulate CR0.PG=0 with EPT + identity page
> 
> - advantage: the EPT pages will be reused once the guest sets CR0.PG=1
> 
> - disadvantage: TLB misses have to walk two levels of page tables
> 
> 2) emulate CR0.PG=0 with EPT disabled.  Similar to ept=0, CR3 will point
> to PAE page tables that do the HVA->GPA transition.
> 
> - advantage: faster TLB misses
> 
> - disadvantage: need to build separate page tables for CR0.PG=1 (EPT
> format) and CR0.PG=0 (PAE format), need to "waste" 4k of GPA space for
> the identity map
Thanks for the explanation!

Best regards,
	Maxim Levitsky






^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:53     ` Paolo Bonzini
@ 2019-09-26 11:47       ` Li Qiang
  0 siblings, 0 replies; 17+ messages in thread
From: Li Qiang @ 2019-09-26 11:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Qemu Developers

[-- Attachment #1: Type: text/plain, Size: 2860 bytes --]

Paolo Bonzini <pbonzini@redhat.com> 于2019年9月26日周四 下午5:53写道:

> On 26/09/19 11:35, Li Qiang wrote:
> > So without unrestrict guest the mainline is this: KVM set guest's
> > rflag bit X86_EFLAGS_VM, so when the guest enter guest mode, it is in
> > vm86 mode. In this mode, the CPU will access the address like in
> > real mode(seg*4+offset), this address is linear address. And in fact,
> > the vm86 is still in protected, so the linear address will be
> > translated to gpa by the identity mapping table. Then goes to EPT
> > table?
>
> Yes.
>
> >     ... as soon as the guest tries to enter protected mode, it will get
> into
> >     a situation which is not real mode but doesn't have the segment
> >     registers properly loaded with selectors.
> >
> >     Therefore, it will either
> >     hack things together (enter_pmode) or emulate instructions until the
> >     state is accepted even without unrestricted guest support.
> >
> > Could you please explain this situation more detailed? Why this happen?
>
> Protected mode entry looks like this:
>
>         mov %cr0, %eax
>         or $1, %al
>         mov %eax, %cr0
>         # [1] now in 16-bit protected mode
>         lgdtl gdt32
>         ljmpl $8, 2f
>         # [2] now in 32-bit protected mode
> 2:
>         .code32
>         mov $16, %ax
>         mov %ax, %ds
>         mov %ax, %es
>         mov %ax, %fs
>         mov %ax, %gs
>         mov %ax, %ss
>         # [3] now everything is okay
>
> Between [1] and [3] the vmentry could fail if not in unrestricted mode.
>  For example (see checks on guest segment registers in the SDM):
>
> - "CS. Type must be 9, 11, 13, or 15 (accessed code segment)."  CS in
> real-mode is a RW data segment, not a code segment.  This applies
> between [1] and [2].
>
> - "SS. If the guest will not be virtual-8086 and the “unrestricted
> guest” VM-execution control is 0, the RPL (bits 1:0) must equal the RPL
> of the selector field for CS."  This may not be the case if the segment
> register still holds real-mode values (which are not selectors, just
> base >> 4).  This applies between [1] and [3].
>
> - "DS, ES, FS, GS. The DPL cannot be less than the RPL in the selector
> field"   Again, the real-mode DPL is zero but the RPL makes no sense if
> the segment registers hold a real-mode value.
>
> You can find more about these checks in guest_state_valid(); look at the
> "else" branch of that function, the "then" branch is for pmode->rmode
> transitions.  When any of the checks fail, KVM emulates instructions
> instead of using VMX non-root mode (usually it's just a handful of them,
> as in the case above).
>
>
Thanks so much for your explanation. I will read the code more to
strengthen my understanding.

Thanks,
Li Qiang



> Paolo
>
>

[-- Attachment #2: Type: text/html, Size: 3717 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:18       ` Paolo Bonzini
  2019-09-26  9:24         ` Maxim Levitsky
@ 2019-09-28 22:10         ` Avi Kivity
  2019-09-29  7:39         ` Li Qiang
  2 siblings, 0 replies; 17+ messages in thread
From: Avi Kivity @ 2019-09-28 22:10 UTC (permalink / raw)
  To: Paolo Bonzini, Maxim Levitsky, Li Qiang; +Cc: Qemu Developers

On 9/26/19 12:18 PM, Paolo Bonzini wrote:
> On 26/09/19 10:59, Maxim Levitsky wrote:
>> If you mean to ask if there is a way to let guest access use no
>> paging at all, that is access host physical addresses directly, then
>> indeed there is no way, since regular non 'unrestricted guest' mode
>> required both protected mode and paging, and 'unrestricted guest'
>> requires EPT. Academically speaking it is of course possible to
>> create paging tables that are 1:1...
> Not so academically, it's exactly what KVM does.  However, indeed it
> would also be possible to switch out of EPT mode when CR0.PG=0.  I'm not
> sure why it was done this way, maybe when the code was written it was
> simpler to use the identity map.
>
> Let's see if Avi is listening... :)


I think it was just simpler for the people who implemented it at the 
time. Switching out of EPT would have been a better solution as it would 
no longer require allocating guest physical address space with the few 
warts that requires.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Questions about the real mode in kvm/qemu
  2019-09-26  9:18       ` Paolo Bonzini
  2019-09-26  9:24         ` Maxim Levitsky
  2019-09-28 22:10         ` Avi Kivity
@ 2019-09-29  7:39         ` Li Qiang
  2 siblings, 0 replies; 17+ messages in thread
From: Li Qiang @ 2019-09-29  7:39 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Qemu Developers, Avi Kivity, Maxim Levitsky

[-- Attachment #1: Type: text/plain, Size: 1085 bytes --]

Paolo Bonzini <pbonzini@redhat.com> 于2019年9月26日周四 下午5:18写道:

> On 26/09/19 10:59, Maxim Levitsky wrote:
> > If you mean to ask if there is a way to let guest access use no
> > paging at all, that is access host physical addresses directly, then
> > indeed there is no way, since regular non 'unrestricted guest' mode
> > required both protected mode and paging, and 'unrestricted guest'
> > requires EPT. Academically speaking it is of course possible to
> > create paging tables that are 1:1...
>
> Not so academically, it's exactly what KVM does.  However, indeed it
> would also be possible to switch out of EPT mode when CR0.PG=0.  I'm not
> sure why it was done this way, maybe when the code was written it was
> simpler to use the identity map.
>


Hi Paolo, what's the meaning of 'switch out of EPT mode'. Do you mean when
the guest in real mode emulation(vm86)
can do something to disable EPT? I don't find the code. Seems my
understanding is wrong.

Thanks,
Li Qiang



>
> Let's see if Avi is listening... :)
>
> Paolo
>

[-- Attachment #2: Type: text/html, Size: 1772 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-09-29  7:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-09-26  7:52 Questions about the real mode in kvm/qemu Li Qiang
2019-09-26  8:31 ` Maxim Levitsky
2019-09-26  8:52   ` Li Qiang
2019-09-26  8:59     ` Maxim Levitsky
2019-09-26  9:18       ` Paolo Bonzini
2019-09-26  9:24         ` Maxim Levitsky
2019-09-26  9:33           ` Paolo Bonzini
2019-09-26  9:41             ` Maxim Levitsky
2019-09-26 10:00               ` Paolo Bonzini
2019-09-26 10:03                 ` Maxim Levitsky
2019-09-28 22:10         ` Avi Kivity
2019-09-29  7:39         ` Li Qiang
2019-09-26  9:15 ` Paolo Bonzini
2019-09-26  9:35   ` Maxim Levitsky
2019-09-26  9:35   ` Li Qiang
2019-09-26  9:53     ` Paolo Bonzini
2019-09-26 11:47       ` Li Qiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).