kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Gleb Natapov <gleb@redhat.com>,
	"Zhang, Yang Z" <yang.z.zhang@intel.com>,
	kvm <kvm@vger.kernel.org>,
	Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Arthur Chunqi Li <yzt356@gmail.com>
Subject: Re: [PATCH v2 5/8] KVM: nVMX: Fix guest CR3 read-back on VM-exit
Date: Wed, 07 Aug 2013 15:32:37 +0200	[thread overview]
Message-ID: <52024C75.9000304@redhat.com> (raw)
In-Reply-To: <520241A7.5060301@siemens.com>

On 08/07/2013 02:46 PM, Jan Kiszka wrote:
> On 2013-08-07 14:39, Gleb Natapov wrote:
>> On Tue, Aug 06, 2013 at 05:57:02PM +0200, Jan Kiszka wrote:
>>> On 2013-08-06 17:53, Gleb Natapov wrote:
>>>> On Tue, Aug 06, 2013 at 05:48:54PM +0200, Jan Kiszka wrote:
>>>>> On 2013-08-06 17:04, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2013-08-06:
>>>>>>> On Tue, Aug 06, 2013 at 02:12:51PM +0000, Zhang, Yang Z wrote:
>>>>>>>> Gleb Natapov wrote on 2013-08-06:
>>>>>>>>> On Tue, Aug 06, 2013 at 11:44:41AM +0000, Zhang, Yang Z wrote:
>>>>>>>>>> Gleb Natapov wrote on 2013-08-06:
>>>>>>>>>>> On Tue, Aug 06, 2013 at 10:39:59AM +0200, Jan Kiszka wrote:
>>>>>>>>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>>>>>>>
>>>>>>>>>>>> If nested EPT is enabled, the L2 guest may change CR3 without any
>>>>>>>>>>>> exits. We therefore have to read the current value from the VMCS
>>>>>>>>>>>> when switching to L1. However, if paging wasn't enabled, L0 tracks
>>>>>>>>>>>> L2's CR3, and GUEST_CR3 rather contains the real-mode identity map.
>>>>>>>>>>>> So we need to retrieve CR3 from the architectural state after
>>>>>>>>>>>> conditionally updating it - and this is what kvm_read_cr3 does.
>>>>>>>>>>>>
>>>>>>>>>>> I have a headache from trying to think about it already, but
>>>>>>>>>>> shouldn't
>>>>>>>>>>> L1 be the one who setups identity map for L2? I traced what
>>>>>>>>>>> vmcs_read64(GUEST_CR3)/kvm_read_cr3(vcpu) return here and do not
>>>>>>>>>>> see
>>>>>>>>>> Here is my understanding:
>>>>>>>>>> In vmx_set_cr3(), if enabled ept, it will check whether target
>>>>>>>>>> vcpu is enabling
>>>>>>>>> paging. When L2 running in real mode, then target vcpu is not
>>>>>>>>> enabling paging and it will use L0's identity map for L2. If you
>>>>>>>>> read GUEST_CR3 from VMCS, then you may get the L2's identity map
>>>>>>>>> not
>>>>>>> L1's.
>>>>>>>>>>
>>>>>>>>> Yes, but why it makes sense to use L0 identity map for L2? I didn't
>>>>>>>>> see different vmcs_read64(GUEST_CR3)/kvm_read_cr3(vcpu) values because
>>>>>>>>> L0 and L1 use the same identity map address. When I changed identity
>>>>>>>>> address L1 configures vmcs_read64(GUEST_CR3)/kvm_read_cr3(vcpu) are
>>>>>>>>> indeed different, but the real CR3 L2 uses points to L0 identity map.
>>>>>>>>> If I zero L1 identity map page L2 still works.
>>>>>>>>>
>>>>>>>> If L2 in real mode, then L2PA == L1PA. So L0's identity map also works
>>>>>>>> if L2 is in real mode.
>>>>>>>>
>>>>>>> That not the point. It may work accidentally for kvm on kvm, but what
>>>>>>> if other hypervisor plays different tricks and builds different ident map for its guest?
>>>>>> Yes, if other hypervisor doesn't build the 1:1 mapping for its guest, it will fail to work. But I cannot imagine what kind of hypervisor will do this and what the purpose is.
>>>>>> Anyway, current logic is definitely wrong. It should use L1's identity map instead L0's.
>>>>>
>>>>> So something like this is rather needed?
>>>>>
>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>>> index 44494ed..60a3644 100644
>>>>> --- a/arch/x86/kvm/vmx.c
>>>>> +++ b/arch/x86/kvm/vmx.c
>>>>> @@ -3375,8 +3375,10 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
>>>>>   	if (enable_ept) {
>>>>>   		eptp = construct_eptp(cr3);
>>>>>   		vmcs_write64(EPT_POINTER, eptp);
>>>>> -		guest_cr3 = is_paging(vcpu) ? kvm_read_cr3(vcpu) :
>>>>> -			vcpu->kvm->arch.ept_identity_map_addr;
>>>>> +		if (is_paging(vcpu) || is_guest_mode(vcpu))
>>>>> +			guest_cr3 = kvm_read_cr3(vcpu) :
>>>>> +		else
>>>>> +			guest_cr3 = vcpu->kvm->arch.ept_identity_map_addr;
>>>>>   		ept_load_pdptrs(vcpu);
>>>>>   	}
>>>>>
>>>> That what I am thinking, will think about it some more tomorrow.
>>>
>>> OK, and I'll feed it into a local test.
>>>
>> Thought about is some more. So without nested unrestricted guest (nUG)
>> is_paging() will always be true (since without nUG guest entry is not
>> possible otherwise) and guest's cr3 will be used, but with nUG identity
>> map is not used (that is why L2 still works even though wrong identity
>> map pointer is assigned to cr3), so the code here just corrupts nested
>> guest's cr3 for no reason and that is why you had to use kvm_read_cr3()
>> in prepare_vmcs12() to get correct cr3 value. The patch above should be
>> used instead of original one IMO. How is testing going?
>
> Yes, testing worked fine. I've queued above patch and will send it out
> within the next round.

Just reply here with the commit message you desire and Signed-off-by, so 
I can queue it for people who wish to play with nEPT.

Paolo


  reply	other threads:[~2013-08-07 13:32 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-06  8:39 [PATCH v2 0/8] KVM: nVMX: Enable unrestricted guest mode and fix some nEPT issues Jan Kiszka
2013-08-06  8:39 ` [PATCH v2 1/8] KVM: nEPT: Advertise WB type EPTP Jan Kiszka
2013-08-06  8:39 ` [PATCH v2 2/8] KVM: nVMX: Fix up VM_ENTRY_IA32E_MODE control feature reporting Jan Kiszka
2013-08-06  9:10   ` Gleb Natapov
2013-08-06  8:39 ` [PATCH v2 3/8] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state Jan Kiszka
2013-08-06  8:39 ` [PATCH v2 4/8] KVM: nVMX: Enable unrestricted guest mode support Jan Kiszka
2013-08-06  8:39 ` [PATCH v2 5/8] KVM: nVMX: Fix guest CR3 read-back on VM-exit Jan Kiszka
2013-08-06 10:12   ` Gleb Natapov
2013-08-06 10:25     ` Jan Kiszka
2013-08-06 10:31       ` Gleb Natapov
2013-08-06 11:44         ` Gleb Natapov
2013-08-06 11:44     ` Zhang, Yang Z
2013-08-06 14:02       ` Gleb Natapov
2013-08-06 14:12         ` Zhang, Yang Z
2013-08-06 14:41           ` Gleb Natapov
2013-08-06 15:04             ` Zhang, Yang Z
2013-08-06 15:48               ` Jan Kiszka
2013-08-06 15:53                 ` Gleb Natapov
2013-08-06 15:57                   ` Jan Kiszka
2013-08-07 12:39                     ` Gleb Natapov
2013-08-07 12:46                       ` Jan Kiszka
2013-08-07 13:32                         ` Paolo Bonzini [this message]
2013-08-07 13:38                           ` Gleb Natapov
2013-08-07 13:54                             ` Paolo Bonzini
2013-08-07 13:59                               ` Jan Kiszka
2013-08-06  8:40 ` [PATCH v2 6/8] KVM: nVMX: Load nEPT state after EFER Jan Kiszka
2013-08-06  8:40 ` [PATCH v2 7/8] KVM: nVMX: Implement support for EFER saving on VM-exit Jan Kiszka
2013-08-06  8:40 ` [PATCH v2 8/8] KVM: nVMX: Update mmu.base_role.nxe after EFER loading on VM-entry/exit Jan Kiszka
2013-08-07 14:06 ` [PATCH v2 0/8] KVM: nVMX: Enable unrestricted guest mode and fix some nEPT issues Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52024C75.9000304@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=gleb@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jun.nakajima@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    --cc=yang.z.zhang@intel.com \
    --cc=yzt356@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).