public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Chao Gao <chao.gao@intel.com>
To: "Huang, Kai" <kai.huang@intel.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>, "Christopherson,,
	Sean" <seanjc@google.com>,
	"binbin.wu@linux.intel.com" <binbin.wu@linux.intel.com>,
	"Guo, Xuelian" <xuelian.guo@intel.com>,
	"robert.hu@linux.intel.com" <robert.hu@linux.intel.com>
Subject: Re: [PATCH v7 2/5] KVM: x86: Virtualize CR3.LAM_{U48,U57}
Date: Fri, 21 Apr 2023 23:32:17 +0800	[thread overview]
Message-ID: <ZEKsgceQo6MEWbB5@chao-email> (raw)
In-Reply-To: <7895c517a84300f903cb04fbf2f05c4b8e518c91.camel@intel.com>

On Fri, Apr 21, 2023 at 07:43:52PM +0800, Huang, Kai wrote:
>On Fri, 2023-04-21 at 14:35 +0800, Binbin Wu wrote:
>> On 4/13/2023 5:13 PM, Huang, Kai wrote:
>> > > On 4/13/2023 10:27 AM, Huang, Kai wrote:
>> > > > On Thu, 2023-04-13 at 09:36 +0800, Binbin Wu wrote:
>> > > > > On 4/12/2023 7:58 PM, Huang, Kai wrote:
>> > > > > 
>> > > ...
>> > > > > > > +	root_gfn = (root_pgd & __PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
>> > > > > > Or, should we explicitly mask vcpu->arch.cr3_ctrl_bits?  In this
>> > > > > > way, below
>> > > > > > mmu_check_root() may potentially catch other invalid bits, but in
>> > > > > > practice there should be no difference I guess.
>> > > > > In previous version, vcpu->arch.cr3_ctrl_bits was used as the mask.
>> > > > > 
>> > > > > However, Sean pointed out that the return value of
>> > > > > mmu->get_guest_pgd(vcpu) could be
>> > > > > EPTP for nested case, so it is not rational to mask to CR3 bit(s) from EPTP.
>> > > > Yes, although EPTP's high bits don't contain any control bits.
>> > > > 
>> > > > But perhaps we want to make it future-proof in case some more control
>> > > > bits are added to EPTP too.
>> > > > 
>> > > > > Since the guest pgd has been check for valadity, for both CR3 and
>> > > > > EPTP, it is safe to mask off non-address bits to get GFN.
>> > > > > 
>> > > > > Maybe I should add this CR3 VS. EPTP part to the changelog to make it
>> > > > > more undertandable.
>> > > > This isn't necessary, and can/should be done in comments if needed.
>> > > > 
>> > > > But IMHO you may want to add more material to explain how nested cases
>> > > > are handled.
>> > > Do you mean about CR3 or others?
>> > > 
>> > This patch is about CR3, so CR3.
>> 
>> For nested case, I plan to add the following in the changelog:
>> 
>>      For nested guest:
>>      - If CR3 is intercepted, after CR3 handled in L1, CR3 will be 
>> checked in
>>        nested_vmx_load_cr3() before returning back to L2.
>>      - For the shadow paging case (SPT02), LAM bits are also be handled 
>> to form
>>        a new shadow CR3 in vmx_load_mmu_pgd().
>> 
>> 
>
>I don't know a lot of code detail of KVM nested code, but in general, since your
>code only changes nested_vmx_load_cr3() and nested_vmx_check_host_state(), the
>changelog should focus on explaining why modifying these two functions are good
>enough.

the patch relaxes all existing checks on CR3, allowing previously reserved bits
to be set. It should be enough otherwise some checks on CR3 are missing in the
first place.

>
>And to explain this, I think we should explain from hardware's perspective
>rather than from shadow paging's perspective.
>
>From L0's perspective, we care about:
>
>	1) L1's CR3 register (VMCS01's GUEST_CR3)
>	2) L1's VMCS to run L2 guest
>		2.1) VMCS12's HOST_CR3 
>		2.2) VMCS12's GUEST_CR3
>
>For 1) the current changelog has explained (that we need to catch _active_
>control bits in guest's CR3 etc).  For nested (case 2)), L1 can choose to
>intercept CR3 or not.  But this isn't the point because from hardware's point of
>view we actually care about below two cases instead:
>
>	1) L0 to emulate a VMExit from L2 to L1 by VMENTER using VMCS01 
>	   to reflect VMCS12
>	2) L0 to VMENTER to L2 using VMCS02 directly
>
>The case 2) can fail due to fail to VMENTER to L2 of course but this should
>result in L0 to VMENTER to L1 with a emulated VMEXIT from L2 to L1 which is the
>case 1).
>
>For case 1) we need new code to check VMCS12's HOST_CR3 against guest's _active_
>CR3 feature control bits.  The key code path is:
>
>	vmx_handle_exit()
>		-> prepare_vmcs12()
>		-> load_vmcs12_host_state().  
>
>For case 2) I _think_ we need new code to check both VMCS12's HOST_CR3 and
>GUEST_CR3 against active control bits.  The key code path is 

...

>
>	nested_vmx_run() -> 
>		-> nested_vmx_check_host_state()
>		-> nested_vmx_enter_non_root_mode()
>			-> prepare_vmcs02_early()
>			-> prepare_vmcs02()
>
>Since nested_vmx_load_cr3() is used in both VMENTER using VMCS12's HOST_CR3
>(VMEXIT to L1) and GUEST_CR3 (VMENTER to L2), and it currently already checks
>kvm_vcpu_is_illegal_gpa(vcpu, cr3), changing it to additionally check guest CR3
>active control bits seems just enough.

IMO, current KVM relies on hardware to do consistency check on the GUEST_CR3
during VM entry.

>
>Also, nested_vmx_check_host_state() (i.e. it is called in nested_vmx_run() to
>return early if any HOST state is wrong) currently checks
>kvm_vcpu_is_illegal_gpa(vcpu, cr3) too so we should also change it.
>
>That being said, I do find it's not easy to come up with a "concise" changelog
>to cover both non-nested and (especially) nested cases, but it seems we can
>abstract some from my above typing?  
>
>My suggestion is to focus on the hardware behaviour's perspective as typed
>above.  But I believe Sean can easily make a "concise" changelog if he wants to
>comment here :)

  reply	other threads:[~2023-04-21 15:32 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-04 13:09 [PATCH v7 0/5] Linear Address Masking (LAM) KVM Enabling Binbin Wu
2023-04-04 13:09 ` [PATCH v7 1/5] KVM: x86: Virtualize CR4.LAM_SUP Binbin Wu
2023-04-04 13:09 ` [PATCH v7 2/5] KVM: x86: Virtualize CR3.LAM_{U48,U57} Binbin Wu
2023-04-06 12:57   ` Huang, Kai
2023-04-09 11:36     ` Binbin Wu
2023-04-11 23:11       ` Huang, Kai
2023-04-12 11:58   ` Huang, Kai
2023-04-13  1:36     ` Binbin Wu
2023-04-13  2:27       ` Huang, Kai
2023-04-13  4:45         ` Binbin Wu
2023-04-13  9:13           ` Huang, Kai
2023-04-21  6:35             ` Binbin Wu
2023-04-21 11:43               ` Huang, Kai
2023-04-21 15:32                 ` Chao Gao [this message]
2023-04-22  4:51                   ` Chao Gao
2023-04-22  8:14                     ` Huang, Kai
2023-04-22  3:32                 ` Binbin Wu
2023-04-22  4:43                   ` Chao Gao
2023-04-27 13:19                     ` Huang, Kai
2023-04-29  4:56                       ` Binbin Wu
2023-04-25 22:48                   ` Huang, Kai
2023-04-26  3:05                     ` Chao Gao
2023-04-26  5:13                       ` Binbin Wu
2023-04-26  8:44                         ` Huang, Kai
2023-04-26  8:50                           ` Binbin Wu
2023-04-26  8:43                       ` Huang, Kai
2023-04-26 10:52                         ` Binbin Wu
2023-04-27 13:23                           ` Huang, Kai
2023-04-17  7:24   ` Chao Gao
2023-04-17  8:02     ` Binbin Wu
2023-04-04 13:09 ` [PATCH v7 3/5] KVM: x86: Introduce untag_addr() in kvm_x86_ops Binbin Wu
2023-04-18  3:08   ` Zeng Guang
2023-04-18  3:34     ` Binbin Wu
2023-04-19  2:30   ` Chao Gao
2023-04-19  3:08     ` Binbin Wu
2023-04-21  7:48       ` Binbin Wu
2023-04-21  8:21         ` Chao Gao
2023-04-04 13:09 ` [PATCH v7 4/5] KVM: x86: Untag address when LAM applicable Binbin Wu
2023-04-06 13:20   ` Huang, Kai
2023-04-10  3:35     ` Binbin Wu
2023-04-18  3:28   ` Zeng Guang
2023-04-18  3:38     ` Binbin Wu
2023-04-19  6:43   ` Chao Gao
2023-04-21  7:57     ` Binbin Wu
2023-04-21  8:36       ` Chao Gao
2023-04-21  9:13         ` Binbin Wu
2023-04-04 13:09 ` [PATCH v7 5/5] KVM: x86: Expose LAM feature to userspace VMM Binbin Wu
2023-04-21  9:40 ` [PATCH v7 0/5] Linear Address Masking (LAM) KVM Enabling Binbin Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZEKsgceQo6MEWbB5@chao-email \
    --to=chao.gao@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=robert.hu@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=xuelian.guo@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox