All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Chen CJ <jason.cj.chen@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Dmytro Maluka <dmy@semihalf.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"android-kvm@google.com" <android-kvm@google.com>,
	Dmitry Torokhov <dtor@chromium.org>,
	Tomasz Nowicki <tn@semihalf.com>,
	Grzegorz Jaszczyk <jaz@semihalf.com>,
	Keir Fraser <keirf@google.com>
Subject: Re: [RFC PATCH part-5 00/22] VMX emulation
Date: Tue, 20 Jun 2023 15:46:21 +0000	[thread overview]
Message-ID: <ZJHJzU607QOYeRM3@jiechen-ubuntu-dev> (raw)
In-Reply-To: <ZIjInENnK5/L/Jsd@google.com>

On Tue, Jun 13, 2023 at 12:50:52PM -0700, Sean Christopherson wrote:
> On Fri, Jun 09, 2023, Dmytro Maluka wrote:
> > On 6/9/23 04:07, Chen, Jason CJ wrote:
> > > I think with PV design, we can benefit from skip shadowing. For example, a TLB flush
> > > could be done in hypervisor directly, while shadowing EPT need emulate it by destroy
> > > shadow EPT page table entries then do next shadowing upon ept violation.
> 
> This is a bit misleading.  KVM has an effective TLB for nested TDP only for 4KiB
> pages; larger shadow pages are never allowed to go out-of-sync, i.e. KVM doesn't
> wait until L1 does a TLB flush to update SPTEs.  KVM does "unload" roots, e.g. to
> emulate INVEPT, but that usually just ends up being an extra slow TLB flush in L0,
> because nested TDP SPTEs rarely go unsync in practice.  The patterns for hypervisors
> managing VM memory don't typically trigger the types of PTE modifications that
> result in unsync SPTEs.
> 
> I actually have a (very tiny) patch sitting around somwhere to disable unsync support
> when TDP is enabled.  There is a very, very thoeretical bug where KVM might fail
> to honor when a guest TDP PTE change is architecturally supposed to be visible,
> and the simplest fix (by far) is to disable unsync support.  Disabling TDP+unsync
> is a viable fix because unsync support is almost never used for nested TDP.  Legacy
> shadow paging on the other hand *significantly* benefits from unsync support, e.g.
> when the guest is managing CoW mappings. I haven't gotten around to posting the
> patch to disable unsync on TDP purely because the flaw is almost comically theoretical.
> 
> Anyways, the point is that the TLB flushing side of nested TDP isn't all that
> interesting.

Agree. Thanks to point it out! I was thinking based on comparing to
current RFC pkvm on x86 solution. :-(

To me, the KVM page table shadowing mechanism (e.g., unsync & sync page)
is too heavy & complicated, if we have KPOP solution, IIUC, we may be 
able to totally remove all shadowing stuff, right? :-)

BTW, KPOP may bring questions to support access tracking & page
dirty loging which may need extend more PV interfaces. MMIO fault
could be another issue if we want to keep optimization based on EPT
MISCONFIG for IA platform.

> 
> > Yeah indeed, good point.
> > 
> > Is my understanding correct: TLB flush is still gonna be requested by
> > the host VM via a hypercall, but the benefit is that the hypervisor
> > merely needs to do INVEPT?
> 
> Maybe?  A paravirt paging scheme could do whatever it wanted.  The APIs could be
> designed in such a way that L1 never needs to explicitly request a TLB flush,
> e.g. if the contract is that changes must always become immediately visible to L2.
> 
> And TLB flushing is but one small aspect of page table shadowing.  With PV paging,
> L1 wouldn't need to manage hardware-defined page tables, i.e. could use any arbitrary
> data type.  E.g. KVM as L1 could use an XArray to track L2 mappings.  And L0 in
> turn wouldn't need to have vendor specific code, i.e. pKVM on x86 (potentially
> *all* architectures) could have a single nested paging scheme for both Intel and
> AMD, as opposed to needing code to deal with the differences between EPT and NPT.
> 
> A few months back, I mentally worked through the flows[*] (I forget why I was
> thinking about PV paging), and I'm pretty sure that adapting x86's TDP MMU to
> support PV paging would be easy-ish, e.g. kvm_tdp_mmu_map() would become an
> XArray insertion (to track the L2 mapping) + hypercall (to inform L1 of the new
> mapping).
> 
> [*] I even though of a catchy name, KVM Paravirt Only Paging, a.k.a. KPOP ;-)

-- 

Thanks
Jason CJ Chen

  parent reply	other threads:[~2023-06-20  7:33 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-12 18:02 [RFC PATCH part-5 00/22] VMX emulation Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 01/22] pkvm: x86: Add memcpy lib Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 02/22] pkvm: x86: Add memory operation APIs for for host VM Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 03/22] pkvm: x86: Do guest address translation per page granularity Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 04/22] pkvm: x86: Add check for guest address translation Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 05/22] pkvm: x86: Add hypercalls for shadow_vm/vcpu init & teardown Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 06/22] KVM: VMX: Add new kvm_x86_ops vm_free Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 07/22] KVM: VMX: Add initialization/teardown for shadow vm/vcpu Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 08/22] pkvm: x86: Add hash table mapping for shadow vcpu based on vmcs12_pa Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 09/22] pkvm: x86: Add VMXON/VMXOFF emulation Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 10/22] pkvm: x86: Add has_vmcs_field() API for physical vmx capability check Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 11/22] KVM: VMX: Add more vmcs and vmcs12 fields definition Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 12/22] pkvm: x86: Init vmcs read/write bitmap for vmcs emulation Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 13/22] pkvm: x86: Initialize emulated fields " Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 14/22] pkvm: x86: Add msr ops for pKVM hypervisor Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 15/22] pkvm: x86: Move _init_host_state_area to " Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 16/22] pkvm: x86: Add vmcs_load/clear_track APIs Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 17/22] pkvm: x86: Add VMPTRLD/VMCLEAR emulation Jason Chen CJ
2023-03-12 18:02 ` [RFC PATCH part-5 18/22] pkvm: x86: Add VMREAD/VMWRITE emulation Jason Chen CJ
2023-03-12 18:03 ` [RFC PATCH part-5 19/22] pkvm: x86: Add VMLAUNCH/VMRESUME emulation Jason Chen CJ
2023-03-12 18:03 ` [RFC PATCH part-5 20/22] pkvm: x86: Add INVEPT/INVVPID emulation Jason Chen CJ
2023-03-12 18:03 ` [RFC PATCH part-5 21/22] pkvm: x86: Initialize msr_bitmap for vmsr Jason Chen CJ
2023-03-12 18:03 ` [RFC PATCH part-5 22/22] pkvm: x86: Add vmx msr emulation Jason Chen CJ
2023-03-13 16:58 ` [RFC PATCH part-5 00/22] VMX emulation Sean Christopherson
2023-03-14 16:29   ` Jason Chen CJ
2023-06-08 21:38     ` Dmytro Maluka
2023-06-09  2:07       ` Chen, Jason CJ
2023-06-09  8:34         ` Dmytro Maluka
2023-06-13 19:50           ` Sean Christopherson
2023-06-15 18:07             ` Dmytro Maluka
2023-06-20 15:46             ` Jason Chen CJ [this message]
2023-09-05  9:47             ` Jason Chen CJ
2023-06-15  3:59           ` Chen, Jason CJ
2023-06-15 21:13       ` Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJHJzU607QOYeRM3@jiechen-ubuntu-dev \
    --to=jason.cj.chen@intel.com \
    --cc=android-kvm@google.com \
    --cc=dmy@semihalf.com \
    --cc=dtor@chromium.org \
    --cc=jaz@semihalf.com \
    --cc=keirf@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=seanjc@google.com \
    --cc=tn@semihalf.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.