public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Isaku Yamahata <isaku.yamahata@intel.com>
Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 Kai Huang <kai.huang@intel.com>,
	 "federico.parola@polito.it" <federico.parola@polito.it>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	 "isaku.yamahata@gmail.com" <isaku.yamahata@gmail.com>,
	"dmatlack@google.com" <dmatlack@google.com>,
	 "michael.roth@amd.com" <michael.roth@amd.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	 isaku.yamahata@linux.intel.com
Subject: Re: [PATCH v2 07/10] KVM: x86: Always populate L1 GPA for KVM_MAP_MEMORY
Date: Tue, 16 Apr 2024 07:22:21 -0700	[thread overview]
Message-ID: <Zh6JndoNGqEhCpCR@google.com> (raw)
In-Reply-To: <20240416014931.GW3039520@ls.amr.corp.intel.com>

On Mon, Apr 15, 2024, Isaku Yamahata wrote:
> On Mon, Apr 15, 2024 at 02:17:02PM -0700,
> Sean Christopherson <seanjc@google.com> wrote:
> 
> > > > - Return error on guest mode or SMM mode:  Without this patch.
> > > >   Pros: No additional patch.
> > > >   Cons: Difficult to use.
> > > 
> > > Hmm... For the non-TDX use cases this is just an optimization, right? For TDX
> > > there shouldn't be an issue. If so, maybe this last one is not so horrible.
> > 
> > And the fact there are so variables to control (MAXPHADDR, SMM, and guest_mode)
> > basically invalidates the argument that returning an error makes the ioctl() hard
> > to use.  I can imagine it might be hard to squeeze this ioctl() into QEMU's
> > existing code, but I don't buy that the ioctl() itself is hard to use.
> > 
> > Literally the only thing userspace needs to do is set CPUID to implicitly select
> > between 4-level and 5-level paging.  If userspace wants to pre-map memory during
> > live migration, or when jump-starting the guest with pre-defined state, simply
> > pre-map memory before stuffing guest state.  In and of itself, that doesn't seem
> > difficult, e.g. at a quick glance, QEMU could add a hook somewhere in
> > kvm_vcpu_thread_fn() without too much trouble (though that comes with a huge
> > disclaimer that I only know enough about how QEMU manages vCPUs to be dangerous).
> > 
> > I would describe the overall cons for this patch versus returning an error
> > differently.  Switching MMU state puts the complexity in the kernel.  Returning
> > an error punts any complexity to userspace.  Specifically, anything that KVM can
> > do regarding vCPU state to get the right MMU, userspace can do too.
> >  
> > Add on that silently doing things that effectively ignore guest state usually
> > ends badly, and I don't see a good argument for this patch (or any variant
> > thereof).
> 
> Ok, here is a experimental patch on top of the 7/10 to return error.  Is this
> a direction? or do we want to invoke KVM page fault handler without any check?
> 
> I can see the following options.
> 
> - Error if vCPU is in SMM mode or guest mode: This patch
>   Defer the decision until the use cases come up.  We can utilize
>   KVM_CAP_MAP_MEMORY and struct kvm_map_memory.flags for future
>   enhancement.
>   Pro: Keep room for future enhancement for unclear use cases to defer
>        the decision.
>   Con: The use space VMM has to check/switch the vCPU mode.
> 
> - No check of vCPU mode and go on
>   Pro: It works.
>   Con: Unclear how the uAPI should be without concrete use cases.
> 
> - Always populate with L1 GPA:
>   This is a bad idea.

Not always.  If L1 is using shadow paging, then L1 and L2 GPAs are in the same
domain from KVM's perspective.

As I said in v1 (I think it was v1), whether or not mapping an L1 GPA is supported
should be a property of the mmu, not an explicit check.  As far as the TDP MMU is
concerend, there's nothing special about SMM nor is there anything special about
guest_mode, except for the fact that they use different roots than "normal" mode.
But that part Just Works.

And if L1 is using TDP, i.e. KVM is shadowing L1's TDP page tables, and L2 is
active, then vcpu->arch.mmu will point at a variation of the shadow MMU, e.g.
the PTTYPE_EPT MMU on Intel CPUs.  The shadow MMU won't support pre-mapping GPAs
because it's non-sensical (legacy shadow paging needs a GVA, nested TDP needs an
L2 GPA), and so the ioctl() fails because mmu->map_gpa or whatever is NULL.

In other words, unless I'm forgetting something, explicit checks for "unsupported"
modes shoud be unnecessary, because

  reply	other threads:[~2024-04-16 14:22 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-10 22:07 [PATCH v2 00/10] KVM: Guest Memory Pre-Population API isaku.yamahata
2024-04-10 22:07 ` [PATCH v2 01/10] KVM: Document KVM_MAP_MEMORY ioctl isaku.yamahata
2024-04-15 23:27   ` Edgecombe, Rick P
2024-04-15 23:47     ` Isaku Yamahata
2024-04-17 11:56     ` Paolo Bonzini
2024-04-10 22:07 ` [PATCH v2 02/10] KVM: Add KVM_MAP_MEMORY vcpu ioctl to pre-populate guest memory isaku.yamahata
2024-04-16 14:20   ` Edgecombe, Rick P
2024-04-10 22:07 ` [PATCH v2 03/10] KVM: x86/mmu: Extract __kvm_mmu_do_page_fault() isaku.yamahata
2024-04-16  8:22   ` Chao Gao
2024-04-16 23:43     ` Isaku Yamahata
2024-04-16 14:36   ` Edgecombe, Rick P
2024-04-16 23:52     ` Isaku Yamahata
2024-04-17 15:41       ` Paolo Bonzini
2024-04-10 22:07 ` [PATCH v2 04/10] KVM: x86/mmu: Make __kvm_mmu_do_page_fault() return mapped level isaku.yamahata
2024-04-16 14:40   ` Edgecombe, Rick P
2024-04-16 23:59     ` Isaku Yamahata
2024-04-10 22:07 ` [PATCH v2 05/10] KVM: x86/mmu: Introduce kvm_tdp_map_page() to populate guest memory isaku.yamahata
2024-04-16 14:46   ` Edgecombe, Rick P
2024-04-17 18:39     ` Isaku Yamahata
2024-04-17  7:04   ` Chao Gao
2024-04-17 18:44     ` Isaku Yamahata
2024-04-10 22:07 ` [PATCH v2 06/10] KVM: x86: Implement kvm_arch_vcpu_map_memory() isaku.yamahata
2024-04-16 15:12   ` Edgecombe, Rick P
2024-04-17  7:20   ` Chao Gao
2024-04-17 12:18   ` Paolo Bonzini
2024-04-10 22:07 ` [PATCH v2 07/10] KVM: x86: Always populate L1 GPA for KVM_MAP_MEMORY isaku.yamahata
2024-04-15 19:12   ` Edgecombe, Rick P
2024-04-15 21:17     ` Sean Christopherson
2024-04-15 21:36       ` Edgecombe, Rick P
2024-04-15 22:59         ` Sean Christopherson
2024-04-16  1:49       ` Isaku Yamahata
2024-04-16 14:22         ` Sean Christopherson [this message]
2024-04-16 21:41       ` Paolo Bonzini
2024-04-16 23:00         ` Sean Christopherson
2024-04-17 10:28           ` Paolo Bonzini
2024-04-15 19:37   ` Edgecombe, Rick P
2024-04-16 17:11   ` Edgecombe, Rick P
2024-04-10 22:07 ` [PATCH v2 08/10] KVM: x86: Add a hook in kvm_arch_vcpu_map_memory() isaku.yamahata
2024-04-16 14:57   ` Edgecombe, Rick P
2024-04-17 12:26   ` Paolo Bonzini
2024-04-10 22:07 ` [PATCH v2 09/10] KVM: SVM: Implement pre_mmu_map_page() to refuse KVM_MAP_MEMORY isaku.yamahata
2024-04-10 22:07 ` [PATCH v2 10/10] KVM: selftests: x86: Add test for KVM_MAP_MEMORY isaku.yamahata

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zh6JndoNGqEhCpCR@google.com \
    --to=seanjc@google.com \
    --cc=dmatlack@google.com \
    --cc=federico.parola@polito.it \
    --cc=isaku.yamahata@gmail.com \
    --cc=isaku.yamahata@intel.com \
    --cc=isaku.yamahata@linux.intel.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox