Re: [PATCH v3 2/3] KVM: x86: Use kvzalloc() to allocate VM struct

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Kai Huang <kai.huang@intel.com>
Cc: "pbonzini@redhat.com" <pbonzini@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 "vipinsh@google.com" <vipinsh@google.com>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 2/3] KVM: x86: Use kvzalloc() to allocate VM struct
Date: Tue, 20 May 2025 16:11:29 -0700	[thread overview]
Message-ID: <aC0MIUOTQbb9-a7k@google.com> (raw)
In-Reply-To: <dca247173aace1269ce8512ae2d3797289bb1718.camel@intel.com>

On Tue, May 20, 2025, Kai Huang wrote:
> On Mon, 2025-05-19 at 08:39 -0700, Sean Christopherson wrote:
> > +static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn,
> > +					enum pg_level level, kvm_pfn_t pfn)
> >  {
> >  	struct page *page = pfn_to_page(pfn);
> >  	int ret;
> > @@ -3507,10 +3507,14 @@ int __init tdx_bringup(void)
> >  	r = __tdx_bringup();
> >  	if (r) {
> >  		/*
> > -		 * Disable TDX only but don't fail to load module if
> > -		 * the TDX module could not be loaded.  No need to print
> > -		 * message saying "module is not loaded" because it was
> > -		 * printed when the first SEAMCALL failed.
> > +		 * Disable TDX only but don't fail to load module if the TDX
> > +		 * module could not be loaded.  No need to print message saying
> > +		 * "module is not loaded" because it was printed when the first
> > +		 * SEAMCALL failed.  Don't bother unwinding the S-EPT hooks or
> > +		 * vm_size, as kvm_x86_ops have already been finalized (and are
> > +		 * intentionally not exported).  The S-EPT code is unreachable,
> > +		 * and allocating a few more bytes per VM in a should-be-rare
> > +		 * failure scenario is a non-issue.
> >  		 */
> >  		if (r == -ENODEV)
> >  			goto success_disable_tdx;
> > @@ -3524,3 +3528,19 @@ int __init tdx_bringup(void)
> >  	enable_tdx = 0;
> >  	return 0;
> >  }
> > +
> > +
> > +void __init tdx_hardware_setup(void)
> > +{
> > +	/*
> > +	 * Note, if the TDX module can't be loaded, KVM TDX support will be
> > +	 * disabled but KVM will continue loading (see tdx_bringup()).
> > +	 */
> 
> This comment seems a little bit weird to me.  I think what you meant here is the
> @vm_size and those S-EPT ops are not unwound while TDX cannot be brought up but
> KVM is still loaded.

This comment is weird?  Or the one in tdx_bringup() is weird?  The sole intent
of _this_ comment is to clarify that KVM could still end up running load with TDX
disabled.  The comment about not unwinding S-EPT resides in tdx_bringup(), because
that's where the actual decision to not reject KVM load and to not undo the setup
lives.

> > +
> > +	vt_x86_ops.link_external_spt = tdx_sept_link_private_spt;
> > +	vt_x86_ops.set_external_spte = tdx_sept_set_private_spte;
> > +	vt_x86_ops.free_external_spt = tdx_sept_free_private_spt;
> > +	vt_x86_ops.remove_external_spte = tdx_sept_remove_private_spte;
> > +	vt_x86_ops.protected_apic_has_interrupt = tdx_protected_apic_has_interrupt;
> > +}
> > diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
> > index 51f98443e8a2..ca39a9391db1 100644
> > --- a/arch/x86/kvm/vmx/tdx.h
> > +++ b/arch/x86/kvm/vmx/tdx.h
> > @@ -8,6 +8,7 @@
> >  #ifdef CONFIG_KVM_INTEL_TDX
> >  #include "common.h"
> >  
> > +void tdx_hardware_setup(void);
> >  int tdx_bringup(void);
> >  void tdx_cleanup(void);
> >  
> 
> There's a build error when CONFIG_KVM_INTEL_TDX is off:
> 
> vmx/main.c: In function ‘vt_hardware_setup’:
> vmx/main.c:34:17: error: implicit declaration of function ‘tdx_hardware_setup’;
> did you mean ‘vmx_hardware_setup’? [-Wimplicit-function-declaration]
>    34 |                 tdx_hardware_setup();
>       |                 ^~~~~~~~~~~~~~~~~~
>       |                 vmx_hardware_setup
> 
> .. for which you need a stub for tdx_hardware_setup() when CONFIG_KVM_INTEL_TDX
> is off.

Not in kvm-x86/next, commit 907092bf7cbd ("KVM: VMX: Clean up and macrofy x86_ops")
buried all of vt_hardware_setup() behind CONFIG_KVM_INTEL_TDX=y.

> And one more thing:
> 
> With the above patch, we still have below code in vt_init():
> 
>         /*
>          * TDX and VMX have different vCPU structures.  Calculate the
>          * maximum size/align so that kvm_init() can use the larger
>          * values to create the kmem_vcpu_cache.
>          */
>         vcpu_size = sizeof(struct vcpu_vmx);
>         vcpu_align = __alignof__(struct vcpu_vmx);
>         if (enable_tdx) {
>                 vcpu_size = max_t(unsigned, vcpu_size,
>                                 sizeof(struct vcpu_tdx));
>                 vcpu_align = max_t(unsigned, vcpu_align,
>                                 __alignof__(struct vcpu_tdx));
>                 kvm_caps.supported_vm_types |= BIT(KVM_X86_TDX_VM);
>         }
> 
> It's kinda ugly too IMHO.
> 
> Since we already have @vm_size in kvm_x86_ops, how about also adding vcpu_size
> and vcpu_align to it?  Then they can be treated in the same way as vm_size for
> TDX.
> 
> They are not needed for SVM, but it doesn't hurt that much?

I'd rather not.  vt_init() already needs to be aware of TDX, e.g. to call into
tdx_bringup() in the first place.  Shoving state into kvm_x86_ops that is only
ever used in vt_init() (an __init function at that) isn't a net positive.

Putting the fields in kvm_x86_init_ops would be better, but I still don't think
the complexity and indirection is justified.  Bleeding gory TDX details into the
common code is undesirable, but I don't see the size of vcpu_tdx or the fact that
enable_tdx==true means KVM_X86_TDX_VM is supported as being information with
hiding.

kvm_x86_ops.vm_size is a means to an end.  E.g. if kvm_vcpu_cache didn't exist,
then we'd probably want/need kvm_x86_ops.vcpu_size, but it does exist, so...

next prev parent reply	other threads:[~2025-05-20 23:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-16 21:54 [PATCH v3 0/3] KVM: x86: Dynamically allocate hashed page list Sean Christopherson
2025-05-16 21:54 ` [PATCH v3 1/3] KVM: x86/mmu: Dynamically allocate shadow MMU's " Sean Christopherson
2025-05-16 21:54 ` [PATCH v3 2/3] KVM: x86: Use kvzalloc() to allocate VM struct Sean Christopherson
2025-05-17 12:35   ` Paolo Bonzini
2025-05-19 15:39     ` Sean Christopherson
2025-05-20 14:42       ` Paolo Bonzini
2025-05-20 22:49       ` Huang, Kai
2025-05-20 23:11         ` Sean Christopherson [this message]
2025-05-20 23:57           ` Huang, Kai
2025-05-21 17:12             ` Sean Christopherson
2025-05-21 22:43               ` Huang, Kai
2025-05-22 13:40                 ` Sean Christopherson
2025-05-23 11:31                   ` Huang, Kai
2025-05-20 16:15   ` Sean Christopherson
2025-05-16 21:54 ` [PATCH v3 3/3] KVM: x86/mmu: Defer allocation of shadow MMU's hashed page list Sean Christopherson
2025-05-17 12:43   ` Paolo Bonzini
2025-05-19 13:37     ` Sean Christopherson
2025-05-19 15:29       ` James Houghton
2025-05-19 15:51         ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aC0MIUOTQbb9-a7k@google.com \
    --to=seanjc@google.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=vipinsh@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.