All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Kai Huang <kai.huang@intel.com>
Cc: Tony Lindgren <tony.lindgren@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	 Rick P Edgecombe <rick.p.edgecombe@intel.com>,
	Isaku Yamahata <isaku.yamahata@intel.com>,
	 "binbin.wu@linux.intel.com" <binbin.wu@linux.intel.com>,
	Xiaoyao Li <xiaoyao.li@intel.com>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Yan Y Zhao <yan.y.zhao@intel.com>,
	 Dan J Williams <dan.j.williams@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 Adrian Hunter <adrian.hunter@intel.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	 "pbonzini@redhat.com" <pbonzini@redhat.com>,
	"kristen@linux.intel.com" <kristen@linux.intel.com>
Subject: Re: [PATCH 3/3] KVM: VMX: Initialize TDX during KVM module load
Date: Thu, 7 Nov 2024 14:04:25 -0800	[thread overview]
Message-ID: <Zy05af5Qxkc4uRtn@google.com> (raw)
In-Reply-To: <2d69e11d8afc90e16a2bed5769f812663c123c14.camel@intel.com>

On Wed, Nov 06, 2024, Kai Huang wrote:
> On Wed, 2024-11-06 at 07:01 -0800, Sean Christopherson wrote:
> > On Wed, Nov 06, 2024, Kai Huang wrote:
> > > For this we only disable TDX but continue to load module.  The reason is I think
> > > this is similar to enable a specific KVM feature but the hardware doesn't
> > > support it.  We can go further to check the return value of tdx_cpu_enable() to
> > > distinguish cases like "module not loaded" and "unexpected error", but I really
> > > don't want to go that far.
> > 
> > Hrm, tdx_cpu_enable() is a bit of a mess.  Ideally, there would be a separate
> > "probe" API so that KVM could detect if TDX is supported.  Though maybe it's the
> > TDX module itself is flawed, e.g. if TDH_SYS_INIT is literally the only way to
> > detect whether or not a module is loaded.
> 
> We can also use P-SEAMLDR SEAMCALL to query, but I see no difference between
> using TDH_SYS_INIT.  If you are asking whether there's CPUID or MSR to query
> then no.

Doesn't have to be a CPUID or MSR, anything idempotent would work.  Which begs
the question, is that P-SEAMLDR SEAMCALL query you have in mind idempotent? :-)

> > So, absent a way to clean up tdx_cpu_enable(), maybe disable the module param if
> > it returns -ENODEV, otherwise fail the module load?
> 
> We can, but we need to assume cpuhp_setup_state_cpuslocked() itself will not
> return -ENODEV (it is true now), otherwise we won't be able to distinguish
> whether the -ENODEV was from cpuhp_setup_state_cpuslocked() or tdx_cpu_enable().
> 
> Unless we choose to do tdx_cpu_enable() via on_each_cpu() separately.
> 
> Btw tdx_cpu_enable() itself will print "module not loaded" in case of -ENODEV,
> so the user will be aware anyway if we only disable TDX but not fail module
> loading.

That only helps if a human looks at dmesg before attempting to run a TDX VM on
the host, and parsing dmesg to treat that particular scenario as fatal isn't
something I want to recommend to end users.  E.g. if our platform configuration
screwed up and failed to load a TDX module, then I want that to be surfaced as
an alarm of sorts, not a silent "this platform doesn't support TDX" flag.

> My concern is still the whole "different handling of error cases" seems over-
> engineering.

IMO, that's a symptom of the TDX enabling code not cleanly separating "probe"
from "enable", and at a glance, that seems very solvable.  And I suspect that
cleaning things up will allow for additional hardening.  E.g. I assume the lack
of MOVDIR64B should be a WARN, but because KVM checks for MOVDIR64B before
checking for basic TDX support, it's an non-commitalpr_warn().

> > > 4) tdx_enable() fails.
> > > 
> > > Ditto to 3).
> > 
> > No, this should fail the module load.  E.g. most of the error conditions are
> > -ENOMEM, which has nothing to do with host support for TDX.
> > 
> > > 5) tdx_get_sysinfo() fails.
> > > 
> > > This is a kernel bug since tdx_get_sysinfo() should always return valid TDX
> > > sysinfo structure pointer after tdx_enable() is done successfully.  Currently we
> > > just WARN() if the returned pointer is NULL and disable TDX only.  I think it's
> > > also fine.
> > > 
> > > 6) TDX global metadata check fails, e.g., MAX_VCPUS etc.
> > > 
> > > Ditto to 3).  For this we disable TDX only.
> > 
> > Where is this code?
> 
> Please check:
> 
> https://github.com/intel/tdx/blob/tdx_kvm_dev-2024-10-25.1-host-metadata-v6-rebase/arch/x86/kvm/vmx/tdx.c
> 
> .. starting at line 3320.

Before I forget, that code has a bug.  This

	/* Check TDX module and KVM capabilities */
	if (!tdx_get_supported_attrs(&tdx_sysinfo->td_conf) ||
	    !tdx_get_supported_xfam(&tdx_sysinfo->td_conf))
		goto get_sysinfo_err;

will return '0' on error, instead of -EINVAL (or whatever it intends).

Back to the main discussion, these checks are obvious "probe" failures and so
should disable TDX without failing module load:

	if (!tdp_mmu_enabled || !enable_mmio_caching)
		return -EOPNOTSUPP;

	if (!cpu_feature_enabled(X86_FEATURE_MOVDIR64B)) {
		pr_warn("MOVDIR64B is reqiured for TDX\n");
		return -EOPNOTSUPP;
	}

A kvm_find_user_return_msr() error is obviously a KVM bug, i.e. should definitely
WARN and fail module module.  Ditto for kvm_enable_virtualization().

The boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM) that's buried in tdx_enable()
really belongs in KVM.  Having it in both is totally fine, but KVM shouldn't do
a bunch of work and _then_ check if all that work was pointless.

I am ok treating everything at or after tdx_get_sysinfo() as fatal to module load,
especially since, IIUC, TD_SYS_INIT can't be undone, i.e. KVM has crossed a point
of no return.

In short, assuming KVM can query if a TDX module is a loaded, I don't think it's
all that much work to do:

  static bool kvm_is_tdx_supported(void)
  {
	if (boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM))
		return false;

	if (!<is TDX module loaded>)
		return false;

	if (!tdp_mmu_enabled || !enable_mmio_caching)
		return false;

	if (WARN_ON_ONCE(!cpu_feature_enabled(X86_FEATURE_MOVDIR64B)))
		return false;

	return true;
  }

  int __init tdx_bringup(void)
  {
	enable_tdx = enable_tdx && kvm_is_tdx_supported();
	if (!enable_tdx)
		return 0;

	return __tdx_bringup();
  }

  reply	other threads:[~2024-11-07 22:04 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-28 13:20 [PATCH 0/3] KVM: VMX: Initialize TDX when loading KVM module Kai Huang
2024-10-28 13:20 ` [PATCH 1/3] KVM: VMX: Refactor VMX module init/exit functions Kai Huang
2024-10-28 13:20 ` [PATCH 2/3] KVM: Export hardware virtualization enabling/disabling functions Kai Huang
2024-10-28 13:20 ` [PATCH 3/3] KVM: VMX: Initialize TDX during KVM module load Kai Huang
2024-10-30 15:19   ` Sean Christopherson
2024-10-31 11:17     ` Huang, Kai
2024-10-31 20:22       ` Sean Christopherson
2024-10-31 21:21         ` Huang, Kai
2024-10-31 21:29           ` Edgecombe, Rick P
2024-11-06 14:19             ` Edgecombe, Rick P
2024-11-06 10:49         ` Huang, Kai
2024-11-06 15:01           ` Sean Christopherson
2024-11-06 20:06             ` Huang, Kai
2024-11-07 22:04               ` Sean Christopherson [this message]
2024-11-07 23:25                 ` Huang, Kai
2024-10-31 21:52       ` Dan Williams
2024-10-31 22:37         ` Huang, Kai
2024-10-31 22:56           ` Dan Williams
2024-10-28 17:41 ` [PATCH 0/3] KVM: VMX: Initialize TDX when loading KVM module Paolo Bonzini
  -- strict thread matches above, loose matches on Subject: below --
2024-11-27 20:10 [PATCH v3 0/3] KVM: VMX: Initialize TDX during KVM module load Paolo Bonzini
2024-11-27 20:10 ` [PATCH 3/3] " Paolo Bonzini
2024-11-28  3:00   ` Chao Gao
2024-11-28  3:04     ` Huang, Kai
2024-11-28  3:34   ` Huang, Kai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zy05af5Qxkc4uRtn@google.com \
    --to=seanjc@google.com \
    --cc=adrian.hunter@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kristen@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=reinette.chatre@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tony.lindgren@intel.com \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.