From: Xiaoyao Li <xiaoyao.li@intel.com>
To: "Huang, Kai" <kai.huang@intel.com>,
"seanjc@google.com" <seanjc@google.com>
Cc: "Bae, Chang Seok" <chang.seok.bae@intel.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"kas@kernel.org" <kas@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>,
"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH v2 5/6] KVM: x86: Track available/dirty register masks as "unsigned long" values
Date: Tue, 14 Apr 2026 10:12:37 +0800 [thread overview]
Message-ID: <95a931f8-42cc-4834-953c-30c9167bfdc1@intel.com> (raw)
In-Reply-To: <d6f05e5fa781ccb465d27a4fb1c7c1ac1e9e95ff.camel@intel.com>
On 4/14/2026 7:03 AM, Huang, Kai wrote:
>> Because VMX and SVM make all GRPs available immediately, except
>> for RSP, KVM ignores avail/dirty for GPRs. I.e. "fixing" TDX will just shift the
>> "bugs" elsewhere.
> Just want to understand:
>
> I thought the fix could be we simply remove the wrong GPRs from the list.
> Not sure how fixing TDX will shift bugs elsewhere?
I'm curious too.
>> More importantly, because the TDX-Module*requires* RCX (the GPR that holds the
>> mask of registers to expose to the VMM) to be hidden on TDVMCALL, KVM*can't*
>> do any kind of meaningful "available" tracking.
>>
> Hmm I think RCX conveys the shared GPRs and VMM can read. Per "Table 5.323:
> TDH.VP.ENTER Output Operands Format #5 Definition: On TDCALL(TDG.VP.VMCALL)
> Following a TD Entry":
>
> RCX ...
> Bit(s) Name Description
>
> 31:0 PARAMS_MASK Value as passed into TDCALL(TDG.VP.VMCALL) by
> the guest TD: indicates which part of the guest
> TD GPR and XMM state is passed as-is to the
> VMM
> and back. For details, see the description of
> TDG.VP.VMCALL in 5.5.26.
>
> I think the problem is, as said previously, currently KVM TDX code uses
> KVM's existing infrastructure to emulate MSR, KVM hypercall etc, but
> TDVMCALL has a different ABI, thus there's a mismatch here.
I once had patch for it internally.
It adds back the available check for GPRs when accessing instead of
assuming they are always available. For normal VMX and SVM, all the GPRs
are still always available. But for TDX, only EXIT_INFO_1 and
EXIT_INFO_2 are always marked available, while others need to be
explicitly set case by case.
The good thing is it makes TDX safer that KVM won't consume invalid data
silently for TDX. But it adds additional overhead of checking the
unnecessary register availability for VMX and SVM case.
-----------------------------&<-------------------------------------
From: Xiaoyao Li <xiaoyao.li@intel.com>
Date: Tue, 11 Mar 2025 07:13:29 -0400
Subject: [PATCH] KVM: x86: Add available check for GPRs
Since commit de3cd117ed2f ("KVM: x86: Omit caching logic for
always-available GPRs"), KVM doesn't check the availability of GPRs
except RSP and RIP when accessing them, because they are always
available.
However, it's not true when it comes to TDX. The GPRs are not available
after TD vcpu exits actually. And it relies on KVM manually sets the
GPRs value when needed, e.g.
- setting rax, rbx, rcx, rdx, rsi, for hypercall emulation in
tdx_emulate_tdvmall();
- setting rax, rcx and rdx before MSR write emulation;
Add the available check of GPRs read, and WARN_ON_ONCE() when unavailable.
It can help capture the cases of undesired GPRs consumption by TDX.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
arch/x86/kvm/kvm_cache_regs.h | 60 +++++++++++++++++++----------------
arch/x86/kvm/vmx/tdx.c | 25 +++------------
2 files changed, 37 insertions(+), 48 deletions(-)
diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
index 8ddb01191d6f..b2fa01ee2b4b 100644
--- a/arch/x86/kvm/kvm_cache_regs.h
+++ b/arch/x86/kvm/kvm_cache_regs.h
@@ -16,34 +16,6 @@
static_assert(!(KVM_POSSIBLE_CR0_GUEST_BITS & X86_CR0_PDPTR_BITS));
-#define BUILD_KVM_GPR_ACCESSORS(lname, uname) \
-static __always_inline unsigned long kvm_##lname##_read(struct kvm_vcpu
*vcpu)\
-{ \
- return vcpu->arch.regs[VCPU_REGS_##uname]; \
-} \
-static __always_inline void kvm_##lname##_write(struct kvm_vcpu *vcpu,
\
- unsigned long val) \
-{ \
- vcpu->arch.regs[VCPU_REGS_##uname] = val; \
-}
-BUILD_KVM_GPR_ACCESSORS(rax, RAX)
-BUILD_KVM_GPR_ACCESSORS(rbx, RBX)
-BUILD_KVM_GPR_ACCESSORS(rcx, RCX)
-BUILD_KVM_GPR_ACCESSORS(rdx, RDX)
-BUILD_KVM_GPR_ACCESSORS(rbp, RBP)
-BUILD_KVM_GPR_ACCESSORS(rsi, RSI)
-BUILD_KVM_GPR_ACCESSORS(rdi, RDI)
-#ifdef CONFIG_X86_64
-BUILD_KVM_GPR_ACCESSORS(r8, R8)
-BUILD_KVM_GPR_ACCESSORS(r9, R9)
-BUILD_KVM_GPR_ACCESSORS(r10, R10)
-BUILD_KVM_GPR_ACCESSORS(r11, R11)
-BUILD_KVM_GPR_ACCESSORS(r12, R12)
-BUILD_KVM_GPR_ACCESSORS(r13, R13)
-BUILD_KVM_GPR_ACCESSORS(r14, R14)
-BUILD_KVM_GPR_ACCESSORS(r15, R15)
-#endif
-
/*
* Using the register cache from interrupt context is generally not
allowed, as
* caching a register and marking it available/dirty can't be done
atomically,
@@ -92,6 +64,38 @@ static inline void kvm_register_mark_dirty(struct
kvm_vcpu *vcpu,
__set_bit(reg, (unsigned long *)&vcpu->arch.regs_dirty);
}
+#define BUILD_KVM_GPR_ACCESSORS(lname, uname) \
+static __always_inline unsigned long kvm_##lname##_read(struct kvm_vcpu
*vcpu)\
+{ \
+ if (WARN_ON_ONCE(!kvm_register_is_available(vcpu, VCPU_REGS_##uname)))\
+ return 0; \
+ \
+ return vcpu->arch.regs[VCPU_REGS_##uname]; \
+} \
+static __always_inline void kvm_##lname##_write(struct kvm_vcpu *vcpu,
\
+ unsigned long val) \
+{ \
+ vcpu->arch.regs[VCPU_REGS_##uname] = val; \
+ kvm_register_mark_available(vcpu, VCPU_REGS_##uname); \
+}
+BUILD_KVM_GPR_ACCESSORS(rax, RAX)
+BUILD_KVM_GPR_ACCESSORS(rbx, RBX)
+BUILD_KVM_GPR_ACCESSORS(rcx, RCX)
+BUILD_KVM_GPR_ACCESSORS(rdx, RDX)
+BUILD_KVM_GPR_ACCESSORS(rbp, RBP)
+BUILD_KVM_GPR_ACCESSORS(rsi, RSI)
+BUILD_KVM_GPR_ACCESSORS(rdi, RDI)
+#ifdef CONFIG_X86_64
+BUILD_KVM_GPR_ACCESSORS(r8, R8)
+BUILD_KVM_GPR_ACCESSORS(r9, R9)
+BUILD_KVM_GPR_ACCESSORS(r10, R10)
+BUILD_KVM_GPR_ACCESSORS(r11, R11)
+BUILD_KVM_GPR_ACCESSORS(r12, R12)
+BUILD_KVM_GPR_ACCESSORS(r13, R13)
+BUILD_KVM_GPR_ACCESSORS(r14, R14)
+BUILD_KVM_GPR_ACCESSORS(r15, R15)
+#endif
+
/*
* kvm_register_test_and_mark_available() is a special snowflake that
uses an
* arch bitop directly to avoid the explicit instrumentation that
comes with
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index cefe6cdd60a9..2b90a60e6f64 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -969,6 +969,9 @@ static __always_inline u32
tdx_to_vmx_exit_reason(struct kvm_vcpu *vcpu)
return exit_reason;
}
+#define TDX_REGS_AVAIL_SET (BIT_ULL(VCPU_EXREG_EXIT_INFO_1) | \
+ BIT_ULL(VCPU_EXREG_EXIT_INFO_2))
+
static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu)
{
struct vcpu_tdx *tdx = to_tdx(vcpu);
@@ -985,6 +988,8 @@ static noinstr void tdx_vcpu_enter_exit(struct
kvm_vcpu *vcpu)
tdx->exit_gpa = tdx->vp_enter_args.r8;
vt->exit_intr_info = tdx->vp_enter_args.r9;
+ vcpu->arch.regs_avail &= TDX_REGS_AVAIL_SET;
+
vmx_handle_nmi(vcpu);
guest_state_exit_irqoff();
@@ -1017,24 +1022,6 @@ static fastpath_t
tdx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
return EXIT_FASTPATH_NONE;
}
-#define TDX_REGS_AVAIL_SET (BIT_ULL(VCPU_EXREG_EXIT_INFO_1) | \
- BIT_ULL(VCPU_EXREG_EXIT_INFO_2) | \
- BIT_ULL(VCPU_REGS_RAX) | \
- BIT_ULL(VCPU_REGS_RBX) | \
- BIT_ULL(VCPU_REGS_RCX) | \
- BIT_ULL(VCPU_REGS_RDX) | \
- BIT_ULL(VCPU_REGS_RBP) | \
- BIT_ULL(VCPU_REGS_RSI) | \
- BIT_ULL(VCPU_REGS_RDI) | \
- BIT_ULL(VCPU_REGS_R8) | \
- BIT_ULL(VCPU_REGS_R9) | \
- BIT_ULL(VCPU_REGS_R10) | \
- BIT_ULL(VCPU_REGS_R11) | \
- BIT_ULL(VCPU_REGS_R12) | \
- BIT_ULL(VCPU_REGS_R13) | \
- BIT_ULL(VCPU_REGS_R14) | \
- BIT_ULL(VCPU_REGS_R15))
-
static void tdx_load_host_xsave_state(struct kvm_vcpu *vcpu)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
@@ -1108,8 +1095,6 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, u64
run_flags)
tdx_load_host_xsave_state(vcpu);
- vcpu->arch.regs_avail &= TDX_REGS_AVAIL_SET;
-
if (unlikely(tdx->vp_enter_ret == EXIT_REASON_EPT_MISCONFIG))
return EXIT_FASTPATH_NONE;
--
2.43.0
next prev parent reply other threads:[~2026-04-14 2:12 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 22:42 [PATCH v2 0/6] KVM: x86: Reg cleanups / prep work for APX Sean Christopherson
2026-04-09 22:42 ` [PATCH v2 1/6] KVM: x86: Add dedicated storage for guest RIP Sean Christopherson
2026-04-10 18:43 ` Chang S. Bae
2026-04-14 12:31 ` Xiaoyao Li
2026-04-14 13:59 ` Chang S. Bae
2026-04-14 15:37 ` Sean Christopherson
2026-04-09 22:42 ` [PATCH v2 2/6] KVM: x86: Drop the "EX" part of "EXREG" to avoid collision with APX Sean Christopherson
2026-04-13 11:23 ` Huang, Kai
2026-04-09 22:42 ` [PATCH v2 3/6] KVM: nVMX: Do a bitwise-AND of regs_avail when switching active VMCS Sean Christopherson
2026-04-09 22:42 ` [PATCH v2 4/6] KVM: x86: Add wrapper APIs to reset dirty/available register masks Sean Christopherson
2026-04-09 22:42 ` [PATCH v2 5/6] KVM: x86: Track available/dirty register masks as "unsigned long" values Sean Christopherson
2026-04-13 11:24 ` Huang, Kai
2026-04-13 11:28 ` Huang, Kai
2026-04-13 14:54 ` Sean Christopherson
2026-04-13 23:03 ` Huang, Kai
2026-04-14 2:12 ` Xiaoyao Li [this message]
2026-04-14 14:04 ` Sean Christopherson
2026-04-14 15:48 ` Sean Christopherson
2026-04-14 22:21 ` Huang, Kai
2026-04-09 22:42 ` [PATCH v2 6/6] KVM: x86: Use a proper bitmap for tracking available/dirty registers Sean Christopherson
2026-04-13 11:31 ` [PATCH v2 0/6] KVM: x86: Reg cleanups / prep work for APX Huang, Kai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=95a931f8-42cc-4834-953c-30c9167bfdc1@intel.com \
--to=xiaoyao.li@intel.com \
--cc=chang.seok.bae@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox