From: Sean Christopherson <seanjc@google.com>
To: Rick P Edgecombe <rick.p.edgecombe@intel.com>
Cc: Yan Y Zhao <yan.y.zhao@intel.com>,
Kai Huang <kai.huang@intel.com>,
"ackerleytng@google.com" <ackerleytng@google.com>,
Vishal Annapurve <vannapurve@google.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Ira Weiny <ira.weiny@intel.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"michael.roth@amd.com" <michael.roth@amd.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>
Subject: Re: [RFC PATCH v2 12/18] KVM: TDX: Bug the VM if extended the initial measurement fails
Date: Fri, 29 Aug 2025 13:11:35 -0700 [thread overview]
Message-ID: <aLIJd7xpNfJvdMeT@google.com> (raw)
In-Reply-To: <8445ac8c96706ba1f079f4012584ef7631c60c8b.camel@intel.com>
[-- Attachment #1: Type: text/plain, Size: 5605 bytes --]
On Fri, Aug 29, 2025, Rick P Edgecombe wrote:
> On Fri, 2025-08-29 at 16:18 +0800, Yan Zhao wrote:
> > > + /*
> > > + * Note, MR.EXTEND can fail if the S-EPT mapping is somehow removed
> > > + * between mapping the pfn and now, but slots_lock prevents memslot
> > > + * updates, filemap_invalidate_lock() prevents guest_memfd updates,
> > > + * mmu_notifier events can't reach S-EPT entries, and KVM's
> > > internal
> > > + * zapping flows are mutually exclusive with S-EPT mappings.
> > > + */
> > > + for (i = 0; i < PAGE_SIZE; i += TDX_EXTENDMR_CHUNKSIZE) {
> > > + err = tdh_mr_extend(&kvm_tdx->td, gpa + i, &entry,
> > > &level_state);
> > > + if (KVM_BUG_ON(err, kvm)) {
> > I suspect tdh_mr_extend() running on one vCPU may contend with
> > tdh_vp_create()/tdh_vp_addcx()/tdh_vp_init*()/tdh_vp_rd()/tdh_vp_wr()/
> > tdh_mng_rd()/tdh_vp_flush() on other vCPUs, if userspace invokes ioctl
> > KVM_TDX_INIT_MEM_REGION on one vCPU while initializing other vCPUs.
> >
> > It's similar to the analysis of contention of tdh_mem_page_add() [1], as
> > both tdh_mr_extend() and tdh_mem_page_add() acquire exclusive lock on
> > resource TDR.
> >
> > I'll try to write a test to verify it and come back to you.
>
> I'm seeing the same thing in the TDX module. It could fail because of contention
> controllable from userspace. So the KVM_BUG_ON() is not appropriate.
>
> Today though if tdh_mr_extend() fails because of contention then the TD is
> essentially dead anyway. Trying to redo KVM_TDX_INIT_MEM_REGION will fail. The
> M-EPT fault could be spurious but the second tdh_mem_page_add() would return an
> error and never get back to the tdh_mr_extend().
>
> The version in this patch can't recover for a different reason. That is
> kvm_tdp_mmu_map_private_pfn() doesn't handle spurious faults, so I'd say just
> drop the KVM_BUG_ON(), and try to handle the contention in a separate effort.
>
> I guess the two approaches could be to make KVM_TDX_INIT_MEM_REGION more robust,
This. First and foremost, KVM's ordering and locking rules need to be explicit
(ideally documented, but at the very least apparent in the code), *especially*
when the locking (or lack thereof) impacts userspace. Even if effectively relying
on the TDX-module to provide ordering "works", it's all but impossible to follow.
And it doesn't truly work, as everything in the TDX-Module is a trylock, and that
in turn prevents KVM from asserting success. Sometimes KVM has better option than
to rely on hardware to detect failure, but it really should be a last resort,
because not being able to expect success makes debugging no fun. Even worse, it
bleeds hard-to-document, specific ordering requirements into userspace, e.g. in
this case, it sounds like userspace can't do _anything_ on vCPUs while doing
KVM_TDX_INIT_MEM_REGION. Which might not be a burden for userspace, but oof is
it nasty from an ABI perspective.
> or prevent the contention. For the latter case:
> tdh_vp_create()/tdh_vp_addcx()/tdh_vp_init*()/tdh_vp_rd()/tdh_vp_wr()
> ...I think we could just take slots_lock during KVM_TDX_INIT_VCPU and
> KVM_TDX_GET_CPUID.
>
> For tdh_vp_flush() the vcpu_load() in kvm_arch_vcpu_ioctl() could be hard to
> handle.
>
> So I'd think maybe to look towards making KVM_TDX_INIT_MEM_REGION more robust,
> which would mean the eventual solution wouldn't have ABI concerns by later
> blocking things that used to be allowed.
>
> Maybe having kvm_tdp_mmu_map_private_pfn() return success for spurious faults is
> enough. But this is all for a case that userspace isn't expected to actually
> hit, so seems like something that could be kicked down the road easily.
You're trying to be too "nice", just smack 'em with a big hammer. For all intents
and purposes, the paths in question are fully serialized, there's no reason to try
and allow anything remotely interesting to happen.
Acquire kvm->lock to prevent VM-wide things from happening, slots_lock to prevent
kvm_mmu_zap_all_fast(), and _all_ vCPU mutexes to prevent vCPUs from interefering.
Doing that for a vCPU ioctl is a bit awkward, but not awful. E.g. we can abuse
kvm_arch_vcpu_async_ioctl(). In hindsight, a more clever approach would have
been to make KVM_TDX_INIT_MEM_REGION a VM-scoped ioctl that takes a vCPU fd. Oh
well.
Anyways, I think we need to avoid the "synchronous" ioctl path anyways, because
taking kvm->slots_lock inside vcpu->mutex is gross. AFAICT it's not actively
problematic today, but it feels like a deadlock waiting to happen.
The other oddity I see is the handling of kvm_tdx->state. I don't see how this
check in tdx_vcpu_create() is safe:
if (kvm_tdx->state != TD_STATE_INITIALIZED)
return -EIO;
kvm_arch_vcpu_create() runs without any locks held, and so TDX effectively has
the same bug that SEV intra-host migration had, where an in-flight vCPU creation
could race with a VM-wide state transition (see commit ecf371f8b02d ("KVM: SVM:
Reject SEV{-ES} intra host migration if vCPU creation is in-flight"). To fix
that, kvm->lock needs to be taken and KVM needs to verify there's no in-flight
vCPU creation, e.g. so that a vCPU doesn't pop up and contend a TDX-Module lock.
We an even define a fancy new CLASS to handle the lock+check => unlock logic
with guard()-like syntax:
CLASS(tdx_vm_state_guard, guard)(kvm);
if (IS_ERR(guard))
return PTR_ERR(guard);
IIUC, with all of those locks, KVM can KVM_BUG_ON() both TDH_MEM_PAGE_ADD and
TDH_MR_EXTEND, with no exceptions given for -EBUSY. Attached patches are very
lightly tested as usual and need to be chunked up, but seem do to what I want.
[-- Attachment #2: 0001-KVM-Make-support-for-kvm_arch_vcpu_async_ioctl-manda.patch --]
[-- Type: text/x-diff, Size: 4250 bytes --]
From 44a96a0db69d9cd56e77813125aa1e318b11d718 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Fri, 29 Aug 2025 07:28:44 -0700
Subject: [PATCH 1/2] KVM: Make support for kvm_arch_vcpu_async_ioctl()
mandatory
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/loongarch/kvm/Kconfig | 1 -
arch/mips/kvm/Kconfig | 1 -
arch/powerpc/kvm/Kconfig | 1 -
arch/riscv/kvm/Kconfig | 1 -
arch/s390/kvm/Kconfig | 1 -
arch/x86/kvm/x86.c | 6 ++++++
include/linux/kvm_host.h | 10 ----------
virt/kvm/Kconfig | 3 ---
8 files changed, 6 insertions(+), 18 deletions(-)
diff --git a/arch/loongarch/kvm/Kconfig b/arch/loongarch/kvm/Kconfig
index 40eea6da7c25..e53948ec978a 100644
--- a/arch/loongarch/kvm/Kconfig
+++ b/arch/loongarch/kvm/Kconfig
@@ -25,7 +25,6 @@ config KVM
select HAVE_KVM_IRQCHIP
select HAVE_KVM_MSI
select HAVE_KVM_READONLY_MEM
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_COMMON
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_GENERIC_HARDWARE_ENABLING
diff --git a/arch/mips/kvm/Kconfig b/arch/mips/kvm/Kconfig
index ab57221fa4dd..cc13cc35f208 100644
--- a/arch/mips/kvm/Kconfig
+++ b/arch/mips/kvm/Kconfig
@@ -22,7 +22,6 @@ config KVM
select EXPORT_UASM
select KVM_COMMON
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_MMIO
select KVM_GENERIC_MMU_NOTIFIER
select KVM_GENERIC_HARDWARE_ENABLING
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 2f2702c867f7..c9a2d50ff1b0 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -20,7 +20,6 @@ if VIRTUALIZATION
config KVM
bool
select KVM_COMMON
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_VFIO
select HAVE_KVM_IRQ_BYPASS
diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig
index 5a62091b0809..de67bfabebc8 100644
--- a/arch/riscv/kvm/Kconfig
+++ b/arch/riscv/kvm/Kconfig
@@ -23,7 +23,6 @@ config KVM
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_MSI
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select HAVE_KVM_READONLY_MEM
select HAVE_KVM_DIRTY_RING_ACQ_REL
select KVM_COMMON
diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
index cae908d64550..96d16028e8b7 100644
--- a/arch/s390/kvm/Kconfig
+++ b/arch/s390/kvm/Kconfig
@@ -20,7 +20,6 @@ config KVM
def_tristate y
prompt "Kernel-based Virtual Machine (KVM) support"
select HAVE_KVM_CPU_RELAX_INTERCEPT
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_ASYNC_PF
select KVM_ASYNC_PF_SYNC
select KVM_COMMON
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7ba2cdfdac44..92e916eba6a9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6943,6 +6943,12 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, void __user *argp)
return 0;
}
+long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl,
+ unsigned long arg)
+{
+ return -ENOIOCTLCMD;
+}
+
int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
{
struct kvm *kvm = filp->private_data;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 15656b7fba6c..a1840aaf80d4 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2421,18 +2421,8 @@ static inline bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
}
#endif /* CONFIG_HAVE_KVM_NO_POLL */
-#ifdef CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL
long kvm_arch_vcpu_async_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg);
-#else
-static inline long kvm_arch_vcpu_async_ioctl(struct file *filp,
- unsigned int ioctl,
- unsigned long arg)
-{
- return -ENOIOCTLCMD;
-}
-#endif /* CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL */
-
void kvm_arch_guest_memory_reclaimed(struct kvm *kvm);
#ifdef CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 727b542074e7..661a4b998875 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -78,9 +78,6 @@ config HAVE_KVM_IRQ_BYPASS
tristate
select IRQ_BYPASS_MANAGER
-config HAVE_KVM_VCPU_ASYNC_IOCTL
- bool
-
config HAVE_KVM_VCPU_RUN_PID_CHANGE
bool
base-commit: f4b88d6c85871847340a86daf838e11986a97348
--
2.51.0.318.gd7df087d1a-goog
[-- Attachment #3: 0002-KVM-TDX-Guard-VM-state-transitions.patch --]
[-- Type: text/x-diff, Size: 9085 bytes --]
From 7277396033c21569dbed0a52fa92804307db111e Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Fri, 29 Aug 2025 09:19:11 -0700
Subject: [PATCH 2/2] KVM: TDX: Guard VM state transitions
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/kvm-x86-ops.h | 1 +
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/vmx/main.c | 9 +++
arch/x86/kvm/vmx/tdx.c | 112 ++++++++++++++++++++++-------
arch/x86/kvm/vmx/x86_ops.h | 1 +
arch/x86/kvm/x86.c | 7 ++
6 files changed, 105 insertions(+), 26 deletions(-)
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 18a5c3119e1a..fe2bb2e2ebc8 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -128,6 +128,7 @@ KVM_X86_OP(enable_smi_window)
KVM_X86_OP_OPTIONAL(dev_get_attr)
KVM_X86_OP_OPTIONAL(mem_enc_ioctl)
KVM_X86_OP_OPTIONAL(vcpu_mem_enc_ioctl)
+KVM_X86_OP_OPTIONAL(vcpu_mem_enc_async_ioctl)
KVM_X86_OP_OPTIONAL(mem_enc_register_region)
KVM_X86_OP_OPTIONAL(mem_enc_unregister_region)
KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d0a8404a6b8f..ac5d3b8fa49f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1911,6 +1911,7 @@ struct kvm_x86_ops {
int (*dev_get_attr)(u32 group, u64 attr, u64 *val);
int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp);
int (*vcpu_mem_enc_ioctl)(struct kvm_vcpu *vcpu, void __user *argp);
+ int (*vcpu_mem_enc_async_ioctl)(struct kvm_vcpu *vcpu, void __user *argp);
int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *argp);
int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *argp);
int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index dbab1c15b0cd..e0e35ceec9b1 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -831,6 +831,14 @@ static int vt_vcpu_mem_enc_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
return tdx_vcpu_ioctl(vcpu, argp);
}
+static int vt_vcpu_mem_enc_async_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
+{
+ if (!is_td_vcpu(vcpu))
+ return -EINVAL;
+
+ return tdx_vcpu_async_ioctl(vcpu, argp);
+}
+
static int vt_gmem_private_max_mapping_level(struct kvm *kvm, kvm_pfn_t pfn)
{
if (is_td(kvm))
@@ -1004,6 +1012,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
.mem_enc_ioctl = vt_op_tdx_only(mem_enc_ioctl),
.vcpu_mem_enc_ioctl = vt_op_tdx_only(vcpu_mem_enc_ioctl),
+ .vcpu_mem_enc_async_ioctl = vt_op_tdx_only(vcpu_mem_enc_async_ioctl),
.private_max_mapping_level = vt_op_tdx_only(gmem_private_max_mapping_level)
};
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index d6c9defad9cd..c595d9cb6dcd 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -2624,6 +2624,44 @@ static int tdx_read_cpuid(struct kvm_vcpu *vcpu, u32 leaf, u32 sub_leaf,
return -EIO;
}
+typedef void * tdx_vm_state_guard_t;
+
+static tdx_vm_state_guard_t tdx_acquire_vm_state_locks(struct kvm *kvm)
+{
+ int r;
+
+ mutex_lock(&kvm->lock);
+ mutex_lock(&kvm->slots_lock);
+
+ if (kvm->created_vcpus != atomic_read(&kvm->online_vcpus)) {
+ r = -EBUSY;
+ goto out_err;
+ }
+
+ r = kvm_lock_all_vcpus(kvm);
+ if (r)
+ goto out_err;
+
+ return kvm;
+
+out_err:
+ mutex_unlock(&kvm->slots_lock);
+ mutex_unlock(&kvm->lock);
+
+ return ERR_PTR(r);
+}
+
+static void tdx_release_vm_state_locks(struct kvm *kvm)
+{
+ kvm_unlock_all_vcpus(kvm);
+ mutex_unlock(&kvm->slots_lock);
+ mutex_unlock(&kvm->lock);
+}
+
+DEFINE_CLASS(tdx_vm_state_guard, tdx_vm_state_guard_t,
+ if (!IS_ERR(_T)) tdx_release_vm_state_locks(_T),
+ tdx_acquire_vm_state_locks(kvm), struct kvm *kvm);
+
static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
@@ -2634,6 +2672,10 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
BUILD_BUG_ON(sizeof(*init_vm) != 256 + sizeof_field(struct kvm_tdx_init_vm, cpuid));
BUILD_BUG_ON(sizeof(struct td_params) != 1024);
+ CLASS(tdx_vm_state_guard, guard)(kvm);
+ if (IS_ERR(guard))
+ return PTR_ERR(guard);
+
if (kvm_tdx->state != TD_STATE_UNINITIALIZED)
return -EINVAL;
@@ -2745,7 +2787,9 @@ static int tdx_td_finalize(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- guard(mutex)(&kvm->slots_lock);
+ CLASS(tdx_vm_state_guard, guard)(kvm);
+ if (IS_ERR(guard))
+ return PTR_ERR(guard);
if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state == TD_STATE_RUNNABLE)
return -EINVAL;
@@ -2763,22 +2807,25 @@ static int tdx_td_finalize(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
return 0;
}
+static int tdx_get_cmd(void __user *argp, struct kvm_tdx_cmd *cmd)
+{
+ if (copy_from_user(cmd, argp, sizeof(cmd)))
+ return -EFAULT;
+
+ if (cmd->hw_error)
+ return -EINVAL;
+
+ return 0;
+}
+
int tdx_vm_ioctl(struct kvm *kvm, void __user *argp)
{
struct kvm_tdx_cmd tdx_cmd;
int r;
- if (copy_from_user(&tdx_cmd, argp, sizeof(struct kvm_tdx_cmd)))
- return -EFAULT;
-
- /*
- * Userspace should never set hw_error. It is used to fill
- * hardware-defined error by the kernel.
- */
- if (tdx_cmd.hw_error)
- return -EINVAL;
-
- mutex_lock(&kvm->lock);
+ r = tdx_get_cmd(argp, &tdx_cmd);
+ if (r)
+ return r;
switch (tdx_cmd.id) {
case KVM_TDX_CAPABILITIES:
@@ -2791,15 +2838,12 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp)
r = tdx_td_finalize(kvm, &tdx_cmd);
break;
default:
- r = -EINVAL;
- goto out;
+ return -EINVAL;
}
if (copy_to_user(argp, &tdx_cmd, sizeof(struct kvm_tdx_cmd)))
r = -EFAULT;
-out:
- mutex_unlock(&kvm->lock);
return r;
}
@@ -3079,11 +3123,13 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *c
long gmem_ret;
int ret;
+ CLASS(tdx_vm_state_guard, guard)(kvm);
+ if (IS_ERR(guard))
+ return PTR_ERR(guard);
+
if (tdx->state != VCPU_TD_STATE_INITIALIZED)
return -EINVAL;
- guard(mutex)(&kvm->slots_lock);
-
/* Once TD is finalized, the initial guest memory is fixed. */
if (kvm_tdx->state == TD_STATE_RUNNABLE)
return -EINVAL;
@@ -3101,6 +3147,8 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *c
!vt_is_tdx_private_gpa(kvm, region.gpa + (region.nr_pages << PAGE_SHIFT) - 1))
return -EINVAL;
+ vcpu_load(vcpu);
+
ret = 0;
while (region.nr_pages) {
if (signal_pending(current)) {
@@ -3132,11 +3180,28 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *c
cond_resched();
}
+ vcpu_put(vcpu);
+
if (copy_to_user(u64_to_user_ptr(cmd->data), ®ion, sizeof(region)))
ret = -EFAULT;
return ret;
}
+int tdx_vcpu_async_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
+{
+ struct kvm_tdx_cmd cmd;
+ int r;
+
+ r = tdx_get_cmd(argp, &cmd);
+ if (r)
+ return r;
+
+ if (cmd.id != KVM_TDX_INIT_MEM_REGION)
+ return -ENOIOCTLCMD;
+
+ return tdx_vcpu_init_mem_region(vcpu, &cmd);
+}
+
int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
@@ -3146,19 +3211,14 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state == TD_STATE_RUNNABLE)
return -EINVAL;
- if (copy_from_user(&cmd, argp, sizeof(cmd)))
- return -EFAULT;
-
- if (cmd.hw_error)
- return -EINVAL;
+ ret = tdx_get_cmd(argp, &cmd);
+ if (ret)
+ return ret;
switch (cmd.id) {
case KVM_TDX_INIT_VCPU:
ret = tdx_vcpu_init(vcpu, &cmd);
break;
- case KVM_TDX_INIT_MEM_REGION:
- ret = tdx_vcpu_init_mem_region(vcpu, &cmd);
- break;
case KVM_TDX_GET_CPUID:
ret = tdx_vcpu_get_cpuid(vcpu, &cmd);
break;
diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h
index 2b3424f638db..a797101a2150 100644
--- a/arch/x86/kvm/vmx/x86_ops.h
+++ b/arch/x86/kvm/vmx/x86_ops.h
@@ -149,6 +149,7 @@ int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp);
+int tdx_vcpu_async_ioctl(struct kvm_vcpu *vcpu, void __user *argp);
void tdx_flush_tlb_current(struct kvm_vcpu *vcpu);
void tdx_flush_tlb_all(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 92e916eba6a9..281cd0980245 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6946,6 +6946,13 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, void __user *argp)
long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl,
unsigned long arg)
{
+ struct kvm_vcpu *vcpu = filp->private_data;
+ void __user *argp = (void __user *)arg;
+
+ if (ioctl == KVM_MEMORY_ENCRYPT_OP &&
+ kvm_x86_ops.vcpu_mem_enc_async_ioctl)
+ return kvm_x86_call(vcpu_mem_enc_async_ioctl)(vcpu, argp);
+
return -ENOIOCTLCMD;
}
--
2.51.0.318.gd7df087d1a-goog
next prev parent reply other threads:[~2025-08-29 20:11 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-29 0:06 [RFC PATCH v2 00/18] KVM: x86/mmu: TDX post-populate cleanups Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 01/18] KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings Sean Christopherson
2025-08-29 6:20 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 02/18] KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU Sean Christopherson
2025-08-29 18:34 ` Edgecombe, Rick P
2025-08-29 20:27 ` Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 03/18] Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU" Sean Christopherson
2025-08-29 19:00 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 04/18] KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_page_prefault() Sean Christopherson
2025-08-29 19:03 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 05/18] KVM: TDX: Drop superfluous page pinning in S-EPT management Sean Christopherson
2025-08-29 8:36 ` Binbin Wu
2025-08-29 19:53 ` Edgecombe, Rick P
2025-08-29 20:19 ` Sean Christopherson
2025-08-29 21:54 ` Edgecombe, Rick P
2025-08-29 22:02 ` Sean Christopherson
2025-08-29 22:17 ` Edgecombe, Rick P
2025-08-29 22:58 ` Sean Christopherson
2025-08-29 22:59 ` Edgecombe, Rick P
2025-09-01 1:25 ` Yan Zhao
2025-09-02 17:33 ` Sean Christopherson
2025-09-02 18:55 ` Edgecombe, Rick P
2025-09-04 8:45 ` Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 06/18] KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition Sean Christopherson
2025-08-29 9:40 ` Binbin Wu
2025-08-29 16:58 ` Ira Weiny
2025-08-29 19:59 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 07/18] KVM: TDX: Fold tdx_sept_drop_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-08-29 9:49 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 08/18] KVM: x86/mmu: Drop the return code from kvm_x86_ops.remove_external_spte() Sean Christopherson
2025-08-29 9:52 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 09/18] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte() Sean Christopherson
2025-08-29 9:52 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 10/18] KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent Sean Christopherson
2025-08-29 10:06 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 11/18] KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller Sean Christopherson
2025-09-02 22:46 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 12/18] KVM: TDX: Bug the VM if extended the initial measurement fails Sean Christopherson
2025-08-29 8:18 ` Yan Zhao
2025-08-29 18:16 ` Edgecombe, Rick P
2025-08-29 20:11 ` Sean Christopherson [this message]
2025-08-29 22:39 ` Edgecombe, Rick P
2025-08-29 23:15 ` Edgecombe, Rick P
2025-08-29 23:18 ` Sean Christopherson
2025-09-02 9:24 ` Yan Zhao
2025-09-02 17:04 ` Sean Christopherson
2025-09-03 0:18 ` Edgecombe, Rick P
2025-09-03 3:34 ` Yan Zhao
2025-09-03 9:19 ` Yan Zhao
2025-08-29 0:06 ` [RFC PATCH v2 13/18] KVM: TDX: ADD pages to the TD image while populating mirror EPT entries Sean Christopherson
2025-08-29 23:42 ` Edgecombe, Rick P
2025-09-02 17:09 ` Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 14/18] KVM: TDX: Fold tdx_sept_zap_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-09-02 17:31 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 15/18] KVM: TDX: Combine KVM_BUG_ON + pr_tdx_error() into TDX_BUG_ON() Sean Christopherson
2025-08-29 9:03 ` Binbin Wu
2025-08-29 14:19 ` Sean Christopherson
2025-09-01 1:46 ` Binbin Wu
2025-09-02 18:55 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 16/18] KVM: TDX: Derive error argument names from the local variable names Sean Christopherson
2025-08-30 0:00 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 17/18] KVM: TDX: Assert that mmu_lock is held for write when removing S-EPT entries Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 18/18] KVM: TDX: Add macro to retry SEAMCALLs when forcing vCPUs out of guest Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLIJd7xpNfJvdMeT@google.com \
--to=seanjc@google.com \
--cc=ackerleytng@google.com \
--cc=ira.weiny@intel.com \
--cc=kai.huang@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=vannapurve@google.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).