From: Sean Christopherson <seanjc@google.com>
To: Rick P Edgecombe <rick.p.edgecombe@intel.com>
Cc: Yan Y Zhao <yan.y.zhao@intel.com>,
Kai Huang <kai.huang@intel.com>,
"ackerleytng@google.com" <ackerleytng@google.com>,
Vishal Annapurve <vannapurve@google.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Ira Weiny <ira.weiny@intel.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"michael.roth@amd.com" <michael.roth@amd.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>
Subject: Re: [RFC PATCH v2 12/18] KVM: TDX: Bug the VM if extended the initial measurement fails
Date: Fri, 29 Aug 2025 13:11:35 -0700 [thread overview]
Message-ID: <aLIJd7xpNfJvdMeT@google.com> (raw)
In-Reply-To: <8445ac8c96706ba1f079f4012584ef7631c60c8b.camel@intel.com>
[-- Attachment #1: Type: text/plain, Size: 5605 bytes --]
On Fri, Aug 29, 2025, Rick P Edgecombe wrote:
> On Fri, 2025-08-29 at 16:18 +0800, Yan Zhao wrote:
> > > + /*
> > > + * Note, MR.EXTEND can fail if the S-EPT mapping is somehow removed
> > > + * between mapping the pfn and now, but slots_lock prevents memslot
> > > + * updates, filemap_invalidate_lock() prevents guest_memfd updates,
> > > + * mmu_notifier events can't reach S-EPT entries, and KVM's
> > > internal
> > > + * zapping flows are mutually exclusive with S-EPT mappings.
> > > + */
> > > + for (i = 0; i < PAGE_SIZE; i += TDX_EXTENDMR_CHUNKSIZE) {
> > > + err = tdh_mr_extend(&kvm_tdx->td, gpa + i, &entry,
> > > &level_state);
> > > + if (KVM_BUG_ON(err, kvm)) {
> > I suspect tdh_mr_extend() running on one vCPU may contend with
> > tdh_vp_create()/tdh_vp_addcx()/tdh_vp_init*()/tdh_vp_rd()/tdh_vp_wr()/
> > tdh_mng_rd()/tdh_vp_flush() on other vCPUs, if userspace invokes ioctl
> > KVM_TDX_INIT_MEM_REGION on one vCPU while initializing other vCPUs.
> >
> > It's similar to the analysis of contention of tdh_mem_page_add() [1], as
> > both tdh_mr_extend() and tdh_mem_page_add() acquire exclusive lock on
> > resource TDR.
> >
> > I'll try to write a test to verify it and come back to you.
>
> I'm seeing the same thing in the TDX module. It could fail because of contention
> controllable from userspace. So the KVM_BUG_ON() is not appropriate.
>
> Today though if tdh_mr_extend() fails because of contention then the TD is
> essentially dead anyway. Trying to redo KVM_TDX_INIT_MEM_REGION will fail. The
> M-EPT fault could be spurious but the second tdh_mem_page_add() would return an
> error and never get back to the tdh_mr_extend().
>
> The version in this patch can't recover for a different reason. That is
> kvm_tdp_mmu_map_private_pfn() doesn't handle spurious faults, so I'd say just
> drop the KVM_BUG_ON(), and try to handle the contention in a separate effort.
>
> I guess the two approaches could be to make KVM_TDX_INIT_MEM_REGION more robust,
This. First and foremost, KVM's ordering and locking rules need to be explicit
(ideally documented, but at the very least apparent in the code), *especially*
when the locking (or lack thereof) impacts userspace. Even if effectively relying
on the TDX-module to provide ordering "works", it's all but impossible to follow.
And it doesn't truly work, as everything in the TDX-Module is a trylock, and that
in turn prevents KVM from asserting success. Sometimes KVM has better option than
to rely on hardware to detect failure, but it really should be a last resort,
because not being able to expect success makes debugging no fun. Even worse, it
bleeds hard-to-document, specific ordering requirements into userspace, e.g. in
this case, it sounds like userspace can't do _anything_ on vCPUs while doing
KVM_TDX_INIT_MEM_REGION. Which might not be a burden for userspace, but oof is
it nasty from an ABI perspective.
> or prevent the contention. For the latter case:
> tdh_vp_create()/tdh_vp_addcx()/tdh_vp_init*()/tdh_vp_rd()/tdh_vp_wr()
> ...I think we could just take slots_lock during KVM_TDX_INIT_VCPU and
> KVM_TDX_GET_CPUID.
>
> For tdh_vp_flush() the vcpu_load() in kvm_arch_vcpu_ioctl() could be hard to
> handle.
>
> So I'd think maybe to look towards making KVM_TDX_INIT_MEM_REGION more robust,
> which would mean the eventual solution wouldn't have ABI concerns by later
> blocking things that used to be allowed.
>
> Maybe having kvm_tdp_mmu_map_private_pfn() return success for spurious faults is
> enough. But this is all for a case that userspace isn't expected to actually
> hit, so seems like something that could be kicked down the road easily.
You're trying to be too "nice", just smack 'em with a big hammer. For all intents
and purposes, the paths in question are fully serialized, there's no reason to try
and allow anything remotely interesting to happen.
Acquire kvm->lock to prevent VM-wide things from happening, slots_lock to prevent
kvm_mmu_zap_all_fast(), and _all_ vCPU mutexes to prevent vCPUs from interefering.
Doing that for a vCPU ioctl is a bit awkward, but not awful. E.g. we can abuse
kvm_arch_vcpu_async_ioctl(). In hindsight, a more clever approach would have
been to make KVM_TDX_INIT_MEM_REGION a VM-scoped ioctl that takes a vCPU fd. Oh
well.
Anyways, I think we need to avoid the "synchronous" ioctl path anyways, because
taking kvm->slots_lock inside vcpu->mutex is gross. AFAICT it's not actively
problematic today, but it feels like a deadlock waiting to happen.
The other oddity I see is the handling of kvm_tdx->state. I don't see how this
check in tdx_vcpu_create() is safe:
if (kvm_tdx->state != TD_STATE_INITIALIZED)
return -EIO;
kvm_arch_vcpu_create() runs without any locks held, and so TDX effectively has
the same bug that SEV intra-host migration had, where an in-flight vCPU creation
could race with a VM-wide state transition (see commit ecf371f8b02d ("KVM: SVM:
Reject SEV{-ES} intra host migration if vCPU creation is in-flight"). To fix
that, kvm->lock needs to be taken and KVM needs to verify there's no in-flight
vCPU creation, e.g. so that a vCPU doesn't pop up and contend a TDX-Module lock.
We an even define a fancy new CLASS to handle the lock+check => unlock logic
with guard()-like syntax:
CLASS(tdx_vm_state_guard, guard)(kvm);
if (IS_ERR(guard))
return PTR_ERR(guard);
IIUC, with all of those locks, KVM can KVM_BUG_ON() both TDH_MEM_PAGE_ADD and
TDH_MR_EXTEND, with no exceptions given for -EBUSY. Attached patches are very
lightly tested as usual and need to be chunked up, but seem do to what I want.
[-- Attachment #2: 0001-KVM-Make-support-for-kvm_arch_vcpu_async_ioctl-manda.patch --]
[-- Type: text/x-diff, Size: 4250 bytes --]
From 44a96a0db69d9cd56e77813125aa1e318b11d718 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Fri, 29 Aug 2025 07:28:44 -0700
Subject: [PATCH 1/2] KVM: Make support for kvm_arch_vcpu_async_ioctl()
mandatory
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/loongarch/kvm/Kconfig | 1 -
arch/mips/kvm/Kconfig | 1 -
arch/powerpc/kvm/Kconfig | 1 -
arch/riscv/kvm/Kconfig | 1 -
arch/s390/kvm/Kconfig | 1 -
arch/x86/kvm/x86.c | 6 ++++++
include/linux/kvm_host.h | 10 ----------
virt/kvm/Kconfig | 3 ---
8 files changed, 6 insertions(+), 18 deletions(-)
diff --git a/arch/loongarch/kvm/Kconfig b/arch/loongarch/kvm/Kconfig
index 40eea6da7c25..e53948ec978a 100644
--- a/arch/loongarch/kvm/Kconfig
+++ b/arch/loongarch/kvm/Kconfig
@@ -25,7 +25,6 @@ config KVM
select HAVE_KVM_IRQCHIP
select HAVE_KVM_MSI
select HAVE_KVM_READONLY_MEM
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_COMMON
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_GENERIC_HARDWARE_ENABLING
diff --git a/arch/mips/kvm/Kconfig b/arch/mips/kvm/Kconfig
index ab57221fa4dd..cc13cc35f208 100644
--- a/arch/mips/kvm/Kconfig
+++ b/arch/mips/kvm/Kconfig
@@ -22,7 +22,6 @@ config KVM
select EXPORT_UASM
select KVM_COMMON
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_MMIO
select KVM_GENERIC_MMU_NOTIFIER
select KVM_GENERIC_HARDWARE_ENABLING
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 2f2702c867f7..c9a2d50ff1b0 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -20,7 +20,6 @@ if VIRTUALIZATION
config KVM
bool
select KVM_COMMON
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_VFIO
select HAVE_KVM_IRQ_BYPASS
diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig
index 5a62091b0809..de67bfabebc8 100644
--- a/arch/riscv/kvm/Kconfig
+++ b/arch/riscv/kvm/Kconfig
@@ -23,7 +23,6 @@ config KVM
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_MSI
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select HAVE_KVM_READONLY_MEM
select HAVE_KVM_DIRTY_RING_ACQ_REL
select KVM_COMMON
diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
index cae908d64550..96d16028e8b7 100644
--- a/arch/s390/kvm/Kconfig
+++ b/arch/s390/kvm/Kconfig
@@ -20,7 +20,6 @@ config KVM
def_tristate y
prompt "Kernel-based Virtual Machine (KVM) support"
select HAVE_KVM_CPU_RELAX_INTERCEPT
- select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_ASYNC_PF
select KVM_ASYNC_PF_SYNC
select KVM_COMMON
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7ba2cdfdac44..92e916eba6a9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6943,6 +6943,12 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, void __user *argp)
return 0;
}
+long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl,
+ unsigned long arg)
+{
+ return -ENOIOCTLCMD;
+}
+
int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
{
struct kvm *kvm = filp->private_data;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 15656b7fba6c..a1840aaf80d4 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2421,18 +2421,8 @@ static inline bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
}
#endif /* CONFIG_HAVE_KVM_NO_POLL */
-#ifdef CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL
long kvm_arch_vcpu_async_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg);
-#else
-static inline long kvm_arch_vcpu_async_ioctl(struct file *filp,
- unsigned int ioctl,
- unsigned long arg)
-{
- return -ENOIOCTLCMD;
-}
-#endif /* CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL */
-
void kvm_arch_guest_memory_reclaimed(struct kvm *kvm);
#ifdef CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 727b542074e7..661a4b998875 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -78,9 +78,6 @@ config HAVE_KVM_IRQ_BYPASS
tristate
select IRQ_BYPASS_MANAGER
-config HAVE_KVM_VCPU_ASYNC_IOCTL
- bool
-
config HAVE_KVM_VCPU_RUN_PID_CHANGE
bool
base-commit: f4b88d6c85871847340a86daf838e11986a97348
--
2.51.0.318.gd7df087d1a-goog
[-- Attachment #3: 0002-KVM-TDX-Guard-VM-state-transitions.patch --]
[-- Type: text/x-diff, Size: 9085 bytes --]
From 7277396033c21569dbed0a52fa92804307db111e Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Fri, 29 Aug 2025 09:19:11 -0700
Subject: [PATCH 2/2] KVM: TDX: Guard VM state transitions
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/kvm-x86-ops.h | 1 +
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/vmx/main.c | 9 +++
arch/x86/kvm/vmx/tdx.c | 112 ++++++++++++++++++++++-------
arch/x86/kvm/vmx/x86_ops.h | 1 +
arch/x86/kvm/x86.c | 7 ++
6 files changed, 105 insertions(+), 26 deletions(-)
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 18a5c3119e1a..fe2bb2e2ebc8 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -128,6 +128,7 @@ KVM_X86_OP(enable_smi_window)
KVM_X86_OP_OPTIONAL(dev_get_attr)
KVM_X86_OP_OPTIONAL(mem_enc_ioctl)
KVM_X86_OP_OPTIONAL(vcpu_mem_enc_ioctl)
+KVM_X86_OP_OPTIONAL(vcpu_mem_enc_async_ioctl)
KVM_X86_OP_OPTIONAL(mem_enc_register_region)
KVM_X86_OP_OPTIONAL(mem_enc_unregister_region)
KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d0a8404a6b8f..ac5d3b8fa49f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1911,6 +1911,7 @@ struct kvm_x86_ops {
int (*dev_get_attr)(u32 group, u64 attr, u64 *val);
int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp);
int (*vcpu_mem_enc_ioctl)(struct kvm_vcpu *vcpu, void __user *argp);
+ int (*vcpu_mem_enc_async_ioctl)(struct kvm_vcpu *vcpu, void __user *argp);
int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *argp);
int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *argp);
int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index dbab1c15b0cd..e0e35ceec9b1 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -831,6 +831,14 @@ static int vt_vcpu_mem_enc_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
return tdx_vcpu_ioctl(vcpu, argp);
}
+static int vt_vcpu_mem_enc_async_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
+{
+ if (!is_td_vcpu(vcpu))
+ return -EINVAL;
+
+ return tdx_vcpu_async_ioctl(vcpu, argp);
+}
+
static int vt_gmem_private_max_mapping_level(struct kvm *kvm, kvm_pfn_t pfn)
{
if (is_td(kvm))
@@ -1004,6 +1012,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
.mem_enc_ioctl = vt_op_tdx_only(mem_enc_ioctl),
.vcpu_mem_enc_ioctl = vt_op_tdx_only(vcpu_mem_enc_ioctl),
+ .vcpu_mem_enc_async_ioctl = vt_op_tdx_only(vcpu_mem_enc_async_ioctl),
.private_max_mapping_level = vt_op_tdx_only(gmem_private_max_mapping_level)
};
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index d6c9defad9cd..c595d9cb6dcd 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -2624,6 +2624,44 @@ static int tdx_read_cpuid(struct kvm_vcpu *vcpu, u32 leaf, u32 sub_leaf,
return -EIO;
}
+typedef void * tdx_vm_state_guard_t;
+
+static tdx_vm_state_guard_t tdx_acquire_vm_state_locks(struct kvm *kvm)
+{
+ int r;
+
+ mutex_lock(&kvm->lock);
+ mutex_lock(&kvm->slots_lock);
+
+ if (kvm->created_vcpus != atomic_read(&kvm->online_vcpus)) {
+ r = -EBUSY;
+ goto out_err;
+ }
+
+ r = kvm_lock_all_vcpus(kvm);
+ if (r)
+ goto out_err;
+
+ return kvm;
+
+out_err:
+ mutex_unlock(&kvm->slots_lock);
+ mutex_unlock(&kvm->lock);
+
+ return ERR_PTR(r);
+}
+
+static void tdx_release_vm_state_locks(struct kvm *kvm)
+{
+ kvm_unlock_all_vcpus(kvm);
+ mutex_unlock(&kvm->slots_lock);
+ mutex_unlock(&kvm->lock);
+}
+
+DEFINE_CLASS(tdx_vm_state_guard, tdx_vm_state_guard_t,
+ if (!IS_ERR(_T)) tdx_release_vm_state_locks(_T),
+ tdx_acquire_vm_state_locks(kvm), struct kvm *kvm);
+
static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
@@ -2634,6 +2672,10 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
BUILD_BUG_ON(sizeof(*init_vm) != 256 + sizeof_field(struct kvm_tdx_init_vm, cpuid));
BUILD_BUG_ON(sizeof(struct td_params) != 1024);
+ CLASS(tdx_vm_state_guard, guard)(kvm);
+ if (IS_ERR(guard))
+ return PTR_ERR(guard);
+
if (kvm_tdx->state != TD_STATE_UNINITIALIZED)
return -EINVAL;
@@ -2745,7 +2787,9 @@ static int tdx_td_finalize(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- guard(mutex)(&kvm->slots_lock);
+ CLASS(tdx_vm_state_guard, guard)(kvm);
+ if (IS_ERR(guard))
+ return PTR_ERR(guard);
if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state == TD_STATE_RUNNABLE)
return -EINVAL;
@@ -2763,22 +2807,25 @@ static int tdx_td_finalize(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
return 0;
}
+static int tdx_get_cmd(void __user *argp, struct kvm_tdx_cmd *cmd)
+{
+ if (copy_from_user(cmd, argp, sizeof(cmd)))
+ return -EFAULT;
+
+ if (cmd->hw_error)
+ return -EINVAL;
+
+ return 0;
+}
+
int tdx_vm_ioctl(struct kvm *kvm, void __user *argp)
{
struct kvm_tdx_cmd tdx_cmd;
int r;
- if (copy_from_user(&tdx_cmd, argp, sizeof(struct kvm_tdx_cmd)))
- return -EFAULT;
-
- /*
- * Userspace should never set hw_error. It is used to fill
- * hardware-defined error by the kernel.
- */
- if (tdx_cmd.hw_error)
- return -EINVAL;
-
- mutex_lock(&kvm->lock);
+ r = tdx_get_cmd(argp, &tdx_cmd);
+ if (r)
+ return r;
switch (tdx_cmd.id) {
case KVM_TDX_CAPABILITIES:
@@ -2791,15 +2838,12 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp)
r = tdx_td_finalize(kvm, &tdx_cmd);
break;
default:
- r = -EINVAL;
- goto out;
+ return -EINVAL;
}
if (copy_to_user(argp, &tdx_cmd, sizeof(struct kvm_tdx_cmd)))
r = -EFAULT;
-out:
- mutex_unlock(&kvm->lock);
return r;
}
@@ -3079,11 +3123,13 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *c
long gmem_ret;
int ret;
+ CLASS(tdx_vm_state_guard, guard)(kvm);
+ if (IS_ERR(guard))
+ return PTR_ERR(guard);
+
if (tdx->state != VCPU_TD_STATE_INITIALIZED)
return -EINVAL;
- guard(mutex)(&kvm->slots_lock);
-
/* Once TD is finalized, the initial guest memory is fixed. */
if (kvm_tdx->state == TD_STATE_RUNNABLE)
return -EINVAL;
@@ -3101,6 +3147,8 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *c
!vt_is_tdx_private_gpa(kvm, region.gpa + (region.nr_pages << PAGE_SHIFT) - 1))
return -EINVAL;
+ vcpu_load(vcpu);
+
ret = 0;
while (region.nr_pages) {
if (signal_pending(current)) {
@@ -3132,11 +3180,28 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *c
cond_resched();
}
+ vcpu_put(vcpu);
+
if (copy_to_user(u64_to_user_ptr(cmd->data), ®ion, sizeof(region)))
ret = -EFAULT;
return ret;
}
+int tdx_vcpu_async_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
+{
+ struct kvm_tdx_cmd cmd;
+ int r;
+
+ r = tdx_get_cmd(argp, &cmd);
+ if (r)
+ return r;
+
+ if (cmd.id != KVM_TDX_INIT_MEM_REGION)
+ return -ENOIOCTLCMD;
+
+ return tdx_vcpu_init_mem_region(vcpu, &cmd);
+}
+
int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
@@ -3146,19 +3211,14 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state == TD_STATE_RUNNABLE)
return -EINVAL;
- if (copy_from_user(&cmd, argp, sizeof(cmd)))
- return -EFAULT;
-
- if (cmd.hw_error)
- return -EINVAL;
+ ret = tdx_get_cmd(argp, &cmd);
+ if (ret)
+ return ret;
switch (cmd.id) {
case KVM_TDX_INIT_VCPU:
ret = tdx_vcpu_init(vcpu, &cmd);
break;
- case KVM_TDX_INIT_MEM_REGION:
- ret = tdx_vcpu_init_mem_region(vcpu, &cmd);
- break;
case KVM_TDX_GET_CPUID:
ret = tdx_vcpu_get_cpuid(vcpu, &cmd);
break;
diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h
index 2b3424f638db..a797101a2150 100644
--- a/arch/x86/kvm/vmx/x86_ops.h
+++ b/arch/x86/kvm/vmx/x86_ops.h
@@ -149,6 +149,7 @@ int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp);
+int tdx_vcpu_async_ioctl(struct kvm_vcpu *vcpu, void __user *argp);
void tdx_flush_tlb_current(struct kvm_vcpu *vcpu);
void tdx_flush_tlb_all(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 92e916eba6a9..281cd0980245 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6946,6 +6946,13 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, void __user *argp)
long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl,
unsigned long arg)
{
+ struct kvm_vcpu *vcpu = filp->private_data;
+ void __user *argp = (void __user *)arg;
+
+ if (ioctl == KVM_MEMORY_ENCRYPT_OP &&
+ kvm_x86_ops.vcpu_mem_enc_async_ioctl)
+ return kvm_x86_call(vcpu_mem_enc_async_ioctl)(vcpu, argp);
+
return -ENOIOCTLCMD;
}
--
2.51.0.318.gd7df087d1a-goog
next prev parent reply other threads:[~2025-08-29 20:11 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-29 0:06 [RFC PATCH v2 00/18] KVM: x86/mmu: TDX post-populate cleanups Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 01/18] KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings Sean Christopherson
2025-08-29 6:20 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 02/18] KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU Sean Christopherson
2025-08-29 18:34 ` Edgecombe, Rick P
2025-08-29 20:27 ` Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 03/18] Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU" Sean Christopherson
2025-08-29 19:00 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 04/18] KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_page_prefault() Sean Christopherson
2025-08-29 19:03 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 05/18] KVM: TDX: Drop superfluous page pinning in S-EPT management Sean Christopherson
2025-08-29 8:36 ` Binbin Wu
2025-08-29 19:53 ` Edgecombe, Rick P
2025-08-29 20:19 ` Sean Christopherson
2025-08-29 21:54 ` Edgecombe, Rick P
2025-08-29 22:02 ` Sean Christopherson
2025-08-29 22:17 ` Edgecombe, Rick P
2025-08-29 22:58 ` Sean Christopherson
2025-08-29 22:59 ` Edgecombe, Rick P
2025-09-01 1:25 ` Yan Zhao
2025-09-02 17:33 ` Sean Christopherson
2025-09-02 18:55 ` Edgecombe, Rick P
2025-09-04 8:45 ` Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 06/18] KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition Sean Christopherson
2025-08-29 9:40 ` Binbin Wu
2025-08-29 16:58 ` Ira Weiny
2025-08-29 19:59 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 07/18] KVM: TDX: Fold tdx_sept_drop_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-08-29 9:49 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 08/18] KVM: x86/mmu: Drop the return code from kvm_x86_ops.remove_external_spte() Sean Christopherson
2025-08-29 9:52 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 09/18] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte() Sean Christopherson
2025-08-29 9:52 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 10/18] KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent Sean Christopherson
2025-08-29 10:06 ` Binbin Wu
2025-08-29 0:06 ` [RFC PATCH v2 11/18] KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller Sean Christopherson
2025-09-02 22:46 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 12/18] KVM: TDX: Bug the VM if extended the initial measurement fails Sean Christopherson
2025-08-29 8:18 ` Yan Zhao
2025-08-29 18:16 ` Edgecombe, Rick P
2025-08-29 20:11 ` Sean Christopherson [this message]
2025-08-29 22:39 ` Edgecombe, Rick P
2025-08-29 23:15 ` Edgecombe, Rick P
2025-08-29 23:18 ` Sean Christopherson
2025-09-02 9:24 ` Yan Zhao
2025-09-02 17:04 ` Sean Christopherson
2025-09-03 0:18 ` Edgecombe, Rick P
2025-09-03 3:34 ` Yan Zhao
2025-09-03 9:19 ` Yan Zhao
2025-08-29 0:06 ` [RFC PATCH v2 13/18] KVM: TDX: ADD pages to the TD image while populating mirror EPT entries Sean Christopherson
2025-08-29 23:42 ` Edgecombe, Rick P
2025-09-02 17:09 ` Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 14/18] KVM: TDX: Fold tdx_sept_zap_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-09-02 17:31 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 15/18] KVM: TDX: Combine KVM_BUG_ON + pr_tdx_error() into TDX_BUG_ON() Sean Christopherson
2025-08-29 9:03 ` Binbin Wu
2025-08-29 14:19 ` Sean Christopherson
2025-09-01 1:46 ` Binbin Wu
2025-09-02 18:55 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 16/18] KVM: TDX: Derive error argument names from the local variable names Sean Christopherson
2025-08-30 0:00 ` Edgecombe, Rick P
2025-08-29 0:06 ` [RFC PATCH v2 17/18] KVM: TDX: Assert that mmu_lock is held for write when removing S-EPT entries Sean Christopherson
2025-08-29 0:06 ` [RFC PATCH v2 18/18] KVM: TDX: Add macro to retry SEAMCALLs when forcing vCPUs out of guest Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLIJd7xpNfJvdMeT@google.com \
--to=seanjc@google.com \
--cc=ackerleytng@google.com \
--cc=ira.weiny@intel.com \
--cc=kai.huang@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=vannapurve@google.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.