From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E1A8332EB7 for ; Mon, 5 Jan 2026 15:50:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767628218; cv=none; b=EqRDXltkqdPdaoPyz+S/NAZzO1wVVCV8pN6Il2L+n0ivVlDps4suAcbYzR4MZGqREAgbi5eM+0ms6jWxIKhCChrchdbxiqb392YUWG6mmw/7viqn+CqxfXTxaV+Kkxwp8QiDwibuBxA3BzKw4+QtNgvwVDtQfN9pd9mIFTZ71l4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767628218; c=relaxed/simple; bh=ziEp5BPeQeyMvUL+QTYTgHhJXBoG87Yp09VcSuzkumA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kHYFDC/SWD6KJG3SzfRRjhNh8ReoWknzePgqiRC4wPKFv2uurUrEIQ0r7SNu+aPUvwhDk7Hh08iIQt5SchGrPzkkpQ/I+Kfj2D8cLLpBKrBP0dOR7cm9kfBgltP2y0f8H8OglSyNkVUeLuaa7aS/gFevnYfLrSuAFWd+hww8Qk8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JxFh9nKp; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JxFh9nKp" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 83920C116D0; Mon, 5 Jan 2026 15:50:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1767628218; bh=ziEp5BPeQeyMvUL+QTYTgHhJXBoG87Yp09VcSuzkumA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JxFh9nKpUMY5B19wqZ2lNtqqj5dYAsTKLwcq1/Duthl9JuzW0AnKFgN3e4NeMQseY QDKg9NrSug0blIi2q0y7v1Bp2I4gzYXDQMk+aEU+bhnSFb4kCjHjsMD1RzxxG1HtEi +MMzAhI5BJh/8CLjxehB8+IAUNQf78FtlmZauQV6DlajKZKMESryUExfPJr0SO6FJH IRpii9p3HucPDejClwsjbwq3G9ZNK12K8g8ciSImc42POnkOGUQXeGcs2ewAJF9FKk IbVunScaLW8kFG4DxKlJi/7ykrND5OJjgT7XzkHqBuMsPwu2aqC5XLwh9/vh0Pfs4F KW7x/DhR5vtOA== From: Will Deacon To: kvmarm@lists.linux.dev Cc: linux-arm-kernel@lists.infradead.org, Will Deacon , Marc Zyngier , Oliver Upton , Joey Gouly , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Quentin Perret , Fuad Tabba , Vincent Donnefort , Mostafa Saleh Subject: [PATCH 09/30] KVM: arm64: Split teardown hypercall into two phases Date: Mon, 5 Jan 2026 15:49:17 +0000 Message-ID: <20260105154939.11041-10-will@kernel.org> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260105154939.11041-1-will@kernel.org> References: <20260105154939.11041-1-will@kernel.org> Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit In preparation for reclaiming protected guest VM pages from the host during teardown, split the current 'pkvm_teardown_vm' hypercall into separate 'start' and 'finalise' calls. The 'pkvm_start_teardown_vm' hypercall puts the VM into a new 'is_dying' state, which is a point of no return past which no vCPU of the pVM is allowed to run any more. Once in this new state, 'pkvm_finalize_teardown_vm' can be used to reclaim meta-data and page-table pages from the VM. A subsequent patch will add support for reclaiming the individual guest memory pages. Co-developed-by: Quentin Perret Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_asm.h | 3 ++- arch/arm64/include/asm/kvm_host.h | 7 +++++ arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 4 ++- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 14 +++++++--- arch/arm64/kvm/hyp/nvhe/pkvm.c | 36 ++++++++++++++++++++++---- arch/arm64/kvm/pkvm.c | 7 ++++- 6 files changed, 60 insertions(+), 11 deletions(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index a1ad12c72ebf..d7fb23d39956 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -85,7 +85,8 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_unreserve_vm, __KVM_HOST_SMCCC_FUNC___pkvm_init_vm, __KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu, - __KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm, + __KVM_HOST_SMCCC_FUNC___pkvm_start_teardown_vm, + __KVM_HOST_SMCCC_FUNC___pkvm_finalize_teardown_vm, __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_load, __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_put, __KVM_HOST_SMCCC_FUNC___pkvm_tlb_flush_vmid, diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index ac7f970c7883..3191d10a2622 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -255,6 +255,13 @@ struct kvm_protected_vm { struct kvm_hyp_memcache stage2_teardown_mc; bool is_protected; bool is_created; + + /* + * True when the guest is being torn down. When in this state, the + * guest's vCPUs can't be loaded anymore, but its pages can be + * reclaimed by the host. + */ + bool is_dying; }; struct kvm_mpidr_data { diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h index 184ad7a39950..04c7ca703014 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h @@ -73,7 +73,9 @@ int __pkvm_init_vm(struct kvm *host_kvm, unsigned long vm_hva, unsigned long pgd_hva); int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, unsigned long vcpu_hva); -int __pkvm_teardown_vm(pkvm_handle_t handle); + +int __pkvm_start_teardown_vm(pkvm_handle_t handle); +int __pkvm_finalize_teardown_vm(pkvm_handle_t handle); struct pkvm_hyp_vcpu *pkvm_load_hyp_vcpu(pkvm_handle_t handle, unsigned int vcpu_idx); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 62a56c6084ca..e88b57eac25c 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -550,11 +550,18 @@ static void handle___pkvm_init_vcpu(struct kvm_cpu_context *host_ctxt) cpu_reg(host_ctxt, 1) = __pkvm_init_vcpu(handle, host_vcpu, vcpu_hva); } -static void handle___pkvm_teardown_vm(struct kvm_cpu_context *host_ctxt) +static void handle___pkvm_start_teardown_vm(struct kvm_cpu_context *host_ctxt) { DECLARE_REG(pkvm_handle_t, handle, host_ctxt, 1); - cpu_reg(host_ctxt, 1) = __pkvm_teardown_vm(handle); + cpu_reg(host_ctxt, 1) = __pkvm_start_teardown_vm(handle); +} + +static void handle___pkvm_finalize_teardown_vm(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(pkvm_handle_t, handle, host_ctxt, 1); + + cpu_reg(host_ctxt, 1) = __pkvm_finalize_teardown_vm(handle); } typedef void (*hcall_t)(struct kvm_cpu_context *); @@ -594,7 +601,8 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__pkvm_unreserve_vm), HANDLE_FUNC(__pkvm_init_vm), HANDLE_FUNC(__pkvm_init_vcpu), - HANDLE_FUNC(__pkvm_teardown_vm), + HANDLE_FUNC(__pkvm_start_teardown_vm), + HANDLE_FUNC(__pkvm_finalize_teardown_vm), HANDLE_FUNC(__pkvm_vcpu_load), HANDLE_FUNC(__pkvm_vcpu_put), HANDLE_FUNC(__pkvm_tlb_flush_vmid), diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index 8911338961c5..7f8191f96fc3 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -256,7 +256,10 @@ struct pkvm_hyp_vcpu *pkvm_load_hyp_vcpu(pkvm_handle_t handle, hyp_spin_lock(&vm_table_lock); hyp_vm = get_vm_by_handle(handle); - if (!hyp_vm || hyp_vm->kvm.created_vcpus <= vcpu_idx) + if (!hyp_vm || hyp_vm->kvm.arch.pkvm.is_dying) + goto unlock; + + if (hyp_vm->kvm.created_vcpus <= vcpu_idx) goto unlock; hyp_vcpu = hyp_vm->vcpus[vcpu_idx]; @@ -829,7 +832,32 @@ teardown_donated_memory(struct kvm_hyp_memcache *mc, void *addr, size_t size) unmap_donated_memory_noclear(addr, size); } -int __pkvm_teardown_vm(pkvm_handle_t handle) +int __pkvm_start_teardown_vm(pkvm_handle_t handle) +{ + struct pkvm_hyp_vm *hyp_vm; + int ret = 0; + + hyp_spin_lock(&vm_table_lock); + hyp_vm = get_vm_by_handle(handle); + if (!hyp_vm) { + ret = -ENOENT; + goto unlock; + } else if (WARN_ON(hyp_page_count(hyp_vm))) { + ret = -EBUSY; + goto unlock; + } else if (hyp_vm->kvm.arch.pkvm.is_dying) { + ret = -EINVAL; + goto unlock; + } + + hyp_vm->kvm.arch.pkvm.is_dying = true; +unlock: + hyp_spin_unlock(&vm_table_lock); + + return ret; +} + +int __pkvm_finalize_teardown_vm(pkvm_handle_t handle) { struct kvm_hyp_memcache *mc, *stage2_mc; struct pkvm_hyp_vm *hyp_vm; @@ -843,9 +871,7 @@ int __pkvm_teardown_vm(pkvm_handle_t handle) if (!hyp_vm) { err = -ENOENT; goto err_unlock; - } - - if (WARN_ON(hyp_page_count(hyp_vm))) { + } else if (!hyp_vm->kvm.arch.pkvm.is_dying) { err = -EBUSY; goto err_unlock; } diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c index 20d50abb3b94..a39dacd1d617 100644 --- a/arch/arm64/kvm/pkvm.c +++ b/arch/arm64/kvm/pkvm.c @@ -88,7 +88,7 @@ void __init kvm_hyp_reserve(void) static void __pkvm_destroy_hyp_vm(struct kvm *kvm) { if (pkvm_hyp_vm_is_created(kvm)) { - WARN_ON(kvm_call_hyp_nvhe(__pkvm_teardown_vm, + WARN_ON(kvm_call_hyp_nvhe(__pkvm_finalize_teardown_vm, kvm->arch.pkvm.handle)); } else if (kvm->arch.pkvm.handle) { /* @@ -350,6 +350,11 @@ void pkvm_pgtable_stage2_destroy_range(struct kvm_pgtable *pgt, if (!handle) return; + if (pkvm_hyp_vm_is_created(kvm) && !kvm->arch.pkvm.is_dying) { + WARN_ON(kvm_call_hyp_nvhe(__pkvm_start_teardown_vm, handle)); + kvm->arch.pkvm.is_dying = true; + } + __pkvm_pgtable_stage2_unshare(pgt, addr, addr + size); } -- 2.52.0.351.gbe84eed79e-goog