From: Adrian Hunter <adrian.hunter@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: <pbonzini@redhat.com>, <mlevitsk@redhat.com>,
<kvm@vger.kernel.org>, <rick.p.edgecombe@intel.com>,
<kirill.shutemov@linux.intel.com>, <kai.huang@intel.com>,
<reinette.chatre@intel.com>, <xiaoyao.li@intel.com>,
<tony.lindgren@linux.intel.com>, <binbin.wu@linux.intel.com>,
<isaku.yamahata@intel.com>, <linux-kernel@vger.kernel.org>,
<yan.y.zhao@intel.com>, <chao.gao@intel.com>
Subject: Re: [PATCH V2 1/1] KVM: TDX: Add sub-ioctl KVM_TDX_TERMINATE_VM
Date: Tue, 22 Apr 2025 11:13:33 +0300 [thread overview]
Message-ID: <910152f4-22b4-4b8d-b3e4-8e044a4d73c9@intel.com> (raw)
In-Reply-To: <aAL4dT1pWG5dDDeo@google.com>
On 19/04/25 04:12, Sean Christopherson wrote:
> On Thu, Apr 17, 2025, Adrian Hunter wrote:
>> From: Sean Christopherson <seanjc@google.com>
>>
>> Add sub-ioctl KVM_TDX_TERMINATE_VM to release the HKID prior to shutdown,
>> which enables more efficient reclaim of private memory.
>>
>> Private memory is removed from MMU/TDP when guest_memfds are closed. If
>> the HKID has not been released, the TDX VM is still in RUNNABLE state,
>> so pages must be removed using "Dynamic Page Removal" procedure (refer
>> TDX Module Base spec) which involves a number of steps:
>> Block further address translation
>> Exit each VCPU
>> Clear Secure EPT entry
>> Flush/write-back/invalidate relevant caches
>>
>> However, when the HKID is released, the TDX VM moves to TD_TEARDOWN state
>> where all TDX VM pages are effectively unmapped, so pages can be reclaimed
>> directly.
>>
>> Reclaiming TD Pages in TD_TEARDOWN State was seen to decrease the total
>> reclaim time. For example:
>>
>> VCPUs Size (GB) Before (secs) After (secs)
>> 4 18 72 24
>> 32 107 517 134
>> 64 400 5539 467
>>
>> [Adrian: wrote commit message, added KVM_TDX_TERMINATE_VM documentation,
>> and moved cpus_read_lock() inside kvm->lock for consistency as reported
>> by lockdep]
>
> /facepalm
>
> I over-thought this. We've had an long-standing battle with kvm_lock vs.
> cpus_read_lock(), but this is kvm->lock, not kvm_lock. /sigh
>
>> +static int tdx_terminate_vm(struct kvm *kvm)
>> +{
>> + int r = 0;
>> +
>> + guard(mutex)(&kvm->lock);
>
> With kvm->lock taken outside cpus_read_lock(), just handle KVM_TDX_TERMINATE_VM
> in the switch statement, i.e. let tdx_vm_ioctl() deal with kvm->lock.
Ok, also cpus_read_lock() can go back where it was in __tdx_release_hkid().
But also in __tdx_release_hkid(), there is
if (KVM_BUG_ON(refcount_read(&kvm->users_count) && !terminate, kvm))
return;
However, __tdx_td_init() calls tdx_mmu_release_hkid() on the
error path so that is not correct.
>
>> + cpus_read_lock();
>> +
>> + if (!kvm_trylock_all_vcpus(kvm)) {
>> + r = -EBUSY;
>> + goto out;
>> + }
>> +
>> + kvm_vm_dead(kvm);
>> + kvm_unlock_all_vcpus(kvm);
>> +
>> + __tdx_release_hkid(kvm, true);
>> +out:
>> + cpus_read_unlock();
>> + return r;
>> +}
>> +
>> int tdx_vm_ioctl(struct kvm *kvm, void __user *argp)
>> {
>> struct kvm_tdx_cmd tdx_cmd;
>> @@ -2805,6 +2827,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp)
>> if (tdx_cmd.hw_error)
>> return -EINVAL;
>>
>> + if (tdx_cmd.id == KVM_TDX_TERMINATE_VM)
>> + return tdx_terminate_vm(kvm);
>> +
>> mutex_lock(&kvm->lock);
>>
>> switch (tdx_cmd.id) {
>> --
>> 2.43.0
>>
next prev parent reply other threads:[~2025-04-22 8:14 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-17 13:19 [PATCH V2 0/1] KVM: TDX: Decrease TDX VM shutdown time Adrian Hunter
2025-04-17 13:19 ` [PATCH V2 1/1] KVM: TDX: Add sub-ioctl KVM_TDX_TERMINATE_VM Adrian Hunter
2025-04-19 0:34 ` Vishal Annapurve
2025-04-19 1:08 ` Sean Christopherson
2025-04-22 7:30 ` Adrian Hunter
2025-04-19 1:12 ` Sean Christopherson
2025-04-22 8:13 ` Adrian Hunter [this message]
2025-04-22 9:37 ` Adrian Hunter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=910152f4-22b4-4b8d-b3e4-8e044a4d73c9@intel.com \
--to=adrian.hunter@intel.com \
--cc=binbin.wu@linux.intel.com \
--cc=chao.gao@intel.com \
--cc=isaku.yamahata@intel.com \
--cc=kai.huang@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mlevitsk@redhat.com \
--cc=pbonzini@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=tony.lindgren@linux.intel.com \
--cc=xiaoyao.li@intel.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox