From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [GIT PULL] KVM: x86: VMX changes for 6.17
Date: Tue, 29 Jul 2025 12:44:17 -0700 [thread overview]
Message-ID: <aIkkkaqTbc9vG_x3@google.com> (raw)
In-Reply-To: <CABgObfZWvtskg-m94LRHqN=_FtJpFtTzOi3sEhiAKZx1rzr=ng@mail.gmail.com>
On Mon, Jul 28, 2025, Paolo Bonzini wrote:
> On Sat, Jul 26, 2025 at 12:07 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > Add a sub-ioctl to allow getting TDX VMs into TEARDOWN before the last reference
> > to the VM is put, so that reclaiming the VM's memory doesn't have to jump
> > through all the hoops needed to reclaim memory from a live TD, which are quite
> > costly, especially for large VMs.
> >
> > The following changes since commit 347e9f5043c89695b01e66b3ed111755afcf1911:
> >
> > Linux 6.16-rc6 (2025-07-13 14:25:58 -0700)
> >
> > are available in the Git repository at:
> >
> > https://github.com/kvm-x86/linux.git tags/kvm-x86-vmx-6.17
> >
> > for you to fetch changes up to dcab95e533642d8f733e2562b8bfa5715541e0cf:
> >
> > KVM: TDX: Add sub-ioctl KVM_TDX_TERMINATE_VM (2025-07-21 16:23:02 -0700)
>
> I haven't pulled this for now because I wonder if it's better to make
> this a general-purpose ioctl and cap (plus a kvm_x86_ops hook). The
> faster teardown is a TDX module quirk, but for example would it be
> useful if you could trigger kvm_vm_dead() in the selftests?
I'm leaning "no" (leaning is probably an understatement).
Mainly because I think the current behavior of vm_dead is a mistake. Rejecting
all ioctls if kvm->vm_dead is true sounds nice on paper, but in practice it gives
us a false sense of security due to the check happening before acquiring kvm->lock,
e.g. see the SEV-ES migration bug found by syzbot.
Enforcing vm_dead with 100% accuracy would be painful given that there are ioctls
that deliberately avoid kvm->lock (vCPU ioctls could simply check KVM_REQ_VM_DEAD),
and I'm not at all convinced that truly making the VM off-limits is actually
desirable. E.g. it prevents quickly freeing resources by nuking memslots.
I do think it makes sense to reject ioctls if vm_bugged is set, because vm_bugged
is all about limiting the damage when something has already gone wrong, i.e.
providing any kind of ABI is very much a non-goal.
And if the vm_dead behavior is gone, I don't think a generic KVM_TERMINATE_VM
adds much, if any value. Blocking KVM_RUN isn't terribly interesting, because
VMMs can already accomplish that with signals+immediate_exit, and AFAIK, real-world
use cases don't have problems with KVM_RUN being called at unexpected times.
One thing that we've discussed internally (though not in much depth) is a way to
block accesses to guest memory, e.g. to guard against accesses to guest memory
while saving vCPU state during live migration, when the VMM might expect that
guest memory is frozen, i.e. can't be dirtied. But we wouldn't want to terminate
the VM in that case, e.g. so that the VM could be resumed if the migration is
aborted at the last minute.
So I think we want something more along the lines of KVM_PAUSE_VM, with specific
semantics and guarantees.
As for this pull request, I vote to drop it for 6.17 and give ourselves time to
figure out what we want to do with vm_dead. I want to land "terminate VM" in
some form by 6.18 (as the next LTS), but AFAIK there's no rush to get it into
6.17.
I posted a series with a slightly modified version of the KVM_TDX_TERMINATE_VM
patch[1] to show where I think we should go. We discussed the topic in v4 of the
KVM_TDX_TERMINATE_VM patch[2], but I opted to post it separate (at the time)
because there wasn't a strict dependency.
[1] https://lore.kernel.org/all/20250729193341.621487-1-seanjc@google.com
[2] https://lore.kernel.org/all/aFNa7L74tjztduT-@google.com
> As a side effect it would remove the supported_caps field and separate
> namespace for KVM_TDX_CAP_* capabilities, at least for now.
next prev parent reply other threads:[~2025-07-29 19:44 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-25 22:07 [GIT PULL] KVM: x86: Changes for 6.17 Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: x86: Local APIC refactoring " Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: Dirty Ring changes " Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: Generic " Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: IRQ " Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: x86: Misc " Sean Christopherson
2025-07-28 15:11 ` Paolo Bonzini
2025-07-25 22:07 ` [GIT PULL] KVM: x86: MMIO State Data mitigation " Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: x86: MMU " Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: Device assignment accounting " Sean Christopherson
2025-07-28 15:08 ` Paolo Bonzini
2025-07-25 22:07 ` [GIT PULL] KVM: x86: Selftests " Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: x86: SEV " Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: x86: SVM " Sean Christopherson
2025-07-25 22:07 ` [GIT PULL] KVM: x86: VMX " Sean Christopherson
2025-07-28 15:47 ` Paolo Bonzini
2025-07-29 19:44 ` Sean Christopherson [this message]
2025-07-30 17:55 ` Paolo Bonzini
2025-07-28 15:50 ` [GIT PULL] KVM: x86: Changes " Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aIkkkaqTbc9vG_x3@google.com \
--to=seanjc@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).