From: Sean Christopherson <seanjc@google.com>
To: Alexander Graf <graf@amazon.com>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
kvm@vger.kernel.org, "Xiao Guangrong" <guangrong.xiao@gmail.com>,
"Chandrasekaran, Siddharth" <sidcha@amazon.de>,
"Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [PATCH v2 11/27] KVM: x86/mmu: Zap only the relevant pages when removing a memslot
Date: Mon, 24 Oct 2022 15:55:55 +0000 [thread overview]
Message-ID: <Y1a1i9vbJ/pVmV9r@google.com> (raw)
In-Reply-To: <490509f6-ae1a-4fc8-42a1-b037d6bffada@amazon.com>
On Mon, Oct 24, 2022, Alexander Graf wrote:
> Hey Sean,
>
> On 21.10.22 21:40, Sean Christopherson wrote:
> >
> > On Thu, Oct 20, 2022, Alexander Graf wrote:
> > > On 20.10.22 22:37, Sean Christopherson wrote:
> > > > On Thu, Oct 20, 2022, Alexander Graf wrote:
> > > > > On 26.06.20 19:32, Sean Christopherson wrote:
> > > > > > /cast <thread necromancy>
> > > > > >
> > > > > > On Tue, Aug 20, 2019 at 01:03:19PM -0700, Sean Christopherson wrote:
> > > > > [...]
> > > > >
> > > > > > I don't think any of this explains the pass-through GPU issue. But, we
> > > > > > have a few use cases where zapping the entire MMU is undesirable, so I'm
> > > > > > going to retry upstreaming this patch as with per-VM opt-in. I wanted to
> > > > > > set the record straight for posterity before doing so.
> > > > > Hey Sean,
> > > > >
> > > > > Did you ever get around to upstream or rework the zap optimization? The way
> > > > > I read current upstream, a memslot change still always wipes all SPTEs, not
> > > > > only the ones that were changed.
> > > > Nope, I've more or less given up hope on zapping only the deleted/moved memslot.
> > > > TDX (and SNP?) will preserve SPTEs for guest private memory, but they're very
> > > > much a special case.
> > > >
> > > > Do you have use case and/or issue that doesn't play nice with the "zap all" behavior?
> > >
> > > Yeah, we're looking at adding support for the Hyper-V VSM extensions which
> > > Windows uses to implement Credential Guard. With that, the guest gets access
> > > to hypercalls that allow it to set reduced permissions for arbitrary gfns.
> > > To ensure that user space has full visibility into those for live migration,
> > > memory slots to model access would be a great fit. But it means we'd do
> > > ~100k memslot modifications on boot.
> > Oof. 100k memslot updates is going to be painful irrespective of flushing. And
> > memslots (in their current form) won't work if the guest can drop executable
> > permissions.
> >
> > Assuming KVM needs to support a KVM_MEM_NO_EXEC flag, rather than trying to solve
> > the "KVM flushes everything on memslot deletion", I think we should instead
> > properly support toggling KVM_MEM_READONLY (and KVM_MEM_NO_EXEC) without forcing
> > userspace to delete the memslot. Commit 75d61fbcf563 ("KVM: set_memory_region:
>
>
> That would be a cute acceleration for the case where we have to change
> permissions for a full slot. Unfortunately, the bulk of the changes are slot
> splits.
Ah, right, the guest will be operating on per-page granularity.
> We already built a prototype implementation of an atomic memslot update
> ioctl that allows us to keep other vCPUs running while we do the
> delete/create/create/create operation.
Please weigh in with your use case on a relevant upstream discussion regarding
"atomic" memslot updates[*]. I suspect we'll end up with a different solution
for this use case (see below), but we should at least capture all potential use
cases and ideas for modifying memslots without pausing vCPUs.
[*] https://lore.kernel.org/all/20220909104506.738478-1-eesposit@redhat.com
> But even with that, we see up to 30 min boot times for larger guests that
> most of the time are stuck in zapping pages.
Out of curiosity, did you measure runtime performance? I would expect some amount
of runtime overhead as well dut to fragmenting memslots to that degree.
> I guess we have 2 options to make this viable:
>
> 1) Optimize memslot splits + modifications to a point where they're fast
> enough
> 2) Add a different, faster mechanism on top of memslots for page granular
> permission bits
#2 crossed my mind as well. This is actually nearly identical to the confidential
VM use case, where KVM needs to handle guest-initiated conversions of memory between
"private" and "shared" on a per-page granularity. The proposed solution for that
is indeed a layer on top of memslots[*], which we arrived at in no small part because
splitting memslots was going to be a bottleneck.
Extending the proposed mem_attr_array to support additional state should be quite
easy. The framework is all there, KVM just needs a few extra flags values, e.g.
KVM_MEM_ATTR_SHARED BIT(0)
KVM_MEM_ATTR_READONLY BIT(1)
KVM_MEM_ATTR_NOEXEC BIT(2)
and then new ioctls to expose the functionality to userspace. Actually, if we
want to go this route, it might even make sense to define new a generic MEM_ATTR
ioctl() right away instead of repurposing KVM_MEMORY_ENCRYPT_(UN)REG_REGION for
the private vs. shared use case.
[*] https://lore.kernel.org/all/20220915142913.2213336-6-chao.p.peng@linux.intel.com
prev parent reply other threads:[~2022-10-24 17:24 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-05 20:54 [PATCH v2 00/27] KVM: x86/mmu: Remove fast invalidate mechanism Sean Christopherson
2019-02-05 20:54 ` Sean Christopherson
2019-02-05 20:54 ` Sean Christopherson
2019-02-05 20:54 ` Sean Christopherson
2019-02-05 20:54 ` [PATCH v2 01/27] KVM: Call kvm_arch_memslots_updated() before updating memslots Sean Christopherson
2019-02-05 20:54 ` Sean Christopherson
2019-02-05 20:54 ` Sean Christopherson
2019-02-05 20:54 ` Sean Christopherson
2019-02-06 9:12 ` Cornelia Huck
2019-02-06 9:12 ` Cornelia Huck
2019-02-06 9:12 ` Cornelia Huck
2019-02-06 9:12 ` Cornelia Huck
2019-02-12 12:36 ` [PATCH v2 00/27] KVM: x86/mmu: Remove fast invalidate mechanism Paolo Bonzini
2019-02-12 12:36 ` Paolo Bonzini
2019-02-12 12:36 ` Paolo Bonzini
2019-02-12 12:36 ` Paolo Bonzini
[not found] ` <20190205210137.1377-11-sean.j.christopherson@intel.com>
2019-08-13 16:04 ` [PATCH v2 11/27] KVM: x86/mmu: Zap only the relevant pages when removing a memslot Alex Williamson
2019-08-13 17:04 ` Sean Christopherson
2019-08-13 17:57 ` Alex Williamson
2019-08-13 19:33 ` Alex Williamson
2019-08-13 20:19 ` Sean Christopherson
2019-08-13 20:37 ` Paolo Bonzini
2019-08-13 21:14 ` Alex Williamson
2019-08-13 21:15 ` Paolo Bonzini
2019-08-13 22:10 ` Alex Williamson
2019-08-15 14:46 ` Sean Christopherson
2019-08-15 15:23 ` Alex Williamson
2019-08-15 16:00 ` Sean Christopherson
2019-08-15 18:16 ` Alex Williamson
2019-08-15 19:25 ` Sean Christopherson
2019-08-15 20:11 ` Alex Williamson
2019-08-19 16:03 ` Paolo Bonzini
2019-08-20 20:03 ` Sean Christopherson
2019-08-20 20:42 ` Alex Williamson
2019-08-20 21:02 ` Sean Christopherson
2019-08-21 19:08 ` Alex Williamson
2019-08-21 19:35 ` Alex Williamson
2019-08-21 20:30 ` Sean Christopherson
2019-08-23 2:25 ` Sean Christopherson
2019-08-23 22:05 ` Alex Williamson
2019-08-21 20:10 ` Sean Christopherson
2019-08-26 7:36 ` Tian, Kevin
2019-08-26 14:56 ` Sean Christopherson
2020-06-26 17:32 ` Sean Christopherson
2022-10-20 18:31 ` Alexander Graf
2022-10-20 20:37 ` Sean Christopherson
2022-10-20 21:06 ` Alexander Graf
2022-10-21 19:40 ` Sean Christopherson
2022-10-24 6:12 ` Alexander Graf
2022-10-24 15:55 ` Sean Christopherson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1a1i9vbJ/pVmV9r@google.com \
--to=seanjc@google.com \
--cc=alex.williamson@redhat.com \
--cc=graf@amazon.com \
--cc=guangrong.xiao@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=sidcha@amazon.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.