From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>,
Emanuele Giuseppe Esposito <eesposit@redhat.com>,
kvm@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>,
Maxim Levitsky <mlevitsk@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
David Hildenbrand <david@redhat.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Hyper-V VTLs, permission bitmaps and userspace exits (was Re: [PATCH 0/4] KVM: API to block and resume all running vcpus in a vm)
Date: Wed, 26 Oct 2022 19:33:17 +0000 [thread overview]
Message-ID: <Y1mLfRRKcF5OWZTG@google.com> (raw)
In-Reply-To: <d3e2dd2b-9520-32ef-6785-94164a834adf@redhat.com>
On Wed, Oct 26, 2022, Paolo Bonzini wrote:
> On 10/26/22 01:07, Sean Christopherson wrote:
> > > > - to stop anything else in the system that consumes KVM memslots, e.g. KVM GT
> > >
> > > Is this true if you only look at the KVM_GET_DIRTY_LOG case and consider it
> > > a guest bug to access the memory (i.e. ignore the strange read-only changes
> > > which only happen at boot, and which I agree are QEMU-specific)?
> >
> > Yes? I don't know exactly what "the KVM_GET_DIRTY_LOG case" is.
>
> It is not possible to atomically read the dirty bitmap and delete a memslot.
> When you delete a memslot, the bitmap is gone. In this case however memory
> accesses to the deleted memslot are a guest bug, so stopping KVM-GT would
> not be necessary.
If accesses to the deleted memslot are a guest bug, why do you care about pausing
vCPUs? I don't mean to be beligerent, I'm genuinely confused.
> So while I'm being slowly convinced that QEMU should find a way to pause its
> vCPUs around memslot changes, I'm not sure that pausing everything is needed
> in general.
>
> > > > And because of the nature of KVM, to support this API on all architectures, KVM
> > > > needs to make change on all architectures, whereas userspace should be able to
> > > > implement a generic solution.
> > >
> > > Yes, I agree that this is essentially just a more efficient kill().
> > > Emanuele, perhaps you can put together a patch to x86/vmexit.c in
> > > kvm-unit-tests, where CPU0 keeps changing memslots and the other CPUs are in
> > > a for(;;) busy wait, to measure the various ways to do it?
> >
> > I'm a bit confused. Is the goal of this to simplify QEMU, dedup VMM code, provide
> > a more performant solution, something else entirely?
>
> Well, a bit of all of them and perhaps that's the problem. And while the
> issues at hand *are* self-inflicted wounds on part of QEMU, it seems to me
> that the underlying issues are general.
>
> For example, Alex Graf and I looked back at your proposal of a userspace
> exit for "bad" accesses to memory, wondering if it could help with Hyper-V
> VTLs too. To recap, the "higher privileged" code at VTL1 can set up VM-wide
> restrictions on access to some pages through a hypercall
> (HvModifyVtlProtectionMask). After the hypercall, VTL0 would not be able to
> access those pages. The hypercall would be handled in userspace and would
> invoke a KVM_SET_MEMORY_REGION_PERM ioctl to restrict the RWX permissions,
> and this ioctl would set up a VM-wide permission bitmap that would be used
> when building page tables.
>
> Using such a bitmap instead of memslots makes it possible to cause userspace
> vmexits on VTL mapping violations with efficient data structures. And it
> would also be possible to use this mechanism around KVM_GET_DIRTY_LOG, to
> read the KVM dirty bitmap just before removing a memslot.
What exactly is the behavior you're trying to achieve for KVM_GET_DIRTY_LOG => delete?
If KVM provides KVM_EXIT_MEMORY_FAULT, can you not achieve the desired behavior by
doing mprotect(PROT_NONE) => KVM_GET_DIRTY_LOG => delete? If PROT_NONE causes the
memory to be freed, won't mprotect(PROT_READ) do what you want even without
KVM_EXIT_MEMORY_FAULT?
> However, external accesses to the regions (ITS, Xen, KVM-GT, non KVM_RUN
> ioctls) would not be blocked, due to the lack of a way to report the exit.
Aren't all of those out of scope? E.g. in a very hypothetical world where XEN's
event channel is being used with VTLs, if VTL1 makes the event channel inaccessible,
that's a guest and/or userspace configuration issue and the guest is hosed no matter
what KVM does. Ditto for these case where KVM-GT's buffer is blocked. I'm guessing
the ITS is similar?
> The intersection of these features with VTLs should be very small (sometimes
> zero since VTLs are x86 only), but the ioctls would be a problem so I'm
> wondering what your thoughts are on this.
How do the ioctls() map to VTLs? I.e. are they considered VTL0, VTL1, out-of-band?
> Also, while the exit API could be the same, it is not clear to me that the
> permission bitmap would be a good match for entirely "void" memslots used to
> work around non-atomic memslot changes. So for now let's leave this aside
> and only consider the KVM_GET_DIRTY_LOG case.
As above, can't userspace just mprotect() the entire memslot to prevent writes
between getting the dirty log and deleting the memslot?
prev parent reply other threads:[~2022-10-26 19:33 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-22 15:48 [PATCH 0/4] KVM: API to block and resume all running vcpus in a vm Emanuele Giuseppe Esposito
2022-10-22 15:48 ` [PATCH 1/4] linux-headers/linux/kvm.h: introduce kvm_userspace_memory_region_list ioctl Emanuele Giuseppe Esposito
2022-10-22 15:48 ` [PATCH 2/4] KVM: introduce kvm_clear_all_cpus_request Emanuele Giuseppe Esposito
2022-10-22 15:48 ` [PATCH 3/4] KVM: introduce memory transaction semaphore Emanuele Giuseppe Esposito
2022-10-23 17:50 ` Paolo Bonzini
2022-10-24 12:57 ` Emanuele Giuseppe Esposito
2022-10-25 10:01 ` Paolo Bonzini
2022-10-22 15:48 ` [PATCH 4/4] KVM: use signals to abort enter_guest/blocking and retry Emanuele Giuseppe Esposito
2022-10-23 17:48 ` Paolo Bonzini
2022-10-24 7:43 ` Emanuele Giuseppe Esposito
2022-10-24 7:49 ` Emanuele Giuseppe Esposito
2022-10-25 10:05 ` Paolo Bonzini
2022-10-24 7:56 ` [PATCH 0/4] KVM: API to block and resume all running vcpus in a vm Christian Borntraeger
2022-10-24 8:33 ` Emanuele Giuseppe Esposito
2022-10-24 9:09 ` Christian Borntraeger
2022-10-24 22:45 ` Sean Christopherson
2022-10-25 9:33 ` Paolo Bonzini
2022-10-25 15:55 ` Sean Christopherson
2022-10-25 21:34 ` Paolo Bonzini
2022-10-25 23:07 ` Sean Christopherson
2022-10-26 17:52 ` Hyper-V VTLs, permission bitmaps and userspace exits (was Re: [PATCH 0/4] KVM: API to block and resume all running vcpus in a vm) Paolo Bonzini
2022-10-26 19:33 ` Sean Christopherson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1mLfRRKcF5OWZTG@google.com \
--to=seanjc@google.com \
--cc=borntraeger@linux.ibm.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=eesposit@redhat.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mlevitsk@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.