From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>,
David Hildenbrand <david@redhat.com>,
Maxim Levitsky <mlevitsk@redhat.com>,
kvm@vger.kernel.org, Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
linux-kernel@vger.kernel.org, Like Xu <like.xu.linux@gmail.com>
Subject: Re: [RFC PATCH 0/9] kvm: implement atomic memslot updates
Date: Thu, 29 Sep 2022 21:39:07 +0000 [thread overview]
Message-ID: <YzYQe2Lc+l2KpLBl@google.com> (raw)
In-Reply-To: <d8d2bd39-cbb3-010d-266a-4e967765a382@redhat.com>
On Thu, Sep 29, 2022, Paolo Bonzini wrote:
>
> On 9/28/22 22:41, Sean Christopherson wrote:
> If you think we should overhaul it even more than just providing atomic
> batched updates, that's fine.
Or maybe solve the problem(s) with more precision? What I don't like about the
batching is that it's not "atomic", and AFAICT doesn't have line of sight to truly
becoming atomic. When I hear "atomic" updates, I think "all transactions are
commited at once". That is not what this proposed API does, and due to the TLB
flushing requirements, I don't see how it can be implemented.
> But still, the impossibility to perform
> atomic updates in batches *is* a suboptimal part of the KVM API.
I don't disagree, but I honestly don't see the difference between "batching" and
providing a KVM_MEM_USER_EXIT flag. The batch API will provide better performance,
but I would be surprised if it's significant enough to matter. I'm not saying
that KVM_MEM_USER_EXIT provides novel functionality, e.g. I think we're in
agreement that handling memslot metadata will suck regardless. My point is that
it's simpler to implement in KVM, much easier to document, and for all intents and
purposes provides the same desired functionality: vCPUs that access in-flux memslots
are stalled and so userspace doesn't have to pause and rendezvous all vCPUs to
provide "atomic" updates.
> The main cases are:
>
> - for the boot case, splitting and merging existing memslots. QEMU likes to
> merge adjacent memory regions into a single memslot, so if something goes
> from read-write to read-only it has to be split and vice versa. I guess a
> "don't merge this memory region" flag would be the less hideous way to solve
> it in userspace.
>
> - however, there is also the case of resizing an existing memslot which is
> what David would like to have for virtio-mem. This is not really fixable
> because part of the appeal of virtio-mem is to have a single huge memslot
> instead of many smaller ones, in order to reduce the granularity of
> add/remove (David, correct me if I'm wrong).
>
> (The latter _would_ be needed by other VMMs).
If we really want to provide a better experience for userspace, why not provide
more primitives to address those specific use cases? E.g. fix KVM's historic wart
of disallowing toggling of KVM_MEM_READONLY, and then add one or more ioctls to:
- Merge 2+ memory regions that are contiguous in GPA and HVA
- Split a memory region into 2+ memory regions
- Truncate a memory region
- Grow a memory region
That would probably require more KVM code overall, but each operation would be more
tightly bounded and thus simpler to define. And I think more precise APIs would
provide other benefits, e.g. growing a region wouldn't require first deleting the
current region, and so could avoid zapping SPTEs and destroying metadata. Merge,
split, and truncate would have similar benefits, though they might be more
difficult to realize in practice.
What I really don't like about the "batch" API, aside from the "atomic" name, is
that it's too open ended and creates conflicts. E.g. to allow "atomic" merging
after converting from RO=>RW, KVM needs to allow DELETE=>DELETE=>CREATE, and that
implies that each operation in the batch operates on the working state created by
the previous operations.
But if userspace provides a batch that does DELETE=>CREATE=>DELETE, naively hoisting
all "invalidations" to the front half of the "atomic" operations breaks the
ordering. And there's a ridiculous amount of potential weirdness with MOVE=>MOVE
operations.
KVM could solve those issues by disallowing MOVE and disallowing DELETE after
CREATE or FLAGS_ONLY, but then userspace is effectively restricted to operating
on a single contiguous chunk of memslots. At that point, it seems that providing
primitives to do each macro operation is an even better experience for userspace.
In other words, make memslots a bit more CISC ;-)
next prev parent reply other threads:[~2022-09-29 21:39 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-09 10:44 [RFC PATCH 0/9] kvm: implement atomic memslot updates Emanuele Giuseppe Esposito
2022-09-09 10:44 ` [RFC PATCH 1/9] kvm_main.c: move slot check in kvm_set_memory_region Emanuele Giuseppe Esposito
2022-09-28 16:41 ` Paolo Bonzini
2022-09-09 10:44 ` [RFC PATCH 2/9] kvm.h: introduce KVM_SET_USER_MEMORY_REGION_LIST ioctl Emanuele Giuseppe Esposito
2022-09-28 16:42 ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 3/9] kvm_main.c: introduce kvm_internal_memory_region_list Emanuele Giuseppe Esposito
2022-09-28 16:48 ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 4/9] kvm_main.c: split logic in kvm_set_memslots Emanuele Giuseppe Esposito
2022-09-28 17:04 ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 5/9] kvm_main.c: split __kvm_set_memory_region logic in kvm_check_mem and kvm_prepare_batch Emanuele Giuseppe Esposito
2022-09-13 2:56 ` Yang, Weijiang
2022-09-18 16:22 ` Emanuele Giuseppe Esposito
2022-09-28 17:11 ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 6/9] kvm_main.c: simplify change-specific callbacks Emanuele Giuseppe Esposito
2022-09-09 10:45 ` [RFC PATCH 7/9] kvm_main.c: duplicate invalid memslot also in inactive list Emanuele Giuseppe Esposito
2022-09-28 17:18 ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 8/9] kvm_main.c: find memslots from the inactive memslot list Emanuele Giuseppe Esposito
2022-09-09 10:45 ` [RFC PATCH 9/9] kvm_main.c: handle atomic memslot update Emanuele Giuseppe Esposito
2022-09-13 2:30 ` Yang, Weijiang
2022-09-18 16:18 ` Emanuele Giuseppe Esposito
2022-09-27 7:46 ` David Hildenbrand
2022-09-27 8:35 ` Emanuele Giuseppe Esposito
2022-09-27 9:22 ` David Hildenbrand
2022-09-27 9:32 ` Emanuele Giuseppe Esposito
2022-09-27 14:52 ` David Hildenbrand
2022-09-28 17:29 ` Paolo Bonzini
2022-09-09 14:30 ` [RFC PATCH 0/9] kvm: implement atomic memslot updates Sean Christopherson
2022-09-18 16:13 ` Emanuele Giuseppe Esposito
2022-09-19 7:38 ` Like Xu
2022-09-19 7:53 ` David Hildenbrand
2022-09-19 17:30 ` David Hildenbrand
2022-09-23 13:10 ` Emanuele Giuseppe Esposito
2022-09-23 13:21 ` David Hildenbrand
2022-09-23 13:38 ` Emanuele Giuseppe Esposito
2022-09-26 9:03 ` David Hildenbrand
2022-09-26 21:28 ` Sean Christopherson
2022-09-27 7:38 ` Emanuele Giuseppe Esposito
2022-09-27 15:58 ` Sean Christopherson
2022-09-28 9:11 ` Emanuele Giuseppe Esposito
2022-09-28 11:14 ` Maxim Levitsky
2022-09-28 12:52 ` David Hildenbrand
2022-09-28 15:07 ` Paolo Bonzini
2022-09-28 15:33 ` David Hildenbrand
2022-09-28 15:58 ` Sean Christopherson
2022-09-28 16:38 ` Paolo Bonzini
2022-09-28 20:41 ` Sean Christopherson
2022-09-29 8:05 ` Emanuele Giuseppe Esposito
2022-09-29 8:24 ` David Hildenbrand
2022-09-29 15:18 ` Sean Christopherson
2022-09-29 15:41 ` Paolo Bonzini
2022-09-29 15:28 ` Paolo Bonzini
2022-09-29 15:40 ` Maxim Levitsky
2022-09-29 16:00 ` David Hildenbrand
2022-09-29 21:39 ` Sean Christopherson [this message]
2022-10-13 7:43 ` Emanuele Giuseppe Esposito
2022-10-13 8:44 ` David Hildenbrand
2022-10-13 11:12 ` Emanuele Giuseppe Esposito
2022-10-13 14:45 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YzYQe2Lc+l2KpLBl@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=eesposit@redhat.com \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=like.xu.linux@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mlevitsk@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox