From: Anish Moorthy <amoorthy@google.com>
To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org,
kvmarm@lists.linux.dev
Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com,
jthoughton@google.com, amoorthy@google.com, bgardon@google.com,
dmatlack@google.com, ricarkol@google.com,
axelrasmussen@google.com, peterx@redhat.com,
nadav.amit@gmail.com, isaku.yamahata@gmail.com
Subject: [PATCH v4 09/16] KVM: Introduce KVM_CAP_NOWAIT_ON_FAULT without implementation
Date: Fri, 2 Jun 2023 16:19:14 +0000 [thread overview]
Message-ID: <20230602161921.208564-10-amoorthy@google.com> (raw)
In-Reply-To: <20230602161921.208564-1-amoorthy@google.com>
Add documentation, memslot flags, useful helper functions, and the
actual new capability itself.
Memory fault exits on absent mappings are particularly useful for
userfaultfd-based postcopy live migration. When many vCPUs fault on a
single userfaultfd the faults can take a while to surface to userspace
due to having to contend for uffd wait queue locks. Bypassing the uffd
entirely by returning information directly to the vCPU exit avoids this
contention and improves the fault rate.
Suggested-by: James Houghton <jthoughton@google.com>
Signed-off-by: Anish Moorthy <amoorthy@google.com>
---
Documentation/virt/kvm/api.rst | 32 +++++++++++++++++++++++++++++---
include/linux/kvm_host.h | 6 ++++++
include/uapi/linux/kvm.h | 2 ++
tools/include/uapi/linux/kvm.h | 1 +
virt/kvm/kvm_main.c | 3 +++
5 files changed, 41 insertions(+), 3 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 5b24059143b3..9daadbe2c7ed 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1312,6 +1312,7 @@ yet and must be cleared on entry.
/* for kvm_userspace_memory_region::flags */
#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
#define KVM_MEM_READONLY (1UL << 1)
+ #define KVM_MEM_NOWAIT_ON_FAULT (1UL << 2)
This ioctl allows the user to create, modify or delete a guest physical
memory slot. Bits 0-15 of "slot" specify the slot id and this value
@@ -1342,12 +1343,15 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
be identical. This allows large pages in the guest to be backed by large
pages in the host.
-The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and
-KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of
+The flags field supports three flags
+
+1. KVM_MEM_LOG_DIRTY_PAGES: can be set to instruct KVM to keep track of
writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to
-use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it,
+use it.
+2. KVM_MEM_READONLY: can be set, if KVM_CAP_READONLY_MEM capability allows it,
to make a new slot read-only. In this case, writes to this memory will be
posted to userspace as KVM_EXIT_MMIO exits.
+3. KVM_MEM_NOWAIT_ON_FAULT: see KVM_CAP_NOWAIT_ON_FAULT for details.
When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
the memory region are automatically reflected into the guest. For example, an
@@ -7776,6 +7780,28 @@ userspace may receive "bare" EFAULTs (i.e. exit reason != KVM_EXIT_MEMORY_FAULT)
from KVM_RUN for failures which may be resolvable. These should be considered
bugs and reported to the maintainers so that annotations can be added.
+7.35 KVM_CAP_NOWAIT_ON_FAULT
+----------------------------
+
+:Architectures: None
+:Returns: -EINVAL.
+
+The presence of this capability indicates that userspace may pass the
+KVM_MEM_NOWAIT_ON_FAULT flag to KVM_SET_USER_MEMORY_REGION to cause KVM_RUN
+to fail (-EFAULT) in response to page faults for which resolution would require
+the faulting thread to sleep.
+
+The range of guest physical memory causing the fault is advertised to userspace
+through KVM_CAP_MEMORY_FAULT_INFO.
+
+Userspace should determine how best to make the mapping present, then take
+appropriate action. For instance establishing the mapping could involve a
+MADV_POPULATE_READ|WRITE, in the context of userfaultfd a UFFDIO_COPY|CONTINUE
+could be appropriate, etc. After establishing the mapping, userspace can return
+to KVM to retry the memory access.
+
+Attempts to enable this capability directly will fail.
+
8. Other capabilities.
======================
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 69a221f71914..abbc5dd72292 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2297,4 +2297,10 @@ static inline void kvm_account_pgtable_pages(void *virt, int nr)
*/
inline void kvm_populate_efault_info(struct kvm_vcpu *vcpu,
uint64_t gpa, uint64_t len, uint64_t flags);
+
+static inline bool kvm_slot_nowait_on_fault(
+ const struct kvm_memory_slot *slot)
+{
+ return slot->flags & KVM_MEM_NOWAIT_ON_FAULT;
+}
#endif
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 143abb334f56..595c3d7d36aa 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -102,6 +102,7 @@ struct kvm_userspace_memory_region {
*/
#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
#define KVM_MEM_READONLY (1UL << 1)
+#define KVM_MEM_NOWAIT_ON_FAULT (1UL << 2)
/* for KVM_IRQ_LINE */
struct kvm_irq_level {
@@ -1198,6 +1199,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_PMU_EVENT_MASKED_EVENTS 226
#define KVM_CAP_COUNTER_OFFSET 227
#define KVM_CAP_MEMORY_FAULT_INFO 228
+#define KVM_CAP_NOWAIT_ON_FAULT 229
#ifdef KVM_CAP_IRQ_ROUTING
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index 5476fe169921..f64845cd599f 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -102,6 +102,7 @@ struct kvm_userspace_memory_region {
*/
#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
#define KVM_MEM_READONLY (1UL << 1)
+#define KVM_MEM_NOWAIT_ON_FAULT (1UL << 2)
/* for KVM_IRQ_LINE */
struct kvm_irq_level {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 05d6e7e3994d..2c276d4d0821 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1527,6 +1527,9 @@ static int check_memory_region_flags(const struct kvm_userspace_memory_region *m
valid_flags |= KVM_MEM_READONLY;
#endif
+ if (kvm_vm_ioctl_check_extension(NULL, KVM_CAP_NOWAIT_ON_FAULT))
+ valid_flags |= KVM_MEM_NOWAIT_ON_FAULT;
+
if (mem->flags & ~valid_flags)
return -EINVAL;
--
2.41.0.rc0.172.g3f132b7071-goog
next prev parent reply other threads:[~2023-06-02 16:20 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-02 16:19 [PATCH v4 00/16] Improve scalability of KVM + userfaultfd live migration via annotated memory faults Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 01/16] KVM: Allow hva_pfn_fast() to resolve read-only faults Anish Moorthy
2023-06-14 14:39 ` Sean Christopherson
2023-06-14 16:57 ` Anish Moorthy
2023-08-10 19:54 ` Anish Moorthy
2023-08-10 23:48 ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 02/16] KVM: x86: Set vCPU exit reason to KVM_EXIT_UNKNOWN at the start of KVM_RUN Anish Moorthy
2023-06-02 20:30 ` Isaku Yamahata
2023-06-05 16:41 ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 03/16] KVM: Add KVM_CAP_MEMORY_FAULT_INFO Anish Moorthy
2023-06-03 16:58 ` Isaku Yamahata
2023-06-05 16:37 ` Anish Moorthy
2023-06-14 14:55 ` Sean Christopherson
2023-06-05 17:46 ` Anish Moorthy
2023-06-14 17:35 ` Sean Christopherson
2023-06-20 21:13 ` Anish Moorthy
2023-07-07 11:50 ` Kautuk Consul
2023-07-10 15:00 ` Anish Moorthy
2023-07-11 3:54 ` Kautuk Consul
2023-07-11 14:25 ` Sean Christopherson
2023-08-11 22:12 ` Anish Moorthy
2023-08-14 18:01 ` Sean Christopherson
2023-08-15 0:06 ` Anish Moorthy
2023-08-15 0:43 ` Sean Christopherson
2023-08-15 17:01 ` Anish Moorthy
2023-08-16 15:58 ` Sean Christopherson
2023-08-16 21:28 ` Anish Moorthy
2023-08-17 23:58 ` Sean Christopherson
2023-08-18 17:32 ` Anish Moorthy
2023-08-23 22:20 ` Sean Christopherson
2023-08-23 23:38 ` Anish Moorthy
2023-08-24 17:24 ` Sean Christopherson
2023-08-17 22:55 ` Anish Moorthy
2023-07-05 8:21 ` Kautuk Consul
2023-06-02 16:19 ` [PATCH v4 04/16] KVM: Add docstrings to __kvm_write_guest_page() and __kvm_read_guest_page() Anish Moorthy
2023-06-15 2:41 ` Robert Hoo
2023-08-14 22:51 ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 05/16] KVM: Annotate -EFAULTs from kvm_vcpu_write_guest_page() Anish Moorthy
2023-06-14 19:10 ` Sean Christopherson
2023-07-06 22:51 ` Anish Moorthy
2023-07-12 14:08 ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 06/16] KVM: Annotate -EFAULTs from kvm_vcpu_read_guest_page() Anish Moorthy
2023-06-14 19:22 ` Sean Christopherson
2023-07-07 17:35 ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 07/16] KVM: Simplify error handling in __gfn_to_pfn_memslot() Anish Moorthy
2023-06-14 19:26 ` Sean Christopherson
2023-07-07 17:33 ` Anish Moorthy
2023-07-10 17:40 ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 08/16] KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() Anish Moorthy
2023-06-14 20:03 ` Sean Christopherson
2023-07-07 18:05 ` Anish Moorthy
2023-06-15 2:43 ` Robert Hoo
2023-06-15 14:40 ` Sean Christopherson
2023-06-02 16:19 ` Anish Moorthy [this message]
2023-06-14 20:11 ` [PATCH v4 09/16] KVM: Introduce KVM_CAP_NOWAIT_ON_FAULT without implementation Sean Christopherson
2023-07-06 19:04 ` Anish Moorthy
2023-06-14 21:20 ` Sean Christopherson
2023-06-14 21:23 ` Sean Christopherson
2023-08-23 21:17 ` Anish Moorthy
2023-06-15 3:55 ` Wang, Wei W
2023-06-15 14:56 ` Sean Christopherson
2023-06-16 12:08 ` Wang, Wei W
2023-07-07 18:13 ` Anish Moorthy
2023-07-07 20:07 ` Anish Moorthy
2023-07-11 15:29 ` Sean Christopherson
2023-08-25 0:15 ` Anish Moorthy
2023-08-29 22:41 ` Sean Christopherson
2023-08-30 16:21 ` Anish Moorthy
2023-09-07 21:17 ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 10/16] KVM: x86: Implement KVM_CAP_NOWAIT_ON_FAULT Anish Moorthy
2023-06-14 20:25 ` Sean Christopherson
2023-07-07 17:41 ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 11/16] KVM: arm64: " Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 12/16] KVM: selftests: Report per-vcpu demand paging rate from demand paging test Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 13/16] KVM: selftests: Allow many vCPUs and reader threads per UFFD in " Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 14/16] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 15/16] KVM: selftests: Add memslot_flags parameter to memstress_create_vm() Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 16/16] KVM: selftests: Handle memory fault exits in demand_paging_test Anish Moorthy
2023-06-20 2:44 ` Robert Hoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230602161921.208564-10-amoorthy@google.com \
--to=amoorthy@google.com \
--cc=axelrasmussen@google.com \
--cc=bgardon@google.com \
--cc=dmatlack@google.com \
--cc=isaku.yamahata@gmail.com \
--cc=jthoughton@google.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=maz@kernel.org \
--cc=nadav.amit@gmail.com \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=ricarkol@google.com \
--cc=robert.hoo.linux@gmail.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.