kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anish Moorthy <amoorthy@google.com>
To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org,
	kvmarm@lists.linux.dev
Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com,
	jthoughton@google.com, amoorthy@google.com, bgardon@google.com,
	dmatlack@google.com, ricarkol@google.com,
	axelrasmussen@google.com, peterx@redhat.com,
	nadav.amit@gmail.com, isaku.yamahata@gmail.com
Subject: [PATCH v4 03/16] KVM: Add KVM_CAP_MEMORY_FAULT_INFO
Date: Fri,  2 Jun 2023 16:19:08 +0000	[thread overview]
Message-ID: <20230602161921.208564-4-amoorthy@google.com> (raw)
In-Reply-To: <20230602161921.208564-1-amoorthy@google.com>

KVM_CAP_MEMORY_FAULT_INFO allows kvm_run to return useful information
besides a return value of -1 and errno of EFAULT when a vCPU fails an
access to guest memory which may be resolvable by userspace.

Add documentation, updates to the KVM headers, and a helper function
(kvm_populate_efault_info()) for implementing the capability.

Besides simply filling the run struct, kvm_populate_efault_info() takes
two safety measures

  a. It tries to prevent concurrent fills on a single vCPU run struct
     by checking that the run struct being modified corresponds to the
     currently loaded vCPU.
  b. It tries to avoid filling an already-populated run struct by
     checking whether the exit reason has been modified since entry
     into KVM_RUN.

Finally, mark KVM_CAP_MEMORY_FAULT_INFO as available on arm64 and x86,
even though EFAULT annotation are currently totally absent. Picking a
point to declare the implementation "done" is difficult because

  1. Annotations will be performed incrementally in subsequent commits
     across both core and arch-specific KVM.
  2. The initial series will very likely miss some cases which need
     annotation. Although these omissions are to be fixed in the future,
     userspace thus still needs to expect and be able to handle
     unannotated EFAULTs.

Given these qualifications, just marking it available here seems the
least arbitrary thing to do.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Anish Moorthy <amoorthy@google.com>
---
 Documentation/virt/kvm/api.rst             | 42 ++++++++++++++++++++++
 arch/arm64/kvm/arm.c                       |  1 +
 arch/x86/kvm/x86.c                         |  1 +
 include/linux/kvm_host.h                   |  9 +++++
 include/uapi/linux/kvm.h                   | 13 +++++++
 tools/include/uapi/linux/kvm.h             |  7 ++++
 tools/testing/selftests/kvm/lib/kvm_util.c |  1 +
 virt/kvm/kvm_main.c                        | 35 ++++++++++++++++++
 8 files changed, 109 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index add067793b90..5b24059143b3 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6700,6 +6700,18 @@ array field represents return values. The userspace should update the return
 values of SBI call before resuming the VCPU. For more details on RISC-V SBI
 spec refer, https://github.com/riscv/riscv-sbi-doc.
 
+::
+
+		/* KVM_EXIT_MEMORY_FAULT */
+		struct {
+			__u64 flags;
+			__u64 gpa;
+			__u64 len; /* in bytes */
+		} memory_fault;
+
+Indicates a vCPU memory fault on the guest physical address range
+[gpa, gpa + len). See KVM_CAP_MEMORY_FAULT_INFO for more details.
+
 ::
 
     /* KVM_EXIT_NOTIFY */
@@ -7734,6 +7746,36 @@ This capability is aimed to mitigate the threat that malicious VMs can
 cause CPU stuck (due to event windows don't open up) and make the CPU
 unavailable to host or other VMs.
 
+7.34 KVM_CAP_MEMORY_FAULT_INFO
+------------------------------
+
+:Architectures: x86, arm64
+:Returns: -EINVAL.
+
+The presence of this capability indicates that KVM_RUN may annotate EFAULTs
+returned by KVM_RUN in response to failed vCPU guest memory accesses which
+userspace may be able to resolve.
+
+The annotation is returned via the run struct. When KVM_RUN returns an error
+with errno=EFAULT, userspace may check the exit reason: if it is
+KVM_EXIT_MEMORY_FAULT, userspace is then permitted to read the run struct's
+'memory_fault' field.
+
+This capability is informational only: attempts to KVM_ENABLE_CAP it directly
+will fail.
+
+The 'gpa' and 'len' (in bytes) fields describe the range of guest
+physical memory to which access failed, i.e. [gpa, gpa + len). 'flags' is a
+bitfield indicating the nature of the access: valid masks are
+
+  - KVM_MEMORY_FAULT_FLAG_WRITE:     The failed access was a write.
+  - KVM_MEMORY_FAULT_FLAG_EXEC:      The failed access was an exec.
+
+NOTE: The implementation of this capability is incomplete. Even with it enabled,
+userspace may receive "bare" EFAULTs (i.e. exit reason != KVM_EXIT_MEMORY_FAULT)
+from KVM_RUN for failures which may be resolvable. These should be considered
+bugs and reported to the maintainers so that annotations can be added.
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 14391826241c..b34cf0cedffa 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -234,6 +234,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ARM_SYSTEM_SUSPEND:
 	case KVM_CAP_IRQFD_RESAMPLE:
 	case KVM_CAP_COUNTER_OFFSET:
+	case KVM_CAP_MEMORY_FAULT_INFO:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a7725d41570a..d15bacb3f634 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4497,6 +4497,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ENABLE_CAP:
 	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
 	case KVM_CAP_IRQFD_RESAMPLE:
+	case KVM_CAP_MEMORY_FAULT_INFO:
 		r = 1;
 		break;
 	case KVM_CAP_EXIT_HYPERCALL:
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 0e571e973bc2..69a221f71914 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2288,4 +2288,13 @@ static inline void kvm_account_pgtable_pages(void *virt, int nr)
 /* Max number of entries allowed for each kvm dirty ring */
 #define  KVM_DIRTY_RING_MAX_ENTRIES  65536
 
+/*
+ * Attempts to set the run struct's exit reason to KVM_EXIT_MEMORY_FAULT and
+ * populate the memory_fault field with the given information.
+ *
+ * WARNs and does nothing if the exit reason is not KVM_EXIT_UNKNOWN, or if
+ * 'vcpu' is not the current running vcpu.
+ */
+inline void kvm_populate_efault_info(struct kvm_vcpu *vcpu,
+				     uint64_t gpa, uint64_t len, uint64_t flags);
 #endif
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 737318b1c1d9..143abb334f56 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -264,6 +264,7 @@ struct kvm_xen_exit {
 #define KVM_EXIT_RISCV_SBI        35
 #define KVM_EXIT_RISCV_CSR        36
 #define KVM_EXIT_NOTIFY           37
+#define KVM_EXIT_MEMORY_FAULT     38
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -510,6 +511,12 @@ struct kvm_run {
 #define KVM_NOTIFY_CONTEXT_INVALID	(1 << 0)
 			__u32 flags;
 		} notify;
+		/* KVM_EXIT_MEMORY_FAULT */
+		struct {
+			__u64 flags;
+			__u64 gpa;
+			__u64 len;
+		} memory_fault;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
@@ -1190,6 +1197,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
 #define KVM_CAP_PMU_EVENT_MASKED_EVENTS 226
 #define KVM_CAP_COUNTER_OFFSET 227
+#define KVM_CAP_MEMORY_FAULT_INFO 228
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -2245,4 +2253,9 @@ struct kvm_s390_zpci_op {
 /* flags for kvm_s390_zpci_op->u.reg_aen.flags */
 #define KVM_S390_ZPCIOP_REGAEN_HOST    (1 << 0)
 
+/* flags for KVM_CAP_MEMORY_FAULT_INFO */
+
+#define KVM_MEMORY_FAULT_FLAG_WRITE    (1 << 0)
+#define KVM_MEMORY_FAULT_FLAG_EXEC     (1 << 1)
+
 #endif /* __LINUX_KVM_H */
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index 4003a166328c..5476fe169921 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -264,6 +264,7 @@ struct kvm_xen_exit {
 #define KVM_EXIT_RISCV_SBI        35
 #define KVM_EXIT_RISCV_CSR        36
 #define KVM_EXIT_NOTIFY           37
+#define KVM_EXIT_MEMORY_FAULT     38
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -505,6 +506,12 @@ struct kvm_run {
 #define KVM_NOTIFY_CONTEXT_INVALID	(1 << 0)
 			__u32 flags;
 		} notify;
+		/* KVM_EXIT_MEMORY_FAULT */
+		struct {
+			__u64 flags;
+			__u64 gpa;
+			__u64 len;
+		} memory_fault;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 298c4372fb1a..7d7e9f893fd5 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1868,6 +1868,7 @@ static struct exit_reason {
 #ifdef KVM_EXIT_MEMORY_NOT_PRESENT
 	KVM_EXIT_STRING(MEMORY_NOT_PRESENT),
 #endif
+	KVM_EXIT_STRING(MEMORY_FAULT),
 };
 
 /*
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index fd80a560378c..09d4d85691e1 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4674,6 +4674,9 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
 
 		return r;
 	}
+	case KVM_CAP_MEMORY_FAULT_INFO: {
+		return -EINVAL;
+	}
 	default:
 		return kvm_vm_ioctl_enable_cap(kvm, cap);
 	}
@@ -6173,3 +6176,35 @@ int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn,
 
 	return init_context.err;
 }
+
+inline void kvm_populate_efault_info(struct kvm_vcpu *vcpu,
+				     uint64_t gpa, uint64_t len, uint64_t flags)
+{
+	if (WARN_ON_ONCE(!vcpu))
+		return;
+
+	preempt_disable();
+	/*
+	 * Ensure the this vCPU isn't modifying another vCPU's run struct, which
+	 * would open the door for races between concurrent calls to this
+	 * function.
+	 */
+	if (WARN_ON_ONCE(vcpu != __this_cpu_read(kvm_running_vcpu)))
+		goto out;
+	/*
+	 * Try not to overwrite an already-populated run struct.
+	 * This isn't a perfect solution, as there's no guarantee that the exit
+	 * reason is set before the run struct is populated, but it should prevent
+	 * at least some bugs.
+	 */
+	else if (WARN_ON_ONCE(vcpu->run->exit_reason != KVM_EXIT_UNKNOWN))
+		goto out;
+
+	vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT;
+	vcpu->run->memory_fault.gpa = gpa;
+	vcpu->run->memory_fault.len = len;
+	vcpu->run->memory_fault.flags = flags;
+
+out:
+	preempt_enable();
+}
-- 
2.41.0.rc0.172.g3f132b7071-goog


  parent reply	other threads:[~2023-06-02 16:20 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 16:19 [PATCH v4 00/16] Improve scalability of KVM + userfaultfd live migration via annotated memory faults Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 01/16] KVM: Allow hva_pfn_fast() to resolve read-only faults Anish Moorthy
2023-06-14 14:39   ` Sean Christopherson
2023-06-14 16:57     ` Anish Moorthy
2023-08-10 19:54       ` Anish Moorthy
2023-08-10 23:48         ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 02/16] KVM: x86: Set vCPU exit reason to KVM_EXIT_UNKNOWN at the start of KVM_RUN Anish Moorthy
2023-06-02 20:30   ` Isaku Yamahata
2023-06-05 16:41     ` Anish Moorthy
2023-06-02 16:19 ` Anish Moorthy [this message]
2023-06-03 16:58   ` [PATCH v4 03/16] KVM: Add KVM_CAP_MEMORY_FAULT_INFO Isaku Yamahata
2023-06-05 16:37     ` Anish Moorthy
2023-06-14 14:55       ` Sean Christopherson
2023-06-05 17:46   ` Anish Moorthy
2023-06-14 17:35   ` Sean Christopherson
2023-06-20 21:13     ` Anish Moorthy
2023-07-07 11:50     ` Kautuk Consul
2023-07-10 15:00       ` Anish Moorthy
2023-07-11  3:54         ` Kautuk Consul
2023-07-11 14:25           ` Sean Christopherson
2023-08-11 22:12     ` Anish Moorthy
2023-08-14 18:01       ` Sean Christopherson
2023-08-15  0:06         ` Anish Moorthy
2023-08-15  0:43           ` Sean Christopherson
2023-08-15 17:01             ` Anish Moorthy
2023-08-16 15:58               ` Sean Christopherson
2023-08-16 21:28                 ` Anish Moorthy
2023-08-17 23:58                   ` Sean Christopherson
2023-08-18 17:32                     ` Anish Moorthy
2023-08-23 22:20                       ` Sean Christopherson
2023-08-23 23:38                         ` Anish Moorthy
2023-08-24 17:24                           ` Sean Christopherson
2023-08-17 22:55     ` Anish Moorthy
2023-07-05  8:21   ` Kautuk Consul
2023-06-02 16:19 ` [PATCH v4 04/16] KVM: Add docstrings to __kvm_write_guest_page() and __kvm_read_guest_page() Anish Moorthy
2023-06-15  2:41   ` Robert Hoo
2023-08-14 22:51     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 05/16] KVM: Annotate -EFAULTs from kvm_vcpu_write_guest_page() Anish Moorthy
2023-06-14 19:10   ` Sean Christopherson
2023-07-06 22:51     ` Anish Moorthy
2023-07-12 14:08       ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 06/16] KVM: Annotate -EFAULTs from kvm_vcpu_read_guest_page() Anish Moorthy
2023-06-14 19:22   ` Sean Christopherson
2023-07-07 17:35     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 07/16] KVM: Simplify error handling in __gfn_to_pfn_memslot() Anish Moorthy
2023-06-14 19:26   ` Sean Christopherson
2023-07-07 17:33     ` Anish Moorthy
2023-07-10 17:40       ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 08/16] KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() Anish Moorthy
2023-06-14 20:03   ` Sean Christopherson
2023-07-07 18:05     ` Anish Moorthy
2023-06-15  2:43   ` Robert Hoo
2023-06-15 14:40     ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 09/16] KVM: Introduce KVM_CAP_NOWAIT_ON_FAULT without implementation Anish Moorthy
2023-06-14 20:11   ` Sean Christopherson
2023-07-06 19:04     ` Anish Moorthy
2023-06-14 21:20   ` Sean Christopherson
2023-06-14 21:23     ` Sean Christopherson
2023-08-23 21:17       ` Anish Moorthy
2023-06-15  3:55     ` Wang, Wei W
2023-06-15 14:56       ` Sean Christopherson
2023-06-16 12:08         ` Wang, Wei W
2023-07-07 18:13     ` Anish Moorthy
2023-07-07 20:07       ` Anish Moorthy
2023-07-11 15:29         ` Sean Christopherson
2023-08-25  0:15           ` Anish Moorthy
2023-08-29 22:41             ` Sean Christopherson
2023-08-30 16:21               ` Anish Moorthy
2023-09-07 21:17                 ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 10/16] KVM: x86: Implement KVM_CAP_NOWAIT_ON_FAULT Anish Moorthy
2023-06-14 20:25   ` Sean Christopherson
2023-07-07 17:41     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 11/16] KVM: arm64: " Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 12/16] KVM: selftests: Report per-vcpu demand paging rate from demand paging test Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 13/16] KVM: selftests: Allow many vCPUs and reader threads per UFFD in " Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 14/16] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 15/16] KVM: selftests: Add memslot_flags parameter to memstress_create_vm() Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 16/16] KVM: selftests: Handle memory fault exits in demand_paging_test Anish Moorthy
2023-06-20  2:44   ` Robert Hoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230602161921.208564-4-amoorthy@google.com \
    --to=amoorthy@google.com \
    --cc=axelrasmussen@google.com \
    --cc=bgardon@google.com \
    --cc=dmatlack@google.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ricarkol@google.com \
    --cc=robert.hoo.linux@gmail.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).