From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: Chenyi Qiang <chenyi.qiang@intel.com>, Peter Xu <peterx@redhat.com>
Subject: [PULL 05/37] i386: add notify VM exit support
Date: Tue, 11 Oct 2022 12:26:28 +0200 [thread overview]
Message-ID: <20221011102700.319178-6-pbonzini@redhat.com> (raw)
In-Reply-To: <20221011102700.319178-1-pbonzini@redhat.com>
From: Chenyi Qiang <chenyi.qiang@intel.com>
There are cases that malicious virtual machine can cause CPU stuck (due
to event windows don't open up), e.g., infinite loop in microcode when
nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and
IRQ) can be delivered. It leads the CPU to be unavailable to host or
other VMs. Notify VM exit is introduced to mitigate such kind of
attacks, which will generate a VM exit if no event window occurs in VM
non-root mode for a specified amount of time (notify window).
A new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT is exposed to user space
so that the user can query the capability and set the expected notify
window when creating VMs. The format of the argument when enabling this
capability is as follows:
Bit 63:32 - notify window specified in qemu command
Bit 31:0 - some flags (e.g. KVM_X86_NOTIFY_VMEXIT_ENABLED is set to
enable the feature.)
Users can configure the feature by a new (x86 only) accel property:
qemu -accel kvm,notify-vmexit=run|internal-error|disable,notify-window=n
The default option of notify-vmexit is run, which will enable the
capability and do nothing if the exit happens. The internal-error option
raises a KVM internal error if it happens. The disable option does not
enable the capability. The default value of notify-window is 0. It is valid
only when notify-vmexit is not disabled. The valid range of notify-window
is non-negative. It is even safe to set it to zero since there's an
internal hardware threshold to be added to ensure no false positive.
Because a notify VM exit may happen with VM_CONTEXT_INVALID set in exit
qualification (no cases are anticipated that would set this bit), which
means VM context is corrupted. It would be reflected in the flags of
KVM_EXIT_NOTIFY exit. If KVM_NOTIFY_CONTEXT_INVALID bit is set, raise a KVM
internal error unconditionally.
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20220929072014.20705-5-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
accel/kvm/kvm-all.c | 2 +
qapi/run-state.json | 17 ++++++++
qemu-options.hx | 11 +++++
target/i386/kvm/kvm.c | 98 +++++++++++++++++++++++++++++++++++++++++++
4 files changed, 128 insertions(+)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index fbfe948398..f99b0becd8 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3618,6 +3618,8 @@ static void kvm_accel_instance_init(Object *obj)
s->kernel_irqchip_split = ON_OFF_AUTO_AUTO;
/* KVM dirty ring is by default off */
s->kvm_dirty_ring_size = 0;
+ s->notify_vmexit = NOTIFY_VMEXIT_OPTION_RUN;
+ s->notify_window = 0;
}
/**
diff --git a/qapi/run-state.json b/qapi/run-state.json
index 9273ea6516..49989d30e6 100644
--- a/qapi/run-state.json
+++ b/qapi/run-state.json
@@ -643,3 +643,20 @@
{ 'struct': 'MemoryFailureFlags',
'data': { 'action-required': 'bool',
'recursive': 'bool'} }
+
+##
+# @NotifyVmexitOption:
+#
+# An enumeration of the options specified when enabling notify VM exit
+#
+# @run: enable the feature, do nothing and continue if the notify VM exit happens.
+#
+# @internal-error: enable the feature, raise a internal error if the notify
+# VM exit happens.
+#
+# @disable: disable the feature.
+#
+# Since: 7.2
+##
+{ 'enum': 'NotifyVmexitOption',
+ 'data': [ 'run', 'internal-error', 'disable' ] }
\ No newline at end of file
diff --git a/qemu-options.hx b/qemu-options.hx
index 95b998a13b..606d379186 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -191,6 +191,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
" split-wx=on|off (enable TCG split w^x mapping)\n"
" tb-size=n (TCG translation block cache size)\n"
" dirty-ring-size=n (KVM dirty ring GFN count, default 0)\n"
+ " notify-vmexit=run|internal-error|disable,notify-window=n (enable notify VM exit and set notify window, x86 only)\n"
" thread=single|multi (enable multi-threaded TCG)\n", QEMU_ARCH_ALL)
SRST
``-accel name[,prop=value[,...]]``
@@ -242,6 +243,16 @@ SRST
is disabled (dirty-ring-size=0). When enabled, KVM will instead
record dirty pages in a bitmap.
+ ``notify-vmexit=run|internal-error|disable,notify-window=n``
+ Enables or disables notify VM exit support on x86 host and specify
+ the corresponding notify window to trigger the VM exit if enabled.
+ ``run`` option enables the feature. It does nothing and continue
+ if the exit happens. ``internal-error`` option enables the feature.
+ It raises a internal error. ``disable`` option doesn't enable the feature.
+ This feature can mitigate the CPU stuck issue due to event windows don't
+ open up for a specified of time (i.e. notify-window).
+ Default: notify-vmexit=run,notify-window=0.
+
ERST
DEF("smp", HAS_ARG, QEMU_OPTION_smp,
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f18d21413c..ec63b5eb10 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -15,6 +15,7 @@
#include "qemu/osdep.h"
#include "qapi/qapi-events-run-state.h"
#include "qapi/error.h"
+#include "qapi/visitor.h"
#include <sys/ioctl.h>
#include <sys/utsname.h>
#include <sys/syscall.h>
@@ -2595,6 +2596,21 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
}
}
+ if (s->notify_vmexit != NOTIFY_VMEXIT_OPTION_DISABLE &&
+ kvm_check_extension(s, KVM_CAP_X86_NOTIFY_VMEXIT)) {
+ uint64_t notify_window_flags =
+ ((uint64_t)s->notify_window << 32) |
+ KVM_X86_NOTIFY_VMEXIT_ENABLED |
+ KVM_X86_NOTIFY_VMEXIT_USER;
+ ret = kvm_vm_enable_cap(s, KVM_CAP_X86_NOTIFY_VMEXIT, 0,
+ notify_window_flags);
+ if (ret < 0) {
+ error_report("kvm: Failed to enable notify vmexit cap: %s",
+ strerror(-ret));
+ return ret;
+ }
+ }
+
return 0;
}
@@ -5137,6 +5153,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
X86CPU *cpu = X86_CPU(cs);
uint64_t code;
int ret;
+ bool ctx_invalid;
+ char str[256];
+ KVMState *state;
switch (run->exit_reason) {
case KVM_EXIT_HLT:
@@ -5192,6 +5211,21 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
/* already handled in kvm_arch_post_run */
ret = 0;
break;
+ case KVM_EXIT_NOTIFY:
+ ctx_invalid = !!(run->notify.flags & KVM_NOTIFY_CONTEXT_INVALID);
+ state = KVM_STATE(current_accel());
+ sprintf(str, "Encounter a notify exit with %svalid context in"
+ " guest. There can be possible misbehaves in guest."
+ " Please have a look.", ctx_invalid ? "in" : "");
+ if (ctx_invalid ||
+ state->notify_vmexit == NOTIFY_VMEXIT_OPTION_INTERNAL_ERROR) {
+ warn_report("KVM internal error: %s", str);
+ ret = -1;
+ } else {
+ warn_report_once("KVM: %s", str);
+ ret = 0;
+ }
+ break;
default:
fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
ret = -1;
@@ -5469,6 +5503,70 @@ void kvm_request_xsave_components(X86CPU *cpu, uint64_t mask)
}
}
+static int kvm_arch_get_notify_vmexit(Object *obj, Error **errp)
+{
+ KVMState *s = KVM_STATE(obj);
+ return s->notify_vmexit;
+}
+
+static void kvm_arch_set_notify_vmexit(Object *obj, int value, Error **errp)
+{
+ KVMState *s = KVM_STATE(obj);
+
+ if (s->fd != -1) {
+ error_setg(errp, "Cannot set properties after the accelerator has been initialized");
+ return;
+ }
+
+ s->notify_vmexit = value;
+}
+
+static void kvm_arch_get_notify_window(Object *obj, Visitor *v,
+ const char *name, void *opaque,
+ Error **errp)
+{
+ KVMState *s = KVM_STATE(obj);
+ uint32_t value = s->notify_window;
+
+ visit_type_uint32(v, name, &value, errp);
+}
+
+static void kvm_arch_set_notify_window(Object *obj, Visitor *v,
+ const char *name, void *opaque,
+ Error **errp)
+{
+ KVMState *s = KVM_STATE(obj);
+ Error *error = NULL;
+ uint32_t value;
+
+ if (s->fd != -1) {
+ error_setg(errp, "Cannot set properties after the accelerator has been initialized");
+ return;
+ }
+
+ visit_type_uint32(v, name, &value, &error);
+ if (error) {
+ error_propagate(errp, error);
+ return;
+ }
+
+ s->notify_window = value;
+}
+
void kvm_arch_accel_class_init(ObjectClass *oc)
{
+ object_class_property_add_enum(oc, "notify-vmexit", "NotifyVMexitOption",
+ &NotifyVmexitOption_lookup,
+ kvm_arch_get_notify_vmexit,
+ kvm_arch_set_notify_vmexit);
+ object_class_property_set_description(oc, "notify-vmexit",
+ "Enable notify VM exit");
+
+ object_class_property_add(oc, "notify-window", "uint32",
+ kvm_arch_get_notify_window,
+ kvm_arch_set_notify_window,
+ NULL, NULL);
+ object_class_property_set_description(oc, "notify-window",
+ "Clock cycles without an event window "
+ "after which a notification VM exit occurs");
}
--
2.37.3
next prev parent reply other threads:[~2022-10-11 10:46 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-11 10:26 [PULL 00/37] SCSI, i386 patches for 2022-10-11 Paolo Bonzini
2022-10-11 10:26 ` [PULL 01/37] scsi-disk: support setting CD-ROM block size via device options Paolo Bonzini
2022-10-11 10:26 ` [PULL 02/37] i386: kvm: extend kvm_{get, put}_vcpu_events to support pending triple fault Paolo Bonzini
2022-10-11 10:26 ` [PULL 03/37] kvm: allow target-specific accelerator properties Paolo Bonzini
2022-10-11 10:26 ` [PULL 04/37] kvm: expose struct KVMState Paolo Bonzini
2022-10-11 10:26 ` Paolo Bonzini [this message]
2022-10-11 10:26 ` [PULL 06/37] target/i386: Remove pc_start Paolo Bonzini
2022-10-11 10:26 ` [PULL 07/37] target/i386: Return bool from disas_insn Paolo Bonzini
2022-10-11 10:26 ` [PULL 08/37] target/i386: Remove cur_eip argument to gen_exception Paolo Bonzini
2022-10-11 10:26 ` [PULL 09/37] target/i386: Remove cur_eip, next_eip arguments to gen_interrupt Paolo Bonzini
2022-10-11 10:26 ` [PULL 10/37] target/i386: Create gen_update_eip_cur Paolo Bonzini
2022-10-11 10:26 ` [PULL 11/37] target/i386: Create gen_update_eip_next Paolo Bonzini
2022-10-11 10:26 ` [PULL 12/37] target/i386: Introduce DISAS_EOB* Paolo Bonzini
2022-10-11 10:26 ` [PULL 13/37] target/i386: Use DISAS_EOB* in gen_movl_seg_T0 Paolo Bonzini
2022-10-11 10:26 ` [PULL 14/37] target/i386: Use DISAS_EOB_NEXT Paolo Bonzini
2022-10-11 10:26 ` [PULL 15/37] target/i386: USe DISAS_EOB_ONLY Paolo Bonzini
2022-10-11 10:26 ` [PULL 16/37] target/i386: Create cur_insn_len, cur_insn_len_i32 Paolo Bonzini
2022-10-11 10:26 ` [PULL 17/37] target/i386: Remove cur_eip, next_eip arguments to gen_repz* Paolo Bonzini
2022-10-11 10:26 ` [PULL 18/37] target/i386: Introduce DISAS_JUMP Paolo Bonzini
2022-10-11 10:26 ` [PULL 19/37] target/i386: Truncate values for lcall_real to i32 Paolo Bonzini
2022-10-11 10:26 ` [PULL 20/37] target/i386: Create eip_next_* Paolo Bonzini
2022-10-11 10:26 ` [PULL 21/37] target/i386: Use DISAS_TOO_MANY to exit after gen_io_start Paolo Bonzini
2022-10-11 10:26 ` [PULL 22/37] target/i386: Create gen_jmp_rel Paolo Bonzini
2022-10-11 10:26 ` [PULL 23/37] target/i386: Use gen_jmp_rel for loop, repz, jecxz insns Paolo Bonzini
2022-10-11 10:26 ` [PULL 24/37] target/i386: Use gen_jmp_rel for gen_jcc Paolo Bonzini
2022-10-11 10:26 ` [PULL 25/37] target/i386: Use gen_jmp_rel for DISAS_TOO_MANY Paolo Bonzini
2022-10-11 10:26 ` [PULL 26/37] target/i386: Remove MemOp argument to gen_op_j*_ecx Paolo Bonzini
2022-10-11 10:26 ` [PULL 27/37] target/i386: Merge gen_jmp_tb and gen_goto_tb into gen_jmp_rel Paolo Bonzini
2022-10-11 10:26 ` [PULL 28/37] target/i386: Create eip_cur_tl Paolo Bonzini
2022-10-11 10:26 ` [PULL 29/37] target/i386: Add cpu_eip Paolo Bonzini
2022-10-11 10:26 ` [PULL 30/37] target/i386: Inline gen_jmp_im Paolo Bonzini
2022-10-11 10:26 ` [PULL 31/37] target/i386: Enable TARGET_TB_PCREL Paolo Bonzini
2022-10-11 10:26 ` [PULL 32/37] x86: Implement MSR_CORE_THREAD_COUNT MSR Paolo Bonzini
2022-10-11 10:26 ` [PULL 33/37] i386: kvm: Add support for MSR filtering Paolo Bonzini
2022-10-11 10:26 ` [PULL 34/37] KVM: x86: Implement MSR_CORE_THREAD_COUNT MSR Paolo Bonzini
2022-10-11 10:26 ` [PULL 35/37] linux-user: i386/signal: move fpstate at the end of the 32-bit frames Paolo Bonzini
2022-10-11 10:26 ` [PULL 36/37] linux-user: i386/signal: support FXSAVE fpstate on 32-bit emulation Paolo Bonzini
2022-10-11 10:27 ` [PULL 37/37] linux-user: i386/signal: support XSAVE/XRSTOR for signal frame fpstate Paolo Bonzini
2022-10-13 20:29 ` [PULL 00/37] SCSI, i386 patches for 2022-10-11 Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221011102700.319178-6-pbonzini@redhat.com \
--to=pbonzini@redhat.com \
--cc=chenyi.qiang@intel.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).