* [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization
@ 2023-04-06 4:09 bugzilla-daemon
2023-04-06 19:14 ` Sean Christopherson
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: bugzilla-daemon @ 2023-04-06 4:09 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=217304
Bug ID: 217304
Summary: KVM does not handle NMI blocking correctly in nested
virtualization
Product: Virtualization
Version: unspecified
Hardware: Intel
OS: Linux
Status: NEW
Severity: normal
Priority: P1
Component: kvm
Assignee: virtualization_kvm@kernel-bugs.osdl.org
Reporter: lixiaoyi13691419520@gmail.com
Regression: No
Created attachment 304088
--> https://bugzilla.kernel.org/attachment.cgi?id=304088&action=edit
LHV image to reproduce this bug (c.img), compressed with xz
CPU model: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
Host kernel version: 6.2.8-200.fc37.x86_64
Host kernel arch: x86_64
Guest: a micro-hypervisor (called LHV, 32-bits), which runs a 32-bit guest
(called "nested guest").
QEMU command line: qemu-system-x86_64 -m 192M -smp 2 -cpu Haswell,vmx=yes
-enable-kvm -serial stdio -drive media=disk,file=c.img,index=1
This bug still exists if using -machine kernel_irqchip=off
This problem cannot be tested with -accel tcg , because the guest requires
nested virtualization
To reproduce this bug:
1. Download c.img.xz (attached with this bug), decompress to get c.img. Related
source code of this LHV image is in
https://github.com/lxylxy123456/uberxmhf/blob/a12638ef90dac430dd18d62cd29aa967826fecc9/xmhf/src/xmhf-core/xmhf-runtime/xmhf-startup/lhv-guest.c#L871
.
2. Run the QEMU command line above
3. See the following output in serial port (should be within 10 seconds):
Detecting environment
QEMU / KVM detected
End detecting environment
Experiment: 13
Enter host, exp=13, state=0
hlt_wait() begin, source = EXIT_NMI_H (5)
Inject NMI
Interrupt recorded: EXIT_NMI_H (5)
hlt_wait() end
hlt_wait() begin, source = EXIT_TIMER_H (6)
Inject interrupt
Interrupt recorded: EXIT_TIMER_H (6)
hlt_wait() end
Leave host
Enter host, exp=13, state=1
hlt_wait() begin, source = EXIT_NMI_H (5)
Inject NMI
Strange wakeup from HLT
Inject interrupt
Interrupt recorded: EXIT_TIMER_H (6)
(empty line)
source: EXIT_NMI_H (5)
exit_source: EXIT_TIMER_H (6)
TEST_ASSERT '0 && (exit_source == source)' failed, line 365, file lhv-guest.c
qemu: terminating on signal 2
The expected output is (reproducible on real Intel CPUs with >= 2 CPUs):
Detecting environment
End detecting environment
Experiment: 13
Enter host, exp=13, state=0
hlt_wait() begin, source = EXIT_NMI_H (5)
Inject NMI
Interrupt recorded: EXIT_NMI_H (5)
hlt_wait() end
hlt_wait() begin, source = EXIT_TIMER_H (6)
Inject interrupt
Interrupt recorded: EXIT_TIMER_H (6)
hlt_wait() end
Leave host
Enter host, exp=13, state=1
hlt_wait() begin, source = EXIT_NMI_H (5)
Inject NMI
Interrupt recorded: EXIT_NMI_H (5)
hlt_wait() end
iret_wait() begin, source = EXIT_MEASURE (1)
iret_wait() end
hlt_wait() begin, source = EXIT_TIMER_H (6)
Inject interrupt
Interrupt recorded: EXIT_TIMER_H (6)
hlt_wait() end
Leave host
Experiment: 1
... (endless)
Explanation:
Assume KVM runs in L0, LHV runs in L1, the nested guest runs in L2.
The code in LHV performs an experiment (called "Experiment 13" in serial
output) on CPU 0 to test the behavior of NMI blocking. The experiment steps
are:
1. Prepare state such that the CPU is currently in L1 (LHV), and NMI is blocked
2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled (NMI exiting =
1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI = 0)
3. VM entry to L2
4. L2 performs VMCALL, get VM exit to L1
5. L1 checks whether NMI is blocked.
The expected behavior is that NMI should be blocked, which is reproduced on
real hardware. According to Intel SDM, NMIs should be unblocked after VM entry
to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does not change, so
NMIs are still unblocked. This behavior is reproducible on real hardware.
However, when running on KVM, the experiment shows that at step 5, NMIs are
blocked in L1. Thus, I think NMI blocking is not implemented correctly in KVM's
nested virtualization.
I am happy to explain how the experiment code works in detail. c.img also
reveals other NMI-related bugs in KVM. I am also happy to explain the other
bugs.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization
2023-04-06 4:09 [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization bugzilla-daemon
@ 2023-04-06 19:14 ` Sean Christopherson
2023-04-06 19:14 ` [Bug 217304] " bugzilla-daemon
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Sean Christopherson @ 2023-04-06 19:14 UTC (permalink / raw)
To: bugzilla-daemon; +Cc: kvm
On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> Assume KVM runs in L0, LHV runs in L1, the nested guest runs in L2.
>
> The code in LHV performs an experiment (called "Experiment 13" in serial
> output) on CPU 0 to test the behavior of NMI blocking. The experiment steps
> are:
> 1. Prepare state such that the CPU is currently in L1 (LHV), and NMI is blocked
> 2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled (NMI exiting =
> 1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI = 0)
> 3. VM entry to L2
> 4. L2 performs VMCALL, get VM exit to L1
> 5. L1 checks whether NMI is blocked.
>
> The expected behavior is that NMI should be blocked, which is reproduced on
> real hardware. According to Intel SDM, NMIs should be unblocked after VM entry
> to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does not change, so
> NMIs are still unblocked. This behavior is reproducible on real hardware.
>
> However, when running on KVM, the experiment shows that at step 5, NMIs are
> blocked in L1. Thus, I think NMI blocking is not implemented correctly in KVM's
> nested virtualization.
Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock NMIs for all other
exit types. I believe this is the fix (untested):
---
arch/x86/kvm/vmx/nested.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 96ede74a6067..4240a052628a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4164,12 +4164,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
NMI_VECTOR | INTR_TYPE_NMI_INTR |
INTR_INFO_VALID_MASK, 0);
- /*
- * The NMI-triggered VM exit counts as injection:
- * clear this one and block further NMIs.
- */
vcpu->arch.nmi_pending = 0;
- vmx_set_nmi_mask(vcpu, true);
return 0;
}
@@ -4865,6 +4860,13 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
}
+ /*
+ * NMIs are blocked on VM-Exit due to NMI, and unblocked by all
+ * other VM-Exit types.
+ */
+ vmx_set_nmi_mask(vcpu, (u16)vm_exit_reason == EXIT_REASON_EXCEPTION_NMI &&
+ !is_nmi(vmcs12->vm_exit_intr_info));
+
if (vm_exit_reason != -1)
trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
vmcs12->exit_qualification,
base-commit: 0b87a6bfd1bdb47b766aa0641b7cf93f3d3227e9
--
> I am happy to explain how the experiment code works in detail. c.img also
> reveals other NMI-related bugs in KVM. I am also happy to explain the other
> bugs.
I'm not sure I want to know ;-) If you can give a quick rundown of each bug, it
would be quite helpful.
Thanks!
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Bug 217304] KVM does not handle NMI blocking correctly in nested virtualization
2023-04-06 4:09 [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization bugzilla-daemon
2023-04-06 19:14 ` Sean Christopherson
@ 2023-04-06 19:14 ` bugzilla-daemon
2023-04-12 17:00 ` Sean Christopherson
2023-04-07 20:14 ` bugzilla-daemon
` (2 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: bugzilla-daemon @ 2023-04-06 19:14 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=217304
--- Comment #1 from Sean Christopherson (seanjc@google.com) ---
On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> Assume KVM runs in L0, LHV runs in L1, the nested guest runs in L2.
>
> The code in LHV performs an experiment (called "Experiment 13" in serial
> output) on CPU 0 to test the behavior of NMI blocking. The experiment steps
> are:
> 1. Prepare state such that the CPU is currently in L1 (LHV), and NMI is
> blocked
> 2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled (NMI exiting =
> 1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI = 0)
> 3. VM entry to L2
> 4. L2 performs VMCALL, get VM exit to L1
> 5. L1 checks whether NMI is blocked.
>
> The expected behavior is that NMI should be blocked, which is reproduced on
> real hardware. According to Intel SDM, NMIs should be unblocked after VM
> entry
> to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does not change,
> so
> NMIs are still unblocked. This behavior is reproducible on real hardware.
>
> However, when running on KVM, the experiment shows that at step 5, NMIs are
> blocked in L1. Thus, I think NMI blocking is not implemented correctly in
> KVM's
> nested virtualization.
Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock NMIs for all
other
exit types. I believe this is the fix (untested):
---
arch/x86/kvm/vmx/nested.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 96ede74a6067..4240a052628a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4164,12 +4164,7 @@ static int vmx_check_nested_events(struct kvm_vcpu
*vcpu)
nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
NMI_VECTOR | INTR_TYPE_NMI_INTR |
INTR_INFO_VALID_MASK, 0);
- /*
- * The NMI-triggered VM exit counts as injection:
- * clear this one and block further NMIs.
- */
vcpu->arch.nmi_pending = 0;
- vmx_set_nmi_mask(vcpu, true);
return 0;
}
@@ -4865,6 +4860,13 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32
vm_exit_reason,
INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
}
+ /*
+ * NMIs are blocked on VM-Exit due to NMI, and unblocked by all
+ * other VM-Exit types.
+ */
+ vmx_set_nmi_mask(vcpu, (u16)vm_exit_reason ==
EXIT_REASON_EXCEPTION_NMI &&
+ !is_nmi(vmcs12->vm_exit_intr_info));
+
if (vm_exit_reason != -1)
trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
vmcs12->exit_qualification,
base-commit: 0b87a6bfd1bdb47b766aa0641b7cf93f3d3227e9
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Bug 217304] KVM does not handle NMI blocking correctly in nested virtualization
2023-04-06 4:09 [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization bugzilla-daemon
2023-04-06 19:14 ` Sean Christopherson
2023-04-06 19:14 ` [Bug 217304] " bugzilla-daemon
@ 2023-04-07 20:14 ` bugzilla-daemon
2023-04-12 17:00 ` bugzilla-daemon
2023-04-12 20:50 ` bugzilla-daemon
4 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2023-04-07 20:14 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=217304
--- Comment #2 from Eric Li (lixiaoyi13691419520@gmail.com) ---
<bugzilla-daemon@kernel.org> 于2023年4月6日周四 15:14写道:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=217304
>
> --- Comment #1 from Sean Christopherson (seanjc@google.com) ---
> On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock NMIs for all
> other
> exit types. I believe this is the fix (untested):
>
Thanks for the fix. I tested it on Linux 5.19.14, and it passes my experiments.
Detail: I wrote 30 experiments (i.e., tests), numbered from 1 to 30.
Before this bug fix, KVM passes 21 experiments. After this bug fix,
KVM passes 3 more experiments (3, 13, and 14) without introducing any
regression in my experiments. There are 6 more experiments that KVM
still fails (2, 4, 6, 18, 19, and 24). I think we can address them in
another bug on Bugzilla.
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are on the CC list for the bug.
> You reported the bug.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Bug 217304] KVM does not handle NMI blocking correctly in nested virtualization
2023-04-06 19:14 ` [Bug 217304] " bugzilla-daemon
@ 2023-04-12 17:00 ` Sean Christopherson
0 siblings, 0 replies; 7+ messages in thread
From: Sean Christopherson @ 2023-04-12 17:00 UTC (permalink / raw)
To: bugzilla-daemon; +Cc: kvm
On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=217304
>
> --- Comment #1 from Sean Christopherson (seanjc@google.com) ---
> On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> > Assume KVM runs in L0, LHV runs in L1, the nested guest runs in L2.
> >
> > The code in LHV performs an experiment (called "Experiment 13" in serial
> > output) on CPU 0 to test the behavior of NMI blocking. The experiment steps
> > are:
> > 1. Prepare state such that the CPU is currently in L1 (LHV), and NMI is
> > blocked
> > 2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled (NMI exiting =
> > 1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI = 0)
> > 3. VM entry to L2
> > 4. L2 performs VMCALL, get VM exit to L1
> > 5. L1 checks whether NMI is blocked.
> >
> > The expected behavior is that NMI should be blocked, which is reproduced on
> > real hardware. According to Intel SDM, NMIs should be unblocked after VM
> > entry
> > to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does not change,
> > so
> > NMIs are still unblocked. This behavior is reproducible on real hardware.
> >
> > However, when running on KVM, the experiment shows that at step 5, NMIs are
> > blocked in L1. Thus, I think NMI blocking is not implemented correctly in
> > KVM's
> > nested virtualization.
>
> Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock NMIs for all
> other
> exit types. I believe this is the fix (untested):
>
> ---
> arch/x86/kvm/vmx/nested.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 96ede74a6067..4240a052628a 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4164,12 +4164,7 @@ static int vmx_check_nested_events(struct kvm_vcpu
> *vcpu)
> nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
> NMI_VECTOR | INTR_TYPE_NMI_INTR |
> INTR_INFO_VALID_MASK, 0);
> - /*
> - * The NMI-triggered VM exit counts as injection:
> - * clear this one and block further NMIs.
> - */
> vcpu->arch.nmi_pending = 0;
> - vmx_set_nmi_mask(vcpu, true);
> return 0;
> }
>
> @@ -4865,6 +4860,13 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32
> vm_exit_reason,
> INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
> }
>
> + /*
> + * NMIs are blocked on VM-Exit due to NMI, and unblocked by all
> + * other VM-Exit types.
> + */
> + vmx_set_nmi_mask(vcpu, (u16)vm_exit_reason ==
> EXIT_REASON_EXCEPTION_NMI &&
> + !is_nmi(vmcs12->vm_exit_intr_info));
Ugh, this is wrong. As Eric stated in the bug report, and per section "27.5.5
Updating Non-Register State", VM-Exit does *not* affect NMI blocking except if
the VM-Exit is directly due to an NMI
Event blocking is affected as follows:
* There is no blocking by STI or by MOV SS after a VM exit.
* VM exits caused directly by non-maskable interrupts (NMIs) cause blocking by
NMI (see Table 24-3). Other VM exits do not affect blocking by NMI. (See
Section 27.1 for the case in which an NMI causes a VM exit indirectly.)
The scenario here is that virtual NMIs are enabled, in which case case VM-Enter,
not VM-Exit, effectively clears NMI blocking. From "26.7.1 Interruptibility State":
The blocking of non-maskable interrupts (NMIs) is determined as follows:
* If the "virtual NMIs" VM-execution control is 0, NMIs are blocked if and
only if bit 3 (blocking by NMI) in the interruptibility-state field is 1.
If the "NMI exiting" VM-execution control is 0, execution of the IRET
instruction removes this blocking (even if the instruction generates a fault).
If the "NMI exiting" control is 1, IRET does not affect this blocking.
* The following items describe the use of bit 3 (blocking by NMI) in the
interruptibility-state field if the "virtual NMIs" VM-execution control is 1:
* The bit’s value does not affect the blocking of NMIs after VM entry. NMIs
are not blocked in VMX non-root operation (except for ordinary blocking
for other reasons, such as by the MOV SS instruction, the wait-for-SIPI
state, etc.)
* The bit’s value determines whether there is virtual-NMI blocking after VM
entry. If the bit is 1, virtual-NMI blocking is in effect after VM entry.
If the bit is 0, there is no virtual-NMI blocking after VM entry unless
the VM entry is injecting an NMI (see Section 26.6.1.1). Execution of IRET
removes virtual-NMI blocking (even if the instruction generates a fault).
I.e. forcing NMIs to be unblocked is wrong when virtual NMIs are disabled.
Unfortunately, that means fixing this will require a much more involved patch
(series?), e.g. KVM can't modify NMI blocking until the VM-Enter is successful,
at which point vmcs02, not vmcs01, is loaded, and so KVM will likely need to
to track NMI blocking in a software variable. That in turn gets complicated by
the !vNMI case, because then KVM needs to propagate NMI blocking between vmcs01,
vmcs12, and vmcs02. Blech.
I'm going to punt fixing this due to lack of bandwidth, and AFAIK lack of a use
case beyond testing. Hopefully I'll be able to revisit this in a few weeks, but
that might be wishful thinking.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug 217304] KVM does not handle NMI blocking correctly in nested virtualization
2023-04-06 4:09 [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization bugzilla-daemon
` (2 preceding siblings ...)
2023-04-07 20:14 ` bugzilla-daemon
@ 2023-04-12 17:00 ` bugzilla-daemon
2023-04-12 20:50 ` bugzilla-daemon
4 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2023-04-12 17:00 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=217304
--- Comment #3 from Sean Christopherson (seanjc@google.com) ---
On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=217304
>
> --- Comment #1 from Sean Christopherson (seanjc@google.com) ---
> On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> > Assume KVM runs in L0, LHV runs in L1, the nested guest runs in L2.
> >
> > The code in LHV performs an experiment (called "Experiment 13" in serial
> > output) on CPU 0 to test the behavior of NMI blocking. The experiment steps
> > are:
> > 1. Prepare state such that the CPU is currently in L1 (LHV), and NMI is
> > blocked
> > 2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled (NMI exiting
> =
> > 1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI = 0)
> > 3. VM entry to L2
> > 4. L2 performs VMCALL, get VM exit to L1
> > 5. L1 checks whether NMI is blocked.
> >
> > The expected behavior is that NMI should be blocked, which is reproduced on
> > real hardware. According to Intel SDM, NMIs should be unblocked after VM
> > entry
> > to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does not change,
> > so
> > NMIs are still unblocked. This behavior is reproducible on real hardware.
> >
> > However, when running on KVM, the experiment shows that at step 5, NMIs are
> > blocked in L1. Thus, I think NMI blocking is not implemented correctly in
> > KVM's
> > nested virtualization.
>
> Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock NMIs for all
> other
> exit types. I believe this is the fix (untested):
>
> ---
> arch/x86/kvm/vmx/nested.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 96ede74a6067..4240a052628a 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4164,12 +4164,7 @@ static int vmx_check_nested_events(struct kvm_vcpu
> *vcpu)
> nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
> NMI_VECTOR | INTR_TYPE_NMI_INTR |
> INTR_INFO_VALID_MASK, 0);
> - /*
> - * The NMI-triggered VM exit counts as injection:
> - * clear this one and block further NMIs.
> - */
> vcpu->arch.nmi_pending = 0;
> - vmx_set_nmi_mask(vcpu, true);
> return 0;
> }
>
> @@ -4865,6 +4860,13 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32
> vm_exit_reason,
> INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
> }
>
> + /*
> + * NMIs are blocked on VM-Exit due to NMI, and unblocked by
> all
> + * other VM-Exit types.
> + */
> + vmx_set_nmi_mask(vcpu, (u16)vm_exit_reason ==
> EXIT_REASON_EXCEPTION_NMI &&
> + !is_nmi(vmcs12->vm_exit_intr_info));
Ugh, this is wrong. As Eric stated in the bug report, and per section "27.5.5
Updating Non-Register State", VM-Exit does *not* affect NMI blocking except if
the VM-Exit is directly due to an NMI
Event blocking is affected as follows:
* There is no blocking by STI or by MOV SS after a VM exit.
* VM exits caused directly by non-maskable interrupts (NMIs) cause blocking
by
NMI (see Table 24-3). Other VM exits do not affect blocking by NMI. (See
Section 27.1 for the case in which an NMI causes a VM exit indirectly.)
The scenario here is that virtual NMIs are enabled, in which case case
VM-Enter,
not VM-Exit, effectively clears NMI blocking. From "26.7.1 Interruptibility
State":
The blocking of non-maskable interrupts (NMIs) is determined as follows:
* If the "virtual NMIs" VM-execution control is 0, NMIs are blocked if and
only if bit 3 (blocking by NMI) in the interruptibility-state field is 1.
If the "NMI exiting" VM-execution control is 0, execution of the IRET
instruction removes this blocking (even if the instruction generates a
fault).
If the "NMI exiting" control is 1, IRET does not affect this blocking.
* The following items describe the use of bit 3 (blocking by NMI) in the
interruptibility-state field if the "virtual NMIs" VM-execution control
is 1:
* The bit’s value does not affect the blocking of NMIs after VM entry.
NMIs
are not blocked in VMX non-root operation (except for ordinary
blocking
for other reasons, such as by the MOV SS instruction, the
wait-for-SIPI
state, etc.)
* The bit’s value determines whether there is virtual-NMI blocking
after VM
entry. If the bit is 1, virtual-NMI blocking is in effect after VM
entry.
If the bit is 0, there is no virtual-NMI blocking after VM entry
unless
the VM entry is injecting an NMI (see Section 26.6.1.1). Execution of
IRET
removes virtual-NMI blocking (even if the instruction generates a
fault).
I.e. forcing NMIs to be unblocked is wrong when virtual NMIs are disabled.
Unfortunately, that means fixing this will require a much more involved patch
(series?), e.g. KVM can't modify NMI blocking until the VM-Enter is successful,
at which point vmcs02, not vmcs01, is loaded, and so KVM will likely need to
to track NMI blocking in a software variable. That in turn gets complicated by
the !vNMI case, because then KVM needs to propagate NMI blocking between
vmcs01,
vmcs12, and vmcs02. Blech.
I'm going to punt fixing this due to lack of bandwidth, and AFAIK lack of a use
case beyond testing. Hopefully I'll be able to revisit this in a few weeks,
but
that might be wishful thinking.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug 217304] KVM does not handle NMI blocking correctly in nested virtualization
2023-04-06 4:09 [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization bugzilla-daemon
` (3 preceding siblings ...)
2023-04-12 17:00 ` bugzilla-daemon
@ 2023-04-12 20:50 ` bugzilla-daemon
4 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2023-04-12 20:50 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=217304
--- Comment #4 from Eric Li (lixiaoyi13691419520@gmail.com) ---
在 2023-04-12星期三的 17:00 +0000,bugzilla-daemon@kernel.org写道:
> https://bugzilla.kernel.org/show_bug.cgi?id=217304
>
> --- Comment #3 from Sean Christopherson (seanjc@google.com) ---
> On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=217304
> >
> > --- Comment #1 from Sean Christopherson (seanjc@google.com) ---
> > On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> > > Assume KVM runs in L0, LHV runs in L1, the nested guest runs in
> > > L2.
> > >
> > > The code in LHV performs an experiment (called "Experiment 13" in
> > > serial
> > > output) on CPU 0 to test the behavior of NMI blocking. The
> > > experiment steps
> > > are:
> > > 1. Prepare state such that the CPU is currently in L1 (LHV), and
> > > NMI is
> > > blocked
> > > 2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled
> > > (NMI exiting
> > =
> > > 1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI
> > > = 0)
> > > 3. VM entry to L2
> > > 4. L2 performs VMCALL, get VM exit to L1
> > > 5. L1 checks whether NMI is blocked.
> > >
> > > The expected behavior is that NMI should be blocked, which is
> > > reproduced on
> > > real hardware. According to Intel SDM, NMIs should be unblocked
> > > after VM
> > > entry
> > > to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does
> > > not change,
> > > so
> > > NMIs are still unblocked. This behavior is reproducible on real
> > > hardware.
> > >
> > > However, when running on KVM, the experiment shows that at step
> > > 5, NMIs are
> > > blocked in L1. Thus, I think NMI blocking is not implemented
> > > correctly in
> > > KVM's
> > > nested virtualization.
> >
> > Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock
> > NMIs for all
> > other
> > exit types. I believe this is the fix (untested):
> >
> > ---
> > arch/x86/kvm/vmx/nested.c | 12 +++++++-----
> > 1 file changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 96ede74a6067..4240a052628a 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -4164,12 +4164,7 @@ static int vmx_check_nested_events(struct
> > kvm_vcpu
> > *vcpu)
> > nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
> > NMI_VECTOR | INTR_TYPE_NMI_INTR |
> > INTR_INFO_VALID_MASK, 0);
> > - /*
> > - * The NMI-triggered VM exit counts as injection:
> > - * clear this one and block further NMIs.
> > - */
> > vcpu->arch.nmi_pending = 0;
> > - vmx_set_nmi_mask(vcpu, true);
> > return 0;
> > }
> >
> > @@ -4865,6 +4860,13 @@ void nested_vmx_vmexit(struct kvm_vcpu
> > *vcpu, u32
> > vm_exit_reason,
> > INTR_INFO_VALID_MASK |
> > INTR_TYPE_EXT_INTR;
> > }
> >
> > + /*
> > + * NMIs are blocked on VM-Exit due to NMI, and
> > unblocked by
> > all
> > + * other VM-Exit types.
> > + */
> > + vmx_set_nmi_mask(vcpu, (u16)vm_exit_reason ==
> > EXIT_REASON_EXCEPTION_NMI &&
> > + !is_nmi(vmcs12-
> > >vm_exit_intr_info));
>
> Ugh, this is wrong. As Eric stated in the bug report, and per
> section "27.5.5
> Updating Non-Register State", VM-Exit does *not* affect NMI blocking
> except if
> the VM-Exit is directly due to an NMI
>
> Event blocking is affected as follows:
> * There is no blocking by STI or by MOV SS after a VM exit.
> * VM exits caused directly by non-maskable interrupts (NMIs)
> cause blocking
> by
> NMI (see Table 24-3). Other VM exits do not affect blocking by
> NMI. (See
> Section 27.1 for the case in which an NMI causes a VM exit
> indirectly.)
>
Correct. In my experiment, NMI is unblocked at VMENTRY. VMEXIT does not
change NMI blocking (i.e. remain unblocked).
> The scenario here is that virtual NMIs are enabled, in which case
> case
> VM-Enter,
> not VM-Exit, effectively clears NMI blocking. From "26.7.1
> Interruptibility
> State":
>
> The blocking of non-maskable interrupts (NMIs) is determined as
> follows:
> * If the "virtual NMIs" VM-execution control is 0, NMIs are
> blocked if and
> only if bit 3 (blocking by NMI) in the interruptibility-state
> field is 1.
> If the "NMI exiting" VM-execution control is 0, execution of
> the IRET
> instruction removes this blocking (even if the instruction
> generates a
> fault).
> If the "NMI exiting" control is 1, IRET does not affect this
> blocking.
> * The following items describe the use of bit 3 (blocking by NMI)
> in the
> interruptibility-state field if the "virtual NMIs" VM-execution
> control
> is 1:
> * The bit’s value does not affect the blocking of NMIs after
> VM entry.
> NMIs
> are not blocked in VMX non-root operation (except for
> ordinary
> blocking
> for other reasons, such as by the MOV SS instruction, the
> wait-for-SIPI
> state, etc.)
> * The bit’s value determines whether there is virtual-NMI
> blocking
> after VM
> entry. If the bit is 1, virtual-NMI blocking is in effect
> after VM
> entry.
> If the bit is 0, there is no virtual-NMI blocking after VM
> entry
> unless
> the VM entry is injecting an NMI (see Section 26.6.1.1).
> Execution of
> IRET
> removes virtual-NMI blocking (even if the instruction
> generates a
> fault).
>
> I.e. forcing NMIs to be unblocked is wrong when virtual NMIs are
> disabled.
>
> Unfortunately, that means fixing this will require a much more
> involved patch
> (series?), e.g. KVM can't modify NMI blocking until the VM-Enter is
> successful,
> at which point vmcs02, not vmcs01, is loaded, and so KVM will likely
> need to
> to track NMI blocking in a software variable. That in turn gets
> complicated by
> the !vNMI case, because then KVM needs to propagate NMI blocking
> between
> vmcs01,
> vmcs12, and vmcs02. Blech.
>
Yes, the implementation to handle NMI perfectly in nested
virtualization may be complicated. There are many strange cases to
think about (e.g. priority between NMI window VM-exit and NMI
interrupts).
> I'm going to punt fixing this due to lack of bandwidth, and AFAIK
> lack of a use
> case beyond testing. Hopefully I'll be able to revisit this in a few
> weeks,
> but
> that might be wishful thinking.
>
I agree. This case probably only appears in testing. I can't think of a
reasonable reason for a hypervisor to perform VM-enter with NMIs
blocked.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-04-12 20:50 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-06 4:09 [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization bugzilla-daemon
2023-04-06 19:14 ` Sean Christopherson
2023-04-06 19:14 ` [Bug 217304] " bugzilla-daemon
2023-04-12 17:00 ` Sean Christopherson
2023-04-07 20:14 ` bugzilla-daemon
2023-04-12 17:00 ` bugzilla-daemon
2023-04-12 20:50 ` bugzilla-daemon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).