From: bugzilla-daemon@kernel.org
To: kvm@vger.kernel.org
Subject: [Bug 217304] KVM does not handle NMI blocking correctly in nested virtualization
Date: Wed, 12 Apr 2023 20:50:24 +0000 [thread overview]
Message-ID: <bug-217304-28872-Omt4TdrpiW@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-217304-28872@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=217304
--- Comment #4 from Eric Li (lixiaoyi13691419520@gmail.com) ---
在 2023-04-12星期三的 17:00 +0000,bugzilla-daemon@kernel.org写道:
> https://bugzilla.kernel.org/show_bug.cgi?id=217304
>
> --- Comment #3 from Sean Christopherson (seanjc@google.com) ---
> On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=217304
> >
> > --- Comment #1 from Sean Christopherson (seanjc@google.com) ---
> > On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> > > Assume KVM runs in L0, LHV runs in L1, the nested guest runs in
> > > L2.
> > >
> > > The code in LHV performs an experiment (called "Experiment 13" in
> > > serial
> > > output) on CPU 0 to test the behavior of NMI blocking. The
> > > experiment steps
> > > are:
> > > 1. Prepare state such that the CPU is currently in L1 (LHV), and
> > > NMI is
> > > blocked
> > > 2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled
> > > (NMI exiting
> > =
> > > 1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI
> > > = 0)
> > > 3. VM entry to L2
> > > 4. L2 performs VMCALL, get VM exit to L1
> > > 5. L1 checks whether NMI is blocked.
> > >
> > > The expected behavior is that NMI should be blocked, which is
> > > reproduced on
> > > real hardware. According to Intel SDM, NMIs should be unblocked
> > > after VM
> > > entry
> > > to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does
> > > not change,
> > > so
> > > NMIs are still unblocked. This behavior is reproducible on real
> > > hardware.
> > >
> > > However, when running on KVM, the experiment shows that at step
> > > 5, NMIs are
> > > blocked in L1. Thus, I think NMI blocking is not implemented
> > > correctly in
> > > KVM's
> > > nested virtualization.
> >
> > Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock
> > NMIs for all
> > other
> > exit types. I believe this is the fix (untested):
> >
> > ---
> > arch/x86/kvm/vmx/nested.c | 12 +++++++-----
> > 1 file changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 96ede74a6067..4240a052628a 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -4164,12 +4164,7 @@ static int vmx_check_nested_events(struct
> > kvm_vcpu
> > *vcpu)
> > nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
> > NMI_VECTOR | INTR_TYPE_NMI_INTR |
> > INTR_INFO_VALID_MASK, 0);
> > - /*
> > - * The NMI-triggered VM exit counts as injection:
> > - * clear this one and block further NMIs.
> > - */
> > vcpu->arch.nmi_pending = 0;
> > - vmx_set_nmi_mask(vcpu, true);
> > return 0;
> > }
> >
> > @@ -4865,6 +4860,13 @@ void nested_vmx_vmexit(struct kvm_vcpu
> > *vcpu, u32
> > vm_exit_reason,
> > INTR_INFO_VALID_MASK |
> > INTR_TYPE_EXT_INTR;
> > }
> >
> > + /*
> > + * NMIs are blocked on VM-Exit due to NMI, and
> > unblocked by
> > all
> > + * other VM-Exit types.
> > + */
> > + vmx_set_nmi_mask(vcpu, (u16)vm_exit_reason ==
> > EXIT_REASON_EXCEPTION_NMI &&
> > + !is_nmi(vmcs12-
> > >vm_exit_intr_info));
>
> Ugh, this is wrong. As Eric stated in the bug report, and per
> section "27.5.5
> Updating Non-Register State", VM-Exit does *not* affect NMI blocking
> except if
> the VM-Exit is directly due to an NMI
>
> Event blocking is affected as follows:
> * There is no blocking by STI or by MOV SS after a VM exit.
> * VM exits caused directly by non-maskable interrupts (NMIs)
> cause blocking
> by
> NMI (see Table 24-3). Other VM exits do not affect blocking by
> NMI. (See
> Section 27.1 for the case in which an NMI causes a VM exit
> indirectly.)
>
Correct. In my experiment, NMI is unblocked at VMENTRY. VMEXIT does not
change NMI blocking (i.e. remain unblocked).
> The scenario here is that virtual NMIs are enabled, in which case
> case
> VM-Enter,
> not VM-Exit, effectively clears NMI blocking. From "26.7.1
> Interruptibility
> State":
>
> The blocking of non-maskable interrupts (NMIs) is determined as
> follows:
> * If the "virtual NMIs" VM-execution control is 0, NMIs are
> blocked if and
> only if bit 3 (blocking by NMI) in the interruptibility-state
> field is 1.
> If the "NMI exiting" VM-execution control is 0, execution of
> the IRET
> instruction removes this blocking (even if the instruction
> generates a
> fault).
> If the "NMI exiting" control is 1, IRET does not affect this
> blocking.
> * The following items describe the use of bit 3 (blocking by NMI)
> in the
> interruptibility-state field if the "virtual NMIs" VM-execution
> control
> is 1:
> * The bit’s value does not affect the blocking of NMIs after
> VM entry.
> NMIs
> are not blocked in VMX non-root operation (except for
> ordinary
> blocking
> for other reasons, such as by the MOV SS instruction, the
> wait-for-SIPI
> state, etc.)
> * The bit’s value determines whether there is virtual-NMI
> blocking
> after VM
> entry. If the bit is 1, virtual-NMI blocking is in effect
> after VM
> entry.
> If the bit is 0, there is no virtual-NMI blocking after VM
> entry
> unless
> the VM entry is injecting an NMI (see Section 26.6.1.1).
> Execution of
> IRET
> removes virtual-NMI blocking (even if the instruction
> generates a
> fault).
>
> I.e. forcing NMIs to be unblocked is wrong when virtual NMIs are
> disabled.
>
> Unfortunately, that means fixing this will require a much more
> involved patch
> (series?), e.g. KVM can't modify NMI blocking until the VM-Enter is
> successful,
> at which point vmcs02, not vmcs01, is loaded, and so KVM will likely
> need to
> to track NMI blocking in a software variable. That in turn gets
> complicated by
> the !vNMI case, because then KVM needs to propagate NMI blocking
> between
> vmcs01,
> vmcs12, and vmcs02. Blech.
>
Yes, the implementation to handle NMI perfectly in nested
virtualization may be complicated. There are many strange cases to
think about (e.g. priority between NMI window VM-exit and NMI
interrupts).
> I'm going to punt fixing this due to lack of bandwidth, and AFAIK
> lack of a use
> case beyond testing. Hopefully I'll be able to revisit this in a few
> weeks,
> but
> that might be wishful thinking.
>
I agree. This case probably only appears in testing. I can't think of a
reasonable reason for a hypervisor to perform VM-enter with NMIs
blocked.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
prev parent reply other threads:[~2023-04-12 20:50 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-06 4:09 [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization bugzilla-daemon
2023-04-06 19:14 ` Sean Christopherson
2023-04-06 19:14 ` [Bug 217304] " bugzilla-daemon
2023-04-12 17:00 ` Sean Christopherson
2023-04-07 20:14 ` bugzilla-daemon
2023-04-12 17:00 ` bugzilla-daemon
2023-04-12 20:50 ` bugzilla-daemon [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-217304-28872-Omt4TdrpiW@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@kernel.org \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox