From: bugzilla-daemon@kernel.org
To: kvm@vger.kernel.org
Subject: [Bug 217304] KVM does not handle NMI blocking correctly in nested virtualization
Date: Wed, 12 Apr 2023 20:50:24 +0000 [thread overview]
Message-ID: <bug-217304-28872-Omt4TdrpiW@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-217304-28872@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=217304
--- Comment #4 from Eric Li (lixiaoyi13691419520@gmail.com) ---
在 2023-04-12星期三的 17:00 +0000,bugzilla-daemon@kernel.org写道:
> https://bugzilla.kernel.org/show_bug.cgi?id=217304
>
> --- Comment #3 from Sean Christopherson (seanjc@google.com) ---
> On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=217304
> >
> > --- Comment #1 from Sean Christopherson (seanjc@google.com) ---
> > On Thu, Apr 06, 2023, bugzilla-daemon@kernel.org wrote:
> > > Assume KVM runs in L0, LHV runs in L1, the nested guest runs in
> > > L2.
> > >
> > > The code in LHV performs an experiment (called "Experiment 13" in
> > > serial
> > > output) on CPU 0 to test the behavior of NMI blocking. The
> > > experiment steps
> > > are:
> > > 1. Prepare state such that the CPU is currently in L1 (LHV), and
> > > NMI is
> > > blocked
> > > 2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled
> > > (NMI exiting
> > =
> > > 1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI
> > > = 0)
> > > 3. VM entry to L2
> > > 4. L2 performs VMCALL, get VM exit to L1
> > > 5. L1 checks whether NMI is blocked.
> > >
> > > The expected behavior is that NMI should be blocked, which is
> > > reproduced on
> > > real hardware. According to Intel SDM, NMIs should be unblocked
> > > after VM
> > > entry
> > > to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does
> > > not change,
> > > so
> > > NMIs are still unblocked. This behavior is reproducible on real
> > > hardware.
> > >
> > > However, when running on KVM, the experiment shows that at step
> > > 5, NMIs are
> > > blocked in L1. Thus, I think NMI blocking is not implemented
> > > correctly in
> > > KVM's
> > > nested virtualization.
> >
> > Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock
> > NMIs for all
> > other
> > exit types. I believe this is the fix (untested):
> >
> > ---
> > arch/x86/kvm/vmx/nested.c | 12 +++++++-----
> > 1 file changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 96ede74a6067..4240a052628a 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -4164,12 +4164,7 @@ static int vmx_check_nested_events(struct
> > kvm_vcpu
> > *vcpu)
> > nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
> > NMI_VECTOR | INTR_TYPE_NMI_INTR |
> > INTR_INFO_VALID_MASK, 0);
> > - /*
> > - * The NMI-triggered VM exit counts as injection:
> > - * clear this one and block further NMIs.
> > - */
> > vcpu->arch.nmi_pending = 0;
> > - vmx_set_nmi_mask(vcpu, true);
> > return 0;
> > }
> >
> > @@ -4865,6 +4860,13 @@ void nested_vmx_vmexit(struct kvm_vcpu
> > *vcpu, u32
> > vm_exit_reason,
> > INTR_INFO_VALID_MASK |
> > INTR_TYPE_EXT_INTR;
> > }
> >
> > + /*
> > + * NMIs are blocked on VM-Exit due to NMI, and
> > unblocked by
> > all
> > + * other VM-Exit types.
> > + */
> > + vmx_set_nmi_mask(vcpu, (u16)vm_exit_reason ==
> > EXIT_REASON_EXCEPTION_NMI &&
> > + !is_nmi(vmcs12-
> > >vm_exit_intr_info));
>
> Ugh, this is wrong. As Eric stated in the bug report, and per
> section "27.5.5
> Updating Non-Register State", VM-Exit does *not* affect NMI blocking
> except if
> the VM-Exit is directly due to an NMI
>
> Event blocking is affected as follows:
> * There is no blocking by STI or by MOV SS after a VM exit.
> * VM exits caused directly by non-maskable interrupts (NMIs)
> cause blocking
> by
> NMI (see Table 24-3). Other VM exits do not affect blocking by
> NMI. (See
> Section 27.1 for the case in which an NMI causes a VM exit
> indirectly.)
>
Correct. In my experiment, NMI is unblocked at VMENTRY. VMEXIT does not
change NMI blocking (i.e. remain unblocked).
> The scenario here is that virtual NMIs are enabled, in which case
> case
> VM-Enter,
> not VM-Exit, effectively clears NMI blocking. From "26.7.1
> Interruptibility
> State":
>
> The blocking of non-maskable interrupts (NMIs) is determined as
> follows:
> * If the "virtual NMIs" VM-execution control is 0, NMIs are
> blocked if and
> only if bit 3 (blocking by NMI) in the interruptibility-state
> field is 1.
> If the "NMI exiting" VM-execution control is 0, execution of
> the IRET
> instruction removes this blocking (even if the instruction
> generates a
> fault).
> If the "NMI exiting" control is 1, IRET does not affect this
> blocking.
> * The following items describe the use of bit 3 (blocking by NMI)
> in the
> interruptibility-state field if the "virtual NMIs" VM-execution
> control
> is 1:
> * The bit’s value does not affect the blocking of NMIs after
> VM entry.
> NMIs
> are not blocked in VMX non-root operation (except for
> ordinary
> blocking
> for other reasons, such as by the MOV SS instruction, the
> wait-for-SIPI
> state, etc.)
> * The bit’s value determines whether there is virtual-NMI
> blocking
> after VM
> entry. If the bit is 1, virtual-NMI blocking is in effect
> after VM
> entry.
> If the bit is 0, there is no virtual-NMI blocking after VM
> entry
> unless
> the VM entry is injecting an NMI (see Section 26.6.1.1).
> Execution of
> IRET
> removes virtual-NMI blocking (even if the instruction
> generates a
> fault).
>
> I.e. forcing NMIs to be unblocked is wrong when virtual NMIs are
> disabled.
>
> Unfortunately, that means fixing this will require a much more
> involved patch
> (series?), e.g. KVM can't modify NMI blocking until the VM-Enter is
> successful,
> at which point vmcs02, not vmcs01, is loaded, and so KVM will likely
> need to
> to track NMI blocking in a software variable. That in turn gets
> complicated by
> the !vNMI case, because then KVM needs to propagate NMI blocking
> between
> vmcs01,
> vmcs12, and vmcs02. Blech.
>
Yes, the implementation to handle NMI perfectly in nested
virtualization may be complicated. There are many strange cases to
think about (e.g. priority between NMI window VM-exit and NMI
interrupts).
> I'm going to punt fixing this due to lack of bandwidth, and AFAIK
> lack of a use
> case beyond testing. Hopefully I'll be able to revisit this in a few
> weeks,
> but
> that might be wishful thinking.
>
I agree. This case probably only appears in testing. I can't think of a
reasonable reason for a hypervisor to perform VM-enter with NMIs
blocked.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
prev parent reply other threads:[~2023-04-12 20:50 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-06 4:09 [Bug 217304] New: KVM does not handle NMI blocking correctly in nested virtualization bugzilla-daemon
2023-04-06 19:14 ` Sean Christopherson
2023-04-06 19:14 ` [Bug 217304] " bugzilla-daemon
2023-04-12 17:00 ` Sean Christopherson
2023-04-07 20:14 ` bugzilla-daemon
2023-04-12 17:00 ` bugzilla-daemon
2023-04-12 20:50 ` bugzilla-daemon [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-217304-28872-Omt4TdrpiW@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@kernel.org \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.