From: Sean Christopherson <seanjc@google.com>
To: "Li, Xin3" <xin3.li@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Paolo Bonzini <pbonzini@redhat.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"x86@kernel.org" <x86@kernel.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"mingo@redhat.com" <mingo@redhat.com>,
"bp@alien8.de" <bp@alien8.de>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"hpa@zytor.com" <hpa@zytor.com>,
"Tian, Kevin" <kevin.tian@intel.com>
Subject: Re: [RESEND PATCH 5/6] KVM: x86/VMX: add kvm_vmx_reinject_nmi_irq() for NMI/IRQ reinjection
Date: Thu, 17 Nov 2022 15:51:36 +0000 [thread overview]
Message-ID: <Y3ZYiKbJacmejY3K@google.com> (raw)
In-Reply-To: <DM5PR1101MB2172D7D7BC49255DB3752802A8069@DM5PR1101MB2172.namprd11.prod.outlook.com>
On Thu, Nov 17, 2022, Li, Xin3 wrote:
>
> > > > > > But what about NMIs, afaict this is all horribly broken for NMIs.
> > > > > > So the whole VMX thing latches the NMI (which stops NMI
> > > > > > recursion), right?
> > > > > >
> > > > > > But then you drop out of noinstr code, which means any random
> > > > > > exception can happen (kprobes #BP, hw_breakpoint #DB, or even
> > > > > > #PF due to random nonsense like *SAN). This exception will do
> > > > > > IRET and clear the NMI latch, all before you get to run any of the
> > > > > > NMI code.
> > > > >
> > > > > What you said here implies that we have this problem in the existing code.
> > > > > Because a fake iret stack is created to call the NMI handler in
> > > > > the IDT NMI descriptor, which lastly executes the IRET instruction.
> > > >
> > > > I can't follow; of course the IDT handler terminates with IRET, it has to no?
> > > >
> > > > And yes, the current code appears to suffer the same defect.
That defect isn't going to be fixed simply by changing how KVM forwards NMIs
though. IIUC, _everything_ between VM-Exit and the invocation of the NMI handler
needs to be noinstr. On VM-Exit due to NMI, NMIs are blocked. If a #BP/#DB/#PF
occurs before KVM gets to kvm_x86_handle_exit_irqoff(), the subsequent IRET will
unblock NMIs before the original NMI is serviced, i.e. a second NMI could come in
at anytime regardless of how KVM forwards the NMI to the kernel.
Is there any way to solve this without tagging everything noinstr? There is a
metric shit ton of code between VM-Exit and the handling of NMIs, and much of that
code is common helpers. It might be possible to hoist NMI handler much earlier,
though we'd need to do a super thorough audit to ensure all necessary host state
is restored.
> > > With FRED, ERETS/ERETU replace IRET, and use bit 28 of the popped CS
> > > field to control whether to unblock NMI. If bit 28 of the field (above
> > > the selector) is 1, ERETS/ERETU unblocks NMIs.
Side topic, there's a bug in the ISE docs. Section "9.4.2 NMI Blocking" states
that bit 16 holds the "unblock NMI" magic, which I'm guessing is a holdover from
an earlier revision of FRED.
As specified in Section 6.1.3 and Section 6.2.3, ERETS and ERETU each unblocks NMIs
if bit 16 of the popped CS field is 1. The following items detail how this behavior may be
changed in VMX non-root operation, depending on the settings of certain VM-execution
controls:
> > Yes, I know that. It is one of the many improvements FRED brings.
> > Ideally the IBT WAIT-FOR-ENDBR state also gets squirreled away in the
> > hardware exception frame, but that's still up in the air I believe :/
> >
> > Anyway.. given there is interrupt priority and NMI is pretty much on top of
> > everything else the reinject crap *should* run NMI first. That way NMI runs
> > with the latch disabled and whatever other pending interrupts will run later.
> >
> > But that all is still broken because afaict the current code also leaves noinstr --
> > and once you leave noinstr (or use a static_key, static_call or anything else that
> > *can* change at runtime) you can't guarantee nothing.
>
> For NMI, HPA asked me to use "int $2", as it switches to the NMI IST stack to
> execute the NMI handler, essentially like how HW deals with a NMI in host. and
> I tested it with NMI watchdog, it looks working fine.
Heh, well yeah, because that's how KVM used to handle NMIs back before I reworked
NMI handling to use the direct call method. Ironically, that original change was
done in part to try and make it _easier_ to deal with FRED (back before FRED was
publicly disclosed).
If KVM reverts to INTn, the fix to route KVM=>NMI through the non-IST entry can
be reverted too.
a217a6593cec ("KVM/VMX: Invoke NMI non-IST entry instead of IST entry")
1a5488ef0dcf ("KVM: VMX: Invoke NMI handler via indirect call instead of INTn")
next prev parent reply other threads:[~2022-11-17 15:51 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-10 6:15 [RESEND PATCH 0/6] x86/traps,VMX: implement software based NMI/IRQ dispatch for VMX NMI/IRQ reinjection Xin Li
2022-11-10 6:15 ` [RESEND PATCH 1/6] x86/traps: let common_interrupt() handle IRQ_MOVE_CLEANUP_VECTOR Xin Li
2022-11-10 18:59 ` Ashok Raj
2022-11-10 22:09 ` Li, Xin3
2022-11-10 6:15 ` [RESEND PATCH 2/6] x86/traps: add a system interrupt table for system interrupt dispatch Xin Li
2022-11-10 8:56 ` Peter Zijlstra
2022-11-10 19:55 ` Li, Xin3
2022-11-10 20:36 ` Li, Xin3
2022-11-10 21:12 ` Nathan Chancellor
2022-11-10 23:00 ` Li, Xin3
2022-11-11 0:08 ` Nathan Chancellor
2022-11-11 3:03 ` Li, Xin3
2022-11-11 8:58 ` Peter Zijlstra
2022-11-11 1:12 ` Tian, Kevin
2022-11-11 3:54 ` Li, Xin3
2022-11-11 8:55 ` Peter Zijlstra
2022-11-11 22:07 ` H. Peter Anvin
2022-11-12 9:47 ` Peter Zijlstra
2022-11-10 6:15 ` [RESEND PATCH 3/6] x86/traps: add install_system_interrupt_handler() Xin Li
2022-11-10 6:15 ` [RESEND PATCH 4/6] x86/traps: add external_interrupt() to dispatch external interrupts Xin Li
2022-11-10 16:24 ` Sean Christopherson
2022-11-10 18:02 ` Li, Xin3
2022-11-10 20:10 ` Sean Christopherson
2022-11-10 6:15 ` [RESEND PATCH 5/6] KVM: x86/VMX: add kvm_vmx_reinject_nmi_irq() for NMI/IRQ reinjection Xin Li
2022-11-10 9:03 ` Peter Zijlstra
2022-11-10 20:53 ` Li, Xin3
2022-11-11 9:15 ` Peter Zijlstra
2022-11-11 12:04 ` Paolo Bonzini
2022-11-11 12:19 ` Peter Zijlstra
2022-11-11 12:48 ` Paolo Bonzini
2022-11-11 14:23 ` Peter Zijlstra
2022-11-11 16:35 ` Andrew Cooper
2022-11-11 22:22 ` H. Peter Anvin
2022-11-12 0:08 ` Andrew Cooper
2022-11-11 18:06 ` Li, Xin3
2022-11-11 19:33 ` Peter Zijlstra
2022-11-12 6:35 ` Li, Xin3
2022-11-14 4:39 ` Li, Xin3
2022-11-14 9:08 ` Peter Zijlstra
2022-11-15 7:50 ` Li, Xin3
2022-11-15 9:17 ` Peter Zijlstra
2022-11-17 3:37 ` Li, Xin3
2022-11-17 15:51 ` Sean Christopherson [this message]
2022-11-18 0:05 ` Li, Xin3
2022-11-22 13:00 ` Li, Xin3
2022-11-22 20:52 ` Sean Christopherson
2022-11-23 8:31 ` Li, Xin3
2022-11-23 20:42 ` Sean Christopherson
2022-11-24 3:40 ` Li, Xin3
2022-11-28 16:26 ` Sean Christopherson
2022-11-24 9:46 ` Peter Zijlstra
2022-11-28 19:05 ` Sean Christopherson
2022-11-23 9:16 ` Peter Zijlstra
2022-11-23 19:18 ` Sean Christopherson
2022-11-11 22:15 ` H. Peter Anvin
2022-11-10 6:15 ` [RESEND PATCH 6/6] x86/traps: remove unused NMI entry exc_nmi_noist() Xin Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y3ZYiKbJacmejY3K@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=xin3.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).