public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Thomas Gleixner <tglx@kernel.org>
Cc: Jim Mattson <jmattson@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	 Binbin Wu <binbin.wu@linux.intel.com>,
	Vishal L Verma <vishal.l.verma@intel.com>,
	 "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Rick P Edgecombe <rick.p.edgecombe@intel.com>,
	 Binbin Wu <binbin.wu@intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	Paolo Bonzini <bonzini@redhat.com>
Subject: Re: CPU Lockups in KVM with deferred hrtimer rearming
Date: Tue, 21 Apr 2026 15:07:30 -0700	[thread overview]
Message-ID: <aef1ItiR3bEYDkWH@google.com> (raw)
In-Reply-To: <87zf2w9e78.ffs@tglx>

On Tue, Apr 21, 2026, Thomas Gleixner wrote:
> On Tue, Apr 21 2026 at 11:55, Sean Christopherson wrote:
> > On Tue, Apr 21, 2026, Thomas Gleixner wrote:
> >> >> Looks like. It will take the interrupt after local_irq_enable().
> > VM_EXIT_ACK_INTR_ON_EXIT also provides symmetry with Intel's handing of NMIs, as
> > NMIs are unconditionally "acked" on VM-Exit.
> 
> What's the exact point you are trying to make?

That no matter what we do for IRQs, KVM needs a direct call into the kernel to
handle an asynchronous event that arrived in the past.

> The symmetry is a cosmetic nice to have bullet point, but neither a
> functional nor a correctness requirement. The fact that hardware people
> provided something which looks "useful" at the first glance does not
> make it so.
> 
> > Even if performance is "fine", changing decades of fundamental KVM behavior is
> > terrifying.
> 
> It worked perfectly fine before this was introduced in commit
> a547c6db4d2f ("KVM: VMX: Enable acknowledge interupt on vmexit") in 2013.

Yes, but that configuration hasn't been tested (by KVM) on any CPU released in
the last decade+.  That's what scares me.  Do I think it's at all likely that
there's a lurking ucode bug?  No.  But the risk vs. reward isn't there for me.

But as Paolo pointed out, the "killer" feature gated by ACK-on-exit is posted
interrupts, and _that_ provides a massive performance win.

> > IMO, that's the way to go.  But instead of exporting __hrtimer_rearm_deferred(),
> > move vmx_do_nmi_irqoff() and vmx_do_interrupt_irqoff() into core kernel entry code
> 
> Surely not into core kernel entry code as this is x86 specific hackery.

Oh come on.  I have a hard time believing that you really truly thought that's
what I was suggesting.

> > (along with the assembly glue), and then EXPORT_SYMBOL_FOR_KVM those.  It'd mean
> > some extra surgery, e.g. to provide an equivalent to KVM's IDT lookup:
> >
> > 	gate_offset((gate_desc *)host_idt_base + vector)
> >
> > But I suspect it would be a big net positive in the end.i  E.g. the entry code
> > would *know* it's dealing with a direct call from KVM, and thus shouldn't need
> > to play pt_regs games.
> 
> As this is x86 specific the generic entry code knows absolutely nothing
> unless there is a magic indicator like PeterZ's hack or yet another
> duplicated version of the irqentry_exit() code just to accomodate KVM
> for handwaving reasons.
> 
> As Peter and myself pointed out before this will also not solve the
> problem that due to that KVM won't be able to benefit from the recent
> hrtimer/hrtick improvements on VMX(TDX) hosts.

Sorry, you lost me here.  What's the TDX angle?  Or are you just saying that VMX
is currently hosed with the deferred rearm?

> To be entirely clear: We are not going to disable HRTICK for the benefit
> of this dubious "decades old performance" hack.

No one suggested that.

> > Actually, even better would be to bury the FRED vs. not-FRED details in entry
> > code.  E.g. on the KVM invocation side, we could get to something like the below,
> > and I'm pretty sure _reduce_ the number of for-KVM exports in the
> > process.
> 
> That's an orthogonal issue. The problem at hand is independent of FRED
> or not-FRED as both end up providing a pt_regs frame with eflags.IF = 0.

Eh, not if it gives us a clean, maintable solution for for the problem.

  reply	other threads:[~2026-04-21 22:07 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-16 20:50 CPU Lockups in KVM with deferred hrtimer rearming Verma, Vishal L
2026-04-20 15:00 ` Thomas Gleixner
2026-04-20 15:22   ` Thomas Gleixner
2026-04-20 20:57   ` Verma, Vishal L
2026-04-20 22:19     ` Thomas Gleixner
2026-04-20 22:24       ` Verma, Vishal L
2026-04-21  6:29         ` Thomas Gleixner
2026-04-21  4:51   ` Binbin Wu
2026-04-21  7:39     ` Thomas Gleixner
2026-04-21 11:18       ` Peter Zijlstra
2026-04-21 11:32         ` Peter Zijlstra
2026-04-21 11:34           ` Peter Zijlstra
2026-04-21 11:49             ` Peter Zijlstra
2026-04-21 12:05               ` Peter Zijlstra
2026-04-21 13:19                 ` Peter Zijlstra
2026-04-21 13:29                   ` Peter Zijlstra
2026-04-21 16:36                     ` Thomas Gleixner
2026-04-21 18:11                     ` Verma, Vishal L
2026-04-21 17:11               ` Thomas Gleixner
2026-04-21 17:20                 ` Jim Mattson
2026-04-21 18:29                   ` Thomas Gleixner
2026-04-21 18:55                     ` Sean Christopherson
2026-04-21 20:06                       ` Peter Zijlstra
2026-04-21 20:46                         ` Peter Zijlstra
2026-04-21 20:57                         ` Sean Christopherson
2026-04-21 21:02                           ` Peter Zijlstra
2026-04-21 21:42                             ` Sean Christopherson
2026-04-21 20:39                       ` Paolo Bonzini
2026-04-21 21:02                         ` Sean Christopherson
2026-04-21 22:48                         ` Thomas Gleixner
2026-04-21 23:15                           ` Paolo Bonzini
2026-04-21 23:34                             ` Jim Mattson
2026-04-21 23:37                               ` Paolo Bonzini
2026-04-22  2:10                             ` Thomas Gleixner
2026-04-21 21:49                       ` Thomas Gleixner
2026-04-21 22:07                         ` Sean Christopherson [this message]
2026-04-21 22:24                         ` Paolo Bonzini
2026-04-21 19:18                 ` Verma, Vishal L
2026-04-21 16:30           ` Thomas Gleixner
2026-04-21 16:11       ` Verma, Vishal L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aef1ItiR3bEYDkWH@google.com \
    --to=seanjc@google.com \
    --cc=binbin.wu@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=bonzini@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tglx@kernel.org \
    --cc=vishal.l.verma@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox