All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Thomas Gleixner <tglx@kernel.org>
Cc: Jim Mattson <jmattson@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	 Binbin Wu <binbin.wu@linux.intel.com>,
	Vishal L Verma <vishal.l.verma@intel.com>,
	 "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Rick P Edgecombe <rick.p.edgecombe@intel.com>,
	 Binbin Wu <binbin.wu@intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	Paolo Bonzini <bonzini@redhat.com>
Subject: Re: CPU Lockups in KVM with deferred hrtimer rearming
Date: Tue, 21 Apr 2026 15:07:30 -0700	[thread overview]
Message-ID: <aef1ItiR3bEYDkWH@google.com> (raw)
In-Reply-To: <87zf2w9e78.ffs@tglx>

On Tue, Apr 21, 2026, Thomas Gleixner wrote:
> On Tue, Apr 21 2026 at 11:55, Sean Christopherson wrote:
> > On Tue, Apr 21, 2026, Thomas Gleixner wrote:
> >> >> Looks like. It will take the interrupt after local_irq_enable().
> > VM_EXIT_ACK_INTR_ON_EXIT also provides symmetry with Intel's handing of NMIs, as
> > NMIs are unconditionally "acked" on VM-Exit.
> 
> What's the exact point you are trying to make?

That no matter what we do for IRQs, KVM needs a direct call into the kernel to
handle an asynchronous event that arrived in the past.

> The symmetry is a cosmetic nice to have bullet point, but neither a
> functional nor a correctness requirement. The fact that hardware people
> provided something which looks "useful" at the first glance does not
> make it so.
> 
> > Even if performance is "fine", changing decades of fundamental KVM behavior is
> > terrifying.
> 
> It worked perfectly fine before this was introduced in commit
> a547c6db4d2f ("KVM: VMX: Enable acknowledge interupt on vmexit") in 2013.

Yes, but that configuration hasn't been tested (by KVM) on any CPU released in
the last decade+.  That's what scares me.  Do I think it's at all likely that
there's a lurking ucode bug?  No.  But the risk vs. reward isn't there for me.

But as Paolo pointed out, the "killer" feature gated by ACK-on-exit is posted
interrupts, and _that_ provides a massive performance win.

> > IMO, that's the way to go.  But instead of exporting __hrtimer_rearm_deferred(),
> > move vmx_do_nmi_irqoff() and vmx_do_interrupt_irqoff() into core kernel entry code
> 
> Surely not into core kernel entry code as this is x86 specific hackery.

Oh come on.  I have a hard time believing that you really truly thought that's
what I was suggesting.

> > (along with the assembly glue), and then EXPORT_SYMBOL_FOR_KVM those.  It'd mean
> > some extra surgery, e.g. to provide an equivalent to KVM's IDT lookup:
> >
> > 	gate_offset((gate_desc *)host_idt_base + vector)
> >
> > But I suspect it would be a big net positive in the end.i  E.g. the entry code
> > would *know* it's dealing with a direct call from KVM, and thus shouldn't need
> > to play pt_regs games.
> 
> As this is x86 specific the generic entry code knows absolutely nothing
> unless there is a magic indicator like PeterZ's hack or yet another
> duplicated version of the irqentry_exit() code just to accomodate KVM
> for handwaving reasons.
> 
> As Peter and myself pointed out before this will also not solve the
> problem that due to that KVM won't be able to benefit from the recent
> hrtimer/hrtick improvements on VMX(TDX) hosts.

Sorry, you lost me here.  What's the TDX angle?  Or are you just saying that VMX
is currently hosed with the deferred rearm?

> To be entirely clear: We are not going to disable HRTICK for the benefit
> of this dubious "decades old performance" hack.

No one suggested that.

> > Actually, even better would be to bury the FRED vs. not-FRED details in entry
> > code.  E.g. on the KVM invocation side, we could get to something like the below,
> > and I'm pretty sure _reduce_ the number of for-KVM exports in the
> > process.
> 
> That's an orthogonal issue. The problem at hand is independent of FRED
> or not-FRED as both end up providing a pt_regs frame with eflags.IF = 0.

Eh, not if it gives us a clean, maintable solution for for the problem.

  reply	other threads:[~2026-04-21 22:07 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-16 20:50 CPU Lockups in KVM with deferred hrtimer rearming Verma, Vishal L
2026-04-20 15:00 ` Thomas Gleixner
2026-04-20 15:22   ` Thomas Gleixner
2026-04-20 20:57   ` Verma, Vishal L
2026-04-20 22:19     ` Thomas Gleixner
2026-04-20 22:24       ` Verma, Vishal L
2026-04-21  6:29         ` Thomas Gleixner
2026-04-21  4:51   ` Binbin Wu
2026-04-21  7:39     ` Thomas Gleixner
2026-04-21 11:18       ` Peter Zijlstra
2026-04-21 11:32         ` Peter Zijlstra
2026-04-21 11:34           ` Peter Zijlstra
2026-04-21 11:49             ` Peter Zijlstra
2026-04-21 12:05               ` Peter Zijlstra
2026-04-21 13:19                 ` Peter Zijlstra
2026-04-21 13:29                   ` Peter Zijlstra
2026-04-21 16:36                     ` Thomas Gleixner
2026-04-21 18:11                     ` Verma, Vishal L
2026-04-21 17:11               ` Thomas Gleixner
2026-04-21 17:20                 ` Jim Mattson
2026-04-21 18:29                   ` Thomas Gleixner
2026-04-21 18:55                     ` Sean Christopherson
2026-04-21 20:06                       ` Peter Zijlstra
2026-04-21 20:46                         ` Peter Zijlstra
2026-04-21 20:57                         ` Sean Christopherson
2026-04-21 21:02                           ` Peter Zijlstra
2026-04-21 21:42                             ` Sean Christopherson
2026-04-22  6:55                               ` Peter Zijlstra
2026-04-22  7:46                                 ` Peter Zijlstra
2026-04-22 14:08                                   ` Peter Zijlstra
2026-04-22 15:26                                     ` Sean Christopherson
2026-04-22 19:13                                   ` Verma, Vishal L
2026-04-22 22:57                                   ` Thomas Gleixner
2026-04-23 15:23                                     ` Peter Zijlstra
2026-04-22 13:47                                 ` Sean Christopherson
2026-04-21 20:39                       ` Paolo Bonzini
2026-04-21 21:02                         ` Sean Christopherson
2026-04-21 22:48                         ` Thomas Gleixner
2026-04-21 23:15                           ` Paolo Bonzini
2026-04-21 23:34                             ` Jim Mattson
2026-04-21 23:37                               ` Paolo Bonzini
2026-04-22  2:10                             ` Thomas Gleixner
2026-04-21 21:49                       ` Thomas Gleixner
2026-04-21 22:07                         ` Sean Christopherson [this message]
2026-04-21 22:24                         ` Paolo Bonzini
2026-04-21 19:18                 ` Verma, Vishal L
2026-04-21 16:30           ` Thomas Gleixner
2026-04-21 16:11       ` Verma, Vishal L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aef1ItiR3bEYDkWH@google.com \
    --to=seanjc@google.com \
    --cc=binbin.wu@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=bonzini@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tglx@kernel.org \
    --cc=vishal.l.verma@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.