Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yosry Ahmed <yosry@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	kvm@vger.kernel.org,  linux-kernel@vger.kernel.org,
	Kai Huang <kai.huang@intel.com>
Subject: Re: [PATCH v4 26/30] KVM: x86: Don't treat interrupts as allowed just because a nested run is pending
Date: Tue, 16 Jun 2026 10:46:31 -0700	[thread overview]
Message-ID: <ajGL95j_Z3ynCUAy@google.com> (raw)
In-Reply-To: <CAO9r8zNFk287zeB+0nrUvcCqpn-wi3CZo6t-9COQGkdd3POzMg@mail.gmail.com>

On Mon, Jun 15, 2026, Yosry Ahmed wrote:
> > > The code makes sense to me but I am trying to make sense of the changelog.
> >
> > What part (parts?) is confusing?  Honest question.  I'm trying to reword the
> > changelog to make it "better", but I'm failing miserable because I don't know
> > what's wrong :-)
> 
> 1. For kvm_vcpu_has_events() being unaffected, the explanation in
> paragraph #3 is focused on the code path from nested_vmx_run() ->
> kvm_emulate_halt_noskip(). I don't immediately see how
> kvm_arch_vcpu_runnable() is unaffected.

To reach kvm_vcpu_has_events(), kvm_vcpu_running() needs to return false.  For
that to happen, vcpu->arch.mp_state needs to be something other than RUNNABLE.

If nested_run_pending is true, then mp_state *must* be RUNNABLE (barring bugs or
stupid userspace), because KVM shouldn't emulate VMRUN/VMLAUNCH/VMRESUME while
the vCPU is !RUNNABLE.  

I didn't include that in the changelog because I thought it was obvious, but
obviously (LOL) not :-D

I called out the GUEST_ACTIVITY_HLT case because (to me) that is less obvious.

> 2. More importantly, paragraphs #3 and #4 read like this patch would
> regress kvm_vcpu_ready_for_interrupt_injection() and
> kvm_vcpu_has_events() if it affected them. Maybe clearly state that
> this patch is the right thing to do for these 2 functions as well, but
> they are more-or-less unaffected by the bug anyway? For
> kvm_vcpu_ready_for_interrupt_injection(), maybe just make it more
> clear in paragraph #4 that it currently incorrectly treats interrupts
> as allowed in the problematic scenario, but it is not a problem
> because ..., and it only results in a spurious exit to userspace (or
> not even that?).

Is this better?

  When querying whether or not interrupts (IRQs) are allowed, check for a
  pending nested run _after_ checking whether or not interrupts are blocked.
  If L1 is running L2 _without_ nested_exit_on_intr(), i.e. if L1 IRQs can
  be blocked while running L2, and interrupts will indeed be blocked once the
  nested VM-Enter to L2 is completed, then KVM should treat interrupts as not
  being allowed.
  
  For injection, this avoids an unnecessary (forced) VM-Exit, as KVM can
  immediately request an IRQ window, instead of forcing an exit and _then_
  requesting an IRQ window (because after the forced exit, KVM will see that
  interrupts are blocked).
  
  For non-injection usage, only kvm_vcpu_ready_for_interrupt_injection() is
  affected in practice.  Barring KVM bugs or misbehaving userspace (at which
  point all architectural guarantees are off), kvm_vcpu_has_events() is
  unreachable when a nested run is pending.  To reach kvm_vcpu_has_events(),
  kvm_vcpu_running() needs to return false, i.e. vcpu->arch.mp_state needs
  to be something other than RUNNABLE.  If nested_run_pending is true, then
  mp_state *must* be RUNNABLE (again barring bugs or stupid userspace),
  because KVM shouldn't emulate VMRUN/VMLAUNCH/VMRESUME while the vCPU is
  !RUNNABLE.
  
  The one "near miss" is VMX's GUEST_ACTIVITY_STATE field, which allows L1 to
  put the vCPU into HLT or WFS as part of nested VMLAUNCH/VMRESUME.  However,
  KVM clears nested_run_pending prior to calling kvm_emulate_halt_noskip()
  when putting L2 into HLT via GUEST_ACTIVITY_HLT, and also clears the flag
  before setting mp_state to INIT_RECEIVED.  SVM has no equivalent to
  GUEST_ACTIVITY_STATE.
  
  I.e. the vCPU will always be runnable if a nested run is pending, and thus
  kvm_arch_vcpu_runnable() => kvm_vcpu_has_events() is effectively dead code,
  as is __kvm_emulate_halt() => kvm_vcpu_has_events().  Oh, and TDX doesn't
  support nested VMX.  Similarly, kvm_can_do_async_pf() is unreachable as
  KVM shouldn't be faulting in memory with a pending nested VM-Enter.
  
  As for kvm_vcpu_ready_for_interrupt_injection(), KVM's current behavior of
  incorrectly treating interrupts as being allowed could result in KVM
  prematurely exiting to userspace to accept an ExtINT.  But, KVM will still
  hold/block the ExtINT and request its own IRQ window.  I.e. the net effect
  is more or less the same as the for-injection case, the unnecessary exit
  just happens at a different boundary.

  reply	other threads:[~2026-06-16 17:46 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-13  0:02 [PATCH v4 00/30] KVM: x86: x86.{c,h} spring cleaning Sean Christopherson
2026-06-13  0:03 ` [PATCH v4 01/30] KVM: x86: Extract REGS and SREGS runtime sync code to helpers Sean Christopherson
2026-06-15  2:16   ` Huang, Kai
2026-06-15  5:02   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 02/30] KVM: x86: Move get_segment_base() to regs.h, as kvm_get_segment_base() Sean Christopherson
2026-06-15  2:43   ` Huang, Kai
2026-06-15  5:03   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 03/30] KVM: x86: Rename __{g,s}et_sregs2() => kvm_x86_vcpu_ioctl_{g,s}et_sregs2() Sean Christopherson
2026-06-15  2:46   ` Huang, Kai
2026-06-15  5:13   ` Binbin Wu
2026-06-15 15:58     ` Sean Christopherson
2026-06-13  0:03 ` [PATCH v4 04/30] KVM: x86: Move the bulk of register specific code from x86.c to regs.c Sean Christopherson
2026-06-15  5:25   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 05/30] KVM: x86: Move local APIC specific helpers out of asm/kvm_host.h Sean Christopherson
2026-06-15  5:47   ` Binbin Wu
2026-06-15 16:06     ` Sean Christopherson
2026-06-13  0:03 ` [PATCH v4 06/30] KVM: x86: Move kvm_caps and kvm_host_values to asm/kvm_host.h Sean Christopherson
2026-06-13  9:01   ` Xiaoyao Li
2026-06-15  6:49     ` Binbin Wu
2026-06-15 16:24       ` Sean Christopherson
2026-06-16  8:18         ` Xiaoyao Li
2026-06-13  0:03 ` [PATCH v4 07/30] KVM: x86: Swap the include order between x86.h and mmu.h Sean Christopherson
2026-06-15  7:26   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 08/30] KVM: x86: Move tdp_enabled from kvm_host.h to mmu.h Sean Christopherson
2026-06-15  7:33   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 09/30] KVM: x86: Move eager_page_split to mmu.{c,h} Sean Christopherson
2026-06-15  7:49   ` Binbin Wu
2026-06-16 17:19     ` Sean Christopherson
2026-06-13  0:03 ` [PATCH v4 10/30] KVM: x86/hyperv: Eliminate an unnecessary include of x86.h in hyperv.h Sean Christopherson
2026-06-15  7:52   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 11/30] KVM: x86: Move kvm_{load,put}_guest_fpu() to fpu.h Sean Christopherson
2026-06-15  8:13   ` Binbin Wu
2026-06-15 16:31     ` Sean Christopherson
2026-06-13  0:03 ` [PATCH v4 12/30] KVM: x86: Extract get/set MSR (list) ioctl logic to helpers Sean Christopherson
2026-06-15  8:30   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 13/30] KVM: x86: Expose several TSC helpers via x86.h for use by MSR code Sean Christopherson
2026-06-13  0:16   ` sashiko-bot
2026-06-13  0:03 ` [PATCH v4 14/30] KVM: x86: Move the bulk of MSR specific code from x86.c to msrs.{c,h} Sean Christopherson
2026-06-15  9:30   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 15/30] KVM: x86: Move register helper declarations from kvm_host.h => regs.h Sean Christopherson
2026-06-16  2:15   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 16/30] KVM: x86: Move kvm_{g,s}et_segment() to inline helpers in regs.h Sean Christopherson
2026-06-16  2:19   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 17/30] KVM: x86: Move MSR helper declarations from kvm_host.h => msrs.h Sean Christopherson
2026-06-16  2:25   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 18/30] KVM: x86: Move "struct kvm_x86_msr_filter" definition to msrs.c Sean Christopherson
2026-06-15  2:47   ` Huang, Kai
2026-06-16  3:12   ` Binbin Wu
2026-06-16  6:29   ` Binbin Wu
2026-06-16  7:29     ` Huang, Kai
2026-06-16  7:43       ` Binbin Wu
2026-06-16  7:46         ` Huang, Kai
2026-06-16 16:19           ` Sean Christopherson
2026-06-13  0:03 ` [PATCH v4 19/30] KVM: x86/pmu: Move "struct kvm_x86_pmu_event_filter" definition to pmu.c Sean Christopherson
2026-06-15  2:48   ` Huang, Kai
2026-06-16  3:18   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 20/30] KVM: x86: Move MMU helper declarations from kvm_host.h => mmu.h Sean Christopherson
2026-06-16  5:04   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 21/30] KVM: x86: Move LLDT assembly wrappers into VMX Sean Christopherson
2026-06-16  6:40   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 22/30] KVM: x86: Move misc "VALID MASK" defines from kvm_host.h => x86.c Sean Christopherson
2026-06-16  6:46   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 23/30] KVM: x86: Move __kvm_irq_line_state() from kvm_host.h => ioapic.h Sean Christopherson
2026-06-16  6:50   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 24/30] KVM: x86: Move IRQ-related helper declarations from kvm_host.h => irq.h Sean Christopherson
2026-06-15 11:55   ` Huang, Kai
2026-06-16  6:55   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 25/30] KVM: x86: Move kvm_pv_send_ipi() declaration from kvm_host.h => lapic.h Sean Christopherson
2026-06-16  6:58   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 26/30] KVM: x86: Don't treat interrupts as allowed just because a nested run is pending Sean Christopherson
2026-06-15 16:40   ` Yosry Ahmed
2026-06-15 16:43     ` Yosry Ahmed
2026-06-15 17:03       ` Sean Christopherson
2026-06-15 19:37         ` Yosry Ahmed
2026-06-15 17:26     ` Sean Christopherson
2026-06-15 19:48       ` Yosry Ahmed
2026-06-16 17:46         ` Sean Christopherson [this message]
2026-06-16 18:08           ` Yosry Ahmed
2026-06-13  0:03 ` [PATCH v4 27/30] KVM: x86: Rework kvm_arch_interrupt_allowed() into kvm_is_interrupt_allowed() Sean Christopherson
2026-06-13  0:03 ` [PATCH v4 28/30] KVM: x86/mmu: Move kvm_arch_async_page_ready() below kvm_tdp_page_fault() Sean Christopherson
2026-06-16  8:40   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 29/30] KVM: x86/mmu: Move kvm_mmu_do_page_fault() from mmu_internal.h => mmu.c Sean Christopherson
2026-06-16  8:48   ` Binbin Wu
2026-06-13  0:03 ` [PATCH v4 30/30] KVM: x86: Move a pile of stuff from kvm_host.h => x86.h Sean Christopherson
2026-06-15 13:01   ` Huang, Kai
2026-06-15 14:23     ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajGL95j_Z3ynCUAy@google.com \
    --to=seanjc@google.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=yosry@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox