All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Maxim Levitsky <mlevitsk@redhat.com>,
	Yong He <zhuangel570@gmail.com>,
	Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: [PATCH 6.6 37/82] KVM: x86: Unconditionally set irr_pending when updating APICv state
Date: Wed, 20 Nov 2024 13:56:47 +0100	[thread overview]
Message-ID: <20241120125630.448025207@linuxfoundation.org> (raw)
In-Reply-To: <20241120125629.623666563@linuxfoundation.org>

6.6-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

commit d3ddef46f22e8c3124e0df1f325bc6a18dadff39 upstream.

Always set irr_pending (to true) when updating APICv status to fix a bug
where KVM fails to set irr_pending when userspace sets APIC state and
APICv is disabled, which ultimate results in KVM failing to inject the
pending interrupt(s) that userspace stuffed into the vIRR, until another
interrupt happens to be emulated by KVM.

Only the APICv-disabled case is flawed, as KVM forces apic->irr_pending to
be true if APICv is enabled, because not all vIRR updates will be visible
to KVM.

Hit the bug with a big hammer, even though strictly speaking KVM can scan
the vIRR and set/clear irr_pending as appropriate for this specific case.
The bug was introduced by commit 755c2bf87860 ("KVM: x86: lapic: don't
touch irr_pending in kvm_apic_update_apicv when inhibiting it"), which as
the shortlog suggests, deleted code that updated irr_pending.

Before that commit, kvm_apic_update_apicv() did indeed scan the vIRR, with
with the crucial difference that kvm_apic_update_apicv() did the scan even
when APICv was being *disabled*, e.g. due to an AVIC inhibition.

        struct kvm_lapic *apic = vcpu->arch.apic;

        if (vcpu->arch.apicv_active) {
                /* irr_pending is always true when apicv is activated. */
                apic->irr_pending = true;
                apic->isr_count = 1;
        } else {
                apic->irr_pending = (apic_search_irr(apic) != -1);
                apic->isr_count = count_vectors(apic->regs + APIC_ISR);
        }

And _that_ bug (clearing irr_pending) was introduced by commit b26a695a1d78
("kvm: lapic: Introduce APICv update helper function"), prior to which KVM
unconditionally set irr_pending to true in kvm_apic_set_state(), i.e.
assumed that the new virtual APIC state could have a pending IRQ.

Furthermore, in addition to introducing this issue, commit 755c2bf87860
also papered over the underlying bug: KVM doesn't ensure CPUs and devices
see APICv as disabled prior to searching the IRR.  Waiting until KVM
emulates an EOI to update irr_pending "works", but only because KVM won't
emulate EOI until after refresh_apicv_exec_ctrl(), and there are plenty of
memory barriers in between.  I.e. leaving irr_pending set is basically
hacking around bad ordering.

So, effectively revert to the pre-b26a695a1d78 behavior for state restore,
even though it's sub-optimal if no IRQs are pending, in order to provide a
minimal fix, but leave behind a FIXME to document the ugliness.  With luck,
the ordering issue will be fixed and the mess will be cleaned up in the
not-too-distant future.

Fixes: 755c2bf87860 ("KVM: x86: lapic: don't touch irr_pending in kvm_apic_update_apicv when inhibiting it")
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Reported-by: Yong He <zhuangel570@gmail.com>
Closes: https://lkml.kernel.org/r/20241023124527.1092810-1-alexyonghe%40tencent.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241106015135.2462147-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/lapic.c |   29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2603,19 +2603,26 @@ void kvm_apic_update_apicv(struct kvm_vc
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
 
-	if (apic->apicv_active) {
-		/* irr_pending is always true when apicv is activated. */
-		apic->irr_pending = true;
+	/*
+	 * When APICv is enabled, KVM must always search the IRR for a pending
+	 * IRQ, as other vCPUs and devices can set IRR bits even if the vCPU
+	 * isn't running.  If APICv is disabled, KVM _should_ search the IRR
+	 * for a pending IRQ.  But KVM currently doesn't ensure *all* hardware,
+	 * e.g. CPUs and IOMMUs, has seen the change in state, i.e. searching
+	 * the IRR at this time could race with IRQ delivery from hardware that
+	 * still sees APICv as being enabled.
+	 *
+	 * FIXME: Ensure other vCPUs and devices observe the change in APICv
+	 *        state prior to updating KVM's metadata caches, so that KVM
+	 *        can safely search the IRR and set irr_pending accordingly.
+	 */
+	apic->irr_pending = true;
+
+	if (apic->apicv_active)
 		apic->isr_count = 1;
-	} else {
-		/*
-		 * Don't clear irr_pending, searching the IRR can race with
-		 * updates from the CPU as APICv is still active from hardware's
-		 * perspective.  The flag will be cleared as appropriate when
-		 * KVM injects the interrupt.
-		 */
+	else
 		apic->isr_count = count_vectors(apic->regs + APIC_ISR);
-	}
+
 	apic->highest_isr_cache = -1;
 }
 



  parent reply	other threads:[~2024-11-20 12:59 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-20 12:56 [PATCH 6.6 00/82] 6.6.63-rc1 review Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 01/82] netlink: terminate outstanding dump on socket close Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 02/82] sctp: fix possible UAF in sctp_v6_available() Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 03/82] net: vertexcom: mse102x: Fix tx_bytes calculation Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 04/82] drm/rockchip: vop: Fix a dereferenced before check warning Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 05/82] mptcp: error out earlier on disconnect Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 06/82] mptcp: cope racing subflow creation in mptcp_rcv_space_adjust Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 07/82] net/mlx5: fs, lock FTE when checking if active Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 08/82] net/mlx5e: kTLS, Fix incorrect page refcounting Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 09/82] net/mlx5e: clear xdp features on non-uplink representors Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 10/82] net/mlx5e: CT: Fix null-ptr-deref in add rule err flow Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 11/82] virtio/vsock: Fix accept_queue memory leak Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 12/82] Revert "RDMA/core: Fix ENODEV error for iWARP test over vlan" Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 13/82] Bluetooth: hci_core: Fix calling mgmt_device_connected Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 14/82] Bluetooth: btintel: Direct exception event to bluetooth stack Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 15/82] net/sched: cls_u32: replace int refcounts with proper refcounts Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 16/82] net: sched: cls_u32: Fix u32s systematic failure to free IDR entries for hnodes Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 17/82] samples: pktgen: correct dev to DEV Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 18/82] net: stmmac: dwmac-mediatek: Fix inverted handling of mediatek,mac-wol Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 19/82] net: Make copy_safe_from_sockptr() match documentation Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 20/82] net: stmmac: dwmac-intel-plat: use devm_stmmac_probe_config_dt() Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 21/82] net: stmmac: dwmac-visconti: " Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 22/82] net: stmmac: rename stmmac_pltfr_remove_no_dt to stmmac_pltfr_remove Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 23/82] stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 24/82] net: ti: icssg-prueth: Fix 1 PPS sync Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 25/82] bonding: add ns target multicast address to slave device Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 26/82] ARM: 9419/1: mm: Fix kernel memory mapping for xip kernels Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 27/82] tools/mm: fix compile error Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 28/82] x86/mm: Fix a kdump kernel failure on SME system when CONFIG_IMA_KEXEC=y Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 29/82] mm: fix NULL pointer dereference in alloc_pages_bulk_noprof Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 30/82] ocfs2: uncache inode which has failed entering the group Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 31/82] mm: revert "mm: shmem: fix data-race in shmem_getattr()" Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 32/82] vdpa: solidrun: Fix UB bug with devres Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 33/82] vdpa/mlx5: Fix PA offset with unaligned starting iotlb map Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 34/82] vp_vdpa: fix id_table array not null terminated error Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 35/82] ima: fix buffer overrun in ima_eventdigest_init_common Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 36/82] KVM: nVMX: Treat vpid01 as current if L2 is active, but with VPID disabled Greg Kroah-Hartman
2024-11-20 12:56 ` Greg Kroah-Hartman [this message]
2024-11-20 12:56 ` [PATCH 6.6 38/82] KVM: VMX: Bury Intel PT virtualization (guest/host mode) behind CONFIG_BROKEN Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 39/82] nilfs2: fix null-ptr-deref in block_touch_buffer tracepoint Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 40/82] nommu: pass NULL argument to vma_iter_prealloc() Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 41/82] ALSA: hda/realtek - Fixed Clevo platform headset Mic issue Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 42/82] ALSA: hda/realtek: fix mute/micmute LEDs for a HP EliteBook 645 G10 Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 43/82] ocfs2: fix UBSAN warning in ocfs2_verify_volume() Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 44/82] nilfs2: fix null-ptr-deref in block_dirty_buffer tracepoint Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 45/82] LoongArch: Fix early_numa_add_cpu() usage for FDT systems Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 46/82] LoongArch: Disable KASAN if PGDIR_SIZE is too large for cpu_vabits Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 47/82] LoongArch: Make KASAN work with 5-level page-tables Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 48/82] Revert "mmc: dw_mmc: Fix IDMAC operation with pages bigger than 4K" Greg Kroah-Hartman
2024-11-20 12:56 ` [PATCH 6.6 49/82] mmc: sunxi-mmc: Fix A100 compatible description Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 50/82] drm/bridge: tc358768: Fix DSI command tx Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 51/82] pmdomain: imx93-blk-ctrl: correct remove path Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 52/82] nouveau: fw: sync dma after setup is called Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 53/82] drm/amd: Fix initialization mistake for NBIO 7.7.0 Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 54/82] drm/amd/display: Adjust VSDB parser for replay feature Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 55/82] mm/damon/core: implement scheme-specific apply interval Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 56/82] mm/damon/core: handle zero {aggregation,ops_update} intervals Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 57/82] staging: vchiq_arm: Get the rid off struct vchiq_2835_state Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 58/82] staging: vchiq_arm: Use devm_kzalloc() for vchiq_arm_state allocation Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 59/82] lib/buildid: Fix build ID parsing logic Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 60/82] media: dvbdev: fix the logic when DVB_DYNAMIC_MINORS is not set Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 61/82] NFSD: initialize copy->cp_clp early in nfsd4_copy for use by trace point Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 62/82] NFSD: Async COPY result needs to return a write verifier Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 63/82] NFSD: Limit the number of concurrent async COPY operations Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 64/82] NFSD: Initialize struct nfsd4_copy earlier Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 65/82] NFSD: Never decrement pending_async_copies on error Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 66/82] mptcp: define more local variables sk Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 67/82] mptcp: add userspace_pm_lookup_addr_by_id helper Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 68/82] mptcp: update local address flags when setting it Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 69/82] mptcp: hold pm lock when deleting entry Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 70/82] mptcp: drop lookup_by_id in lookup_addr Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 71/82] mptcp: pm: use _rcu variant under rcu_read_lock Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 72/82] drm/amd/pm: Vangogh: Fix kernel memory out of bounds write Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 73/82] fs/9p: fix uninitialized values during inode evict Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 74/82] leds: mlxreg: Use devm_mutex_init() for mutex initialization Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 75/82] mm: avoid unsafe VMA hook invocation when error arises on mmap hook Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 76/82] mm: unconditionally close VMAs on error Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 77/82] mm: refactor map_deny_write_exec() Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 78/82] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 79/82] mm: resolve faulty mmap_region() error path behaviour Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 80/82] mm/damon/core: check apply interval in damon_do_apply_schemes() Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 81/82] mm/damon/core: handle zero schemes apply interval Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.6 82/82] mm/damon/core: copy nr_accesses when splitting region Greg Kroah-Hartman
2024-11-20 16:44 ` [PATCH 6.6 00/82] 6.6.63-rc1 review Mark Brown
2024-11-20 17:02 ` SeongJae Park
2024-11-20 19:05 ` Florian Fainelli
2024-11-20 23:20 ` Shuah Khan
2024-11-21  4:20 ` Ron Economos
2024-11-21  7:37 ` Naresh Kamboju
2024-11-21 16:57 ` Hardik Garg
2024-11-21 19:39 ` Jon Hunter
2024-11-22  6:55 ` Muhammad Usama Anjum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241120125630.448025207@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=mlevitsk@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=stable@vger.kernel.org \
    --cc=zhuangel570@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.