From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Shuai Hu <hshuai@redhat.com>,
Zhenyu Zhang <zhenyzha@redhat.com>, Gavin Shan <gshan@redhat.com>,
David Hildenbrand <david@redhat.com>,
Oliver Upton <oliver.upton@linux.dev>,
Peter Xu <peterx@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Shaoqin Huang <shahuang@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: [PATCH 5.15 16/96] KVM: Avoid illegal stage2 mapping on invalid memory slot
Date: Mon, 26 Jun 2023 20:11:31 +0200 [thread overview]
Message-ID: <20230626180747.600081893@linuxfoundation.org> (raw)
In-Reply-To: <20230626180746.943455203@linuxfoundation.org>
From: Gavin Shan <gshan@redhat.com>
commit 2230f9e1171a2e9731422a14d1bbc313c0b719d1 upstream.
We run into guest hang in edk2 firmware when KSM is kept as running on
the host. The edk2 firmware is waiting for status 0x80 from QEMU's pflash
device (TYPE_PFLASH_CFI01) during the operation of sector erasing or
buffered write. The status is returned by reading the memory region of
the pflash device and the read request should have been forwarded to QEMU
and emulated by it. Unfortunately, the read request is covered by an
illegal stage2 mapping when the guest hang issue occurs. The read request
is completed with QEMU bypassed and wrong status is fetched. The edk2
firmware runs into an infinite loop with the wrong status.
The illegal stage2 mapping is populated due to same page sharing by KSM
at (C) even the associated memory slot has been marked as invalid at (B)
when the memory slot is requested to be deleted. It's notable that the
active and inactive memory slots can't be swapped when we're in the middle
of kvm_mmu_notifier_change_pte() because kvm->mn_active_invalidate_count
is elevated, and kvm_swap_active_memslots() will busy loop until it reaches
to zero again. Besides, the swapping from the active to the inactive memory
slots is also avoided by holding &kvm->srcu in __kvm_handle_hva_range(),
corresponding to synchronize_srcu_expedited() in kvm_swap_active_memslots().
CPU-A CPU-B
----- -----
ioctl(kvm_fd, KVM_SET_USER_MEMORY_REGION)
kvm_vm_ioctl_set_memory_region
kvm_set_memory_region
__kvm_set_memory_region
kvm_set_memslot(kvm, old, NULL, KVM_MR_DELETE)
kvm_invalidate_memslot
kvm_copy_memslot
kvm_replace_memslot
kvm_swap_active_memslots (A)
kvm_arch_flush_shadow_memslot (B)
same page sharing by KSM
kvm_mmu_notifier_invalidate_range_start
:
kvm_mmu_notifier_change_pte
kvm_handle_hva_range
__kvm_handle_hva_range
kvm_set_spte_gfn (C)
:
kvm_mmu_notifier_invalidate_range_end
Fix the issue by skipping the invalid memory slot at (C) to avoid the
illegal stage2 mapping so that the read request for the pflash's status
is forwarded to QEMU and emulated by it. In this way, the correct pflash's
status can be returned from QEMU to break the infinite loop in the edk2
firmware.
We tried a git-bisect and the first problematic commit is cd4c71835228 ("
KVM: arm64: Convert to the gfn-based MMU notifier callbacks"). With this,
clean_dcache_guest_page() is called after the memory slots are iterated
in kvm_mmu_notifier_change_pte(). clean_dcache_guest_page() is called
before the iteration on the memory slots before this commit. This change
literally enlarges the racy window between kvm_mmu_notifier_change_pte()
and memory slot removal so that we're able to reproduce the issue in a
practical test case. However, the issue exists since commit d5d8184d35c9
("KVM: ARM: Memory virtualization setup").
Cc: stable@vger.kernel.org # v3.9+
Fixes: d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Reported-by: Shuai Hu <hshuai@redhat.com>
Reported-by: Zhenyu Zhang <zhenyzha@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Message-Id: <20230615054259.14911-1-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
virt/kvm/kvm_main.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -636,6 +636,24 @@ static __always_inline int kvm_handle_hv
return __kvm_handle_hva_range(kvm, &range);
}
+
+static bool kvm_change_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
+{
+ /*
+ * Skipping invalid memslots is correct if and only change_pte() is
+ * surrounded by invalidate_range_{start,end}(), which is currently
+ * guaranteed by the primary MMU. If that ever changes, KVM needs to
+ * unmap the memslot instead of skipping the memslot to ensure that KVM
+ * doesn't hold references to the old PFN.
+ */
+ WARN_ON_ONCE(!READ_ONCE(kvm->mn_active_invalidate_count));
+
+ if (range->slot->flags & KVM_MEMSLOT_INVALID)
+ return false;
+
+ return kvm_set_spte_gfn(kvm, range);
+}
+
static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long address,
@@ -656,7 +674,7 @@ static void kvm_mmu_notifier_change_pte(
if (!READ_ONCE(kvm->mmu_notifier_count))
return;
- kvm_handle_hva_range(mn, address, address + 1, pte, kvm_set_spte_gfn);
+ kvm_handle_hva_range(mn, address, address + 1, pte, kvm_change_spte_gfn);
}
void kvm_inc_notifier_count(struct kvm *kvm, unsigned long start,
next prev parent reply other threads:[~2023-06-26 18:38 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-26 18:11 [PATCH 5.15 00/96] 5.15.119-rc1 review Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 01/96] drm/amd/display: fix the system hang while disable PSR Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 02/96] tracing: Add tracing_reset_all_online_cpus_unlocked() function Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 03/96] tpm, tpm_tis: Claim locality in interrupt handler Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 04/96] drm/amd/display: Add minimal pipe split transition state Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 05/96] drm/amd/display: Use dc_update_planes_and_stream Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 06/96] drm/amd/display: Add wrapper to call planes and stream update Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 07/96] tick/common: Align tick period during sched_timer setup Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 08/96] selftests: mptcp: lib: skip if missing symbol Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 09/96] selftests: mptcp: lib: skip if not below kernel version Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 10/96] selftests/mount_setattr: fix redefine struct mount_attr build error Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 11/96] selftests: mptcp: pm nl: remove hardcoded default limits Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 12/96] selftests: mptcp: join: use iptables-legacy if available Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 13/96] selftests: mptcp: join: skip check if MIB counter not supported Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 14/96] nilfs2: fix buffer corruption due to concurrent device reads Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 15/96] ACPI: sleep: Avoid breaking S3 wakeup due to might_sleep() Greg Kroah-Hartman
2023-06-26 18:11 ` Greg Kroah-Hartman [this message]
2023-06-26 18:11 ` [PATCH 5.15 17/96] Drivers: hv: vmbus: Call hv_synic_free() if hv_synic_alloc() fails Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 18/96] Drivers: hv: vmbus: Fix vmbus_wait_for_unload() to scan present CPUs Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 19/96] PCI: hv: Fix a race condition bug in hv_pci_query_relations() Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 20/96] Revert "PCI: hv: Fix a timing issue which causes kdump to fail occasionally" Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 21/96] PCI: hv: Remove the useless hv_pcichild_state from struct hv_pci_dev Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 22/96] PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 23/96] PCI: hv: Add a per-bus mutex state_lock Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 24/96] cgroup: Do not corrupt task iteration when rebinding subsystem Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 25/96] mmc: sdhci-msm: Disable broken 64-bit DMA on MSM8916 Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 26/96] mmc: meson-gx: remove redundant mmc_request_done() call from irq context Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 27/96] mmc: mmci: stm32: fix max busy timeout calculation Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 28/96] ip_tunnels: allow VXLAN/GENEVE to inherit TOS/TTL from VLAN Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 29/96] regulator: pca9450: Fix LDO3OUT and LDO4OUT MASK Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 30/96] regmap: spi-avmm: Fix regmap_bus max_raw_write Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 31/96] writeback: fix dereferencing NULL mapping->host on writeback_page_template Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 32/96] io_uring/net: save msghdr->msg_control for retries Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 33/96] io_uring/net: clear msg_controllen on partial sendmsg retry Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 34/96] io_uring/net: disable partial retries for recvmsg with cmsg Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 35/96] nilfs2: prevent general protection fault in nilfs_clear_dirty_page() Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 36/96] x86/mm: Avoid using set_pgd() outside of real PGD pages Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 37/96] memfd: check for non-NULL file_seals in memfd_create() syscall Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 38/96] mmc: meson-gx: fix deferred probing Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 39/96] ieee802154: hwsim: Fix possible memory leaks Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 40/96] xfrm: Treat already-verified secpath entries as optional Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 41/96] xfrm: interface: rename xfrm_interface.c to xfrm_interface_core.c Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 42/96] xfrm: Ensure policies always checked on XFRM-I input path Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 43/96] bpf: track immediate values written to stack by BPF_ST instruction Greg Kroah-Hartman
2023-06-26 18:11 ` [PATCH 5.15 44/96] bpf: Fix verifier id tracking of scalars on spill Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 45/96] xfrm: fix inbound ipv4/udp/esp packets to UDPv6 dualstack sockets Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 46/96] selftests: net: fcnal-test: check if FIPS mode is enabled Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 47/96] xfrm: Linearize the skb after offloading if needed Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 48/96] net: qca_spi: Avoid high load if QCA7000 is not available Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 49/96] mmc: mtk-sd: fix deferred probing Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 50/96] mmc: mvsdio: " Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 51/96] mmc: omap: " Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 52/96] mmc: omap_hsmmc: " Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 53/96] mmc: owl: " Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 54/96] mmc: sdhci-acpi: " Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 55/96] mmc: sh_mmcif: " Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 56/96] mmc: usdhi60rol0: " Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 57/96] ipvs: align inner_mac_header for encapsulation Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 58/96] net: dsa: mt7530: fix trapping frames on non-MT7621 SoC MT7530 switch Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 59/96] net: dsa: mt7530: fix handling of BPDUs on " Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 60/96] be2net: Extend xmit workaround to BE3 chip Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 61/96] netfilter: nf_tables: fix chain binding transaction logic Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 62/96] netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 63/96] netfilter: nft_set_pipapo: .walk does not deal with generations Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 64/96] netfilter: nf_tables: disallow element updates of bound anonymous sets Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 65/96] netfilter: nf_tables: reject unbound anonymous set before commit phase Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 66/96] netfilter: nf_tables: reject unbound chain " Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 67/96] netfilter: nf_tables: disallow updates of anonymous sets Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 68/96] netfilter: nfnetlink_osf: fix module autoload Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 69/96] Revert "net: phy: dp83867: perform soft reset and retain established link" Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 70/96] bpf/btf: Accept function names that contain dots Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 71/96] selftests: forwarding: Fix race condition in mirror installation Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 72/96] sch_netem: acquire qdisc lock in netem_change() Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 73/96] gpio: Allow per-parent interrupt data Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 74/96] gpiolib: Fix GPIO chip IRQ initialization restriction Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 75/96] gpio: sifive: add missing check for platform_get_irq Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 76/96] scsi: target: iscsi: Prevent login threads from racing between each other Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 77/96] HID: wacom: Add error check to wacom_parse_and_register() Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 78/96] arm64: Add missing Set/Way CMO encodings Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 79/96] media: cec: core: dont set last_initiator if tx in progress Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 80/96] nfcsim.c: Fix error checking for debugfs_create_dir Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 81/96] usb: gadget: udc: fix NULL dereference in remove() Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 82/96] nvme: double KA polling frequency to avoid KATO with TBKAS on Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 83/96] Input: soc_button_array - add invalid acpi_index DMI quirk handling Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 84/96] s390/cio: unregister device when the only path is gone Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 85/96] spi: lpspi: disable lpspi module irq in DMA mode Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 86/96] ASoC: simple-card: Add missing of_node_put() in case of error Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 87/96] soundwire: dmi-quirks: add new mapping for HP Spectre x360 Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 88/96] ASoC: nau8824: Add quirk to active-high jack-detect Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 89/96] s390/purgatory: disable branch profiling Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 90/96] ARM: dts: Fix erroneous ADS touchscreen polarities Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 91/96] drm/exynos: vidi: fix a wrong error return Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 92/96] drm/exynos: fix race condition UAF in exynos_g2d_exec_ioctl Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 93/96] drm/radeon: fix race condition UAF in radeon_gem_set_domain_ioctl Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 94/96] vhost_net: revert upend_idx only on retriable error Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 95/96] x86/apic: Fix kernel panic when booting with intremap=off and x2apic_phys Greg Kroah-Hartman
2023-06-26 18:12 ` [PATCH 5.15 96/96] i2c: imx-lpi2c: fix type char overflow issue when calculating the clock cycle Greg Kroah-Hartman
2023-06-27 9:04 ` [PATCH 5.15 00/96] 5.15.119-rc1 review Jon Hunter
2023-06-27 20:09 ` Chris Paterson
2023-06-27 21:34 ` Guenter Roeck
2023-06-28 6:42 ` Naresh Kamboju
2023-06-28 7:27 ` Ron Economos
2023-06-28 17:38 ` Allen Pais
2023-07-21 23:29 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230626180747.600081893@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=david@redhat.com \
--cc=gshan@redhat.com \
--cc=hshuai@redhat.com \
--cc=oliver.upton@linux.dev \
--cc=patches@lists.linux.dev \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=seanjc@google.com \
--cc=shahuang@redhat.com \
--cc=stable@vger.kernel.org \
--cc=zhenyzha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).