All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Sean Christopherson <seanjc@google.com>,
	David Matlack <dmatlack@google.com>,
	Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	Will Deacon <will@kernel.org>
Subject: [PATCH 5.15 03/13] KVM: arm64: Retry fault if vma_lookup() results become invalid
Date: Fri, 28 Apr 2023 13:28:07 +0200	[thread overview]
Message-ID: <20230428112039.259362711@linuxfoundation.org> (raw)
In-Reply-To: <20230428112039.133978540@linuxfoundation.org>

From: David Matlack <dmatlack@google.com>

commit 13ec9308a85702af7c31f3638a2720863848a7f2 upstream.

Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
detect if the results of vma_lookup() (e.g. vma_shift) become stale
before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
VMA could be changed by userspace after vma_lookup() and before KVM
reads the mmu_invalidate_seq, causing KVM to install page table entries
based on a (possibly) no-longer-valid vma_shift.

Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
inducing spurious fault retries).

This bug has existed since KVM/ARM's inception. It's unlikely that any
sane userspace currently modifies VMAs in such a way as to trigger this
race. And even with directed testing I was unable to reproduce it. But a
sufficiently motivated host userspace might be able to exploit this
race.

Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")
Cc: stable@vger.kernel.org
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
[will: Use FSC_PERM instead of ESR_ELx_FSC_PERM. Read 'mmu_notifier_seq'
 instead of 'mmu_invalidate_seq'. Fix up function references in comment.]
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kvm/mmu.c |   47 +++++++++++++++++++++--------------------------
 1 file changed, 21 insertions(+), 26 deletions(-)

--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -998,6 +998,20 @@ static int user_mem_abort(struct kvm_vcp
 	}
 
 	/*
+	 * Permission faults just need to update the existing leaf entry,
+	 * and so normally don't require allocations from the memcache. The
+	 * only exception to this is when dirty logging is enabled at runtime
+	 * and a write fault needs to collapse a block entry into a table.
+	 */
+	if (fault_status != FSC_PERM ||
+	    (logging_active && write_fault)) {
+		ret = kvm_mmu_topup_memory_cache(memcache,
+						 kvm_mmu_cache_min_pages(kvm));
+		if (ret)
+			return ret;
+	}
+
+	/*
 	 * Let's check if we will get back a huge page backed by hugetlbfs, or
 	 * get block mapping for device MMIO region.
 	 */
@@ -1051,36 +1065,17 @@ static int user_mem_abort(struct kvm_vcp
 		fault_ipa &= ~(vma_pagesize - 1);
 
 	gfn = fault_ipa >> PAGE_SHIFT;
-	mmap_read_unlock(current->mm);
-
-	/*
-	 * Permission faults just need to update the existing leaf entry,
-	 * and so normally don't require allocations from the memcache. The
-	 * only exception to this is when dirty logging is enabled at runtime
-	 * and a write fault needs to collapse a block entry into a table.
-	 */
-	if (fault_status != FSC_PERM || (logging_active && write_fault)) {
-		ret = kvm_mmu_topup_memory_cache(memcache,
-						 kvm_mmu_cache_min_pages(kvm));
-		if (ret)
-			return ret;
-	}
 
-	mmu_seq = vcpu->kvm->mmu_notifier_seq;
 	/*
-	 * Ensure the read of mmu_notifier_seq happens before we call
-	 * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk
-	 * the page we just got a reference to gets unmapped before we have a
-	 * chance to grab the mmu_lock, which ensure that if the page gets
-	 * unmapped afterwards, the call to kvm_unmap_gfn will take it away
-	 * from us again properly. This smp_rmb() interacts with the smp_wmb()
-	 * in kvm_mmu_notifier_invalidate_<page|range_end>.
+	 * Read mmu_notifier_seq so that KVM can detect if the results of
+	 * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to
+	 * acquiring kvm->mmu_lock.
 	 *
-	 * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is
-	 * used to avoid unnecessary overhead introduced to locate the memory
-	 * slot because it's always fixed even @gfn is adjusted for huge pages.
+	 * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
+	 * with the smp_wmb() in kvm_dec_notifier_count().
 	 */
-	smp_rmb();
+	mmu_seq = vcpu->kvm->mmu_notifier_seq;
+	mmap_read_unlock(current->mm);
 
 	pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
 				   write_fault, &writable, NULL);



  parent reply	other threads:[~2023-04-28 11:29 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-28 11:28 [PATCH 5.15 00/13] 5.15.110-rc1 review Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 01/13] PCI/ASPM: Remove pcie_aspm_pm_state_change() Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 02/13] selftests/kselftest/runner/run_one(): allow running non-executable files Greg Kroah-Hartman
2023-04-28 11:28 ` Greg Kroah-Hartman [this message]
2023-04-28 11:28 ` [PATCH 5.15 04/13] KVM: arm64: Fix buffer overflow in kvm_arm_set_fw_reg() Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 05/13] wifi: brcmfmac: slab-out-of-bounds read in brcmf_get_assoc_ies() Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 06/13] drm/fb-helper: set x/yres_virtual in drm_fb_helper_check_var Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 07/13] bluetooth: Perform careful capability checks in hci_sock_ioctl() Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 08/13] USB: serial: option: add UNISOC vendor and TOZED LT70C product Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 09/13] driver core: Dont require dynamic_debug for initcall_debug probe timing Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 10/13] selftests: mptcp: join: fix "invalid address, ADD_ADDR timeout" Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 11/13] riscv: Move early dtb mapping into the fixmap region Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 12/13] riscv: Do not set initial_boot_params to the linear address of the dtb Greg Kroah-Hartman
2023-04-28 11:28 ` [PATCH 5.15 13/13] riscv: No need to relocate the dtb as it lies in the fixmap region Greg Kroah-Hartman
2023-04-28 22:33 ` [PATCH 5.15 00/13] 5.15.110-rc1 review Shuah Khan
2023-04-28 22:45 ` Naresh Kamboju
2023-04-29  2:57 ` Florian Fainelli
2023-04-29  4:09 ` Guenter Roeck
2023-04-29  6:42 ` Ron Economos
2023-04-29  7:42 ` Bagas Sanjaya
2023-05-02  5:40 ` Chris Paterson
2023-05-02 16:17 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230428112039.259362711@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dmatlack@google.com \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=patches@lists.linux.dev \
    --cc=seanjc@google.com \
    --cc=stable@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.