From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3058C290A for ; Fri, 28 Apr 2023 11:29:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AB001C433EF; Fri, 28 Apr 2023 11:29:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1682681364; bh=iHZwOF0KDqQ5mOUaDe0VinwfF9hu7mLA39+jUjCqmSI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HYIVarNfV6nVNu1o3HbHN6un9cnhmbmNNNEY/HGMS8rBB4Ans+Tm8L+1uZ7I+yrNk 80TavayTy9W/6J149YddgeOkWZapt4Y2siPmpYT2Cnj3sNRy16Llt7Nrq5Sn9XNHxe mEk6ANMglbCpppyJ1igDioRAiHRzjH8vnds+e7ks= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Sean Christopherson , David Matlack , Marc Zyngier , Oliver Upton , Will Deacon Subject: [PATCH 6.1 03/16] KVM: arm64: Retry fault if vma_lookup() results become invalid Date: Fri, 28 Apr 2023 13:27:55 +0200 Message-Id: <20230428112040.172369153@linuxfoundation.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230428112040.063291126@linuxfoundation.org> References: <20230428112040.063291126@linuxfoundation.org> User-Agent: quilt/0.67 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: David Matlack commit 13ec9308a85702af7c31f3638a2720863848a7f2 upstream. Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can detect if the results of vma_lookup() (e.g. vma_shift) become stale before it acquires kvm->mmu_lock. This fixes a theoretical bug where a VMA could be changed by userspace after vma_lookup() and before KVM reads the mmu_invalidate_seq, causing KVM to install page table entries based on a (possibly) no-longer-valid vma_shift. Re-order the MMU cache top-up to earlier in user_mem_abort() so that it is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid inducing spurious fault retries). This bug has existed since KVM/ARM's inception. It's unlikely that any sane userspace currently modifies VMAs in such a way as to trigger this race. And even with directed testing I was unable to reproduce it. But a sufficiently motivated host userspace might be able to exploit this race. Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM") Cc: stable@vger.kernel.org Reported-by: Sean Christopherson Signed-off-by: David Matlack Reviewed-by: Marc Zyngier Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com Signed-off-by: Oliver Upton [will: Use FSC_PERM instead of ESR_ELx_FSC_PERM] Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman --- arch/arm64/kvm/mmu.c | 47 +++++++++++++++++++++-------------------------- 1 file changed, 21 insertions(+), 26 deletions(-) --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1179,6 +1179,20 @@ static int user_mem_abort(struct kvm_vcp } /* + * Permission faults just need to update the existing leaf entry, + * and so normally don't require allocations from the memcache. The + * only exception to this is when dirty logging is enabled at runtime + * and a write fault needs to collapse a block entry into a table. + */ + if (fault_status != FSC_PERM || + (logging_active && write_fault)) { + ret = kvm_mmu_topup_memory_cache(memcache, + kvm_mmu_cache_min_pages(kvm)); + if (ret) + return ret; + } + + /* * Let's check if we will get back a huge page backed by hugetlbfs, or * get block mapping for device MMIO region. */ @@ -1234,36 +1248,17 @@ static int user_mem_abort(struct kvm_vcp fault_ipa &= ~(vma_pagesize - 1); gfn = fault_ipa >> PAGE_SHIFT; - mmap_read_unlock(current->mm); - - /* - * Permission faults just need to update the existing leaf entry, - * and so normally don't require allocations from the memcache. The - * only exception to this is when dirty logging is enabled at runtime - * and a write fault needs to collapse a block entry into a table. - */ - if (fault_status != FSC_PERM || (logging_active && write_fault)) { - ret = kvm_mmu_topup_memory_cache(memcache, - kvm_mmu_cache_min_pages(kvm)); - if (ret) - return ret; - } - mmu_seq = vcpu->kvm->mmu_invalidate_seq; /* - * Ensure the read of mmu_invalidate_seq happens before we call - * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk - * the page we just got a reference to gets unmapped before we have a - * chance to grab the mmu_lock, which ensure that if the page gets - * unmapped afterwards, the call to kvm_unmap_gfn will take it away - * from us again properly. This smp_rmb() interacts with the smp_wmb() - * in kvm_mmu_notifier_invalidate_. + * Read mmu_invalidate_seq so that KVM can detect if the results of + * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to + * acquiring kvm->mmu_lock. * - * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is - * used to avoid unnecessary overhead introduced to locate the memory - * slot because it's always fixed even @gfn is adjusted for huge pages. + * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs + * with the smp_wmb() in kvm_mmu_invalidate_end(). */ - smp_rmb(); + mmu_seq = vcpu->kvm->mmu_invalidate_seq; + mmap_read_unlock(current->mm); pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL, write_fault, &writable, NULL);