From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754190AbdJCMc7 (ORCPT ); Tue, 3 Oct 2017 08:32:59 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:34688 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754177AbdJCMc4 (ORCPT ); Tue, 3 Oct 2017 08:32:56 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jon Masters , Timur Tabi , Richard Ruigrok , Will Deacon , Catalin Marinas Subject: [PATCH 4.13 077/110] arm64: mm: Use READ_ONCE when dereferencing pointer to pte table Date: Tue, 3 Oct 2017 14:29:39 +0200 Message-Id: <20171003114244.430374928@linuxfoundation.org> X-Mailer: git-send-email 2.14.2 In-Reply-To: <20171003114241.408583531@linuxfoundation.org> References: <20171003114241.408583531@linuxfoundation.org> User-Agent: quilt/0.65 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.13-stable review patch. If anyone has any objections, please let me know. ------------------ From: Will Deacon commit f069faba688701c4d56b6c3452a130f97bf02e95 upstream. On kernels built with support for transparent huge pages, different CPUs can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk and they must take care to use READ_ONCE to avoid value tearing or caching of stale values by the compiler. Unfortunately, these functions call into our pgtable macros, which don't use READ_ONCE, and compiler caching has been observed to cause the following crash during ext4 writeback: PC is at check_pte+0x20/0x170 LR is at page_vma_mapped_walk+0x2e0/0x540 [...] Process doio (pid: 2463, stack limit = 0xffff00000f2e8000) Call trace: [] check_pte+0x20/0x170 [] page_vma_mapped_walk+0x2e0/0x540 [] page_mkclean_one+0xac/0x278 [] rmap_walk_file+0xf0/0x238 [] rmap_walk+0x64/0xa0 [] page_mkclean+0x90/0xa8 [] clear_page_dirty_for_io+0x84/0x2a8 [] mpage_submit_page+0x34/0x98 [] mpage_process_page_bufs+0x164/0x170 [] mpage_prepare_extent_to_map+0x134/0x2b8 [] ext4_writepages+0x484/0xe30 [] do_writepages+0x44/0xe8 [] __filemap_fdatawrite_range+0xbc/0x110 [] file_write_and_wait_range+0x48/0xd8 [] ext4_sync_file+0x80/0x4b8 [] vfs_fsync_range+0x64/0xc0 [] SyS_msync+0x194/0x1e8 This is because page_vma_mapped_walk loads the PMD twice before calling pte_offset_map: the first time without READ_ONCE (where it gets all zeroes due to a concurrent pmdp_invalidate) and the second time with READ_ONCE (where it sees a valid table pointer due to a concurrent pmd_populate). However, the compiler inlines everything and caches the first value in a register, which is subsequently used in pte_offset_phys which returns a junk pointer that is later dereferenced when attempting to access the relevant pte. This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure that a stale value is not used. Whilst this is a point fix for a known failure (and simple to backport), a full fix moving all of our page table accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in page_vma_mapped_walk is in the works for a future kernel release. Cc: Jon Masters Cc: Timur Tabi Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()") Tested-by: Richard Ruigrok Signed-off-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman --- arch/arm64/include/asm/pgtable.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -412,7 +412,7 @@ static inline phys_addr_t pmd_page_paddr /* Find an entry in the third-level page table. */ #define pte_index(addr) (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) -#define pte_offset_phys(dir,addr) (pmd_page_paddr(*(dir)) + pte_index(addr) * sizeof(pte_t)) +#define pte_offset_phys(dir,addr) (pmd_page_paddr(READ_ONCE(*(dir))) + pte_index(addr) * sizeof(pte_t)) #define pte_offset_kernel(dir,addr) ((pte_t *)__va(pte_offset_phys((dir), (addr)))) #define pte_offset_map(dir,addr) pte_offset_kernel((dir), (addr))