All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Will Deacon <will.deacon@arm.com>,
	Yury Norov <ynorov@caviumnetworks.com>,
	Richard Ruigrok <rruigrok@codeaurora.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.13 04/11] mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE outside of lock
Date: Thu, 19 Oct 2017 15:38:59 +0200	[thread overview]
Message-ID: <20171019131700.010878957@linuxfoundation.org> (raw)
In-Reply-To: <20171019131659.388456140@linuxfoundation.org>

4.13-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Will Deacon <will.deacon@arm.com>

commit a7b100953aa33a5bbdc3e5e7f2241b9c0704606e upstream.

Loading the pmd without holding the pmd_lock exposes us to races with
concurrent updaters of the page tables but, worse still, it also allows
the compiler to cache the pmd value in a register and reuse it later on,
even if we've performed a READ_ONCE in between and seen a more recent
value.

In the case of page_vma_mapped_walk, this leads to the following crash
when the pmd loaded for the initial pmd_trans_huge check is all zeroes
and a subsequent valid table entry is loaded by check_pmd.  We then
proceed into map_pte, but the compiler re-uses the zero entry inside
pte_offset_map, resulting in a junk pointer being installed in
pvmw->pte:

  PC is at check_pte+0x20/0x170
  LR is at page_vma_mapped_walk+0x2e0/0x540
  [...]
  Process doio (pid: 2463, stack limit = 0xffff00000f2e8000)
  Call trace:
    check_pte+0x20/0x170
    page_vma_mapped_walk+0x2e0/0x540
    page_mkclean_one+0xac/0x278
    rmap_walk_file+0xf0/0x238
    rmap_walk+0x64/0xa0
    page_mkclean+0x90/0xa8
    clear_page_dirty_for_io+0x84/0x2a8
    mpage_submit_page+0x34/0x98
    mpage_process_page_bufs+0x164/0x170
    mpage_prepare_extent_to_map+0x134/0x2b8
    ext4_writepages+0x484/0xe30
    do_writepages+0x44/0xe8
    __filemap_fdatawrite_range+0xbc/0x110
    file_write_and_wait_range+0x48/0xd8
    ext4_sync_file+0x80/0x4b8
    vfs_fsync_range+0x64/0xc0
    SyS_msync+0x194/0x1e8

This patch fixes the problem by ensuring that READ_ONCE is used before
the initial checks on the pmd, and this value is subsequently used when
checking whether or not the pmd is present.  pmd_check is removed and
the pmd_present check is inlined directly.

Link: http://lkml.kernel.org/r/1507222630-5839-1-git-send-email-will.deacon@arm.com
Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Tested-by: Yury Norov <ynorov@caviumnetworks.com>
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[will: backport to 4.13.y]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/page_vma_mapped.c |   25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -6,17 +6,6 @@
 
 #include "internal.h"
 
-static inline bool check_pmd(struct page_vma_mapped_walk *pvmw)
-{
-	pmd_t pmde;
-	/*
-	 * Make sure we don't re-load pmd between present and !trans_huge check.
-	 * We need a consistent view.
-	 */
-	pmde = READ_ONCE(*pvmw->pmd);
-	return pmd_present(pmde) && !pmd_trans_huge(pmde);
-}
-
 static inline bool not_found(struct page_vma_mapped_walk *pvmw)
 {
 	page_vma_mapped_walk_done(pvmw);
@@ -106,6 +95,7 @@ bool page_vma_mapped_walk(struct page_vm
 	pgd_t *pgd;
 	p4d_t *p4d;
 	pud_t *pud;
+	pmd_t pmde;
 
 	/* The only possible pmd mapping has been handled on last iteration */
 	if (pvmw->pmd && !pvmw->pte)
@@ -138,7 +128,13 @@ restart:
 	if (!pud_present(*pud))
 		return false;
 	pvmw->pmd = pmd_offset(pud, pvmw->address);
-	if (pmd_trans_huge(*pvmw->pmd)) {
+	/*
+	 * Make sure the pmd value isn't cached in a register by the
+	 * compiler and used as a stale value after we've observed a
+	 * subsequent update.
+	 */
+	pmde = READ_ONCE(*pvmw->pmd);
+	if (pmd_trans_huge(pmde)) {
 		pvmw->ptl = pmd_lock(mm, pvmw->pmd);
 		if (!pmd_present(*pvmw->pmd))
 			return not_found(pvmw);
@@ -153,9 +149,8 @@ restart:
 			spin_unlock(pvmw->ptl);
 			pvmw->ptl = NULL;
 		}
-	} else {
-		if (!check_pmd(pvmw))
-			return false;
+	} else if (!pmd_present(pmde)) {
+		return false;
 	}
 	if (!map_pte(pvmw))
 		goto next_pte;

  parent reply	other threads:[~2017-10-19 13:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 13:38 [PATCH 4.13 00/11] 4.13.9-stable review Greg Kroah-Hartman
2017-10-19 13:38 ` [PATCH 4.13 01/11] x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature Greg Kroah-Hartman
2017-10-19 13:38 ` [PATCH 4.13 02/11] x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on hypervisors Greg Kroah-Hartman
2017-10-19 13:38 ` [PATCH 4.13 03/11] perf pmu: Unbreak perf record for arm/arm64 with events with explicit PMU Greg Kroah-Hartman
2017-10-19 13:38 ` Greg Kroah-Hartman [this message]
2017-10-19 13:39 ` [PATCH 4.13 05/11] HID: hid-elecom: extend to fix descriptor for HUGE trackball Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 06/11] Drivers: hv: vmbus: Fix rescind handling issues Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 07/11] Drivers: hv: vmbus: Fix bugs in rescind handling Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 08/11] vmbus: simplify hv_ringbuffer_read Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 09/11] vmbus: refactor hv_signal_on_read Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 10/11] vmbus: eliminate duplicate cached index Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 11/11] vmbus: more host signalling avoidance Greg Kroah-Hartman
2017-10-20 13:06 ` [PATCH 4.13 00/11] 4.13.9-stable review Guenter Roeck
2017-10-20 13:49   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171019131700.010878957@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rruigrok@codeaurora.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=ynorov@caviumnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.