From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Will Deacon <will.deacon@arm.com>,
Yury Norov <ynorov@caviumnetworks.com>,
Richard Ruigrok <rruigrok@codeaurora.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.13 04/11] mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE outside of lock
Date: Thu, 19 Oct 2017 15:38:59 +0200 [thread overview]
Message-ID: <20171019131700.010878957@linuxfoundation.org> (raw)
In-Reply-To: <20171019131659.388456140@linuxfoundation.org>
4.13-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon <will.deacon@arm.com>
commit a7b100953aa33a5bbdc3e5e7f2241b9c0704606e upstream.
Loading the pmd without holding the pmd_lock exposes us to races with
concurrent updaters of the page tables but, worse still, it also allows
the compiler to cache the pmd value in a register and reuse it later on,
even if we've performed a READ_ONCE in between and seen a more recent
value.
In the case of page_vma_mapped_walk, this leads to the following crash
when the pmd loaded for the initial pmd_trans_huge check is all zeroes
and a subsequent valid table entry is loaded by check_pmd. We then
proceed into map_pte, but the compiler re-uses the zero entry inside
pte_offset_map, resulting in a junk pointer being installed in
pvmw->pte:
PC is at check_pte+0x20/0x170
LR is at page_vma_mapped_walk+0x2e0/0x540
[...]
Process doio (pid: 2463, stack limit = 0xffff00000f2e8000)
Call trace:
check_pte+0x20/0x170
page_vma_mapped_walk+0x2e0/0x540
page_mkclean_one+0xac/0x278
rmap_walk_file+0xf0/0x238
rmap_walk+0x64/0xa0
page_mkclean+0x90/0xa8
clear_page_dirty_for_io+0x84/0x2a8
mpage_submit_page+0x34/0x98
mpage_process_page_bufs+0x164/0x170
mpage_prepare_extent_to_map+0x134/0x2b8
ext4_writepages+0x484/0xe30
do_writepages+0x44/0xe8
__filemap_fdatawrite_range+0xbc/0x110
file_write_and_wait_range+0x48/0xd8
ext4_sync_file+0x80/0x4b8
vfs_fsync_range+0x64/0xc0
SyS_msync+0x194/0x1e8
This patch fixes the problem by ensuring that READ_ONCE is used before
the initial checks on the pmd, and this value is subsequently used when
checking whether or not the pmd is present. pmd_check is removed and
the pmd_present check is inlined directly.
Link: http://lkml.kernel.org/r/1507222630-5839-1-git-send-email-will.deacon@arm.com
Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Tested-by: Yury Norov <ynorov@caviumnetworks.com>
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[will: backport to 4.13.y]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/page_vma_mapped.c | 25 ++++++++++---------------
1 file changed, 10 insertions(+), 15 deletions(-)
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -6,17 +6,6 @@
#include "internal.h"
-static inline bool check_pmd(struct page_vma_mapped_walk *pvmw)
-{
- pmd_t pmde;
- /*
- * Make sure we don't re-load pmd between present and !trans_huge check.
- * We need a consistent view.
- */
- pmde = READ_ONCE(*pvmw->pmd);
- return pmd_present(pmde) && !pmd_trans_huge(pmde);
-}
-
static inline bool not_found(struct page_vma_mapped_walk *pvmw)
{
page_vma_mapped_walk_done(pvmw);
@@ -106,6 +95,7 @@ bool page_vma_mapped_walk(struct page_vm
pgd_t *pgd;
p4d_t *p4d;
pud_t *pud;
+ pmd_t pmde;
/* The only possible pmd mapping has been handled on last iteration */
if (pvmw->pmd && !pvmw->pte)
@@ -138,7 +128,13 @@ restart:
if (!pud_present(*pud))
return false;
pvmw->pmd = pmd_offset(pud, pvmw->address);
- if (pmd_trans_huge(*pvmw->pmd)) {
+ /*
+ * Make sure the pmd value isn't cached in a register by the
+ * compiler and used as a stale value after we've observed a
+ * subsequent update.
+ */
+ pmde = READ_ONCE(*pvmw->pmd);
+ if (pmd_trans_huge(pmde)) {
pvmw->ptl = pmd_lock(mm, pvmw->pmd);
if (!pmd_present(*pvmw->pmd))
return not_found(pvmw);
@@ -153,9 +149,8 @@ restart:
spin_unlock(pvmw->ptl);
pvmw->ptl = NULL;
}
- } else {
- if (!check_pmd(pvmw))
- return false;
+ } else if (!pmd_present(pmde)) {
+ return false;
}
if (!map_pte(pvmw))
goto next_pte;
next prev parent reply other threads:[~2017-10-19 13:39 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-19 13:38 [PATCH 4.13 00/11] 4.13.9-stable review Greg Kroah-Hartman
2017-10-19 13:38 ` [PATCH 4.13 01/11] x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature Greg Kroah-Hartman
2017-10-19 13:38 ` [PATCH 4.13 02/11] x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on hypervisors Greg Kroah-Hartman
2017-10-19 13:38 ` [PATCH 4.13 03/11] perf pmu: Unbreak perf record for arm/arm64 with events with explicit PMU Greg Kroah-Hartman
2017-10-19 13:38 ` Greg Kroah-Hartman [this message]
2017-10-19 13:39 ` [PATCH 4.13 05/11] HID: hid-elecom: extend to fix descriptor for HUGE trackball Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 06/11] Drivers: hv: vmbus: Fix rescind handling issues Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 07/11] Drivers: hv: vmbus: Fix bugs in rescind handling Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 08/11] vmbus: simplify hv_ringbuffer_read Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 09/11] vmbus: refactor hv_signal_on_read Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 10/11] vmbus: eliminate duplicate cached index Greg Kroah-Hartman
2017-10-19 13:39 ` [PATCH 4.13 11/11] vmbus: more host signalling avoidance Greg Kroah-Hartman
2017-10-20 13:06 ` [PATCH 4.13 00/11] 4.13.9-stable review Guenter Roeck
2017-10-20 13:49 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171019131700.010878957@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rruigrok@codeaurora.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
--cc=ynorov@caviumnetworks.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.