From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Trevor Saunders <tbsaunde@tbsaunde.org>,
Lorenzo Stoakes <lstoakes@gmail.com>,
Rik van Riel <riel@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@techsingularity.net>,
Linus Torvalds <torvalds@linux-foundation.org>,
Tim Gardner <tim.gardner@canonical.com>,
Marcelo Henrique Cerri <marcelo.cerri@canonical.com>,
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Subject: [PATCH 4.4 12/18] mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
Date: Thu, 14 Oct 2021 16:53:44 +0200 [thread overview]
Message-ID: <20211014145206.721654546@linuxfoundation.org> (raw)
In-Reply-To: <20211014145206.330102860@linuxfoundation.org>
From: Lorenzo Stoakes <lstoakes@gmail.com>
commit 38e088546522e1e86d2b8f401a1354ad3a9b3303 upstream.
The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
PMDs respectively as requiring balancing upon a subsequent page fault.
User-defined PROT_NONE memory regions which also have this flag set will
not normally invoke the NUMA balancing code as do_page_fault() will send
a segfault to the process before handle_mm_fault() is even called.
However if access_remote_vm() is invoked to access a PROT_NONE region of
memory, handle_mm_fault() is called via faultin_page() and
__get_user_pages() without any access checks being performed, meaning
the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
region.
A simple means of triggering this problem is to access PROT_NONE mmap'd
memory using /proc/self/mem which reliably results in the NUMA handling
functions being invoked when CONFIG_NUMA_BALANCING is set.
This issue was reported in bugzilla (issue 99101) which includes some
simple repro code.
There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
added at commit c0e7cad to avoid accidentally provoking strange
behaviour by attempting to apply NUMA balancing to pages that are in
fact PROT_NONE. The BUG_ON()'s are consistently triggered by the repro.
This patch moves the PROT_NONE check into mm/memory.c rather than
invoking BUG_ON() as faulting in these pages via faultin_page() is a
valid reason for reaching the NUMA check with the PROT_NONE page table
flag set and is therefore not always a bug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101
Reported-by: Trevor Saunders <tbsaunde@tbsaunde.org>
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
[cascardo: context adjustments were necessary]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/huge_memory.c | 3 ---
mm/memory.c | 12 +++++++-----
2 files changed, 7 insertions(+), 8 deletions(-)
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1340,9 +1340,6 @@ int do_huge_pmd_numa_page(struct mm_stru
bool was_writable;
int flags = 0;
- /* A PROT_NONE fault should not end up here */
- BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)));
-
ptl = pmd_lock(mm, pmdp);
if (unlikely(!pmd_same(pmd, *pmdp)))
goto out_unlock;
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3209,9 +3209,6 @@ static int do_numa_page(struct mm_struct
bool was_writable = pte_write(pte);
int flags = 0;
- /* A PROT_NONE fault should not end up here */
- BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)));
-
/*
* The "pte" at this point cannot be used safely without
* validation through pte_unmap_same(). It's of NUMA type but
@@ -3304,6 +3301,11 @@ static int wp_huge_pmd(struct mm_struct
return VM_FAULT_FALLBACK;
}
+static inline bool vma_is_accessible(struct vm_area_struct *vma)
+{
+ return vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE);
+}
+
/*
* These routines also need to handle stuff like marking pages dirty
* and/or accessed for architectures that don't do it in hardware (most
@@ -3350,7 +3352,7 @@ static int handle_pte_fault(struct mm_st
pte, pmd, flags, entry);
}
- if (pte_protnone(entry))
+ if (pte_protnone(entry) && vma_is_accessible(vma))
return do_numa_page(mm, vma, address, entry, pte, pmd);
ptl = pte_lockptr(mm, pmd);
@@ -3425,7 +3427,7 @@ static int __handle_mm_fault(struct mm_s
if (pmd_trans_splitting(orig_pmd))
return 0;
- if (pmd_protnone(orig_pmd))
+ if (pmd_protnone(orig_pmd) && vma_is_accessible(vma))
return do_huge_pmd_numa_page(mm, vma, address,
orig_pmd, pmd);
next prev parent reply other threads:[~2021-10-14 14:55 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-14 14:53 [PATCH 4.4 00/18] 4.4.289-rc1 review Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 01/18] USB: cdc-acm: fix racy tty buffer accesses Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 02/18] USB: cdc-acm: fix break reporting Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 03/18] nfsd4: Handle the NFSv4 READDIR dircount hint being zero Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 04/18] xtensa: call irqchip_init only when CONFIG_USE_OF is selected Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 05/18] phy: mdio: fix memory leak Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 06/18] net_sched: fix NULL deref in fifo_set_limit() Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 07/18] ptp_pch: Load module automatically if ID matches Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 08/18] ARM: imx6: disable the GIC CPU interface before calling stby-poweroff sequence Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 09/18] netlink: annotate data races around nlk->bound Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 10/18] i40e: fix endless loop under rtnl Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 11/18] gup: document and work around "COW can break either way" issue Greg Kroah-Hartman
2021-10-14 14:53 ` Greg Kroah-Hartman [this message]
2021-10-14 14:53 ` [PATCH 4.4 13/18] HID: apple: Fix logical maximum and usage maximum of Magic Keyboard JIS Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 14/18] netfilter: ip6_tables: zero-initialize fragment offset Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 15/18] mac80211: Drop frames from invalid MAC address in ad-hoc mode Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 16/18] scsi: ses: Fix unsigned comparison with less than zero Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 17/18] scsi: virtio_scsi: Fix spelling mistake "Unsupport" -> "Unsupported" Greg Kroah-Hartman
2021-10-14 14:53 ` [PATCH 4.4 18/18] perf/x86: Reset destroy callback on event init failure Greg Kroah-Hartman
2021-10-14 21:06 ` [PATCH 4.4 00/18] 4.4.289-rc1 review Pavel Machek
2021-10-14 22:40 ` Shuah Khan
2021-10-15 22:05 ` Guenter Roeck
2021-10-15 23:11 ` Daniel Díaz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211014145206.721654546@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=cascardo@canonical.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lstoakes@gmail.com \
--cc=marcelo.cerri@canonical.com \
--cc=mgorman@techsingularity.net \
--cc=riel@redhat.com \
--cc=stable@vger.kernel.org \
--cc=tbsaunde@tbsaunde.org \
--cc=tim.gardner@canonical.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox