From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, David Rientjes <rientjes@google.com>,
Alex Thorlton <athorlton@sgi.com>, Bob Liu <lliubbo@gmail.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Hedi Berriche <hedi@sgi.com>, Hugh Dickins <hughd@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.15 14/42] mm, thp: do not allow thp faults to avoid cpuset restrictions
Date: Tue, 5 Aug 2014 11:13:28 -0700 [thread overview]
Message-ID: <20140805181321.451781632@linuxfoundation.org> (raw)
In-Reply-To: <20140805181321.018370988@linuxfoundation.org>
3.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Rientjes <rientjes@google.com>
commit b104a35d32025ca740539db2808aa3385d0f30eb upstream.
The page allocator relies on __GFP_WAIT to determine if ALLOC_CPUSET
should be set in allocflags. ALLOC_CPUSET controls if a page allocation
should be restricted only to the set of allowed cpuset mems.
Transparent hugepages clears __GFP_WAIT when defrag is disabled to prevent
the fault path from using memory compaction or direct reclaim. Thus, it
is unfairly able to allocate outside of its cpuset mems restriction as a
side-effect.
This patch ensures that ALLOC_CPUSET is only cleared when the gfp mask is
truly GFP_ATOMIC by verifying it is also not a thp allocation.
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Alex Thorlton <athorlton@sgi.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Bob Liu <lliubbo@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/page_alloc.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2435,7 +2435,7 @@ static inline int
gfp_to_alloc_flags(gfp_t gfp_mask)
{
int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
- const gfp_t wait = gfp_mask & __GFP_WAIT;
+ const bool atomic = !(gfp_mask & (__GFP_WAIT | __GFP_NO_KSWAPD));
/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
@@ -2444,20 +2444,20 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
* The caller may dip into page reserves a bit more if the caller
* cannot run direct reclaim, or if the caller has realtime scheduling
* policy or is asking for __GFP_HIGH memory. GFP_ATOMIC requests will
- * set both ALLOC_HARDER (!wait) and ALLOC_HIGH (__GFP_HIGH).
+ * set both ALLOC_HARDER (atomic == true) and ALLOC_HIGH (__GFP_HIGH).
*/
alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
- if (!wait) {
+ if (atomic) {
/*
- * Not worth trying to allocate harder for
- * __GFP_NOMEMALLOC even if it can't schedule.
+ * Not worth trying to allocate harder for __GFP_NOMEMALLOC even
+ * if it can't schedule.
*/
- if (!(gfp_mask & __GFP_NOMEMALLOC))
+ if (!(gfp_mask & __GFP_NOMEMALLOC))
alloc_flags |= ALLOC_HARDER;
/*
- * Ignore cpuset if GFP_ATOMIC (!wait) rather than fail alloc.
- * See also cpuset_zone_allowed() comment in kernel/cpuset.c.
+ * Ignore cpuset mems for GFP_ATOMIC rather than fail, see the
+ * comment for __cpuset_node_allowed_softwall().
*/
alloc_flags &= ~ALLOC_CPUSET;
} else if (unlikely(rt_task(current)) && !in_interrupt())
next prev parent reply other threads:[~2014-08-05 18:53 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-05 18:13 [PATCH 3.15 00/42] 3.15.9-stable review Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 01/42] powerpc/perf: Fix MMCR2 handling for EBB Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 02/42] crypto: arm-aes - fix encryption of unaligned data Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 03/42] crypto: af_alg - properly label AF_ALG socket Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 04/42] ARM: dts: fix L2 address in Hi3620 Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 05/42] ARM: OMAP2+: gpmc: fix gpmc_hwecc_bch_capable() Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 06/42] ARM: fix alignment of keystone page table fixup Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 07/42] ARM: 8115/1: LPAE: reduce damage caused by idmap to virtual memory layout Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 08/42] ath9k: fix aggregation session lockup Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 09/42] cfg80211: fix mic_failure tracing Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 10/42] Revert "mac80211: move "bufferable MMPDU" check to fix AP mode scan" Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 11/42] rapidio/tsi721_dma: fix failure to obtain transaction descriptor Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 12/42] scsi: handle flush errors properly Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 13/42] mm/page-writeback.c: fix divide by zero in bdi_dirty_limits() Greg Kroah-Hartman
2014-08-05 18:13 ` Greg Kroah-Hartman [this message]
2014-08-05 18:13 ` [PATCH 3.15 15/42] memcg: oom_notify use-after-free fix Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 16/42] staging: vt6655: Fix disassociated messages every 10 seconds Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 17/42] ACPI / PNP: Fix acpi_pnp_match() Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 18/42] iio:bma180: Fix scale factors to report correct acceleration units Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 19/42] iio:bma180: Missing check for frequency fractional part Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 20/42] iio: buffer: Fix demux table creation Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 21/42] dm bufio: fully initialize shrinker Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 22/42] dm cache: fix race affecting dirty block count Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 23/42] printk: rename printk_sched to printk_deferred Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 24/42] sched_clock: Avoid corrupting hrtimer tree during suspend Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 25/42] timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 26/42] Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option" Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 27/42] x86-64, espfix: Dont leak bits 31:16 of %esp returning to 16-bit stack Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 28/42] x86, espfix: Move espfix definitions into a separate header file Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 29/42] x86, espfix: Fix broken header guard Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 30/42] x86, espfix: Make espfix64 a Kconfig option, fix UML Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 31/42] x86, espfix: Make it possible to disable 16-bit support Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 32/42] x86_64/entry/xen: Do not invoke espfix64 on Xen Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 33/42] pinctrl: dra: dt-bindings: Fix pull enable/disable Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 34/42] vfs: fix check for fallocate on active swapfile Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 35/42] ARM: dts: dra7-evm: Make VDDA_1V8_PHY supply always on Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 36/42] staging: vt6655: Fix Warning on boot handle_irq_event_percpu Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 37/42] drm/i915: Ignore VBT backlight presence check on HP Chromebook 14 Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 38/42] x86/xen: no need to explicitly register an NMI callback Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 39/42] xtensa: add fixup for double exception raised in window overflow Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 40/42] net/l2tp: dont fall back on UDP [get|set]sockopt Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 41/42] lib/btree.c: fix leak of whole btree nodes Greg Kroah-Hartman
2014-08-05 18:13 ` [PATCH 3.15 42/42] x86/espfix/xen: Fix allocation of pages for paravirt page tables Greg Kroah-Hartman
2014-08-05 23:07 ` [PATCH 3.15 00/42] 3.15.9-stable review Shuah Khan
2014-08-05 23:27 ` Greg Kroah-Hartman
2014-08-06 11:44 ` Satoru Takeuchi
2014-08-06 14:20 ` Greg Kroah-Hartman
2014-08-06 2:35 ` Guenter Roeck
2014-08-06 2:38 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140805181321.451781632@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=athorlton@sgi.com \
--cc=dave.hansen@linux.intel.com \
--cc=hannes@cmpxchg.org \
--cc=hedi@sgi.com \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lliubbo@gmail.com \
--cc=mgorman@suse.de \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).