stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, David Rientjes <rientjes@google.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mel Gorman <mgorman@suse.de>,
	Rik van Riel <riel@redhat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Bob Liu <bob.liu@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.14 66/77] mm, thp: only collapse hugepages to nodes with affinity for zone_reclaim_mode
Date: Tue, 27 Jan 2015 17:27:44 -0800	[thread overview]
Message-ID: <20150128012747.947520308@linuxfoundation.org> (raw)
In-Reply-To: <20150128012745.971137091@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Rientjes <rientjes@google.com>

commit 14a4e2141e24304fff2c697be6382ffb83888185 upstream.

Commit 9f1b868a13ac ("mm: thp: khugepaged: add policy for finding target
node") improved the previous khugepaged logic which allocated a
transparent hugepages from the node of the first page being collapsed.

However, it is still possible to collapse pages to remote memory which
may suffer from additional access latency.  With the current policy, it
is possible that 255 pages (with PAGE_SHIFT == 12) will be collapsed
remotely if the majority are allocated from that node.

When zone_reclaim_mode is enabled, it means the VM should make every
attempt to allocate locally to prevent NUMA performance degradation.  In
this case, we do not want to collapse hugepages to remote nodes that
would suffer from increased access latency.  Thus, when
zone_reclaim_mode is enabled, only allow collapsing to nodes with
RECLAIM_DISTANCE or less.

There is no functional change for systems that disable
zone_reclaim_mode.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/huge_memory.c |   26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2273,6 +2273,30 @@ static void khugepaged_alloc_sleep(void)
 
 static int khugepaged_node_load[MAX_NUMNODES];
 
+static bool khugepaged_scan_abort(int nid)
+{
+	int i;
+
+	/*
+	 * If zone_reclaim_mode is disabled, then no extra effort is made to
+	 * allocate memory locally.
+	 */
+	if (!zone_reclaim_mode)
+		return false;
+
+	/* If there is a count for this node already, it must be acceptable */
+	if (khugepaged_node_load[nid])
+		return false;
+
+	for (i = 0; i < MAX_NUMNODES; i++) {
+		if (!khugepaged_node_load[i])
+			continue;
+		if (node_distance(nid, i) > RECLAIM_DISTANCE)
+			return true;
+	}
+	return false;
+}
+
 #ifdef CONFIG_NUMA
 static int khugepaged_find_target_node(void)
 {
@@ -2589,6 +2613,8 @@ static int khugepaged_scan_pmd(struct mm
 		 * hit record.
 		 */
 		node = page_to_nid(page);
+		if (khugepaged_scan_abort(node))
+			goto out_unmap;
 		khugepaged_node_load[node]++;
 		VM_BUG_ON_PAGE(PageCompound(page), page);
 		if (!PageLRU(page) || PageLocked(page) || !PageAnon(page))



  parent reply	other threads:[~2015-01-28  1:27 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-28  1:26 [PATCH 3.14 00/77] 3.14.31-stable review Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 01/77] gpio: sysfs: fix gpio-chip device-attribute leak Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 02/77] gpio: sysfs: fix gpio " Greg Kroah-Hartman
2015-01-28 14:30   ` Luis Henriques
2015-01-28 15:24     ` Johan Hovold
2015-01-28 16:02       ` [PATCH v2] " Johan Hovold
2015-01-28 17:52         ` Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 03/77] pinctrl: Fix two deadlocks Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 04/77] libata: prevent HSM state change race between ISR and PIO Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 05/77] ALSA: usb-audio: Add mic volume fix quirk for Logitech Webcam C210 Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 06/77] scripts/recordmcount.pl: There is no -m32 gcc option on Super-H anymore Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 07/77] drm/i915: Fix mutex->owner inspection race under DEBUG_MUTEXES Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 08/77] drm/radeon: add a dpm quirk list Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 09/77] drm/radeon: add si " Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 10/77] drm/radeon: use rv515_ring_start on r5xx Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 11/77] PCI: Add flag for devices where we cant use bus reset Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 12/77] PCI: Mark Atheros AR93xx to avoid " Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 13/77] ipr: wait for aborted command responses Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 14/77] dm cache: share cache-metadata object across inactive and active DM tables Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 15/77] dm cache: fix problematic dual use of a single migration count variable Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 16/77] time: settimeofday: Validate the values of tv from user Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 17/77] time: adjtimex: Validate the ADJ_FREQUENCY values Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 18/77] ARM: dts: imx25: Fix PWM "per" clocks Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 19/77] bus: mvebu-mbus: fix support of MBus window 13 Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 20/77] fix deadlock in cifs_ioctl_clone() Greg Kroah-Hartman
2015-01-28  1:26 ` [PATCH 3.14 21/77] can: dev: fix crtlmode_supported check Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 22/77] clocksource: exynos_mct: Fix bitmask regression for exynos4_mct_write Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 23/77] x86, hyperv: Mark the Hyper-V clocksource as being continuous Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 24/77] x86/tsc: Change Fast TSC calibration failed from error to info Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 25/77] x86, boot: Skip relocs when load address unchanged Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 26/77] KVM: x86: Fix of previously incomplete fix for CVE-2014-8480 Greg Kroah-Hartman
2015-01-28  8:51   ` Nadav Amit
2015-01-28  1:27 ` [PATCH 3.14 27/77] x86, tls, ldt: Stop checking lm in LDT_empty Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 28/77] x86, tls: Interpret an all-zero struct user_desc as "no segment" Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 29/77] x86/apic: Re-enable PCI_MSI support for non-SMP X86_32 Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 30/77] x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 31/77] sata_dwc_460ex: fix resource leak on error path Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 32/77] KEYS: close race between key lookup and freeing Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 33/77] netfilter: nfnetlink: validate nfnetlink header from batch Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 34/77] ipvs: uninitialized data with IP_VS_IPV6 Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 35/77] Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single" Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 36/77] drbd: merge_bvec_fn: properly remap bvm->bi_bdev Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 37/77] crypto: prefix module autoloading with "crypto-" Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 38/77] crypto: include crypto- module prefix in template Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 39/77] crypto: add missing crypto module aliases Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 40/77] ARC: Delete stale barrier.h Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 41/77] ARC: Fix build breakage for !CONFIG_ARC_DW2_UNWIND Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 42/77] Input: evdev - fix EVIOCG{type} ioctl Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 43/77] tty: Fix pty master poll() after slave closes v2 Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 44/77] mmc: sdhci: Dont signal the sdio irq if its not setup Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 45/77] mm/swap.c: clean up *lru_cache_add* functions Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 46/77] mm: page_alloc: do not update zlc unless the zlc is active Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 47/77] mm: page_alloc: do not treat a zone that cannot be used for dirty pages as "full" Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 48/77] include/linux/jump_label.h: expose the reference count Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 49/77] mm: page_alloc: use jump labels to avoid checking number_of_cpusets Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 50/77] mm: page_alloc: calculate classzone_idx once from the zonelist ref Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 51/77] mm: page_alloc: only check the zone id check if pages are buddies Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 52/77] mm: page_alloc: only check the alloc flags and gfp_mask for dirty once Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 53/77] mm: page_alloc: take the ALLOC_NO_WATERMARK check out of the fast path Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 54/77] mm: page_alloc: use unsigned int for order in more places Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 55/77] mm: page_alloc: reduce number of times page_to_pfn is called Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 56/77] mm: page_alloc: convert hot/cold parameter and immediate callers to bool Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 57/77] mm: page_alloc: lookup pageblock migratetype with IRQs enabled during free Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 58/77] mm: shmem: avoid atomic operation during shmem_getpage_gfp Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 59/77] mm: do not use atomic operations when releasing pages Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 60/77] mm: do not use unnecessary atomic operations when adding pages to the LRU Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 61/77] fs: buffer: do not use unnecessary atomic operations when discarding buffers Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 62/77] mm: non-atomically mark page accessed during page cache allocation where possible Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 63/77] mm: avoid unnecessary atomic operations during end_page_writeback() Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 64/77] shmem: fix init_page_accessed use to stop !PageLRU bug Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 65/77] mm/memory.c: use entry = ACCESS_ONCE(*pte) in handle_pte_fault() Greg Kroah-Hartman
2015-01-28  1:27 ` Greg Kroah-Hartman [this message]
2015-01-28  1:27 ` [PATCH 3.14 67/77] mm: make copy_pte_range static again Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 68/77] vmalloc: use rcu list iterator to reduce vmap_area_lock contention Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 69/77] memcg, vmscan: Fix forced scan of anonymous pages Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 70/77] mm: pagemap: avoid unnecessary overhead when tracepoints are deactivated Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 71/77] mm: rearrange zone fields into read-only, page alloc, statistics and page reclaim lines Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 72/77] mm: move zone->pages_scanned into a vmstat counter Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 73/77] mm: vmscan: only update per-cpu thresholds for online CPU Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 74/77] mm: page_alloc: abort fair zone allocation policy when remotes nodes are encountered Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 75/77] mm: page_alloc: reduce cost of the fair zone allocation policy Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 76/77] mm: get rid of radix tree gfp mask for pagecache_get_page Greg Kroah-Hartman
2015-01-28  1:27 ` [PATCH 3.14 77/77] md/raid5: fetch_block must fetch all the blocks handle_stripe_dirtying wants Greg Kroah-Hartman
2015-01-28 14:15 ` [PATCH 3.14 00/77] 3.14.31-stable review Guenter Roeck
2015-01-28 16:51 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150128012747.947520308@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bob.liu@oracle.com \
    --cc=dave.hansen@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).