From: aarcange@redhat.com
To: linux-mm@kvack.org
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
Izik Eidus <ieidus@redhat.com>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
Mel Gorman <mel@csn.ul.ie>, Dave Hansen <dave@linux.vnet.ibm.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Ingo Molnar <mingo@elte.hu>, Mike Travis <travis@sgi.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Christoph Lameter <cl@linux-foundation.org>,
Chris Wright <chrisw@sous-sol.org>,
Andrew Morton <akpm@linux-foundation.org>,
bpicco@redhat.com,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Arnd Bergmann <arnd@arndb.de>,
Andrea Arcangeli <aarcange@redhat.com>
Subject: [patch 25/36] _GFP_NO_KSWAPD
Date: Sun, 21 Feb 2010 15:10:34 +0100 [thread overview]
Message-ID: <20100221141756.772875923@redhat.com> (raw)
In-Reply-To: 20100221141009.581909647@redhat.com
[-- Attachment #1: gfp_no_kswapd --]
[-- Type: text/plain, Size: 2224 bytes --]
From: Andrea Arcangeli <aarcange@redhat.com>
Transparent hugepage allocations must be allowed not to invoke kswapd or any
other kind of indirect reclaim (especially when the defrag sysfs is control
disabled). It's unacceptable to swap out anonymous pages (potentially
anonymous transparent hugepages) in order to create new transparent hugepages.
This is true for the MADV_HUGEPAGE areas too (swapping out a kvm virtual
machine and so having it suffer an unbearable slowdown, so another one with
guest physical memory marked MADV_HUGEPAGE can run 30% faster if it is running
memory intensive workloads, makes no sense). If a transparent hugepage
allocation fails the slowdown is minor and there is total fallback, so kswapd
should never be asked to swapout memory to allow the high order allocation to
succeed.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -59,13 +59,15 @@ struct vm_area_struct;
#define __GFP_NOTRACK ((__force gfp_t)0)
#endif
+#define __GFP_NO_KSWAPD ((__force gfp_t)0x400000u)
+
/*
* This may seem redundant, but it's a way of annotating false positives vs.
* allocations that simply cannot be supported (e.g. page tables).
*/
#define __GFP_NOTRACK_FALSE_POSITIVE (__GFP_NOTRACK)
-#define __GFP_BITS_SHIFT 22 /* Room for 22 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 23 /* Room for 23 __GFP_FOO bits */
#define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
/* This equals 0, but use constants in case they ever change */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1829,7 +1829,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, u
goto nopage;
restart:
- wake_all_kswapd(order, zonelist, high_zoneidx);
+ if (!(gfp_mask & __GFP_NO_KSWAPD))
+ wake_all_kswapd(order, zonelist, high_zoneidx);
/*
* OK, we're below the kswapd watermark and have kicked background
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-02-21 14:18 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-21 14:10 [patch 00/36] Transparent Hugepage support #11 aarcange
2010-02-21 14:10 ` [patch 01/36] define MADV_HUGEPAGE aarcange
2010-02-21 14:10 ` [patch 02/36] compound_lock aarcange
2010-02-21 14:10 ` [patch 03/36] alter compound get_page/put_page aarcange
2010-02-21 14:10 ` [patch 04/36] update futex compound knowledge aarcange
2010-02-21 14:10 ` [patch 05/36] fix bad_page to show the real reason the page is bad aarcange
2010-02-21 14:10 ` [patch 06/36] clear compound mapping aarcange
2010-02-21 14:10 ` [patch 07/36] add native_set_pmd_at aarcange
2010-02-21 14:10 ` [patch 08/36] add pmd paravirt ops aarcange
2010-02-21 14:10 ` [patch 09/36] no paravirt version of pmd ops aarcange
2010-02-21 14:10 ` [patch 10/36] export maybe_mkwrite aarcange
2010-02-21 14:10 ` [patch 11/36] comment reminder in destroy_compound_page aarcange
2010-02-21 14:10 ` [patch 12/36] config_transparent_hugepage aarcange
2010-02-21 14:10 ` [patch 13/36] special pmd_trans_* functions aarcange
2010-02-21 14:10 ` [patch 14/36] add pmd mangling generic functions aarcange
2010-02-21 14:10 ` [patch 15/36] add pmd mangling functions to x86 aarcange
2010-02-21 14:10 ` [patch 16/36] bail out gup_fast on splitting pmd aarcange
2010-02-21 14:10 ` [patch 17/36] pte alloc trans splitting aarcange
2010-02-21 14:10 ` [patch 18/36] add pmd mmu_notifier helpers aarcange
2010-02-21 14:10 ` [patch 19/36] clear page compound aarcange
2010-02-21 14:10 ` [patch 20/36] add pmd_huge_pte to mm_struct aarcange
2010-02-21 14:10 ` [patch 21/36] split_huge_page_mm/vma aarcange
2010-02-21 14:10 ` [patch 22/36] split_huge_page paging aarcange
2010-02-21 14:10 ` [patch 23/36] clear_copy_huge_page aarcange
2010-02-21 14:10 ` [patch 24/36] kvm mmu transparent hugepage support aarcange
2010-02-21 14:10 ` aarcange [this message]
2010-02-22 17:53 ` [patch 25/36] _GFP_NO_KSWAPD Rik van Riel
2010-02-22 18:00 ` Andrea Arcangeli
2010-02-22 18:02 ` Avi Kivity
2010-03-01 12:14 ` Mel Gorman
2010-02-21 14:10 ` [patch 26/36] dont alloc harder for gfp nomemalloc even if nowait aarcange
2010-02-22 17:54 ` Rik van Riel
2010-02-21 14:10 ` [patch 27/36] transparent hugepage core aarcange
2010-02-21 14:10 ` [patch 28/36] adapt to mm_counter in -mm aarcange
2010-02-22 17:54 ` Rik van Riel
2010-02-21 14:10 ` [patch 29/36] page anon_vma aarcange
2010-02-22 17:55 ` Rik van Riel
2010-02-21 14:10 ` [patch 30/36] verify pmd_trans_huge isnt leaking aarcange
2010-02-22 17:56 ` Rik van Riel
2010-02-21 14:10 ` [patch 31/36] madvise(MADV_HUGEPAGE) aarcange
2010-02-21 14:10 ` [patch 32/36] pmd_trans_huge migrate bugcheck aarcange
2010-02-21 14:10 ` [patch 33/36] memcg compound aarcange
2010-02-21 14:10 ` [patch 34/36] memcg huge memory aarcange
2010-02-21 14:10 ` [patch 35/36] transparent hugepage vmstat aarcange
2010-02-21 14:10 ` [patch 36/36] khugepaged aarcange
2010-02-23 7:58 ` KAMEZAWA Hiroyuki
2010-02-23 8:51 ` KAMEZAWA Hiroyuki
2010-02-23 14:26 ` Andrea Arcangeli
2010-02-23 23:57 ` KAMEZAWA Hiroyuki
2010-02-24 20:11 ` Andrew Morton
2010-02-24 20:28 ` Rik van Riel
2010-02-24 20:52 ` Andrew Morton
2010-02-24 20:57 ` Rik van Riel
2010-02-24 21:12 ` Andrew Morton
2010-02-24 21:24 ` Rik van Riel
2010-02-24 21:28 ` Andrew Morton
2010-02-24 21:58 ` Andrea Arcangeli
2010-02-24 22:52 ` Andrea Arcangeli
2010-02-24 22:56 ` Rik van Riel
2010-02-22 10:22 ` [patch 00/36] Transparent Hugepage support #11 Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100221141756.772875923@redhat.com \
--to=aarcange@redhat.com \
--cc=agl@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=avi@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=bpicco@redhat.com \
--cc=chrisw@sous-sol.org \
--cc=cl@linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=ieidus@redhat.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mingo@elte.hu \
--cc=mtosatti@redhat.com \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=travis@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).