From: Robin Holt <holt@sgi.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@linux-foundation.org>,
Robin Holt <holt@sgi.com>
Subject: Re: [PATCH 4/4] zone_reclaim_mode is always 0 by default
Date: Thu, 14 May 2009 06:48:27 -0500 [thread overview]
Message-ID: <20090514114827.GN7601@sgi.com> (raw)
In-Reply-To: <20090514170721.9B75.A69D9226@jp.fujitsu.com>
> Unfortunately no.
> zone reclaim has two weakness by design.
>
> 1.
> zone reclaim don't works well when workingset size > local node size.
> but it can happen easily on small machine.
> if it happen, zone reclaim drop own process's memory.
>
> Plus, zone reclaim also doesn't fit DB server. its process has large
> workingset.
Large DB server is not your typical desktop application either.
> 2.
> zone reclaim have inter zone balancing issue.
>
> example: x86_64 2node 8G machine has following zone assignment
>
> zone 0 (DMA32): 3GB
> zone 0 (Normal): 1GB
> zone 1 (Normal): 4GB
>
> if the page is allocated from DMA32, you are lucky. DMA32 isn't reclaimed
> so freqently. but if from zone0 Normal, you are unlucky.
> it is very frequent reclaimed although it is small than other zone.
I have seen that behavior on some of our mismatched large systems as well,
although never had one so imbalanced because ia64 only has Normal.
> I know my patch change large server default. but I believe linux
> default kernel parameter adapt to desktop and entry machine.
If this imbalance is an x86_64 only problem, then we could do something
simple like the following untested patch. This leaves the default
for everyone except x86_64.
Robin
------------------------------------------------------------------------
Even if there is a great node distance on x86_64, disable zone reclaim
by default. This was done to handle the imbalanced zone sizes where a
majority of the memory in zone 0 is DMA32 with a small remaining Normal
which will be aggressively reclaimed.
For other architectures, we leave the default behavior.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
---
arch/x86/include/asm/topology.h | 2 ++
include/linux/topology.h | 5 +++++
mm/page_alloc.c | 2 +-
3 files changed, 8 insertions(+), 1 deletion(-)
Index: page_reclaim_mode/arch/x86/include/asm/topology.h
===================================================================
--- page_reclaim_mode.orig/arch/x86/include/asm/topology.h 2009-05-14 06:44:20.118925713 -0500
+++ page_reclaim_mode/arch/x86/include/asm/topology.h 2009-05-14 06:44:21.251067716 -0500
@@ -128,6 +128,8 @@ extern unsigned long node_remap_size[];
#endif
+#define DEFAULT_ZONE_RECLAIM_MODE 0
+
/* sched_domains SD_NODE_INIT for NUMA machines */
#define SD_NODE_INIT (struct sched_domain) { \
.min_interval = 8, \
Index: page_reclaim_mode/include/linux/topology.h
===================================================================
--- page_reclaim_mode.orig/include/linux/topology.h 2009-05-14 06:44:20.070919619 -0500
+++ page_reclaim_mode/include/linux/topology.h 2009-05-14 06:44:21.279071382 -0500
@@ -61,6 +61,11 @@ int arch_update_cpu_topology(void);
*/
#define RECLAIM_DISTANCE 20
#endif
+
+#ifndef DEFAULT_ZONE_RECLAIM_MODE
+#define DEFAULT_ZONE_RECLAIM_MODE 1
+#endif
+
#ifndef PENALTY_FOR_NODE_WITH_CPUS
#define PENALTY_FOR_NODE_WITH_CPUS (1)
#endif
Index: page_reclaim_mode/mm/page_alloc.c
===================================================================
--- page_reclaim_mode.orig/mm/page_alloc.c 2009-05-14 06:44:20.138928363 -0500
+++ page_reclaim_mode/mm/page_alloc.c 2009-05-14 06:44:21.311075244 -0500
@@ -2331,7 +2331,7 @@ static void build_zonelists(pg_data_t *p
* to reclaim pages in a zone before going off node.
*/
if (distance > RECLAIM_DISTANCE)
- zone_reclaim_mode = 1;
+ zone_reclaim_mode = DEFAULT_ZONE_RECLAIM_MODE;
/*
* We don't want to pressure a particular node.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-05-14 11:47 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-13 3:06 [PATCH 0/4] various zone_reclaim cleanup KOSAKI Motohiro
2009-05-13 3:06 ` [PATCH 1/4] vmscan: change the number of the unmapped files in zone reclaim KOSAKI Motohiro
2009-05-13 13:31 ` Rik van Riel
2009-05-14 19:52 ` Christoph Lameter
2009-05-18 3:15 ` Wu Fengguang
2009-05-18 3:35 ` KOSAKI Motohiro
2009-05-18 3:53 ` Wu Fengguang
2009-05-19 1:11 ` KOSAKI Motohiro
2009-05-13 3:06 ` [PATCH 2/4] vmscan: drop PF_SWAPWRITE from zone_reclaim KOSAKI Motohiro
2009-05-13 13:35 ` Rik van Riel
2009-05-14 19:57 ` Christoph Lameter
2009-05-18 3:33 ` Wu Fengguang
2009-05-13 3:07 ` [PATCH 3/4] vmscan: zone_reclaim use may_swap KOSAKI Motohiro
2009-05-13 11:26 ` Johannes Weiner
2009-05-13 14:43 ` Rik van Riel
2009-05-14 19:59 ` Christoph Lameter
2009-05-18 3:35 ` Wu Fengguang
2009-05-13 3:08 ` [PATCH 4/4] zone_reclaim_mode is always 0 by default KOSAKI Motohiro
2009-05-13 14:47 ` Rik van Riel
2009-05-14 8:20 ` KOSAKI Motohiro
2009-05-14 11:48 ` Robin Holt [this message]
2009-05-14 12:02 ` KOSAKI Motohiro
2009-05-13 15:22 ` Robin Holt
2009-05-14 20:05 ` Christoph Lameter
2009-05-14 20:23 ` Rik van Riel
2009-05-14 20:31 ` Christoph Lameter
2009-05-15 1:02 ` KOSAKI Motohiro
2009-05-15 10:51 ` Robin Holt
2009-05-19 2:53 ` KOSAKI Motohiro
2009-05-20 14:00 ` Robin Holt
2009-05-21 2:44 ` KOSAKI Motohiro
2009-05-21 13:31 ` Christoph Lameter
2009-05-21 13:57 ` Robin Holt
2009-05-24 13:44 ` KOSAKI Motohiro
2009-05-15 18:01 ` Christoph Lameter
2009-05-18 3:49 ` Wu Fengguang
2009-05-19 1:16 ` Zhang, Yanmin
2009-05-19 2:53 ` KOSAKI Motohiro
2009-05-19 2:57 ` KOSAKI Motohiro
2009-05-19 3:38 ` Zhang, Yanmin
2009-05-19 4:30 ` KOSAKI Motohiro
2009-05-19 5:06 ` Zhang, Yanmin
2009-05-19 7:09 ` KOSAKI Motohiro
2009-05-19 7:15 ` Zhang, Yanmin
2009-05-18 9:09 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090514114827.GN7601@sgi.com \
--to=holt@sgi.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).