All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Robin Holt <holt@sgi.com>,
	"Zhang, Yanmin" <yanmin.zhang@intel.com>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linuxppc-dev@ozlabs.org" <linuxppc-dev@ozlabs.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v4] zone_reclaim is always 0 by default
Date: Thu, 04 Jun 2009 10:59:59 +0000	[thread overview]
Message-ID: <20090604105959.GA22118@localhost> (raw)
In-Reply-To: <20090604192236.9761.A69D9226@jp.fujitsu.com>

On Thu, Jun 04, 2009 at 06:23:15PM +0800, KOSAKI Motohiro wrote:
> 
> Current linux policy is, zone_reclaim_mode is enabled by default if the machine
> has large remote node distance. it's because we could assume that large distance
> mean large server until recently.
> 
> Unfortunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P transport
> memory controller. IOW it's seen as NUMA from software view.
> Some Core i7 machine has large remote node distance.
> 
> Yanmin reported zone_reclaim_mode=1 cause large apache regression.
> 
>     One Nehalem machine has 12GB memory,
>     but there is always 2GB free although applications accesses lots of files.
>     Eventually we located the root cause as zone_reclaim_mode=1.
> 
> Actually, zone_reclaim_mode=1 mean "I dislike remote node allocation rather than
> disk access", it makes performance improvement to HPC workload.
> but it makes performance degression to desktop, file server and web server.
> 
> In general, workload depended configration shouldn't put into default settings.
> 
> However, current code is long standing about two year. Highest POWER and IA64 HPC machine
> (only) use this setting.
> 
> Thus, x86 and almost rest architecture change default setting, but Only power and ia64
> remain current configuration for backward-compatibility.

The above lines are too long. Limit to 72 cols in general could be
better as git-log may add additional leading white spaces.

Thank you for all the efforts!

Acked-by: Wu Fengguang <fengguang.wu@intel.com>

> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Christoph Lameter <cl@linux-foundation.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Robin Holt <holt@sgi.com>
> Cc: "Zhang, Yanmin" <yanmin.zhang@intel.com>
> Cc: Wu Fengguang <fengguang.wu@intel.com>
> Cc: linux-ia64@vger.kernel.org
> Cc: linuxppc-dev@ozlabs.org
> ---
>  arch/powerpc/include/asm/topology.h |    6 ++++++
>  include/linux/topology.h            |    7 +------
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> Index: b/include/linux/topology.h
> =================================> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -54,12 +54,7 @@ int arch_update_cpu_topology(void);
>  #define node_distance(from,to)	((from) = (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE)
>  #endif
>  #ifndef RECLAIM_DISTANCE
> -/*
> - * If the distance between nodes in a system is larger than RECLAIM_DISTANCE
> - * (in whatever arch specific measurement units returned by node_distance())
> - * then switch on zone reclaim on boot.
> - */
> -#define RECLAIM_DISTANCE 20
> +#define RECLAIM_DISTANCE INT_MAX
>  #endif
>  #ifndef PENALTY_FOR_NODE_WITH_CPUS
>  #define PENALTY_FOR_NODE_WITH_CPUS	(1)
> Index: b/arch/powerpc/include/asm/topology.h
> =================================> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -10,6 +10,12 @@ struct device_node;
>  
>  #include <asm/mmzone.h>
>  
> +/*
> + * Distance above which we begin to use zone reclaim

s/begin to/default to/ ?

> + */
> +#define RECLAIM_DISTANCE 20
> +
> +
>  static inline int cpu_to_node(int cpu)
>  {
>  	return numa_cpu_lookup_table[cpu];
> 

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	"Zhang, Yanmin" <yanmin.zhang@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"linuxppc-dev@ozlabs.org" <linuxppc-dev@ozlabs.org>,
	Robin Holt <holt@sgi.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v4] zone_reclaim is always 0 by default
Date: Thu, 4 Jun 2009 18:59:59 +0800	[thread overview]
Message-ID: <20090604105959.GA22118@localhost> (raw)
In-Reply-To: <20090604192236.9761.A69D9226@jp.fujitsu.com>

On Thu, Jun 04, 2009 at 06:23:15PM +0800, KOSAKI Motohiro wrote:
> 
> Current linux policy is, zone_reclaim_mode is enabled by default if the machine
> has large remote node distance. it's because we could assume that large distance
> mean large server until recently.
> 
> Unfortunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P transport
> memory controller. IOW it's seen as NUMA from software view.
> Some Core i7 machine has large remote node distance.
> 
> Yanmin reported zone_reclaim_mode=1 cause large apache regression.
> 
>     One Nehalem machine has 12GB memory,
>     but there is always 2GB free although applications accesses lots of files.
>     Eventually we located the root cause as zone_reclaim_mode=1.
> 
> Actually, zone_reclaim_mode=1 mean "I dislike remote node allocation rather than
> disk access", it makes performance improvement to HPC workload.
> but it makes performance degression to desktop, file server and web server.
> 
> In general, workload depended configration shouldn't put into default settings.
> 
> However, current code is long standing about two year. Highest POWER and IA64 HPC machine
> (only) use this setting.
> 
> Thus, x86 and almost rest architecture change default setting, but Only power and ia64
> remain current configuration for backward-compatibility.

The above lines are too long. Limit to 72 cols in general could be
better as git-log may add additional leading white spaces.

Thank you for all the efforts!

Acked-by: Wu Fengguang <fengguang.wu@intel.com>

> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Christoph Lameter <cl@linux-foundation.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Robin Holt <holt@sgi.com>
> Cc: "Zhang, Yanmin" <yanmin.zhang@intel.com>
> Cc: Wu Fengguang <fengguang.wu@intel.com>
> Cc: linux-ia64@vger.kernel.org
> Cc: linuxppc-dev@ozlabs.org
> ---
>  arch/powerpc/include/asm/topology.h |    6 ++++++
>  include/linux/topology.h            |    7 +------
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> Index: b/include/linux/topology.h
> ===================================================================
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -54,12 +54,7 @@ int arch_update_cpu_topology(void);
>  #define node_distance(from,to)	((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE)
>  #endif
>  #ifndef RECLAIM_DISTANCE
> -/*
> - * If the distance between nodes in a system is larger than RECLAIM_DISTANCE
> - * (in whatever arch specific measurement units returned by node_distance())
> - * then switch on zone reclaim on boot.
> - */
> -#define RECLAIM_DISTANCE 20
> +#define RECLAIM_DISTANCE INT_MAX
>  #endif
>  #ifndef PENALTY_FOR_NODE_WITH_CPUS
>  #define PENALTY_FOR_NODE_WITH_CPUS	(1)
> Index: b/arch/powerpc/include/asm/topology.h
> ===================================================================
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -10,6 +10,12 @@ struct device_node;
>  
>  #include <asm/mmzone.h>
>  
> +/*
> + * Distance above which we begin to use zone reclaim

s/begin to/default to/ ?

> + */
> +#define RECLAIM_DISTANCE 20
> +
> +
>  static inline int cpu_to_node(int cpu)
>  {
>  	return numa_cpu_lookup_table[cpu];
> 

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Robin Holt <holt@sgi.com>,
	"Zhang, Yanmin" <yanmin.zhang@intel.com>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linuxppc-dev@ozlabs.org" <linuxppc-dev@ozlabs.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v4] zone_reclaim is always 0 by default
Date: Thu, 4 Jun 2009 18:59:59 +0800	[thread overview]
Message-ID: <20090604105959.GA22118@localhost> (raw)
In-Reply-To: <20090604192236.9761.A69D9226@jp.fujitsu.com>

On Thu, Jun 04, 2009 at 06:23:15PM +0800, KOSAKI Motohiro wrote:
> 
> Current linux policy is, zone_reclaim_mode is enabled by default if the machine
> has large remote node distance. it's because we could assume that large distance
> mean large server until recently.
> 
> Unfortunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P transport
> memory controller. IOW it's seen as NUMA from software view.
> Some Core i7 machine has large remote node distance.
> 
> Yanmin reported zone_reclaim_mode=1 cause large apache regression.
> 
>     One Nehalem machine has 12GB memory,
>     but there is always 2GB free although applications accesses lots of files.
>     Eventually we located the root cause as zone_reclaim_mode=1.
> 
> Actually, zone_reclaim_mode=1 mean "I dislike remote node allocation rather than
> disk access", it makes performance improvement to HPC workload.
> but it makes performance degression to desktop, file server and web server.
> 
> In general, workload depended configration shouldn't put into default settings.
> 
> However, current code is long standing about two year. Highest POWER and IA64 HPC machine
> (only) use this setting.
> 
> Thus, x86 and almost rest architecture change default setting, but Only power and ia64
> remain current configuration for backward-compatibility.

The above lines are too long. Limit to 72 cols in general could be
better as git-log may add additional leading white spaces.

Thank you for all the efforts!

Acked-by: Wu Fengguang <fengguang.wu@intel.com>

> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Christoph Lameter <cl@linux-foundation.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Robin Holt <holt@sgi.com>
> Cc: "Zhang, Yanmin" <yanmin.zhang@intel.com>
> Cc: Wu Fengguang <fengguang.wu@intel.com>
> Cc: linux-ia64@vger.kernel.org
> Cc: linuxppc-dev@ozlabs.org
> ---
>  arch/powerpc/include/asm/topology.h |    6 ++++++
>  include/linux/topology.h            |    7 +------
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> Index: b/include/linux/topology.h
> ===================================================================
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -54,12 +54,7 @@ int arch_update_cpu_topology(void);
>  #define node_distance(from,to)	((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE)
>  #endif
>  #ifndef RECLAIM_DISTANCE
> -/*
> - * If the distance between nodes in a system is larger than RECLAIM_DISTANCE
> - * (in whatever arch specific measurement units returned by node_distance())
> - * then switch on zone reclaim on boot.
> - */
> -#define RECLAIM_DISTANCE 20
> +#define RECLAIM_DISTANCE INT_MAX
>  #endif
>  #ifndef PENALTY_FOR_NODE_WITH_CPUS
>  #define PENALTY_FOR_NODE_WITH_CPUS	(1)
> Index: b/arch/powerpc/include/asm/topology.h
> ===================================================================
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -10,6 +10,12 @@ struct device_node;
>  
>  #include <asm/mmzone.h>
>  
> +/*
> + * Distance above which we begin to use zone reclaim

s/begin to/default to/ ?

> + */
> +#define RECLAIM_DISTANCE 20
> +
> +
>  static inline int cpu_to_node(int cpu)
>  {
>  	return numa_cpu_lookup_table[cpu];
> 

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Robin Holt <holt@sgi.com>,
	"Zhang, Yanmin" <yanmin.zhang@intel.com>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linuxppc-dev@ozlabs.org" <linuxppc-dev@ozlabs.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v4] zone_reclaim is always 0 by default
Date: Thu, 4 Jun 2009 18:59:59 +0800	[thread overview]
Message-ID: <20090604105959.GA22118@localhost> (raw)
In-Reply-To: <20090604192236.9761.A69D9226@jp.fujitsu.com>

On Thu, Jun 04, 2009 at 06:23:15PM +0800, KOSAKI Motohiro wrote:
> 
> Current linux policy is, zone_reclaim_mode is enabled by default if the machine
> has large remote node distance. it's because we could assume that large distance
> mean large server until recently.
> 
> Unfortunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P transport
> memory controller. IOW it's seen as NUMA from software view.
> Some Core i7 machine has large remote node distance.
> 
> Yanmin reported zone_reclaim_mode=1 cause large apache regression.
> 
>     One Nehalem machine has 12GB memory,
>     but there is always 2GB free although applications accesses lots of files.
>     Eventually we located the root cause as zone_reclaim_mode=1.
> 
> Actually, zone_reclaim_mode=1 mean "I dislike remote node allocation rather than
> disk access", it makes performance improvement to HPC workload.
> but it makes performance degression to desktop, file server and web server.
> 
> In general, workload depended configration shouldn't put into default settings.
> 
> However, current code is long standing about two year. Highest POWER and IA64 HPC machine
> (only) use this setting.
> 
> Thus, x86 and almost rest architecture change default setting, but Only power and ia64
> remain current configuration for backward-compatibility.

The above lines are too long. Limit to 72 cols in general could be
better as git-log may add additional leading white spaces.

Thank you for all the efforts!

Acked-by: Wu Fengguang <fengguang.wu@intel.com>

> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Christoph Lameter <cl@linux-foundation.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Robin Holt <holt@sgi.com>
> Cc: "Zhang, Yanmin" <yanmin.zhang@intel.com>
> Cc: Wu Fengguang <fengguang.wu@intel.com>
> Cc: linux-ia64@vger.kernel.org
> Cc: linuxppc-dev@ozlabs.org
> ---
>  arch/powerpc/include/asm/topology.h |    6 ++++++
>  include/linux/topology.h            |    7 +------
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> Index: b/include/linux/topology.h
> ===================================================================
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -54,12 +54,7 @@ int arch_update_cpu_topology(void);
>  #define node_distance(from,to)	((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE)
>  #endif
>  #ifndef RECLAIM_DISTANCE
> -/*
> - * If the distance between nodes in a system is larger than RECLAIM_DISTANCE
> - * (in whatever arch specific measurement units returned by node_distance())
> - * then switch on zone reclaim on boot.
> - */
> -#define RECLAIM_DISTANCE 20
> +#define RECLAIM_DISTANCE INT_MAX
>  #endif
>  #ifndef PENALTY_FOR_NODE_WITH_CPUS
>  #define PENALTY_FOR_NODE_WITH_CPUS	(1)
> Index: b/arch/powerpc/include/asm/topology.h
> ===================================================================
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -10,6 +10,12 @@ struct device_node;
>  
>  #include <asm/mmzone.h>
>  
> +/*
> + * Distance above which we begin to use zone reclaim

s/begin to/default to/ ?

> + */
> +#define RECLAIM_DISTANCE 20
> +
> +
>  static inline int cpu_to_node(int cpu)
>  {
>  	return numa_cpu_lookup_table[cpu];
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-06-04 10:59 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-04 10:23 [PATCH v4] zone_reclaim is always 0 by default KOSAKI Motohiro
2009-06-04 10:23 ` KOSAKI Motohiro
2009-06-04 10:23 ` KOSAKI Motohiro
2009-06-04 10:59 ` Wu Fengguang [this message]
2009-06-04 10:59   ` Wu Fengguang
2009-06-04 10:59   ` Wu Fengguang
2009-06-04 10:59   ` Wu Fengguang
2009-06-04 12:24 ` Robin Holt
2009-06-04 12:24   ` Robin Holt
2009-06-04 12:24   ` Robin Holt
2009-06-04 12:24   ` Robin Holt
2009-06-08 11:50 ` Mel Gorman
2009-06-08 11:50   ` Mel Gorman
2009-06-08 11:50   ` Mel Gorman
2009-06-08 11:50   ` Mel Gorman
2009-06-09  9:55   ` Robin Holt
2009-06-09  9:55     ` Robin Holt
2009-06-09  9:55     ` Robin Holt
2009-06-09  9:55     ` Robin Holt
2009-06-09 10:37     ` Mel Gorman
2009-06-09 10:37       ` Mel Gorman
2009-06-09 10:37       ` Mel Gorman
2009-06-09 10:37       ` Mel Gorman
2009-06-09 12:02       ` Robin Holt
2009-06-09 12:02         ` Robin Holt
2009-06-09 12:02         ` Robin Holt
2009-06-09 12:02         ` Robin Holt
2009-06-09 19:47         ` Andrew Morton
2009-06-09 19:47           ` Andrew Morton
2009-06-09 19:47           ` Andrew Morton
2009-06-09 19:47           ` Andrew Morton
2009-06-09 13:48   ` KOSAKI Motohiro
2009-06-09 13:48     ` KOSAKI Motohiro
2009-06-09 13:48     ` KOSAKI Motohiro
2009-06-09 13:48     ` KOSAKI Motohiro
2009-06-09 14:38     ` Mel Gorman
2009-06-09 14:38       ` Mel Gorman
2009-06-09 14:38       ` Mel Gorman
2009-06-09 14:38       ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090604105959.GA22118@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=holt@sgi.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=riel@redhat.com \
    --cc=yanmin.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.