From: Andrew Morton <akpm@linux-foundation.org>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>, Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH resend^2] mm: increase RECLAIM_DISTANCE to 30
Date: Mon, 11 Apr 2011 14:19:50 -0700 [thread overview]
Message-ID: <20110411141950.46d3d6da.akpm@linux-foundation.org> (raw)
In-Reply-To: <20110411172004.0361.A69D9226@jp.fujitsu.com>
On Mon, 11 Apr 2011 17:19:31 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> Recently, Robert Mueller reported zone_reclaim_mode doesn't work
It's time for some nagging.
I'm trying to work out what the user-visible effect of this problem
was, but it isn't described in the changelog and there is no link to
any report and not even a Reported-by: or a Cc: and a search for Robert
in linux-mm and linux-kernel turned up blank.
> properly on his new NUMA server (Dual Xeon E5520 + Intel S5520UR MB).
> He is using Cyrus IMAPd and it's built on a very traditional
> single-process model.
>
> * a master process which reads config files and manages the other
> process
> * multiple imapd processes, one per connection
> * multiple pop3d processes, one per connection
> * multiple lmtpd processes, one per connection
> * periodical "cleanup" processes.
>
> Then, there are thousands of independent processes. The problem is,
> recent Intel motherboard turn on zone_reclaim_mode by default and
> traditional prefork model software don't work fine on it.
> Unfortunatelly, Such model is still typical one even though 21th
> century. We can't ignore them.
>
> This patch raise zone_reclaim_mode threshold to 30. 30 don't have
> specific meaning. but 20 mean one-hop QPI/Hypertransport and such
> relatively cheap 2-4 socket machine are often used for tradiotional
> server as above. The intention is, their machine don't use
> zone_reclaim_mode.
>
> Note: ia64 and Power have arch specific RECLAIM_DISTANCE definition.
> then this patch doesn't change such high-end NUMA machine behavior.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Christoph Lameter <cl@linux.com>
> Acked-by: David Rientjes <rientjes@google.com>
> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> include/linux/topology.h | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index b91a40e..fc839bf 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -60,7 +60,7 @@ int arch_update_cpu_topology(void);
> * (in whatever arch specific measurement units returned by node_distance())
> * then switch on zone reclaim on boot.
> */
> -#define RECLAIM_DISTANCE 20
> +#define RECLAIM_DISTANCE 30
Any time we tweak a magic number to improve one platform, we risk
causing deterioration on another. Do we know that this risk is low
with this patch?
Also, what are we doing setting
zone_relaim_mode = 1;
when we have nice enumerated constants for this? It should be
zone_relaim_mode = RECLAIM_ZONE;
or, pedantically but clearer:
zone_relaim_mode = RECLAIM_ZONE & !RECLAIM_WRITE & !RECLAIM_SWAP;
Finally, we shouldn't be playing these guessing games in the kernel at
all - we'll always get it wrong for some platforms and for some
workloads. zone_reclaim_mdoe is tunable at runtime and we should be
encouraging administrators, integrators and distros to *use* this
ability. That might mean having to write some tools to empirically
determine the optimum setting for a particular machine.
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>, Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH resend^2] mm: increase RECLAIM_DISTANCE to 30
Date: Mon, 11 Apr 2011 14:19:50 -0700 [thread overview]
Message-ID: <20110411141950.46d3d6da.akpm@linux-foundation.org> (raw)
In-Reply-To: <20110411172004.0361.A69D9226@jp.fujitsu.com>
On Mon, 11 Apr 2011 17:19:31 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> Recently, Robert Mueller reported zone_reclaim_mode doesn't work
It's time for some nagging.
I'm trying to work out what the user-visible effect of this problem
was, but it isn't described in the changelog and there is no link to
any report and not even a Reported-by: or a Cc: and a search for Robert
in linux-mm and linux-kernel turned up blank.
> properly on his new NUMA server (Dual Xeon E5520 + Intel S5520UR MB).
> He is using Cyrus IMAPd and it's built on a very traditional
> single-process model.
>
> * a master process which reads config files and manages the other
> process
> * multiple imapd processes, one per connection
> * multiple pop3d processes, one per connection
> * multiple lmtpd processes, one per connection
> * periodical "cleanup" processes.
>
> Then, there are thousands of independent processes. The problem is,
> recent Intel motherboard turn on zone_reclaim_mode by default and
> traditional prefork model software don't work fine on it.
> Unfortunatelly, Such model is still typical one even though 21th
> century. We can't ignore them.
>
> This patch raise zone_reclaim_mode threshold to 30. 30 don't have
> specific meaning. but 20 mean one-hop QPI/Hypertransport and such
> relatively cheap 2-4 socket machine are often used for tradiotional
> server as above. The intention is, their machine don't use
> zone_reclaim_mode.
>
> Note: ia64 and Power have arch specific RECLAIM_DISTANCE definition.
> then this patch doesn't change such high-end NUMA machine behavior.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Christoph Lameter <cl@linux.com>
> Acked-by: David Rientjes <rientjes@google.com>
> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> include/linux/topology.h | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index b91a40e..fc839bf 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -60,7 +60,7 @@ int arch_update_cpu_topology(void);
> * (in whatever arch specific measurement units returned by node_distance())
> * then switch on zone reclaim on boot.
> */
> -#define RECLAIM_DISTANCE 20
> +#define RECLAIM_DISTANCE 30
Any time we tweak a magic number to improve one platform, we risk
causing deterioration on another. Do we know that this risk is low
with this patch?
Also, what are we doing setting
zone_relaim_mode = 1;
when we have nice enumerated constants for this? It should be
zone_relaim_mode = RECLAIM_ZONE;
or, pedantically but clearer:
zone_relaim_mode = RECLAIM_ZONE & !RECLAIM_WRITE & !RECLAIM_SWAP;
Finally, we shouldn't be playing these guessing games in the kernel at
all - we'll always get it wrong for some platforms and for some
workloads. zone_reclaim_mdoe is tunable at runtime and we should be
encouraging administrators, integrators and distros to *use* this
ability. That might mean having to write some tools to empirically
determine the optimum setting for a particular machine.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-04-11 21:20 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-11 8:19 [PATCH resend^2] mm: increase RECLAIM_DISTANCE to 30 KOSAKI Motohiro
2011-04-11 8:19 ` KOSAKI Motohiro
2011-04-11 21:19 ` Andrew Morton [this message]
2011-04-11 21:19 ` Andrew Morton
2011-04-12 0:59 ` KOSAKI Motohiro
2011-04-12 0:59 ` KOSAKI Motohiro
2011-04-11 21:29 ` Dave Hansen
2011-04-11 21:29 ` Dave Hansen
2011-04-12 1:01 ` KOSAKI Motohiro
2011-04-12 1:01 ` KOSAKI Motohiro
2011-04-12 2:27 ` Dave Hansen
2011-04-12 2:27 ` Dave Hansen
2011-04-12 7:25 ` KOSAKI Motohiro
2011-04-12 7:25 ` KOSAKI Motohiro
2011-05-24 20:07 ` Andrew Morton
2011-05-24 20:07 ` Andrew Morton
2011-05-24 20:24 ` David Rientjes
2011-05-24 20:24 ` David Rientjes
2011-05-24 20:37 ` Dave Hansen
2011-05-24 20:37 ` Dave Hansen
2011-04-13 0:22 ` David Rientjes
2011-04-13 0:22 ` David Rientjes
2011-04-13 0:49 ` Dave Hansen
2011-04-13 0:49 ` Dave Hansen
2011-04-13 0:56 ` David Rientjes
2011-04-13 0:56 ` David Rientjes
2011-04-13 0:16 ` David Rientjes
2011-04-13 0:16 ` David Rientjes
2011-04-13 0:26 ` Rob Mueller
2011-04-13 0:26 ` Rob Mueller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110411141950.46d3d6da.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.