From: Randy Dunlap <randy.dunlap@oracle.com>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Paul Jackson <pj@sgi.com>, Christoph Lameter <clameter@sgi.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Andi Kleen <ak@suse.de>,
linux-kernel@vger.kernel.org
Subject: Re: [patch 4/4 v2] mempolicy: update NUMA memory policy documentation
Date: Mon, 11 Feb 2008 12:14:50 -0800 [thread overview]
Message-ID: <47B0ACBA.4090207@oracle.com> (raw)
In-Reply-To: <alpine.DEB.1.00.0802111205040.19986@chino.kir.corp.google.com>
David Rientjes wrote:
> Updates Documentation/vm/numa_memory_policy.txt and
> Documentation/filesystems/tmpfs.txt to describe optional mempolicy mode
> flags.
>
> Cc: Paul Jackson <pj@sgi.com>
> Cc: Christoph Lameter <clameter@sgi.com>
> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
> Cc: Andi Kleen <ak@suse.de>
> Cc: Randy Dunlap <randy.dunlap@oracle.com>
> Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Thanks.
> ---
> Includes fixes to problems identified by Randy Dunlap.
>
> Documentation/filesystems/tmpfs.txt | 11 ++++++++
> Documentation/vm/numa_memory_policy.txt | 41 +++++++++++++++++++++++++++----
> 2 files changed, 47 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/filesystems/tmpfs.txt b/Documentation/filesystems/tmpfs.txt
> --- a/Documentation/filesystems/tmpfs.txt
> +++ b/Documentation/filesystems/tmpfs.txt
> @@ -92,6 +92,17 @@ NodeList format is a comma-separated list of decimal numbers and ranges,
> a range being two hyphen-separated decimal numbers, the smallest and
> largest node numbers in the range. For example, mpol=bind:0-3,5,7,9-15
>
> +It is possible to specify a static NodeList by appending '=static' to
> +the memory policy mode in the mpol= argument. This will require that
> +tasks or VMA's restricted to a subset of allowed nodes are only allowed
> +to effect the memory policy over those nodes. No remapping of the
> +NodeList when the policy is rebound, which is the default behavior, is
> +allowed when '=static' is specified. For example:
> +
> +mpol=bind=static:NodeList will only allocate from each node in
> + the NodeList without remapping the
> + NodeList if the policy is rebound
> +
> Note that trying to mount a tmpfs with an mpol option will fail if the
> running kernel does not support NUMA; and will fail if its nodelist
> specifies a node which is not online. If your system relies on that
> diff --git a/Documentation/vm/numa_memory_policy.txt b/Documentation/vm/numa_memory_policy.txt
> --- a/Documentation/vm/numa_memory_policy.txt
> +++ b/Documentation/vm/numa_memory_policy.txt
> @@ -135,9 +135,11 @@ most general to most specific:
>
> Components of Memory Policies
>
> - A Linux memory policy is a tuple consisting of a "mode" and an optional set
> - of nodes. The mode determine the behavior of the policy, while the
> - optional set of nodes can be viewed as the arguments to the behavior.
> + A Linux memory policy consists of a "mode", optional mode flags, and an
> + optional set of nodes. The mode determines the behavior of the policy,
> + the optional mode flags determine the behavior of the mode, and the
> + optional set of nodes can be viewed as the arguments to the policy
> + behavior.
>
> Internally, memory policies are implemented by a reference counted
> structure, struct mempolicy. Details of this structure will be discussed
> @@ -145,7 +147,12 @@ Components of Memory Policies
>
> Note: in some functions AND in the struct mempolicy itself, the mode
> is called "policy". However, to avoid confusion with the policy tuple,
> - this document will continue to use the term "mode".
> + this document will continue to use the term "mode". Since the mode and
> + optional mode flags are stored in the same struct mempolicy member
> + (specifically, pol->policy), you must use mpol_mode(pol->policy) to
> + access only the mode and mpol_flags(pol->policy) to access only the
> + flags. Any function with a formal of type enum mempolicy_mode only
> + refers to the mode.
>
> Linux memory policy supports the following 4 behavioral modes:
>
> @@ -231,6 +238,28 @@ Components of Memory Policies
> the temporary interleaved system default policy works in this
> mode.
>
> + Linux memory policy supports the following optional mode flag:
> +
> + MPOL_F_STATIC_NODES: This flag specifies that the nodemask passed by
> + the user should not be remapped if the task or VMA's set of accessible
> + nodes changes after the memory policy has been defined.
> +
> + Without this flag, anytime a mempolicy is rebound because of a
> + change in the set of accessible nodes, the node (Preferred) or
> + nodemask (Bind, Interleave) is remapped to the new set of
> + accessible nodes. This may result in nodes being used that were
> + previously undesired. With this flag, the policy is either
> + effected over the user's specified nodemask or the Default
> + behavior is used.
> +
> + For example, consider a task that is attached to a cpuset with
> + mems 1-3 that sets an Interleave policy over the same set. If
> + the cpuset's mems change to 3-5, the Interleave will now occur
> + over nodes 3, 4, and 5. With this flag, however, since only
> + node 3 is accessible from the user's nodemask, the "interleave"
> + only occurs over that node. If no nodes from the user's
> + nodemask are now accessible, the Default behavior is used.
> +
> MEMORY POLICY APIs
>
> Linux supports 3 system calls for controlling memory policy. These APIS
> @@ -251,7 +280,9 @@ Set [Task] Memory Policy:
> Set's the calling task's "task/process memory policy" to mode
> specified by the 'mode' argument and the set of nodes defined
> by 'nmask'. 'nmask' points to a bit mask of node ids containing
> - at least 'maxnode' ids.
> + at least 'maxnode' ids. Optional mode flags may be passed by
> + combining the 'mode' argument with the flag (for example:
> + MPOL_INTERLEAVE | MPOL_F_STATIC_NODES).
>
> See the set_mempolicy(2) man page for more details
>
--
~Randy
next prev parent reply other threads:[~2008-02-11 20:17 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-11 15:30 [patch 1/4] mempolicy: convert MPOL constants to enum David Rientjes
2008-02-11 15:30 ` [patch 2/4] mempolicy: support optional mode flags David Rientjes
2008-02-11 15:30 ` [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag David Rientjes
2008-02-11 15:30 ` [patch 4/4] mempolicy: update NUMA memory policy documentation David Rientjes
2008-02-11 16:10 ` Randy Dunlap
2008-02-11 20:06 ` [patch 4/4 v2] " David Rientjes
2008-02-11 20:14 ` Randy Dunlap [this message]
2008-02-11 18:25 ` [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag KOSAKI Motohiro
2008-02-11 19:56 ` David Rientjes
2008-02-13 0:25 ` Lee Schermerhorn
2008-02-13 0:57 ` David Rientjes
2008-02-11 19:34 ` Christoph Lameter
2008-02-13 0:22 ` Lee Schermerhorn
2008-02-13 3:52 ` Paul Jackson
2008-02-13 4:03 ` David Rientjes
2008-02-13 4:13 ` Paul Jackson
2008-02-13 4:23 ` David Rientjes
2008-02-13 8:03 ` Paul Jackson
2008-02-13 9:36 ` David Rientjes
2008-02-13 16:01 ` Lee Schermerhorn
2008-02-13 18:48 ` David Rientjes
2008-02-13 18:58 ` Paul Jackson
2008-02-13 19:05 ` Lee Schermerhorn
2008-02-13 19:17 ` David Rientjes
2008-02-13 17:04 ` Paul Jackson
2008-02-13 19:02 ` David Rientjes
2008-02-13 20:29 ` Paul Jackson
2008-02-13 21:35 ` David Rientjes
2008-02-14 11:12 ` Paul Jackson
2008-02-14 12:27 ` Paul Jackson
2008-02-14 10:26 ` Paul Jackson
2008-02-14 19:45 ` David Rientjes
2008-02-15 10:19 ` Paul Jackson
2008-02-15 20:14 ` David Rientjes
2008-02-13 4:18 ` David Rientjes
2008-02-13 5:06 ` David Rientjes
2008-02-13 15:15 ` Lee Schermerhorn
2008-02-13 16:14 ` Lee Schermerhorn
2008-02-13 19:12 ` David Rientjes
2008-02-14 10:09 ` Paul Jackson
2008-02-14 19:40 ` David Rientjes
2008-02-15 1:44 ` David Rientjes
2008-02-15 10:00 ` Paul Jackson
2008-02-14 21:38 ` David Rientjes
2008-02-15 9:27 ` Paul Jackson
2008-02-15 20:23 ` David Rientjes
2008-02-15 20:32 ` David Rientjes
2008-02-15 23:45 ` Paul Jackson
2008-02-15 23:55 ` David Rientjes
2008-02-16 0:11 ` Paul Jackson
2008-02-11 16:36 ` [patch 2/4] mempolicy: support optional mode flags Lee Schermerhorn
2008-02-11 19:34 ` David Rientjes
2008-02-12 15:31 ` Lee Schermerhorn
2008-02-12 19:14 ` David Rientjes
2008-02-11 20:55 ` Paul Jackson
2008-02-11 21:52 ` David Rientjes
2008-02-11 21:57 ` Paul Jackson
2008-02-13 0:14 ` Lee Schermerhorn
2008-02-13 0:25 ` David Rientjes
2008-02-11 18:45 ` [patch 1/4] mempolicy: convert MPOL constants to enum Andi Kleen
2008-02-11 19:25 ` David Rientjes
2008-02-11 19:32 ` Christoph Lameter
2008-02-11 19:40 ` David Rientjes
2008-02-11 19:48 ` Christoph Lameter
2008-02-11 20:02 ` David Rientjes
2008-02-11 20:45 ` Christoph Lameter
2008-02-13 0:10 ` Lee Schermerhorn
2008-02-13 0:31 ` Paul Jackson
2008-02-13 0:53 ` David Rientjes
2008-02-13 1:04 ` Christoph Lameter
2008-02-13 1:28 ` Paul Jackson
2008-02-13 1:32 ` Paul Jackson
2008-02-13 2:00 ` David Rientjes
2008-02-13 2:22 ` Paul Jackson
2008-02-13 2:42 ` David Rientjes
2008-02-13 2:59 ` Paul Jackson
2008-02-13 3:17 ` David Rientjes
2008-02-13 3:22 ` Paul Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47B0ACBA.4090207@oracle.com \
--to=randy.dunlap@oracle.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=pj@sgi.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.