From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
Ben Widawsky <ben.widawsky@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Feng Tang <feng.tang@intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Mel Gorman <mgorman@techsingularity.net>,
Mike Kravetz <mike.kravetz@oracle.com>,
Randy Dunlap <rdunlap@infradead.org>,
Vlastimil Babka <vbabka@suse.cz>, Andi Kleen <ak@linux.intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Huang Ying <ying.huang@intel.com>,
linux-api@vger.kernel.org
Subject: Re: [RFC PATCH] mm/mempolicy: add MPOL_PREFERRED_STRICT memory policy
Date: Wed, 13 Oct 2021 19:27:03 +0530 [thread overview]
Message-ID: <291424a2-c962-533e-c755-e4239fd55f5d@linux.ibm.com> (raw)
In-Reply-To: <9a0baa59-f316-103f-3030-990cd91d1813@linux.ibm.com>
On 10/13/21 18:28, Aneesh Kumar K.V wrote:
> On 10/13/21 18:20, Michal Hocko wrote:
>> On Wed 13-10-21 18:05:49, Aneesh Kumar K.V wrote:
>>> On 10/13/21 16:18, Michal Hocko wrote:
>>>> On Wed 13-10-21 12:42:34, Michal Hocko wrote:
>>>>> [Cc linux-api]
>>>>>
>>>>> On Wed 13-10-21 15:15:39, Aneesh Kumar K.V wrote:
>>>>>> This mempolicy mode can be used with either the set_mempolicy(2)
>>>>>> or mbind(2) interfaces. Like the MPOL_PREFERRED interface, it
>>>>>> allows an application to set a preference node from which the kernel
>>>>>> will fulfill memory allocation requests. Unlike the MPOL_PREFERRED
>>>>>> mode,
>>>>>> it takes a set of nodes. The nodes in the nodemask are used as
>>>>>> fallback
>>>>>> allocation nodes if memory is not available on the preferred node.
>>>>>> Unlike MPOL_PREFERRED_MANY, it will not fall back memory allocations
>>>>>> to all nodes in the system. Like the MPOL_BIND interface, it works
>>>>>> over a
>>>>>> set of nodes and will cause a SIGSEGV or invoke the OOM killer if
>>>>>> memory is not available on those preferred nodes.
>>>>>>
>>>>>> This patch helps applications to hint a memory allocation
>>>>>> preference node
>>>>>> and fallback to _only_ a set of nodes if the memory is not available
>>>>>> on the preferred node. Fallback allocation is attempted from the
>>>>>> node which is
>>>>>> nearest to the preferred node.
>>>>>>
>>>>>> This new memory policy helps applications to have explicit control
>>>>>> on slow
>>>>>> memory allocation and avoids default fallback to slow memory NUMA
>>>>>> nodes.
>>>>>> The difference with MPOL_BIND is the ability to specify a
>>>>>> preferred node
>>>>>> which is the first node in the nodemask argument passed.
>>>>
>>>> I am sorry but I do not understand the semantic diffrence from
>>>> MPOL_BIND. Could you be more specific please?
>>>>
>>>
>>>
>>>
>>> MPOL_BIND
>>> This mode specifies that memory must come from the set of
>>> nodes specified by the policy. Memory will be allocated from
>>> the node in the set with sufficient free memory that is
>>> closest to the node where the allocation takes place.
>>>
>>>
>>> MPOL_PREFERRED_STRICT
>>> This mode specifies that the allocation should be attempted
>>> from the first node specified in the nodemask of the policy.
>>> If that allocation fails, the kernel will search other nodes
>>> in the nodemask, in order of increasing distance from the
>>> preferred node based on information provided by the platform
>>> firmware.
>>>
>>> The difference is the ability to specify the preferred node as the first
>>> node in the nodemask and all fallback allocations are based on the
>>> distance
>>> from the preferred node. With MPOL_BIND they base based on the node
>>> where
>>> the allocation takes place.
>>
>> OK, this makes it more clear. Thanks!
>>
>> I am still not sure the semantic makes sense though. Why should
>> the lowest node in the nodemask have any special meaning? What if it is
>> a node with a higher number that somebody preferes to start with?
>>
>
> That is true. I haven't been able to find an easy way to specify the
> preferred node other than expressing it as first node in the node mask.
> Yes, it limits the usage of the policy. Any alternate suggestion?
>
> We could do
> set_mempolicy(MPOLD_PREFERRED, nodemask(nodeX)))
> set_mempolicy(MPOLD_PREFFERED_EXTEND, nodemask(fallback nodemask for
> above PREFERRED policy))
>
> But that really complicates the interface?
>
>
Another option is to keep this mbind(2) specific and overload flags to
be the preferred nodeid.
mbind(va, len, MPOL_PREFERRED_STRICT, nodemask, max_node, preferred_node);
-aneesh
next prev parent reply other threads:[~2021-10-13 13:57 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20211013094539.962357-1-aneesh.kumar@linux.ibm.com>
2021-10-13 10:42 ` [RFC PATCH] mm/mempolicy: add MPOL_PREFERRED_STRICT memory policy Michal Hocko
2021-10-13 10:48 ` Michal Hocko
2021-10-13 12:35 ` Aneesh Kumar K.V
2021-10-13 12:50 ` Michal Hocko
2021-10-13 12:58 ` Aneesh Kumar K.V
2021-10-13 13:07 ` Michal Hocko
2021-10-13 13:10 ` Aneesh Kumar K.V
2021-10-13 14:22 ` Michal Hocko
2021-10-13 13:57 ` Aneesh Kumar K.V [this message]
2021-10-13 14:26 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=291424a2-c962-533e-c755-e4239fd55f5d@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=ben.widawsky@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=feng.tang@intel.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=rdunlap@infradead.org \
--cc=vbabka@suse.cz \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).