linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	Ben Widawsky <ben.widawsky@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Feng Tang <feng.tang@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>, Andi Kleen <ak@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Huang Ying <ying.huang@intel.com>,
	linux-api@vger.kernel.org
Subject: Re: [RFC PATCH] mm/mempolicy: add MPOL_PREFERRED_STRICT memory policy
Date: Wed, 13 Oct 2021 19:27:03 +0530	[thread overview]
Message-ID: <291424a2-c962-533e-c755-e4239fd55f5d@linux.ibm.com> (raw)
In-Reply-To: <9a0baa59-f316-103f-3030-990cd91d1813@linux.ibm.com>

On 10/13/21 18:28, Aneesh Kumar K.V wrote:
> On 10/13/21 18:20, Michal Hocko wrote:
>> On Wed 13-10-21 18:05:49, Aneesh Kumar K.V wrote:
>>> On 10/13/21 16:18, Michal Hocko wrote:
>>>> On Wed 13-10-21 12:42:34, Michal Hocko wrote:
>>>>> [Cc linux-api]
>>>>>
>>>>> On Wed 13-10-21 15:15:39, Aneesh Kumar K.V wrote:
>>>>>> This mempolicy mode can be used with either the set_mempolicy(2)
>>>>>> or mbind(2) interfaces.  Like the MPOL_PREFERRED interface, it
>>>>>> allows an application to set a preference node from which the kernel
>>>>>> will fulfill memory allocation requests. Unlike the MPOL_PREFERRED 
>>>>>> mode,
>>>>>> it takes a set of nodes. The nodes in the nodemask are used as 
>>>>>> fallback
>>>>>> allocation nodes if memory is not available on the preferred node.
>>>>>> Unlike MPOL_PREFERRED_MANY, it will not fall back memory allocations
>>>>>> to all nodes in the system. Like the MPOL_BIND interface, it works 
>>>>>> over a
>>>>>> set of nodes and will cause a SIGSEGV or invoke the OOM killer if
>>>>>> memory is not available on those preferred nodes.
>>>>>>
>>>>>> This patch helps applications to hint a memory allocation 
>>>>>> preference node
>>>>>> and fallback to _only_ a set of nodes if the memory is not available
>>>>>> on the preferred node.  Fallback allocation is attempted from the 
>>>>>> node which is
>>>>>> nearest to the preferred node.
>>>>>>
>>>>>> This new memory policy helps applications to have explicit control 
>>>>>> on slow
>>>>>> memory allocation and avoids default fallback to slow memory NUMA 
>>>>>> nodes.
>>>>>> The difference with MPOL_BIND is the ability to specify a 
>>>>>> preferred node
>>>>>> which is the first node in the nodemask argument passed.
>>>>
>>>> I am sorry but I do not understand the semantic diffrence from
>>>> MPOL_BIND. Could you be more specific please?
>>>>
>>>
>>>
>>>
>>> MPOL_BIND
>>>     This mode specifies that memory must come from the set of
>>>     nodes specified by the policy.  Memory will be allocated from
>>>     the node in the set with sufficient free memory that is
>>>     closest to the node where the allocation takes place.
>>>
>>>
>>> MPOL_PREFERRED_STRICT
>>>     This mode specifies that the allocation should be attempted
>>>     from the first node specified in the nodemask of the policy.
>>>     If that allocation fails, the kernel will search other nodes
>>>     in the nodemask, in order of increasing distance from the
>>>     preferred node based on information provided by the platform   
>>> firmware.
>>>
>>> The difference is the ability to specify the preferred node as the first
>>> node in the nodemask and all fallback allocations are based on the 
>>> distance
>>> from the preferred node. With MPOL_BIND they base based on the node 
>>> where
>>> the allocation takes place.
>>
>> OK, this makes it more clear. Thanks!
>>
>> I am still not sure the semantic makes sense though. Why should
>> the lowest node in the nodemask have any special meaning? What if it is
>> a node with a higher number that somebody preferes to start with?
>>
> 
> That is true. I haven't been able to find an easy way to specify the 
> preferred node other than expressing it as first node in the node mask. 
> Yes, it limits the usage of the policy. Any alternate suggestion?
> 
> We could do
> set_mempolicy(MPOLD_PREFERRED, nodemask(nodeX)))
> set_mempolicy(MPOLD_PREFFERED_EXTEND, nodemask(fallback nodemask for 
> above PREFERRED policy))
> 
> But that really complicates the interface?
> 
>

Another option is to keep this mbind(2) specific and overload flags to 
be the preferred nodeid.

mbind(va, len, MPOL_PREFERRED_STRICT, nodemask, max_node, preferred_node);

  -aneesh


  parent reply	other threads:[~2021-10-13 13:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20211013094539.962357-1-aneesh.kumar@linux.ibm.com>
2021-10-13 10:42 ` [RFC PATCH] mm/mempolicy: add MPOL_PREFERRED_STRICT memory policy Michal Hocko
2021-10-13 10:48   ` Michal Hocko
2021-10-13 12:35     ` Aneesh Kumar K.V
2021-10-13 12:50       ` Michal Hocko
2021-10-13 12:58         ` Aneesh Kumar K.V
2021-10-13 13:07           ` Michal Hocko
2021-10-13 13:10             ` Aneesh Kumar K.V
2021-10-13 14:22               ` Michal Hocko
2021-10-13 13:57           ` Aneesh Kumar K.V [this message]
2021-10-13 14:26             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=291424a2-c962-533e-c755-e4239fd55f5d@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=ben.widawsky@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=feng.tang@intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=rdunlap@infradead.org \
    --cc=vbabka@suse.cz \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).