From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
Ben Widawsky <ben.widawsky@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Feng Tang <feng.tang@intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Mel Gorman <mgorman@techsingularity.net>,
Mike Kravetz <mike.kravetz@oracle.com>,
Randy Dunlap <rdunlap@infradead.org>,
Vlastimil Babka <vbabka@suse.cz>, Andi Kleen <ak@linux.intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Huang Ying <ying.huang@intel.com>,
linux-api@vger.kernel.org
Subject: Re: [PATCH v5 2/3] mm/mempolicy: add set_mempolicy_home_node syscall
Date: Mon, 29 Nov 2021 19:17:06 +0530 [thread overview]
Message-ID: <8735nfcbvp.fsf@linux.ibm.com> (raw)
In-Reply-To: <YaTLVCKl9t5RCfQR@dhcp22.suse.cz>
Michal Hocko <mhocko@suse.com> writes:
> On Mon 29-11-21 16:16:05, Aneesh Kumar K.V wrote:
>> Michal Hocko <mhocko@suse.com> writes:
>>
>> > On Tue 16-11-21 12:12:37, Aneesh Kumar K.V wrote:
> [...]
>> >> +SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, len,
>> >> + unsigned long, home_node, unsigned long, flags)
>> >> +{
>> >> + struct mm_struct *mm = current->mm;
>> >> + struct vm_area_struct *vma;
>> >> + struct mempolicy *new;
>> >> + unsigned long vmstart;
>> >> + unsigned long vmend;
>> >> + unsigned long end;
>> >> + int err = -ENOENT;
>> >> +
>> >> + if (start & ~PAGE_MASK)
>> >> + return -EINVAL;
>> >> + /*
>> >> + * flags is used for future extension if any.
>> >> + */
>> >> + if (flags != 0)
>> >> + return -EINVAL;
>> >> +
>> >> + if (!node_online(home_node))
>> >> + return -EINVAL;
>> >
>> > You really want to check the home_node before dereferencing the mask.
>> >
>>
>> Any reason why we want to check for home node first?
>
> Because the given node is an index to node_states[N_ONLINE] bitmap. I do
> not think we do range checking there.
Will add this
if (home_node >= MAX_NUMNODES || !node_online(home_node))
return -EINVAL;
>
>> >> + len = (len + PAGE_SIZE - 1) & PAGE_MASK;
>> >> + end = start + len;
>> >> +
>> >> + if (end < start)
>> >> + return -EINVAL;
>> >> + if (end == start)
>> >> + return 0;
>> >> + mmap_write_lock(mm);
>> >> + vma = find_vma(mm, start);
>> >> + for (; vma && vma->vm_start < end; vma = vma->vm_next) {
>> >> +
>> >> + vmstart = max(start, vma->vm_start);
>> >> + vmend = min(end, vma->vm_end);
>> >> + new = mpol_dup(vma_policy(vma));
>> >> + if (IS_ERR(new)) {
>> >> + err = PTR_ERR(new);
>> >> + break;
>> >> + }
>> >> + /*
>> >> + * Only update home node if there is an existing vma policy
>> >> + */
>> >> + if (!new)
>> >> + continue;
>> >
>> > Your changelog only mentions MPOL_BIND and MPOL_PREFERRED_MANY as
>> > supported but you seem to be applying the home node to all existing
>> > policieso
>>
>>
>> The restriction is done in policy_node.
>>
>> @@ -1801,6 +1856,11 @@ static int policy_node(gfp_t gfp, struct mempolicy *policy, int nd)
>> WARN_ON_ONCE(policy->mode == MPOL_BIND && (gfp & __GFP_THISNODE));
>> }
>>
>> + if ((policy->mode == MPOL_BIND ||
>> + policy->mode == MPOL_PREFERRED_MANY) &&
>> + policy->home_node != NUMA_NO_NODE)
>> + return policy->home_node;
>> +
>> return nd;
>> }
>
> But you do allow to set the home node also for other policies and that
> means that a default MPOL_INTERLEAVE would be different from the one
> with home_node set up even though they behave exactly the same.
I agree that there is no error returned if we try to set the home_node
for other memory policies. But there should not be any behaviour
differences. We ignore home_node for policies other than BIND and
PREFERRED_MANY.
The reason I avoided erroring out for other policies was to simplify the
error handling. For example, for a range of addr with a mix of memory
policy MPOLD_BIND and MPOL_INTERLEAVE what should be the state after the
above system call? We could say, we ignore setting home_node for vma
with policy MPOL_INTERLEAVE and leave the home node set for vma with
policy MPOL_BIND. Or should we make the system call return error also
undoing the changes done for vmas for which we have set the home_node?
-aneesh
next prev parent reply other threads:[~2021-11-29 13:49 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20211116064238.727454-1-aneesh.kumar@linux.ibm.com>
2021-11-16 6:42 ` [PATCH v5 1/3] mm/mempolicy: use policy_node helper with MPOL_PREFERRED_MANY Aneesh Kumar K.V
2021-11-29 10:11 ` Michal Hocko
2021-11-29 10:12 ` [PATCH 4/3] mm: drop node from alloc_pages_vma Michal Hocko
2021-11-16 6:42 ` [PATCH v5 2/3] mm/mempolicy: add set_mempolicy_home_node syscall Aneesh Kumar K.V
2021-11-29 10:32 ` Michal Hocko
2021-11-29 10:46 ` Aneesh Kumar K.V
2021-11-29 12:45 ` Michal Hocko
2021-11-29 13:47 ` Aneesh Kumar K.V [this message]
2021-11-29 14:52 ` Michal Hocko
2021-11-29 14:59 ` Aneesh Kumar K.V
2021-11-29 15:19 ` Michal Hocko
2021-11-29 22:02 ` Andrew Morton
2021-11-30 8:59 ` Aneesh Kumar K.V
2021-11-30 9:59 ` Michal Hocko
2021-12-01 3:00 ` Andrew Morton
2021-12-01 6:22 ` Aneesh Kumar K.V
2021-12-01 0:47 ` Daniel Jordan
2021-12-01 6:15 ` Aneesh Kumar K.V
2021-12-01 16:22 ` Daniel Jordan
2021-11-16 6:42 ` [PATCH v5 3/3] mm/mempolicy: wire up syscall set_mempolicy_home_node Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8735nfcbvp.fsf@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=ben.widawsky@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=feng.tang@intel.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=rdunlap@infradead.org \
--cc=vbabka@suse.cz \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).