linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	Ben Widawsky <ben.widawsky@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Feng Tang <feng.tang@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>, Andi Kleen <ak@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Huang Ying <ying.huang@intel.com>,
	linux-api@vger.kernel.org
Subject: Re: [PATCH v5 2/3] mm/mempolicy: add set_mempolicy_home_node syscall
Date: Mon, 29 Nov 2021 19:17:06 +0530	[thread overview]
Message-ID: <8735nfcbvp.fsf@linux.ibm.com> (raw)
In-Reply-To: <YaTLVCKl9t5RCfQR@dhcp22.suse.cz>

Michal Hocko <mhocko@suse.com> writes:

> On Mon 29-11-21 16:16:05, Aneesh Kumar K.V wrote:
>> Michal Hocko <mhocko@suse.com> writes:
>> 
>> > On Tue 16-11-21 12:12:37, Aneesh Kumar K.V wrote:
> [...]
>> >> +SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, len,
>> >> +		unsigned long, home_node, unsigned long, flags)
>> >> +{
>> >> +	struct mm_struct *mm = current->mm;
>> >> +	struct vm_area_struct *vma;
>> >> +	struct mempolicy *new;
>> >> +	unsigned long vmstart;
>> >> +	unsigned long vmend;
>> >> +	unsigned long end;
>> >> +	int err = -ENOENT;
>> >> +
>> >> +	if (start & ~PAGE_MASK)
>> >> +		return -EINVAL;
>> >> +	/*
>> >> +	 * flags is used for future extension if any.
>> >> +	 */
>> >> +	if (flags != 0)
>> >> +		return -EINVAL;
>> >> +
>> >> +	if (!node_online(home_node))
>> >> +		return -EINVAL;
>> >
>> > You really want to check the home_node before dereferencing the mask.
>> >
>> 
>> Any reason why we want to check for home node first?
>
> Because the given node is an index to node_states[N_ONLINE] bitmap. I do
> not think we do range checking there.

Will add this

	if (home_node >= MAX_NUMNODES || !node_online(home_node))
		return -EINVAL;



>
>> >> +	len = (len + PAGE_SIZE - 1) & PAGE_MASK;
>> >> +	end = start + len;
>> >> +
>> >> +	if (end < start)
>> >> +		return -EINVAL;
>> >> +	if (end == start)
>> >> +		return 0;
>> >> +	mmap_write_lock(mm);
>> >> +	vma = find_vma(mm, start);
>> >> +	for (; vma && vma->vm_start < end;  vma = vma->vm_next) {
>> >> +
>> >> +		vmstart = max(start, vma->vm_start);
>> >> +		vmend   = min(end, vma->vm_end);
>> >> +		new = mpol_dup(vma_policy(vma));
>> >> +		if (IS_ERR(new)) {
>> >> +			err = PTR_ERR(new);
>> >> +			break;
>> >> +		}
>> >> +		/*
>> >> +		 * Only update home node if there is an existing vma policy
>> >> +		 */
>> >> +		if (!new)
>> >> +			continue;
>> >
>> > Your changelog only mentions MPOL_BIND and MPOL_PREFERRED_MANY as
>> > supported but you seem to be applying the home node to all existing
>> > policieso
>> 
>> 
>> The restriction is done in policy_node. 
>> 
>> @@ -1801,6 +1856,11 @@ static int policy_node(gfp_t gfp, struct mempolicy *policy, int nd)
>> 		WARN_ON_ONCE(policy->mode == MPOL_BIND && (gfp & __GFP_THISNODE));
>> 	}
>> 
>> +	if ((policy->mode == MPOL_BIND ||
>> +	     policy->mode == MPOL_PREFERRED_MANY) &&
>> +	    policy->home_node != NUMA_NO_NODE)
>> +		return policy->home_node;
>> +
>> 	return nd;
>>  }
>
> But you do allow to set the home node also for other policies and that
> means that a default MPOL_INTERLEAVE would be different from the one
> with home_node set up even though they behave exactly the same.

I agree that there is no error returned if we try to set the home_node
for other memory policies. But there should not be any behaviour
differences. We ignore home_node for policies other than BIND and
PREFERRED_MANY.

The reason I avoided erroring out for other policies was to simplify the
error handling. For example, for a range of addr with a mix of memory
policy MPOLD_BIND and MPOL_INTERLEAVE what should be the state after the
above system call? We could say, we ignore setting home_node for vma
with policy MPOL_INTERLEAVE and leave the home node set for vma with
policy MPOL_BIND. Or should we make the system call return error also
undoing the changes done for vmas for which we have set the home_node?

-aneesh

  reply	other threads:[~2021-11-29 13:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20211116064238.727454-1-aneesh.kumar@linux.ibm.com>
2021-11-16  6:42 ` [PATCH v5 1/3] mm/mempolicy: use policy_node helper with MPOL_PREFERRED_MANY Aneesh Kumar K.V
2021-11-29 10:11   ` Michal Hocko
2021-11-29 10:12   ` [PATCH 4/3] mm: drop node from alloc_pages_vma Michal Hocko
2021-11-16  6:42 ` [PATCH v5 2/3] mm/mempolicy: add set_mempolicy_home_node syscall Aneesh Kumar K.V
2021-11-29 10:32   ` Michal Hocko
2021-11-29 10:46     ` Aneesh Kumar K.V
2021-11-29 12:45       ` Michal Hocko
2021-11-29 13:47         ` Aneesh Kumar K.V [this message]
2021-11-29 14:52           ` Michal Hocko
2021-11-29 14:59             ` Aneesh Kumar K.V
2021-11-29 15:19               ` Michal Hocko
2021-11-29 22:02   ` Andrew Morton
2021-11-30  8:59     ` Aneesh Kumar K.V
2021-11-30  9:59       ` Michal Hocko
2021-12-01  3:00       ` Andrew Morton
2021-12-01  6:22         ` Aneesh Kumar K.V
2021-12-01  0:47   ` Daniel Jordan
2021-12-01  6:15     ` Aneesh Kumar K.V
2021-12-01 16:22       ` Daniel Jordan
2021-11-16  6:42 ` [PATCH v5 3/3] mm/mempolicy: wire up syscall set_mempolicy_home_node Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735nfcbvp.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=ben.widawsky@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=feng.tang@intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=rdunlap@infradead.org \
    --cc=vbabka@suse.cz \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).