All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	Ben Widawsky <ben.widawsky@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Feng Tang <feng.tang@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>, Andi Kleen <ak@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Huang Ying <ying.huang@intel.com>,
	linux-api@vger.kernel.org
Subject: Re: [PATCH v5 2/3] mm/mempolicy: add set_mempolicy_home_node syscall
Date: Mon, 29 Nov 2021 19:17:06 +0530	[thread overview]
Message-ID: <8735nfcbvp.fsf@linux.ibm.com> (raw)
In-Reply-To: <YaTLVCKl9t5RCfQR@dhcp22.suse.cz>

Michal Hocko <mhocko@suse.com> writes:

> On Mon 29-11-21 16:16:05, Aneesh Kumar K.V wrote:
>> Michal Hocko <mhocko@suse.com> writes:
>> 
>> > On Tue 16-11-21 12:12:37, Aneesh Kumar K.V wrote:
> [...]
>> >> +SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, len,
>> >> +		unsigned long, home_node, unsigned long, flags)
>> >> +{
>> >> +	struct mm_struct *mm = current->mm;
>> >> +	struct vm_area_struct *vma;
>> >> +	struct mempolicy *new;
>> >> +	unsigned long vmstart;
>> >> +	unsigned long vmend;
>> >> +	unsigned long end;
>> >> +	int err = -ENOENT;
>> >> +
>> >> +	if (start & ~PAGE_MASK)
>> >> +		return -EINVAL;
>> >> +	/*
>> >> +	 * flags is used for future extension if any.
>> >> +	 */
>> >> +	if (flags != 0)
>> >> +		return -EINVAL;
>> >> +
>> >> +	if (!node_online(home_node))
>> >> +		return -EINVAL;
>> >
>> > You really want to check the home_node before dereferencing the mask.
>> >
>> 
>> Any reason why we want to check for home node first?
>
> Because the given node is an index to node_states[N_ONLINE] bitmap. I do
> not think we do range checking there.

Will add this

	if (home_node >= MAX_NUMNODES || !node_online(home_node))
		return -EINVAL;



>
>> >> +	len = (len + PAGE_SIZE - 1) & PAGE_MASK;
>> >> +	end = start + len;
>> >> +
>> >> +	if (end < start)
>> >> +		return -EINVAL;
>> >> +	if (end == start)
>> >> +		return 0;
>> >> +	mmap_write_lock(mm);
>> >> +	vma = find_vma(mm, start);
>> >> +	for (; vma && vma->vm_start < end;  vma = vma->vm_next) {
>> >> +
>> >> +		vmstart = max(start, vma->vm_start);
>> >> +		vmend   = min(end, vma->vm_end);
>> >> +		new = mpol_dup(vma_policy(vma));
>> >> +		if (IS_ERR(new)) {
>> >> +			err = PTR_ERR(new);
>> >> +			break;
>> >> +		}
>> >> +		/*
>> >> +		 * Only update home node if there is an existing vma policy
>> >> +		 */
>> >> +		if (!new)
>> >> +			continue;
>> >
>> > Your changelog only mentions MPOL_BIND and MPOL_PREFERRED_MANY as
>> > supported but you seem to be applying the home node to all existing
>> > policieso
>> 
>> 
>> The restriction is done in policy_node. 
>> 
>> @@ -1801,6 +1856,11 @@ static int policy_node(gfp_t gfp, struct mempolicy *policy, int nd)
>> 		WARN_ON_ONCE(policy->mode == MPOL_BIND && (gfp & __GFP_THISNODE));
>> 	}
>> 
>> +	if ((policy->mode == MPOL_BIND ||
>> +	     policy->mode == MPOL_PREFERRED_MANY) &&
>> +	    policy->home_node != NUMA_NO_NODE)
>> +		return policy->home_node;
>> +
>> 	return nd;
>>  }
>
> But you do allow to set the home node also for other policies and that
> means that a default MPOL_INTERLEAVE would be different from the one
> with home_node set up even though they behave exactly the same.

I agree that there is no error returned if we try to set the home_node
for other memory policies. But there should not be any behaviour
differences. We ignore home_node for policies other than BIND and
PREFERRED_MANY.

The reason I avoided erroring out for other policies was to simplify the
error handling. For example, for a range of addr with a mix of memory
policy MPOLD_BIND and MPOL_INTERLEAVE what should be the state after the
above system call? We could say, we ignore setting home_node for vma
with policy MPOL_INTERLEAVE and leave the home node set for vma with
policy MPOL_BIND. Or should we make the system call return error also
undoing the changes done for vmas for which we have set the home_node?

-aneesh

  reply	other threads:[~2021-11-29 13:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-16  6:42 [PATCH v5 0/3] mm: add new syscall set_mempolicy_home_node Aneesh Kumar K.V
2021-11-16  6:42 ` [PATCH v5 1/3] mm/mempolicy: use policy_node helper with MPOL_PREFERRED_MANY Aneesh Kumar K.V
2021-11-29 10:11   ` Michal Hocko
2021-11-29 10:12   ` [PATCH 4/3] mm: drop node from alloc_pages_vma Michal Hocko
2021-11-16  6:42 ` [PATCH v5 2/3] mm/mempolicy: add set_mempolicy_home_node syscall Aneesh Kumar K.V
2021-11-29 10:32   ` Michal Hocko
2021-11-29 10:46     ` Aneesh Kumar K.V
2021-11-29 12:45       ` Michal Hocko
2021-11-29 13:47         ` Aneesh Kumar K.V [this message]
2021-11-29 14:52           ` Michal Hocko
2021-11-29 14:59             ` Aneesh Kumar K.V
2021-11-29 15:19               ` Michal Hocko
2021-11-29 22:02   ` Andrew Morton
2021-11-30  8:59     ` Aneesh Kumar K.V
2021-11-30  9:59       ` Michal Hocko
2021-12-01  3:00       ` Andrew Morton
2021-12-01  6:22         ` Aneesh Kumar K.V
2021-12-01  0:47   ` Daniel Jordan
2021-12-01  6:15     ` Aneesh Kumar K.V
2021-12-01 16:22       ` Daniel Jordan
2021-11-16  6:42 ` [PATCH v5 3/3] mm/mempolicy: wire up syscall set_mempolicy_home_node Aneesh Kumar K.V
2021-11-29  8:37 ` [PATCH v5 0/3] mm: add new " Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735nfcbvp.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=ben.widawsky@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=feng.tang@intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=rdunlap@infradead.org \
    --cc=vbabka@suse.cz \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.