All of lore.kernel.org
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Muchun Song <songmuchun@bytedance.com>
Cc: "Hocko, Michal" <mhocko@suse.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"bwidawsk@kernel.org" <bwidawsk@kernel.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: mempolicy: fix policy_nodemask() for MPOL_PREFERRED_MANY case
Date: Tue, 2 Aug 2022 13:52:05 +0800	[thread overview]
Message-ID: <Yui7hWZYMX31ktOr@feng-skl> (raw)
In-Reply-To: <YuidPA9knCOoaT0c@FVFYT0MHHV2J>

On Tue, Aug 02, 2022 at 11:42:52AM +0800, Muchun Song wrote:
> On Mon, Aug 01, 2022 at 05:26:23PM +0800, Feng Tang wrote:
> > On Mon, Aug 01, 2022 at 05:06:14PM +0800, Michal Hocko wrote:
> > > On Mon 01-08-22 16:42:07, Muchun Song wrote:
> > > > policy_nodemask() is supposed to be returned a nodemask representing a mempolicy
> > > > for filtering nodes for page allocation, which is a hard restriction (see the user
> > > > of allowed_mems_nr() in hugetlb.c).  However, MPOL_PREFERRED_MANY is a preferred
> > > > mode not a hard restriction.  Now it breaks the user of HugeTLB.  Remove it from
> > > > policy_nodemask() to fix it, which will not affect current users of policy_nodemask()
> > > > since all of the users already have handled the case of MPOL_PREFERRED_MANY before
> > > > calling it.  BTW, it is found by code inspection.
> > > 
> > > I am not sure this is the right fix. It is quite true that
> > > policy_nodemask is a tricky function to use. It pretends to have a
> > > higher level logic but all existing users are expected to be policy
> > > aware and they special case allocation for each policy. That would mean
> > > that hugetlb should do the same.
> > 
> > Yes, when I worked on the MPOL_PREFERRED_MANY patches, I was also
> > confused about policy_nodemask(), as it is never a 'strict' one as
> > the old code is:
> > 
> > 	if (unlikely(mode == MPOL_BIND) &&
> > 		apply_policy_zone(policy, gfp_zone(gfp)) &&
> > 		cpuset_nodemask_valid_mems_allowed(&policy->nodes))
> > 		return &policy->nodes;
> > 
> > 	return NULL
> > 
> > Even when the MPOL_BIND's nodes is not allowed by cpuset, it will 
> > still return NULL (equals all nodes).
> >
> 
> Well, I agree policy_nodemask() is really confusing because of the
> shortage of comments and the weird logic.
> 
> > From the semantics of allowed_mems_nr(), I think it does get changed
> > a little by b27abaccf8e8. And to enforce the 'strict' semantic for
> > 'allowed', we may need a more strict nodemask API for it.
> >
> 
> Maybe this is a good idea to fix this, e.g. introducing a new helper
> to return the strict allowed nodemask.

Yep. 

I had another thought to add one global all-zero nodemask, for API like
policy_nodemask(), it has 2 types of return value:
* a nodemask with some bits set
* NULL (means all nodes)

Here a new type of zero nodemask (a gloabl variable)can be created to
indicate no qualified node.

> > > I haven't checked the actual behavior implications for hugetlb here. Is
> > > MPOL_PREFERRED_MANY even supported for hugetlb? Does this change make it
> > > work? From a quick look this just ignores MPOL_PREFERRED_MANY
> > > completely.
> > 
> > IIRC, the hugetlb will hornor MPOL_PREFERRED_MANY. And I can double
> > check and report back if otherwise.
> >
> > > > Fixes: b27abaccf8e8 ("mm/mempolicy: add MPOL_PREFERRED_MANY for multiple preferred nodes")
> > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > > > ---
> > > >  mm/mempolicy.c | 3 ---
> > > >  1 file changed, 3 deletions(-)
> > > > 
> > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > > > index 6c27acb6cd63..4deec7e598c6 100644
> > > > --- a/mm/mempolicy.c
> > > > +++ b/mm/mempolicy.c
> > > > @@ -1845,9 +1845,6 @@ nodemask_t *policy_nodemask(gfp_t gfp, struct mempolicy *policy)
> > > >  		cpuset_nodemask_valid_mems_allowed(&policy->nodes))
> > > >  		return &policy->nodes;
> > > >  
> > > > -	if (mode == MPOL_PREFERRED_MANY)
> > > > -		return &policy->nodes;
> > 
> > I think it will make MPOL_PREFERRED_MANY not usable.
> >
> 
> Sorry, I didn't got what you mean here. Could you explain more details
> about why it is not usable?
 
I thought alloc_pages() will rely on policy_nodemask(), which was wrong
as I forgot the MPOL_PREFERRED_MANY has a dedicated function
alloc_pages_preferred_many() to handle it. Sorry for the confusion.

Thanks,
Feng

> Thanks.
> 
> > Thanks,
> > Feng
> > 
> > > > -
> > > >  	return NULL;
> > > >  }
> > > >  
> > > > -- 
> > > > 2.11.0
> > > 
> > > -- 
> > > Michal Hocko
> > > SUSE Labs
> > 
> 


  reply	other threads:[~2022-08-02  5:52 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-01  8:42 [PATCH] mm: mempolicy: fix policy_nodemask() for MPOL_PREFERRED_MANY case Muchun Song
2022-08-01  9:06 ` Michal Hocko
2022-08-01  9:26   ` Feng Tang
2022-08-02  3:42     ` Muchun Song
2022-08-02  5:52       ` Feng Tang [this message]
2022-08-02  6:40         ` Muchun Song
2022-08-02  7:39           ` Feng Tang
2022-08-02  9:02             ` Michal Hocko
2022-08-03  6:41               ` Feng Tang
2022-08-03  7:36                 ` Michal Hocko
2022-08-03 17:14                   ` Feng Tang
2022-08-03 11:28                     ` Michal Hocko
2022-08-03 20:43                       ` Feng Tang
2022-08-03 12:56                         ` Michal Hocko
2022-08-03 21:08                           ` Feng Tang
2022-08-03 13:21                             ` Michal Hocko
2022-08-04  8:27                               ` Feng Tang
2022-08-04 10:43                                 ` Michal Hocko
2022-08-04 13:03                                   ` [PATCH] mm/hugetlb: add dedicated func to get 'allowed' nodemask for current process Feng Tang
2022-08-04 13:36                                     ` Michal Hocko
2022-08-04 22:37                                       ` Andrew Morton
2022-08-05  0:06                                         ` Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yui7hWZYMX31ktOr@feng-skl \
    --to=feng.tang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bwidawsk@kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.