All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yury Norov <ynorov@nvidia.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>, Zi Yan <ziy@nvidia.com>,
	Matthew Brost <matthew.brost@intel.com>,
	Rakie Kim <rakie.kim@sk.com>, Byungchul Park <byungchul@sk.com>,
	Gregory Price <gourry@gourry.net>,
	Ying Huang <ying.huang@linux.alibaba.com>,
	Alistair Popple <apopple@nvidia.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Farhad Alemi <farhad.alemi@berkeley.edu>,
	Waiman Long <longman@redhat.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	cgroups@vger.kernel.org
Subject: Re: [PATCH] mm: don't allow empty relative nodemask in mpol_relative_nodemask()
Date: Fri, 29 May 2026 13:47:12 -0400	[thread overview]
Message-ID: <ahnRIDBk4bQ3xX2q@yury> (raw)
In-Reply-To: <20260529152616.2308736-1-joshua.hahnjy@gmail.com>

On Fri, May 29, 2026 at 08:26:15AM -0700, Joshua Hahn wrote:
> On Thu, 28 May 2026 12:41:33 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > On Thu, 28 May 2026 15:03:37 -0400 Yury Norov <ynorov@nvidia.com> wrote:
> > 
> > > Reassigning nodes relative an empty user-provided nodemask is useless,
> > > and triggers divide-by-zero in the function.
> > > 
> > > Reported-by: Farhad Alemi <farhad.alemi@berkeley.edu>
> > > Link: https://lore.kernel.org/all/CA+0ovCgxbZkXa+OU8w3s84R3KNPNxxRfmsNR-udh+afQBbGNmw@mail.gmail.com/
> > 
> > Thanks both.
> > 
> > It looks like this is very old code, so we'll be wanting a cc:stable in
> > this.
> > 
> > > --- a/mm/mempolicy.c
> > > +++ b/mm/mempolicy.c
> > > @@ -370,8 +370,13 @@ static inline int mpol_store_user_nodemask(const struct mempolicy *pol)
> > >  static void mpol_relative_nodemask(nodemask_t *ret, const nodemask_t *orig,
> > >  				   const nodemask_t *rel)
> > >  {
> > > +	unsigned int w = nodes_weight(*rel);
> > >  	nodemask_t tmp;
> > > -	nodes_fold(tmp, *orig, nodes_weight(*rel));
> > > +
> > > +	if (w == 0)
> > > +		return -EINVAL;
> > > +
> > > +	nodes_fold(tmp, *orig, w);
> > >  	nodes_onto(*ret, tmp, *rel);
> > >  }
> > 
> > I suspect we should address this at the mpol level - it should never
> > have got that far.  Hopefully the mempolicy maintainers can have a
> > think.
> 
> Hello Andrew, hello Yury,
> 
> I agree with Andrew here.
> mpol_relative_nodemask is called from two places, the first being
> mpol_rebind_nodemask which is the calling function seen in the bug report as
> well.
> 
> The other place is mpol_set_nodemask, which has a helpful comment that notes:
> "mpol_set_nodemask is called after mpol_new() [...snip...] mpol_new() has
> already validated the nodes parameter with respect to the policy mode and
> flags".
> 
> So it seems like we are missing the big if-else if-else if block from mpol_new
> in other places that should in fact have it, like mpol_rebind_nodemask.
> 
> The approach proposed here of just checking whether the node weight is 0
> won't work for a few cases, namely for MPOL_DEFAULT and MPOL_PREFERRED where
> empty nodemasks are actually allowed. So what should really be done here is to
> do the full policy-nodemask checking section in mpol_new and call that from
> mpol_set_nodemask as well.
> 
> Thank you for taking a shot at fixing the bug report, please let me know what
> you think! Have a great day : -)

Hi Joshua.

Indeed, quick and dirty shot.

The problem is that nodes_fold() can't work with the sz == 0. In
other words, folding to a 0-bit bitmap is an error. We don't check
that on bitmaps level because it's an internal helper, and it's a
caller's responsibility to validate the parameters.

nodes_onto(), or more specifically bitmap_onto(), is a different
story. In case of empty relmap, the function actually clears all the
bits in dst and returns.

I see 2 options to move this forward.

1. Simply disallow empty relmap in mpol_relative_nodemask(). There's
no valid cases for it, AFAIK, so the nodes_fold() limitation looks
reasonable. We can consider it as a new policy.

We've got 2 users for mpol_relative_nodemask(). In mpol_set_nodemask()
we can simply propagate the error; and in mpol_rebind_nodemask() we
can throw a warning and do nothing.

2. Follow the spirit of the nodes_onto(), and in case of empty
relmask, clean the ret mask and bail out

I'm in a favor for the 1st option, because empty relmask looks buggy
anyways.

> The approach proposed here of just checking whether the node weight is 0
> won't work for a few cases, namely for MPOL_DEFAULT and MPOL_PREFERRED where
> empty nodemasks are actually allowed.

Not sure I understand this. The mpol_relative_nodemask() is called
only if MPOL_F_RELATIVE_NODES is set. In mpol_rebind_nodemask(), if
both MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES are set, the former
wins. How would the RELATIVE mode mess with the others?

The mpol_new() code seemingly tries to disable empty nodes in case of
MPOL_DEFAILT and MPOL_PREFERRED + MPOL_F_RELATIVE_NODES, but obviously
it doesn't work very well in the rebind case.

Anyways, I'm not really deep in mempolicy domain, so please educate me if
I miss something.

Thanks,
Yury


  reply	other threads:[~2026-05-29 17:47 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-28 19:03 [PATCH] mm: don't allow empty relative nodemask in mpol_relative_nodemask() Yury Norov
2026-05-28 19:37 ` Waiman Long
2026-05-28 19:40   ` Yury Norov
2026-05-28 19:37 ` Matthew Wilcox
2026-05-28 19:41 ` Andrew Morton
2026-05-29 15:26   ` Joshua Hahn
2026-05-29 17:47     ` Yury Norov [this message]
2026-05-29 18:40       ` Joshua Hahn
2026-06-01 14:32       ` David Hildenbrand (Arm)
2026-06-02  8:44         ` Gregory Price
2026-06-02  9:19           ` David Hildenbrand (Arm)
2026-06-02  9:54             ` Gregory Price
2026-06-02 15:01               ` Farhad Alemi
2026-06-05 15:18                 ` David Hildenbrand (Arm)
2026-05-29  8:47 ` kernel test robot
2026-05-29  8:58 ` kernel test robot
2026-05-29 12:45 ` kernel test robot
2026-05-29 12:47 ` kernel test robot
2026-06-01 14:06 ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahnRIDBk4bQ3xX2q@yury \
    --to=ynorov@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=byungchul@sk.com \
    --cc=cgroups@vger.kernel.org \
    --cc=david@kernel.org \
    --cc=farhad.alemi@berkeley.edu \
    --cc=gourry@gourry.net \
    --cc=joshua.hahnjy@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=longman@redhat.com \
    --cc=matthew.brost@intel.com \
    --cc=rakie.kim@sk.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.