linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Feng Tang <feng.tang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Matthew Wilcox <willy@infradead.org>,
	Mel Gorman <mgorman@suse.de>,
	dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/2] mm: fix OOMs for binding workloads to movable zone only node
Date: Wed, 4 Nov 2020 09:53:43 +0100	[thread overview]
Message-ID: <20201104085343.GA18718@dhcp22.suse.cz> (raw)
In-Reply-To: <20201104084021.GB15700@shbuild999.sh.intel.com>

On Wed 04-11-20 16:40:21, Feng Tang wrote:
> On Wed, Nov 04, 2020 at 08:58:19AM +0100, Michal Hocko wrote:
> > On Wed 04-11-20 15:38:26, Feng Tang wrote:
> > [...]
> > > > Could you be more specific about the usecase here? Why do you need a
> > > > binding to a pure movable node? 
> > > 
> > > One common configuration for a platform is small size of DRAM plus huge
> > > size of PMEM (which is slower but cheaper), and my guess of their use
> > > is to try to lead the bulk of user space allocation (GFP_HIGHUSER_MOVABLE)
> > > to PMEM node, and only let DRAM be used as less as possible. 
> > 
> > While this is possible, it is a tricky configuration. It is essentially 
> > get us back to 32b and highmem...
> 
> :) Another possible case is similar binding on a memory hotplugable
> platform, which has one unplugable node and several other nodes configured
> as movable only to be hot removable when needed

Yes, another way to shoot your foot ;)

> > As I've said in reply to your second patch. I think we can make the oom
> > killer behavior more sensible in this misconfigured cases but I do not
> > think we want break the cpuset isolation for such a configuration.
> 
> Do you mean we skip the killing and just let the allocation fail? We've
> checked the oom killer code first, when the oom happens, both DRAM
> node and unmovable node have lots of free memory, and killing process
> won't improve the situation.

We already do skip oom killer and fail for lowmem allocation requests already.
This is similar in some sense. Another option would be to kill the
allocating context which will have less corner cases potentially because
some allocation failures might be unexpected.

> (Folloing is copied from your comments for 2/2) 
> > This allows to spill memory allocations over to any other node which
> > has Normal (or other lower) zones and as such it breaks cpuset isolation.
> > As I've pointed out in the reply to your cover letter it seems that
> > this is more of a misconfiguration than a bug.
> 
> For the usage case (docker container running), the spilling is already
> happening, I traced its memory allocation requests, many of them are
> movable, and got fallback to the normal node naturally with current

Could you be more specific? This sounds like a bug. Allocations
shouldn't spill over to a node which is not in the cpuset. There are few
exceptions like IRQ context but that shouldn't happen regurarly.

> code, only a few got blocked, as many of __alloc_pages_nodemask are
> called witih 'NULL' nodemask parameter.
> 
> And I made this RFC patch inspired by code in __alloc_pages_may_oom():
> 
> 	if (gfp_mask & __GFP_NOFAIL)
> 		page = __alloc_pages_cpuset_fallback(gfp_mask, order,
> 				ALLOC_NO_WATERMARKS, ac);

I am not really sure I follow here. __GFP_NOFAIL is a special beast
because such an allocation must not fail. Breaking node affinity is the
only option left. This shouldn't be something used for regular
allocation requests.
-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2020-11-04  8:53 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04  6:10 [RFC PATCH 0/2] mm: fix OOMs for binding workloads to movable zone only node Feng Tang
2020-11-04  6:10 ` [RFC PATCH 1/2] mm, oom: dump meminfo for all memory nodes Feng Tang
2020-11-04  7:18   ` Michal Hocko
2020-11-04  6:10 ` [RFC PATCH 2/2] mm, page_alloc: loose the node binding check to avoid helpless oom killing Feng Tang
2020-11-04  7:23   ` Michal Hocko
2020-11-04  7:13 ` [RFC PATCH 0/2] mm: fix OOMs for binding workloads to movable zone only node Michal Hocko
2020-11-04  7:38   ` Feng Tang
2020-11-04  7:58     ` Michal Hocko
2020-11-04  8:40       ` Feng Tang
2020-11-04  8:53         ` Michal Hocko [this message]
     [not found]           ` <20201105014028.GA86777@shbuild999.sh.intel.com>
2020-11-05 12:08             ` Michal Hocko
2020-11-05 12:53               ` Vlastimil Babka
2020-11-05 12:58                 ` Michal Hocko
2020-11-05 13:07                   ` Feng Tang
2020-11-05 13:12                     ` Michal Hocko
2020-11-05 13:43                       ` Feng Tang
2020-11-05 16:16                         ` Michal Hocko
2020-11-06  7:06                           ` Feng Tang
2020-11-06  8:10                             ` Michal Hocko
2020-11-06  9:08                               ` Feng Tang
2020-11-06 10:35                                 ` Michal Hocko
2020-11-05 13:14                   ` Vlastimil Babka
2020-11-05 13:19                     ` Michal Hocko
2020-11-05 13:34                       ` Vlastimil Babka
2020-11-06  4:32               ` Huang, Ying
2020-11-06  7:43                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201104085343.GA18718@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=feng.tang@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).