From: Johannes Weiner <hannes@cmpxchg.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: Greg Thelen <gthelen@google.com>, Roman Gushchin <guro@fb.com>,
Michal Hocko <mhocko@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org,
syzbot+13f93c99c06988391efe@syzkaller.appspotmail.com
Subject: Re: [PATCH] mm: vmscan: memcontrol: remove mem_cgroup_select_victim_node()
Date: Wed, 30 Oct 2019 13:44:55 -0400 [thread overview]
Message-ID: <20191030174455.GA45135@cmpxchg.org> (raw)
In-Reply-To: <20191029234753.224143-1-shakeelb@google.com>
On Tue, Oct 29, 2019 at 04:47:53PM -0700, Shakeel Butt wrote:
> Since commit 1ba6fc9af35b ("mm: vmscan: do not share cgroup iteration
> between reclaimers"), the memcg reclaim does not bail out earlier based
> on sc->nr_reclaimed and will traverse all the nodes. All the reclaimable
> pages of the memcg on all the nodes will be scanned relative to the
> reclaim priority. So, there is no need to maintain state regarding which
> node to start the memcg reclaim from. Also KCSAN complains data races in
> the code maintaining the state.
>
> This patch effectively reverts the commit 889976dbcb12 ("memcg: reclaim
> memory from nodes in round-robin order") and the commit 453a9bf347f1
> ("memcg: fix numa scan information update to be triggered by memory
> event").
>
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> Reported-by: <syzbot+13f93c99c06988391efe@syzkaller.appspotmail.com>
Excellent, thanks Shakeel!
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Just a request on this bit:
> @@ -3360,16 +3358,9 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
> .may_unmap = 1,
> .may_swap = may_swap,
> };
> + struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
>
> set_task_reclaim_state(current, &sc.reclaim_state);
> - /*
> - * Unlike direct reclaim via alloc_pages(), memcg's reclaim doesn't
> - * take care of from where we get pages. So the node where we start the
> - * scan does not need to be the current node.
> - */
> - nid = mem_cgroup_select_victim_node(memcg);
> -
> - zonelist = &NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK];
This works, but it *is* somewhat fragile if we decide to add bail-out
conditions to reclaim again. And some numa nodes receiving slightly
less pressure than others could be quite tricky to debug.
Can we add a comment here that points out the assumption that the
zonelist walk is comprehensive, and that all nodes receive equal
reclaim pressure?
Also, I think we should use sc.gfp_mask & ~__GFP_THISNODE, so that
allocations with a physical node preference still do node-agnostic
reclaim for the purpose of cgroup accounting.
next prev parent reply other threads:[~2019-10-30 17:44 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-29 23:47 [PATCH] mm: vmscan: memcontrol: remove mem_cgroup_select_victim_node() Shakeel Butt
2019-10-29 23:51 ` Roman Gushchin
2019-10-30 0:47 ` Andrew Morton
2019-10-30 0:54 ` Andrew Morton
2019-10-30 1:15 ` Shakeel Butt
2019-10-30 11:49 ` Michal Hocko
2019-10-30 17:44 ` Johannes Weiner [this message]
2019-10-30 17:53 ` Michal Hocko
2019-10-30 18:06 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191030174455.GA45135@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=gthelen@google.com \
--cc=guro@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=shakeelb@google.com \
--cc=syzbot+13f93c99c06988391efe@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.