From: Johannes Weiner <hannes@cmpxchg.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guro@fb.com>, Linux MM <linux-mm@kvack.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Tejun Heo <tj@kernel.org>,
kernel-team@fb.com, Cgroups <cgroups@vger.kernel.org>,
linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RESEND v12 3/6] mm, oom: cgroup-aware OOM killer
Date: Tue, 31 Oct 2017 14:44:11 -0400 [thread overview]
Message-ID: <20171031184411.GA641@cmpxchg.org> (raw)
In-Reply-To: <CALvZod5tVoX20Lir=4jnWMXzsEGhh1qCbi73j5vs_n6ViR80yw@mail.gmail.com>
On Tue, Oct 31, 2017 at 10:50:43AM -0700, Shakeel Butt wrote:
> On Tue, Oct 31, 2017 at 9:40 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> > On Tue, Oct 31, 2017 at 08:04:19AM -0700, Shakeel Butt wrote:
> >> > +
> >> > +static void select_victim_memcg(struct mem_cgroup *root, struct oom_control *oc)
> >> > +{
> >> > + struct mem_cgroup *iter;
> >> > +
> >> > + oc->chosen_memcg = NULL;
> >> > + oc->chosen_points = 0;
> >> > +
> >> > + /*
> >> > + * The oom_score is calculated for leaf memory cgroups (including
> >> > + * the root memcg).
> >> > + */
> >> > + rcu_read_lock();
> >> > + for_each_mem_cgroup_tree(iter, root) {
> >> > + long score;
> >> > +
> >> > + if (memcg_has_children(iter) && iter != root_mem_cgroup)
> >> > + continue;
> >> > +
> >>
> >> Cgroup v2 does not support charge migration between memcgs. So, there
> >> can be intermediate nodes which may contain the major charge of the
> >> processes in their leave descendents. Skipping such intermediate nodes
> >> will kind of protect such processes from oom-killer (lower on the list
> >> to be killed). Is it ok to not handle such scenario? If yes, shouldn't
> >> we document it?
> >
> > Tasks cannot be in intermediate nodes, so the only way you can end up
> > in a situation like this is to start tasks fully, let them fault in
> > their full workingset, then create child groups and move them there.
> >
> > That has attribution problems much wider than the OOM killer: any
> > local limits you would set on a leaf cgroup like this ALSO won't
> > control the memory of its tasks - as it's all sitting in the parent.
> >
> > We created the "no internal competition" rule exactly to prevent this
> > situation.
>
> Rather than the "no internal competition" restriction I think "charge
> migration" would have resolved that situation? Also "no internal
> competition" restriction (I am assuming 'no internal competition' is
> no tasks in internal nodes, please correct me if I am wrong) has made
> "charge migration" hard to implement and thus not added in cgroup v2.
>
> I know this is parallel discussion and excuse my ignorance, what are
> other reasons behind "no internal competition" specifically for memory
> controller?
Sorry, but this is completely off-topic.
The rationale for this decisions is in Documentation/cgroup-v2.txt.
WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guro@fb.com>, Linux MM <linux-mm@kvack.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Tejun Heo <tj@kernel.org>,
kernel-team@fb.com, Cgroups <cgroups@vger.kernel.org>,
linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RESEND v12 3/6] mm, oom: cgroup-aware OOM killer
Date: Tue, 31 Oct 2017 14:44:11 -0400 [thread overview]
Message-ID: <20171031184411.GA641@cmpxchg.org> (raw)
In-Reply-To: <CALvZod5tVoX20Lir=4jnWMXzsEGhh1qCbi73j5vs_n6ViR80yw@mail.gmail.com>
On Tue, Oct 31, 2017 at 10:50:43AM -0700, Shakeel Butt wrote:
> On Tue, Oct 31, 2017 at 9:40 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> > On Tue, Oct 31, 2017 at 08:04:19AM -0700, Shakeel Butt wrote:
> >> > +
> >> > +static void select_victim_memcg(struct mem_cgroup *root, struct oom_control *oc)
> >> > +{
> >> > + struct mem_cgroup *iter;
> >> > +
> >> > + oc->chosen_memcg = NULL;
> >> > + oc->chosen_points = 0;
> >> > +
> >> > + /*
> >> > + * The oom_score is calculated for leaf memory cgroups (including
> >> > + * the root memcg).
> >> > + */
> >> > + rcu_read_lock();
> >> > + for_each_mem_cgroup_tree(iter, root) {
> >> > + long score;
> >> > +
> >> > + if (memcg_has_children(iter) && iter != root_mem_cgroup)
> >> > + continue;
> >> > +
> >>
> >> Cgroup v2 does not support charge migration between memcgs. So, there
> >> can be intermediate nodes which may contain the major charge of the
> >> processes in their leave descendents. Skipping such intermediate nodes
> >> will kind of protect such processes from oom-killer (lower on the list
> >> to be killed). Is it ok to not handle such scenario? If yes, shouldn't
> >> we document it?
> >
> > Tasks cannot be in intermediate nodes, so the only way you can end up
> > in a situation like this is to start tasks fully, let them fault in
> > their full workingset, then create child groups and move them there.
> >
> > That has attribution problems much wider than the OOM killer: any
> > local limits you would set on a leaf cgroup like this ALSO won't
> > control the memory of its tasks - as it's all sitting in the parent.
> >
> > We created the "no internal competition" rule exactly to prevent this
> > situation.
>
> Rather than the "no internal competition" restriction I think "charge
> migration" would have resolved that situation? Also "no internal
> competition" restriction (I am assuming 'no internal competition' is
> no tasks in internal nodes, please correct me if I am wrong) has made
> "charge migration" hard to implement and thus not added in cgroup v2.
>
> I know this is parallel discussion and excuse my ignorance, what are
> other reasons behind "no internal competition" specifically for memory
> controller?
Sorry, but this is completely off-topic.
The rationale for this decisions is in Documentation/cgroup-v2.txt.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-31 18:44 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-19 18:52 [RESEND v12 0/6] cgroup-aware OOM killer Roman Gushchin
2017-10-19 18:52 ` Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 1/6] mm, oom: refactor the oom_kill_process() function Roman Gushchin
2017-10-19 18:52 ` Roman Gushchin
2017-10-19 18:52 ` Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 2/6] mm: implement mem_cgroup_scan_tasks() for the root memory cgroup Roman Gushchin
2017-10-19 18:52 ` Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 3/6] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-10-19 18:52 ` Roman Gushchin
2017-10-19 19:30 ` Michal Hocko
2017-10-19 19:30 ` Michal Hocko
2017-10-31 15:04 ` Shakeel Butt
2017-10-31 15:04 ` Shakeel Butt
[not found] ` <CALvZod7V1iNACeDJuuSDrMMGMo7YX+gZ87gq=S4rP=Eh9Wh5kQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-31 15:29 ` Michal Hocko
2017-10-31 15:29 ` Michal Hocko
2017-10-31 15:29 ` Michal Hocko
2017-10-31 19:06 ` Michal Hocko
2017-10-31 19:06 ` Michal Hocko
2017-10-31 19:13 ` Michal Hocko
2017-10-31 19:13 ` Michal Hocko
2017-10-31 16:40 ` Johannes Weiner
2017-10-31 16:40 ` Johannes Weiner
2017-10-31 17:50 ` Shakeel Butt
2017-10-31 17:50 ` Shakeel Butt
2017-10-31 18:44 ` Johannes Weiner [this message]
2017-10-31 18:44 ` Johannes Weiner
2017-10-19 18:52 ` [RESEND v12 4/6] mm, oom: introduce memory.oom_group Roman Gushchin
2017-10-19 18:52 ` Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 5/6] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer Roman Gushchin
2017-10-19 18:52 ` Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 6/6] mm, oom, docs: describe the " Roman Gushchin
2017-10-19 18:52 ` Roman Gushchin
[not found] ` <20171019185218.12663-1-guro-b10kYP2dOMg@public.gmane.org>
2017-10-19 19:45 ` [RESEND v12 0/6] " Johannes Weiner
2017-10-19 19:45 ` Johannes Weiner
2017-10-19 19:45 ` Johannes Weiner
2017-10-19 21:09 ` Michal Hocko
2017-10-19 21:09 ` Michal Hocko
2017-10-23 0:24 ` David Rientjes
2017-10-23 0:24 ` David Rientjes
[not found] ` <alpine.DEB.2.10.1710221715010.70210-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2017-10-23 11:49 ` Michal Hocko
2017-10-23 11:49 ` Michal Hocko
2017-10-23 11:49 ` Michal Hocko
2017-10-25 20:12 ` David Rientjes
2017-10-25 20:12 ` David Rientjes
2017-10-26 14:24 ` Johannes Weiner
2017-10-26 14:24 ` Johannes Weiner
2017-10-26 21:03 ` David Rientjes
2017-10-26 21:03 ` David Rientjes
2017-10-27 9:31 ` Roman Gushchin
2017-10-27 9:31 ` Roman Gushchin
2017-10-27 9:31 ` Roman Gushchin
2017-10-30 21:36 ` David Rientjes
2017-10-30 21:36 ` David Rientjes
[not found] ` <alpine.DEB.2.10.1710301430170.105449-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2017-10-31 7:54 ` Michal Hocko
2017-10-31 7:54 ` Michal Hocko
2017-10-31 7:54 ` Michal Hocko
2017-10-31 22:21 ` David Rientjes
2017-10-31 22:21 ` David Rientjes
2017-11-01 7:37 ` Michal Hocko
2017-11-01 7:37 ` Michal Hocko
2017-11-01 20:42 ` David Rientjes
2017-11-01 20:42 ` David Rientjes
2017-10-27 20:05 ` Johannes Weiner
2017-10-27 20:05 ` Johannes Weiner
2017-10-31 14:17 ` peter enderborg
2017-10-31 14:17 ` peter enderborg
2017-10-31 14:34 ` Michal Hocko
2017-10-31 14:34 ` Michal Hocko
2017-10-31 15:07 ` peter enderborg
2017-10-31 15:07 ` peter enderborg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171031184411.GA641@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=guro@fb.com \
--cc=kernel-team@fb.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.