From: Vladimir Davydov <vdavydov@virtuozzo.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
David Rientjes <rientjes@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] mm: oom: deduplicate victim selection code for memcg and global oom
Date: Wed, 8 Jun 2016 16:52:04 +0300 [thread overview]
Message-ID: <20160608135204.GA30465@esperanza> (raw)
In-Reply-To: <20160608083334.GF22570@dhcp22.suse.cz>
On Wed, Jun 08, 2016 at 10:33:34AM +0200, Michal Hocko wrote:
> On Fri 27-05-16 17:17:42, Vladimir Davydov wrote:
> [...]
> > @@ -970,26 +1028,25 @@ bool out_of_memory(struct oom_control *oc)
> > !oom_unkillable_task(current, NULL, oc->nodemask) &&
> > current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) {
> > get_task_struct(current);
> > - oom_kill_process(oc, current, 0, totalpages,
> > - "Out of memory (oom_kill_allocating_task)");
> > + oom_kill_process(oc, current, 0, totalpages);
> > return true;
> > }
>
> Do we really want to introduce sysctl_oom_kill_allocating_task to memcg
> as well?
Not sure, but why not? We take into account dump_tasks and panic_on_oom
on memcg oom so why should we treat this sysctl differently?
> The heuristic is quite dubious even for the global context IMHO
> because it leads to a very random behavior.
>
> > p = select_bad_process(oc, &points, totalpages);
> > /* Found nothing?!?! Either we hang forever, or we panic. */
> > - if (!p && !is_sysrq_oom(oc)) {
> > + if (!p && !is_sysrq_oom(oc) && !oc->memcg) {
> > dump_header(oc, NULL);
> > panic("Out of memory and no killable processes...\n");
> > }
> > if (p && p != (void *)-1UL) {
> > - oom_kill_process(oc, p, points, totalpages, "Out of memory");
> > + oom_kill_process(oc, p, points, totalpages);
> > /*
> > * Give the killed process a good chance to exit before trying
> > * to allocate memory again.
> > */
> > schedule_timeout_killable(1);
> > }
> > - return true;
> > + return !!p;
> > }
>
> Now if you look at out_of_memory() the only shared "heuristic" with the
> memcg part is the bypass for the exiting tasks.
bypass exiting task (task_will_free_mem)
check for panic (check_panic_on_oom)
oom badness evaluation (oom_scan_process_thread or oom_evaluate_task
after your patch)
points calculation + kill (oom_kill_process)
And if you need to modify any of these function calls or add yet another
check, you have to do it twice. Ugly.
> Plus both need the oom_lock.
I believe locking could be unified for global/memcg oom cases too.
> You have to special case oom notifiers, panic on no victim handling and
> I guess the oom_kill_allocating task is not intentional either. So I
> am not really sure this is an improvement. I even hate how we conflate
> sysrq vs. regular global oom context together but my cleanup for that
> has failed in the past.
>
> The victim selection code can be reduced because it is basically
> shared between the two, only the iterator differs. But I guess that
> can be eliminated by a simple helper.
IMHO exporting a bunch of very oom-specific helpers (like those I
enumerated above), partially revealing oom implementation, instead of
well defined memcg helpers that could be reused anywhere else looks
ugly. It's like having shrink_zone implementation both in vmscan.c and
memcontrol.c with shrink_slab, shrink_lruvec, etc. exported, because we
need to iterate over cgroups there.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@virtuozzo.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
David Rientjes <rientjes@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm: oom: deduplicate victim selection code for memcg and global oom
Date: Wed, 8 Jun 2016 16:52:04 +0300 [thread overview]
Message-ID: <20160608135204.GA30465@esperanza> (raw)
In-Reply-To: <20160608083334.GF22570@dhcp22.suse.cz>
On Wed, Jun 08, 2016 at 10:33:34AM +0200, Michal Hocko wrote:
> On Fri 27-05-16 17:17:42, Vladimir Davydov wrote:
> [...]
> > @@ -970,26 +1028,25 @@ bool out_of_memory(struct oom_control *oc)
> > !oom_unkillable_task(current, NULL, oc->nodemask) &&
> > current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) {
> > get_task_struct(current);
> > - oom_kill_process(oc, current, 0, totalpages,
> > - "Out of memory (oom_kill_allocating_task)");
> > + oom_kill_process(oc, current, 0, totalpages);
> > return true;
> > }
>
> Do we really want to introduce sysctl_oom_kill_allocating_task to memcg
> as well?
Not sure, but why not? We take into account dump_tasks and panic_on_oom
on memcg oom so why should we treat this sysctl differently?
> The heuristic is quite dubious even for the global context IMHO
> because it leads to a very random behavior.
>
> > p = select_bad_process(oc, &points, totalpages);
> > /* Found nothing?!?! Either we hang forever, or we panic. */
> > - if (!p && !is_sysrq_oom(oc)) {
> > + if (!p && !is_sysrq_oom(oc) && !oc->memcg) {
> > dump_header(oc, NULL);
> > panic("Out of memory and no killable processes...\n");
> > }
> > if (p && p != (void *)-1UL) {
> > - oom_kill_process(oc, p, points, totalpages, "Out of memory");
> > + oom_kill_process(oc, p, points, totalpages);
> > /*
> > * Give the killed process a good chance to exit before trying
> > * to allocate memory again.
> > */
> > schedule_timeout_killable(1);
> > }
> > - return true;
> > + return !!p;
> > }
>
> Now if you look at out_of_memory() the only shared "heuristic" with the
> memcg part is the bypass for the exiting tasks.
bypass exiting task (task_will_free_mem)
check for panic (check_panic_on_oom)
oom badness evaluation (oom_scan_process_thread or oom_evaluate_task
after your patch)
points calculation + kill (oom_kill_process)
And if you need to modify any of these function calls or add yet another
check, you have to do it twice. Ugly.
> Plus both need the oom_lock.
I believe locking could be unified for global/memcg oom cases too.
> You have to special case oom notifiers, panic on no victim handling and
> I guess the oom_kill_allocating task is not intentional either. So I
> am not really sure this is an improvement. I even hate how we conflate
> sysrq vs. regular global oom context together but my cleanup for that
> has failed in the past.
>
> The victim selection code can be reduced because it is basically
> shared between the two, only the iterator differs. But I guess that
> can be eliminated by a simple helper.
IMHO exporting a bunch of very oom-specific helpers (like those I
enumerated above), partially revealing oom implementation, instead of
well defined memcg helpers that could be reused anywhere else looks
ugly. It's like having shrink_zone implementation both in vmscan.c and
memcontrol.c with shrink_slab, shrink_lruvec, etc. exported, because we
need to iterate over cgroups there.
next prev parent reply other threads:[~2016-06-08 13:52 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-27 14:17 [PATCH 1/2] mm: oom: add memcg to oom_control Vladimir Davydov
2016-05-27 14:17 ` Vladimir Davydov
2016-05-27 14:17 ` [PATCH 2/2] mm: oom: deduplicate victim selection code for memcg and global oom Vladimir Davydov
2016-05-27 14:17 ` Vladimir Davydov
2016-05-27 14:26 ` Michal Hocko
2016-05-27 14:26 ` Michal Hocko
2016-05-27 14:45 ` Vladimir Davydov
2016-05-27 14:45 ` Vladimir Davydov
2016-05-27 14:52 ` Michal Hocko
2016-05-27 14:52 ` Michal Hocko
2016-05-27 17:23 ` Johannes Weiner
2016-05-27 17:23 ` Johannes Weiner
2016-06-08 8:33 ` Michal Hocko
2016-06-08 8:33 ` Michal Hocko
2016-06-08 11:18 ` Tetsuo Handa
2016-06-08 11:18 ` Tetsuo Handa
2016-06-08 11:24 ` Tetsuo Handa
2016-06-08 11:24 ` Tetsuo Handa
2016-06-08 14:21 ` Michal Hocko
2016-06-08 14:21 ` Michal Hocko
2016-06-08 13:52 ` Vladimir Davydov [this message]
2016-06-08 13:52 ` Vladimir Davydov
2016-06-08 14:46 ` Michal Hocko
2016-06-08 14:46 ` Michal Hocko
2016-05-27 14:34 ` [PATCH 1/2] mm: oom: add memcg to oom_control Michal Hocko
2016-05-27 14:34 ` Michal Hocko
2016-05-27 17:20 ` Johannes Weiner
2016-05-27 17:20 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160608135204.GA30465@esperanza \
--to=vdavydov@virtuozzo.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.