From: Michal Hocko <mhocko@suse.cz>
To: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org
Subject: Re: [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves
Date: Thu, 12 Dec 2013 11:50:03 +0100 [thread overview]
Message-ID: <20131212105003.GC2630@dhcp22.suse.cz> (raw)
In-Reply-To: <20131212103159.GB2630@dhcp22.suse.cz>
On Thu 12-12-13 11:31:59, Michal Hocko wrote:
[...]
> > > Anyway.
> > > Does the reclaim make any sense for PF_EXITING tasks? Shouldn't we
> > > simply bypass charges of these tasks automatically. Those tasks will
> > > free some memory anyway so why to trigger reclaim and potentially OOM
> > > in the first place? Do we need to go via TIF_MEMDIE loop in the first
> > > place?
> > >
> >
> > I don't see any reason to make an optimization there since they will get
> > TIF_MEMDIE set if reclaim has failed on one of their charges or if it
> > results in a system oom through the page allocator's oom killer.
>
> This all will happen after MEM_CGROUP_RECLAIM_RETRIES full reclaim
> rounds. Is it really worth the addional overhead just to later say "OK
> go ahead and skipp charges"?
> And for the !oom memcg it might reclaim some pages which could have
> stayed on LRUs just to free some memory little bit later and release the
> memory pressure.
> So I would rather go with
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c72b03bf9679..fee25c5934d2 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2692,7 +2693,8 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
> * MEMDIE process.
> */
> if (unlikely(test_thread_flag(TIF_MEMDIE)
> - || fatal_signal_pending(current)))
> + || fatal_signal_pending(current))
> + || current->flags & PF_EXITING)
> goto bypass;
>
> if (unlikely(task_in_memcg_oom(current)))
>
> rather than the later checks down the oom_synchronize paths. The comment
> already mentions dying process...
With the full changelog. I will repost it in a separate thread if you
are OK with this.
---
>From 6e24846f531bb1cc2e68383abf9a6c63b4ee1f78 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Thu, 12 Dec 2013 11:37:27 +0100
Subject: [PATCH] memcg: Do not hang on OOM when killed by userspace OOM
Eric has reported that he can see task(s) stuck in memcg OOM handler
regularly. The only way out is to
echo 0 > $GROUP/memory.oom_controll
His usecase is:
- Setup a hierarchy with memory and the freezer
(disable kernel oom and have a process watch for oom).
- In that memory cgroup add a process with one thread per cpu.
- In one thread slowly allocate once per second I think it is 16M of ram
and mlock and dirty it (just to force the pages into ram and stay there).
- When oom is achieved loop:
* attempt to freeze all of the tasks.
* if frozen send every task SIGKILL, unfreeze, remove the directory in
cgroupfs.
Eric has then pinpointed the issue to be memcg specific.
All tasks are sitting on the memcg_oom_waitq when memcg oom is disabled.
Those that have received fatal signal will bypass the charge and should
continue on their way out. The tricky part is that the exit path might
trigger a page fault (e.g. exit_robust_list) thus the memcg charge
while its memcg is still under OOM because nobody has released any
charges. Unlike with the in-kernel OOM handler the exiting task doesn't
get TIF_MEMDIE set so it doesn't shortcut charges and falls to the
memcg OOM again without any way out of it as there are no fatal signals
pending anymore.
This patch fixes the issue by checking PF_EXITING early in
__mem_cgroup_try_charge and bypass the charge same as if it had fatal
signal pending or TIF_MEMDIE set.
Normally (not killed) exiting tasks will bypass the charge now but
this should be OK as the task is leaving and will release memory and
increasing the memory pressure just to release it in a moment seems
dubious wasting of cycles. Besides that charges after exit_signals
should be rare.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
mm/memcontrol.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c72b03bf9679..98900c070045 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2692,7 +2692,8 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
* MEMDIE process.
*/
if (unlikely(test_thread_flag(TIF_MEMDIE)
- || fatal_signal_pending(current)))
+ || fatal_signal_pending(current))
+ || current->flags & PF_EXITING)
goto bypass;
if (unlikely(task_in_memcg_oom(current)))
--
1.8.4.4
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2013-12-12 10:50 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-31 1:39 [patch] mm, memcg: add memory.oom_control notification for system oom David Rientjes
2013-10-31 5:49 ` Johannes Weiner
2013-11-13 22:19 ` David Rientjes
2013-11-13 23:34 ` Johannes Weiner
2013-11-14 0:56 ` David Rientjes
2013-11-14 3:25 ` Johannes Weiner
2013-11-14 22:57 ` David Rientjes
2013-11-14 23:26 ` [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves David Rientjes
2013-11-14 23:26 ` [patch 2/2] mm, memcg: add memory.oom_control notification for system oom David Rientjes
2013-11-18 18:52 ` Michal Hocko
2013-11-19 1:25 ` David Rientjes
2013-11-19 12:41 ` Michal Hocko
2013-11-18 12:52 ` [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves Michal Hocko
2013-11-18 12:55 ` Michal Hocko
2013-11-19 1:19 ` David Rientjes
2013-11-18 15:41 ` Johannes Weiner
2013-11-18 16:51 ` Michal Hocko
2013-11-19 1:22 ` David Rientjes
2013-11-22 16:51 ` Johannes Weiner
2013-11-27 0:53 ` David Rientjes
2013-11-27 16:34 ` Johannes Weiner
2013-11-27 21:51 ` David Rientjes
2013-11-27 23:19 ` Johannes Weiner
2013-11-28 0:22 ` David Rientjes
2013-11-28 2:28 ` Johannes Weiner
2013-11-28 2:52 ` David Rientjes
2013-11-28 3:16 ` Johannes Weiner
2013-12-02 20:02 ` Michal Hocko
2013-12-02 21:25 ` Johannes Weiner
2013-12-03 12:04 ` Michal Hocko
2013-12-03 20:17 ` Johannes Weiner
2013-12-03 21:00 ` Michal Hocko
2013-12-03 21:23 ` Johannes Weiner
2013-12-03 23:50 ` David Rientjes
2013-12-04 3:34 ` Johannes Weiner
2013-12-04 11:13 ` Michal Hocko
2013-12-05 0:23 ` David Rientjes
2013-12-09 12:48 ` Michal Hocko
2013-12-09 21:46 ` David Rientjes
2013-12-09 22:51 ` Johannes Weiner
2013-12-09 23:05 ` Johannes Weiner
2014-01-10 0:34 ` David Rientjes
2013-12-10 10:38 ` Michal Hocko
2013-12-11 1:03 ` David Rientjes
2013-12-11 9:55 ` Michal Hocko
2013-12-11 22:40 ` David Rientjes
2013-12-12 10:31 ` Michal Hocko
2013-12-12 10:50 ` Michal Hocko [this message]
2013-12-12 12:11 ` Michal Hocko
2013-12-12 12:37 ` Michal Hocko
2013-12-13 23:55 ` David Rientjes
2013-12-17 16:23 ` Michal Hocko
2013-12-17 20:50 ` David Rientjes
2013-12-18 20:04 ` Michal Hocko
2013-12-19 6:09 ` David Rientjes
2013-12-19 14:41 ` Michal Hocko
2014-01-08 0:25 ` Andrew Morton
2014-01-08 10:33 ` Michal Hocko
2014-01-09 14:30 ` [PATCH] memcg: Do not hang on OOM when killed by userspace OOM " Michal Hocko
2014-01-09 21:40 ` David Rientjes
2014-01-10 8:23 ` Michal Hocko
2014-01-10 21:33 ` David Rientjes
2014-01-15 14:26 ` Michal Hocko
2014-01-15 21:19 ` David Rientjes
2014-01-16 10:12 ` Michal Hocko
2014-01-21 6:13 ` David Rientjes
2014-01-21 13:21 ` Michal Hocko
2014-01-09 21:34 ` [patch 1/2] mm, memcg: avoid oom notification when current needs " David Rientjes
2014-01-09 22:47 ` Andrew Morton
2014-01-10 0:01 ` David Rientjes
2014-01-10 0:12 ` Andrew Morton
2014-01-10 0:23 ` David Rientjes
2014-01-10 0:35 ` David Rientjes
2014-01-10 22:14 ` Johannes Weiner
2014-01-12 22:10 ` David Rientjes
2014-01-15 14:34 ` Michal Hocko
2014-01-15 21:23 ` David Rientjes
2014-01-16 9:32 ` Michal Hocko
2014-01-21 5:58 ` David Rientjes
2014-01-21 6:04 ` Greg Kroah-Hartmann
2014-01-21 6:08 ` David Rientjes
2014-01-10 8:30 ` Michal Hocko
2014-01-10 21:38 ` David Rientjes
2014-01-10 22:34 ` Johannes Weiner
2014-01-12 22:14 ` David Rientjes
2013-11-18 15:54 ` [patch] mm, memcg: add memory.oom_control notification for system oom Johannes Weiner
2013-11-18 23:15 ` One Thousand Gnomes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131212105003.GC2630@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox