From: Michal Hocko <mhocko@kernel.org>
To: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
linux-mm@kvack.org, Roman Gushchin <guro@fb.com>
Subject: Re: [PATCH 4/4] mm, oom: Fix unnecessary killing of additional processes.
Date: Tue, 21 Aug 2018 08:09:52 +0200 [thread overview]
Message-ID: <20180821060952.GU29735@dhcp22.suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.21.1808201429400.58458@chino.kir.corp.google.com>
On Mon 20-08-18 14:31:04, David Rientjes wrote:
> On Mon, 20 Aug 2018, Michal Hocko wrote:
>
> > > The oom reaper will always be unable to free some memory, such as page
> > > tables. If it can't grab mm->mmap_sem in a reasonable amount of time, it
> > > also can give up early. The munlock() case is another example. We
> > > experience unnecessary oom killing during free_pgtables() where the
> > > single-threaded exit_mmap() is freeing an enormous amount of page tables
> > > (usually a malloc implementation such as tcmalloc that does not free
> > > virtual memory) and other processes are faulting faster than we can free.
> > > It's a combination of a multiprocessor system and a lot of virtual memory
> > > from the original victim. This is the same case as being unable to
> > > munlock quickly enough in exit_mmap() to free the memory.
> > >
> > > We must wait until free_pgtables() completes in exit_mmap() before killing
> > > additional processes in the large majority (99.96% of cases from my data)
> > > of instances where oom livelock does not occur. In the remainder of
> > > situations, livelock has been prevented by what the oom reaper has been
> > > able to free. We can, of course, not do free_pgtables() from the oom
> > > reaper. So my approach was to allow for a reasonable amount of time for
> > > the victim to free a lot of memory before declaring that additional
> > > processes must be oom killed. It would be functionally similar to having
> > > the oom reaper retry many, many more times than 10 and having a linked
> > > list of mm_structs to reap. I don't care one way or another if it's a
> > > timeout based solution or many, many retries that have schedule_timeout()
> > > that yields the same time period in the end.
> >
> > I would really keep the current retry logic with an extension to allow
> > to keep retrying or hand over to exit_mmap when we know it is past the
> > last moment of blocking.
> >
>
> Ok, so it appears you're suggesting a per-mm counter of oom reaper retries
> and once it reaches a certain threshold, either give up and set
> MMF_OOM_SKIP or declare that exit_mmap() is responsible for it. That's
> fine, but obviously I'll be suggesting that the threshold is rather large.
> So if I adjust my patch to be a retry counter rather than timestamp, do
> you have any other reservations?
It absolutely has to be an internal thing without any user API to be
set. Also I still haven't heard any specific argument why would oom
reaper need to do per-task attempt and loop over all victims on the
list. Maybe you have some examples though.
I believe that we really need to think about the hand over between the
two paths first and only build a more elaborate retry logic on top of
it.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2018-08-21 6:09 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-04 13:29 [PATCH 1/4] mm, oom: Remove wake_oom_reaper() Tetsuo Handa
2018-08-04 13:29 ` [PATCH 2/4] mm, oom: Check pending victims earlier in out_of_memory() Tetsuo Handa
2018-08-04 13:29 ` [PATCH 3/4] mm, oom: Remove unused "abort" path Tetsuo Handa
2018-08-04 13:29 ` [PATCH 4/4] mm, oom: Fix unnecessary killing of additional processes Tetsuo Handa
2018-08-06 13:45 ` Michal Hocko
2018-08-06 20:19 ` David Rientjes
2018-08-06 20:51 ` Michal Hocko
2018-08-09 20:16 ` David Rientjes
2018-08-10 9:07 ` Michal Hocko
2018-08-10 10:54 ` Tetsuo Handa
2018-08-10 11:16 ` Michal Hocko
2018-08-11 3:12 ` Tetsuo Handa
2018-08-14 11:33 ` Michal Hocko
2018-08-19 14:23 ` Tetsuo Handa
2018-08-20 5:54 ` Michal Hocko
2018-08-20 22:03 ` Tetsuo Handa
2018-08-21 6:16 ` Michal Hocko
2018-08-21 13:39 ` Tetsuo Handa
2018-08-19 23:45 ` David Rientjes
2018-08-20 6:07 ` Michal Hocko
2018-08-20 21:31 ` David Rientjes
2018-08-21 6:09 ` Michal Hocko [this message]
2018-08-21 17:20 ` David Rientjes
2018-08-22 8:03 ` Michal Hocko
2018-08-22 20:54 ` David Rientjes
2018-09-01 11:48 ` Tetsuo Handa
2018-09-06 11:35 ` Michal Hocko
2018-09-06 11:50 ` Tetsuo Handa
2018-09-06 12:05 ` Michal Hocko
2018-09-06 13:40 ` Tetsuo Handa
2018-09-06 13:56 ` Michal Hocko
2018-09-06 14:06 ` Tetsuo Handa
2018-09-06 14:16 ` Michal Hocko
2018-09-06 21:13 ` Tetsuo Handa
2018-09-07 11:10 ` Michal Hocko
2018-09-07 11:36 ` Tetsuo Handa
2018-09-07 11:51 ` Michal Hocko
2018-09-07 13:30 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180821060952.GU29735@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=guro@fb.com \
--cc=linux-mm@kvack.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).