From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: David Rientjes <rientjes@google.com>,
linux-mm@kvack.org, Roman Gushchin <guro@fb.com>
Subject: Re: [PATCH 4/4] mm, oom: Fix unnecessary killing of additional processes.
Date: Fri, 10 Aug 2018 13:16:04 +0200 [thread overview]
Message-ID: <20180810111604.GA1644@dhcp22.suse.cz> (raw)
In-Reply-To: <be42a7c0-015e-2992-a40d-20af21e8c0fc@i-love.sakura.ne.jp>
On Fri 10-08-18 19:54:39, Tetsuo Handa wrote:
> On 2018/08/10 18:07, Michal Hocko wrote:
> >> Yes, of course, it would be easy to rely on exit_mmap() to set
> >> MMF_OOM_SKIP itself and have the oom reaper drop the task from its list
> >> when we are assured of forward progress. What polling are you proposing
> >> other than a timeout based mechanism to do this?
> >
> > I was thinking about doing something like the following
> > - oom_reaper checks the amount of victim's memory after it is done with
> > reaping (e.g. by calling oom_badness before and after).
>
> OK. We can apply
>
> +static inline unsigned long oom_victim_mm_score(struct mm_struct *mm)
> +{
> + return get_mm_rss(mm) + get_mm_counter(mm, MM_SWAPENTS) +
> + mm_pgtables_bytes(mm) / PAGE_SIZE;
> +}
>
> and
>
> - points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
> - mm_pgtables_bytes(p->mm) / PAGE_SIZE;
> + points = oom_victim_mm_score(p->mm);
>
> part, can't we?
>
> > If it wasn't able to
> > reclaim much then return false and keep retrying with the existing
> > mechanism
>
> How do you decide whether oom_reaper() was not able to reclaim much?
Just a rule of thumb. If it freed at least few kBs then we should be good
to MMF_OOM_SKIP.
> > - once a flag (e.g. MMF_OOM_MMAP) is set it bails out and won't set the
> > MMF_OOM_SKIP flag.
>
> Unless oom_victim_mm_score() becomes close to 0, setting MMF_OOM_SKIP is
> considered premature. oom_reaper() will have to keep retrying....
there absolutely have to be a cap for retrying. Otherwise you have
lockup scenarios back when the memory is mostly consumed by page tables.
> >> We could set a MMF_EXIT_MMAP in exit_mmap() to specify that it will
> >> complete free_pgtables() for that mm. The problem is the same: when does
> >> the oom reaper decide to set MMF_OOM_SKIP because MMF_EXIT_MMAP has not
> >> been set in a timely manner?
> >
> > reuse the current retry policy which is the number of attempts rather
> > than any timeout.
>
> And this is really I can't understand. The number of attempts multiplied
> by retry interval _is_ nothing but timeout.
Yes it is a timeout but it is not the time that matters. It is that we
have tried sufficient times. Looks at it this way. You can retry 5 times
in 10s or just once. Depending on what is going on in the system. I
would really prefer the behavior to be deterministic.
> We are already using timeout based decision, with some attempt to reclaim
> memory if conditions are met.
Timeout based decision is when you, well, make a decision after a
certain time passes. And we do not do that.
> >> If this is an argument that the oom reaper should loop checking for
> >> MMF_EXIT_MMAP and doing schedule_timeout(1) a set number of times rather
> >> than just setting the jiffies in the mm itself, that's just implementing
> >> the same thing and doing so in a way where the oom reaper stalls operating
> >> on a single mm rather than round-robin iterating over mm's in my patch.
> >
> > I've said earlier that I do not mind doing round robin in the oom repaer
> > but this is certainly more complex than what we do now and I haven't
> > seen any actual example where it would matter. OOM reaper is a safely
> > measure. Nothing should fall apart if it is slow.
>
> The OOM reaper can fail if allocating threads have high priority. You seem to
> assume that realtime threads won't trigger OOM path. But since !PF_WQ_WORKER
> threads do only cond_resched() due to your "the cargo cult programming" refusal,
> and like Andrew Morton commented
>
> cond_resched() is a no-op in the presence of realtime policy threads
> and using to attempt to yield to a different thread it in this fashion
> is broken.
>
> at "mm: disable preemption before swapcache_free" thread, we can't guarantee
> that allocating threads shall give the OOM reaper enough CPU resource for
> making forward progress. And my direct OOM reaping proposal was also refused
> by you. I really dislike counting OOM reaper as a safety measure.
Well, yeah, you can screw up your system with real time priority tasks
all you want. I really fail to see why you are bringing that up now
though. Yet another offtopic?
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2018-08-10 11:16 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-04 13:29 [PATCH 1/4] mm, oom: Remove wake_oom_reaper() Tetsuo Handa
2018-08-04 13:29 ` [PATCH 2/4] mm, oom: Check pending victims earlier in out_of_memory() Tetsuo Handa
2018-08-04 13:29 ` [PATCH 3/4] mm, oom: Remove unused "abort" path Tetsuo Handa
2018-08-04 13:29 ` [PATCH 4/4] mm, oom: Fix unnecessary killing of additional processes Tetsuo Handa
2018-08-06 13:45 ` Michal Hocko
2018-08-06 20:19 ` David Rientjes
2018-08-06 20:51 ` Michal Hocko
2018-08-09 20:16 ` David Rientjes
2018-08-10 9:07 ` Michal Hocko
2018-08-10 10:54 ` Tetsuo Handa
2018-08-10 11:16 ` Michal Hocko [this message]
2018-08-11 3:12 ` Tetsuo Handa
2018-08-14 11:33 ` Michal Hocko
2018-08-19 14:23 ` Tetsuo Handa
2018-08-20 5:54 ` Michal Hocko
2018-08-20 22:03 ` Tetsuo Handa
2018-08-21 6:16 ` Michal Hocko
2018-08-21 13:39 ` Tetsuo Handa
2018-08-19 23:45 ` David Rientjes
2018-08-20 6:07 ` Michal Hocko
2018-08-20 21:31 ` David Rientjes
2018-08-21 6:09 ` Michal Hocko
2018-08-21 17:20 ` David Rientjes
2018-08-22 8:03 ` Michal Hocko
2018-08-22 20:54 ` David Rientjes
2018-09-01 11:48 ` Tetsuo Handa
2018-09-06 11:35 ` Michal Hocko
2018-09-06 11:50 ` Tetsuo Handa
2018-09-06 12:05 ` Michal Hocko
2018-09-06 13:40 ` Tetsuo Handa
2018-09-06 13:56 ` Michal Hocko
2018-09-06 14:06 ` Tetsuo Handa
2018-09-06 14:16 ` Michal Hocko
2018-09-06 21:13 ` Tetsuo Handa
2018-09-07 11:10 ` Michal Hocko
2018-09-07 11:36 ` Tetsuo Handa
2018-09-07 11:51 ` Michal Hocko
2018-09-07 13:30 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180810111604.GA1644@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=guro@fb.com \
--cc=linux-mm@kvack.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).