From: Michal Hocko <mhocko@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
David Rientjes <rientjes@google.com>,
Oleg Nesterov <oleg@redhat.com>, Kyle Walker <kwalker@redhat.com>,
Christoph Lameter <cl@linux.com>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov@parallels.com>,
linux-mm <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Stanislav Kozina <skozina@redhat.com>
Subject: Re: can't oom-kill zap the victim's memory?
Date: Mon, 5 Oct 2015 16:44:04 +0200 [thread overview]
Message-ID: <20151005144404.GD7023@dhcp22.suse.cz> (raw)
In-Reply-To: <CA+55aFw=OLSdh-5Ut2vjy=4Yf1fTXqpzoDHdF7XnT5gDHs6sYA@mail.gmail.com>
On Fri 02-10-15 15:01:06, Linus Torvalds wrote:
> On Fri, Oct 2, 2015 at 8:36 AM, Michal Hocko <mhocko@kernel.org> wrote:
> >
> > Have they been reported/fixed? All kernel paths doing an allocation are
> > _supposed_ to check and handle ENOMEM. If they are not then they are
> > buggy and should be fixed.
>
> No. Stop this theoretical idiocy.
>
> We've tried it. I objected before people tried it, and it turns out
> that it was a horrible idea.
>
> Small kernel allocations should basically never fail, because we end
> up needing memory for random things, and if a kmalloc() fails it's
> because some application is using too much memory, and the application
> should be killed. Never should the kernel allocation fail. It really
> is that simple. If we are out of memory, that does not mean that we
> should start failing random kernel things.
But you do realize that killing a task as a memory reclaim technique is
not 100% reliable, right?
Any task might be blocked in an uninterruptible context (e.g. a mutex)
waiting for completion which depends on the allocation success. The page
allocator (resp. OOM killer) is not aware of these dependencies and I am
really skeptical it will ever be because dependency tracking is way too
expensive. So killing a task doesn't guarantee a forward progress.
So I can see basically only few ways out of this deadlock situation.
Either we face the reality and allow small allocations (withtout
__GFP_NOFAIL) to fail after all attempts to reclaim memory have failed
(so after even OOM killer hasn't made any progress).
Or we can start killing other tasks but this might end up in the same
state and the time to resolve the problem might be basically unbounded
(it is trivial to construct loads where hundreds of tasks are bashing
against a single i_mutex and all of them depending on an allocation...).
Or we can panic/reboot the system if the OOM situation cannot be solved
within a selected timeout.
There are other ways to micro-optimize the current implementation by
playing with memory reserves but all that is just postponing the final
disaster and there is still a point of no further progress that we have
to deal with somehow.
> So this "people should check for allocation failures" is bullshit.
> It's a computer science myth. It's simply not true in all cases.
Sure it is not true in _all_ cases. If some paths cannot fail they can
use __GFP_NOFAIL for that purpose. The point is that most allocations
_can_ handle the failure. People are taught to check for allocation
failures. We even have scripts/coccinelle/null/kmerr.cocci which helps
to detect slab allocator users to some degree.
> Kernel allocators that know that they do large allocations (ie bigger
> than a few pages) need to be able to handle the failure, but not the
> general case. Also, kernel allocators that know they have a good
> fallback (eg they try a large allocation first but can fall back to a
> smaller one) should use __GFP_NORETRY, but again, that does *not* in
> any way mean that general kernel allocations should randomly fail.
>
> So no. The answer is ABSOLUTELY NOT "everybody should check allocation
> failure". Get over it. I refuse to go through that circus again. It's
> stupid.
>
> Linus
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2015-10-05 14:44 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-17 17:59 [PATCH] mm/oom_kill.c: don't kill TASK_UNINTERRUPTIBLE tasks Kyle Walker
2015-09-17 19:22 ` Oleg Nesterov
2015-09-18 15:41 ` Christoph Lameter
2015-09-18 16:24 ` Oleg Nesterov
2015-09-18 16:39 ` Tetsuo Handa
2015-09-18 16:54 ` Oleg Nesterov
2015-09-18 17:00 ` Christoph Lameter
2015-09-18 19:07 ` Oleg Nesterov
2015-09-18 19:19 ` Christoph Lameter
[not found] ` <CAEPKNT+H28BdJxb12MfFSrtoA8jkGX5WGSPGpH4ejRDbCQZFXQ@mail.gmail.com>
2015-09-18 22:07 ` Christoph Lameter
2015-09-19 8:32 ` Michal Hocko
2015-09-19 14:33 ` Tetsuo Handa
2015-09-19 15:51 ` Michal Hocko
2015-09-21 23:33 ` David Rientjes
2015-09-22 5:33 ` Tetsuo Handa
2015-09-22 23:32 ` David Rientjes
2015-09-23 12:03 ` Kyle Walker
2015-09-24 11:50 ` Tetsuo Handa
2015-09-19 14:44 ` Oleg Nesterov
2015-09-21 23:27 ` David Rientjes
2015-09-19 8:25 ` Michal Hocko
2015-09-19 8:22 ` Michal Hocko
2015-09-21 23:08 ` David Rientjes
2015-09-19 15:03 ` can't oom-kill zap the victim's memory? Oleg Nesterov
2015-09-19 15:10 ` Oleg Nesterov
2015-09-19 15:58 ` Michal Hocko
2015-09-20 13:16 ` Oleg Nesterov
2015-09-19 22:24 ` Linus Torvalds
2015-09-19 23:00 ` Raymond Jennings
2015-09-19 23:13 ` Linus Torvalds
2015-09-20 9:33 ` Michal Hocko
2015-09-20 13:06 ` Oleg Nesterov
2015-09-20 12:56 ` Oleg Nesterov
2015-09-20 18:05 ` Linus Torvalds
2015-09-20 19:07 ` Raymond Jennings
2015-09-21 13:57 ` Oleg Nesterov
2015-09-21 13:44 ` Oleg Nesterov
2015-09-21 14:24 ` Michal Hocko
2015-09-21 15:32 ` Oleg Nesterov
2015-09-21 16:12 ` Michal Hocko
2015-09-22 16:06 ` Oleg Nesterov
2015-09-22 23:04 ` David Rientjes
2015-09-23 20:59 ` Michal Hocko
2015-09-24 21:15 ` David Rientjes
2015-09-25 9:35 ` Michal Hocko
2015-09-25 16:14 ` Tetsuo Handa
2015-09-28 16:18 ` Tetsuo Handa
2015-09-28 22:28 ` David Rientjes
2015-10-02 12:36 ` Michal Hocko
2015-10-02 19:01 ` Linus Torvalds
2015-10-05 14:44 ` Michal Hocko [this message]
2015-10-07 5:16 ` Vlastimil Babka
2015-10-07 10:43 ` Tetsuo Handa
2015-10-08 9:40 ` Vlastimil Babka
2015-10-06 7:55 ` Eric W. Biederman
2015-10-06 8:49 ` Linus Torvalds
2015-10-06 8:55 ` Linus Torvalds
2015-10-06 14:52 ` Eric W. Biederman
2015-10-03 6:02 ` Can't we use timeout based OOM warning/killing? Tetsuo Handa
2015-10-06 14:51 ` Tetsuo Handa
2015-10-12 6:43 ` Tetsuo Handa
2015-10-12 15:25 ` Silent hang up caused by pages being not scanned? Tetsuo Handa
2015-10-12 21:23 ` Linus Torvalds
2015-10-13 12:21 ` Tetsuo Handa
2015-10-13 16:37 ` Linus Torvalds
2015-10-14 12:21 ` Tetsuo Handa
2015-10-15 13:14 ` Michal Hocko
2015-10-16 15:57 ` Michal Hocko
2015-10-16 18:34 ` Linus Torvalds
2015-10-16 18:49 ` Tetsuo Handa
2015-10-19 12:57 ` Michal Hocko
2015-10-19 12:53 ` Michal Hocko
2015-10-13 13:32 ` Michal Hocko
2015-10-13 16:19 ` Tetsuo Handa
2015-10-14 13:22 ` Michal Hocko
2015-10-14 14:38 ` Tetsuo Handa
2015-10-14 14:59 ` Michal Hocko
2015-10-14 15:06 ` Tetsuo Handa
2015-10-26 11:44 ` Newbie's question: memory allocation when reclaiming memory Tetsuo Handa
2015-11-05 8:46 ` Vlastimil Babka
[not found] ` <CA+55aFy5QBd-T2WXr5s4oAxcC1UoSjkFnd8v5f26LYzrtyFqAg@mail.gmail.com>
2015-10-08 15:33 ` Can't we use timeout based OOM warning/killing? Tetsuo Handa
2015-10-10 12:50 ` Tetsuo Handa
2015-09-28 22:24 ` can't oom-kill zap the victim's memory? David Rientjes
2015-09-29 7:57 ` Tetsuo Handa
2015-09-29 22:56 ` David Rientjes
2015-09-30 4:25 ` Tetsuo Handa
2015-09-30 10:21 ` Tetsuo Handa
2015-09-30 21:11 ` David Rientjes
2015-10-01 12:13 ` Tetsuo Handa
2015-10-01 14:48 ` Michal Hocko
2015-10-02 13:06 ` Tetsuo Handa
2015-10-06 18:45 ` Oleg Nesterov
2015-10-07 11:03 ` Tetsuo Handa
2015-10-07 12:00 ` Oleg Nesterov
2015-10-08 14:04 ` Michal Hocko
2015-10-08 14:01 ` Michal Hocko
2015-09-21 16:51 ` Tetsuo Handa
2015-09-22 12:43 ` Oleg Nesterov
2015-09-22 14:30 ` Tetsuo Handa
2015-09-22 14:45 ` Oleg Nesterov
2015-09-21 23:42 ` David Rientjes
2015-09-21 16:55 ` Linus Torvalds
2015-09-20 14:50 ` Tetsuo Handa
2015-09-20 14:55 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151005144404.GD7023@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=kwalker@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=oleg@redhat.com \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=rientjes@google.com \
--cc=skozina@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).