From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Mel Gorman <mgorman@suse.de>,
David Rientjes <rientjes@google.com>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
Oleg Nesterov <oleg@redhat.com>,
Andrea Argangeli <andrea@kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] mm, oom: introduce oom reaper
Date: Thu, 26 Nov 2015 12:08:49 +0100 [thread overview]
Message-ID: <20151126110849.GC7953@dhcp22.suse.cz> (raw)
In-Reply-To: <20151125200806.GA13388@cmpxchg.org>
On Wed 25-11-15 15:08:06, Johannes Weiner wrote:
> Hi Michal,
>
> I think whatever we end up doing to smoothen things for the "common
> case" (as much as OOM kills can be considered common), we need a plan
> to resolve the memory deadlock situations in a finite amount of time.
>
> Eventually we have to attempt killing another task. Or kill all of
> them to save the kernel.
>
> It just strikes me as odd to start with smoothening the common case,
> rather than making it functionally correct first.
I believe there is not an universally correct solution for this
problem. OOM killer is a heuristic and a destructive one so I think we
should limit it as much as possible. I do agree that we should allow an
administrator to define a policy when things go terribly wrong - e.g.
panic/emerg. reboot after the system is trashing on OOM for more than
a defined amount of time. But I think that this is orthogonal to this
patch. This patch should remove one large class of potential deadlocks
and corner cases without too much cost or maintenance burden. It doesn't
remove a need for the last resort solution though.
> On Wed, Nov 25, 2015 at 04:56:58PM +0100, Michal Hocko wrote:
> > A kernel thread has been chosen because we need a reliable way of
> > invocation so workqueue context is not appropriate because all the
> > workers might be busy (e.g. allocating memory). Kswapd which sounds
> > like another good fit is not appropriate as well because it might get
> > blocked on locks during reclaim as well.
>
> Why not do it directly from the allocating context? I.e. when entering
> the OOM killer and finding a lingering TIF_MEMDIE from a previous kill
> just reap its memory directly then and there. It's not like the
> allocating task has anything else to do in the meantime...
One reason is that we have to exclude race with exit_mmap so we have to
increase mm_users but we cannot mmput in this context because we might
deadlock. So we have to tear down from a different context. Another
reason is that address space of the victim might be really large and
reaping from on behalf of one (random) task might be really unfair
wrt. others. Doing that from a kernel threads sounds like an easy and
relatively cheap way to workaround both issues.
>
> > @@ -1123,7 +1126,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
> > continue;
> > }
> > /* If details->check_mapping, we leave swap entries. */
> > - if (unlikely(details))
> > + if (unlikely(details || !details->check_swap_entries))
> > continue;
>
> &&
Ups, thanks for catching this! I was playing with the condition and
rearranged the code multiple times before posting.
Thanks!
---
diff --git a/mm/memory.c b/mm/memory.c
index 4750d7e942a3..49cafa195527 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1125,8 +1125,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
}
continue;
}
- /* If details->check_mapping, we leave swap entries. */
- if (unlikely(details || !details->check_swap_entries))
+ /* only check swap_entries if explicitly asked for in details */
+ if (unlikely(details && !details->check_swap_entries))
continue;
entry = pte_to_swp_entry(ptent);
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-11-26 11:08 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-25 15:56 [RFC PATCH] mm, oom: introduce oom reaper Michal Hocko
2015-11-25 20:08 ` Johannes Weiner
2015-11-26 11:08 ` Michal Hocko [this message]
2015-11-26 15:24 ` Tetsuo Handa
2015-11-26 16:34 ` Michal Hocko
2015-11-26 17:31 ` Michal Hocko
2015-11-27 11:29 ` Tetsuo Handa
2015-11-27 12:35 ` Michal Hocko
2015-11-27 16:12 ` [RFC PATCH -v2] " Michal Hocko
2015-11-27 16:39 ` Mel Gorman
2015-12-01 13:07 ` Michal Hocko
2015-11-28 4:39 ` Tetsuo Handa
2015-11-28 16:10 ` Tetsuo Handa
2015-12-01 13:29 ` Michal Hocko
2015-12-05 12:33 ` Tetsuo Handa
2015-12-07 16:07 ` Michal Hocko
2015-12-07 22:19 ` Tetsuo Handa
2015-12-08 11:06 ` Michal Hocko
2015-12-01 13:22 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151126110849.GC7953@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=andrea@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=oleg@redhat.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).