linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch] oom: prevent unnecessary oom kills or kernel panics
@ 2011-03-01 19:09 David Rientjes
  2011-03-03  1:20 ` KOSAKI Motohiro
  0 siblings, 1 reply; 53+ messages in thread
From: David Rientjes @ 2011-03-01 19:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, KAMEZAWA Hiroyuki, Oleg Nesterov, Hugh Dickins,
	linux-mm

This patch revents unnecessary oom kills or kernel panics by reverting
two commits:

	495789a5 (oom: make oom_score to per-process value)
	cef1d352 (oom: multi threaded process coredump don't make deadlock)

First, 495789a5 (oom: make oom_score to per-process value) ignores the
fact that all threads in a thread group do not necessarily exit at the
same time.

It is imperative that select_bad_process() detect threads that are in the
exit path, specifically those with PF_EXITING set, to prevent needlessly
killing additional tasks.  If a process is oom killed and the thread
group leader exits, select_bad_process() cannot detect the other threads
that are PF_EXITING by iterating over only processes.  Thus, it currently
chooses another task unnecessarily for oom kill or panics the machine
when nothing else is eligible.

By iterating over threads instead, it is possible to detect threads that
are exiting and nominate them for oom kill so they get access to memory
reserves.

Second, cef1d352 (oom: multi threaded process coredump don't make
deadlock) erroneously avoids making the oom killer a no-op when an
eligible thread other than current isfound to be exiting.  We want to
detect this situation so that we may allow that exiting thread time to
exit and free its memory; if it is able to exit on its own, that should
free memory so current is no loner oom.  If it is not able to exit on its
own, the oom killer will nominate it for oom kill which, in this case,
only means it will get access to memory reserves.

Without this change, it is easy for the oom killer to unnecessarily
target tasks when all threads of a victim don't exit before the thread
group leader or, in the worst case, panic the machine.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -292,11 +292,11 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 		unsigned long totalpages, struct mem_cgroup *mem,
 		const nodemask_t *nodemask)
 {
-	struct task_struct *p;
+	struct task_struct *g, *p;
 	struct task_struct *chosen = NULL;
 	*ppoints = 0;
 
-	for_each_process(p) {
+	do_each_thread(g, p) {
 		unsigned int points;
 
 		if (oom_unkillable_task(p, mem, nodemask))
@@ -324,7 +324,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 		 * the process of exiting and releasing its resources.
 		 * Otherwise we could get an easy OOM deadlock.
 		 */
-		if (thread_group_empty(p) && (p->flags & PF_EXITING) && p->mm) {
+		if ((p->flags & PF_EXITING) && p->mm) {
 			if (p != current)
 				return ERR_PTR(-1UL);
 
@@ -337,7 +337,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 			chosen = p;
 			*ppoints = points;
 		}
-	}
+	} while_each_thread(g, p);
 
 	return chosen;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2011-03-15 21:25 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-01 19:09 [patch] oom: prevent unnecessary oom kills or kernel panics David Rientjes
2011-03-03  1:20 ` KOSAKI Motohiro
2011-03-03 19:53   ` David Rientjes
2011-03-06 11:14     ` KOSAKI Motohiro
2011-03-06 22:06       ` David Rientjes
2011-03-08  0:24         ` KOSAKI Motohiro
2011-03-08  2:01         ` KOSAKI Motohiro
2011-03-08 13:42   ` Oleg Nesterov
2011-03-08 23:57     ` David Rientjes
2011-03-09 10:36       ` KOSAKI Motohiro
2011-03-09 11:06       ` Oleg Nesterov
2011-03-09 20:32         ` David Rientjes
2011-03-10 12:05           ` Oleg Nesterov
2011-03-10 15:40             ` [PATCH 0/1] Was: " Oleg Nesterov
2011-03-10 15:41               ` [PATCH 1/1] oom_kill_task: mark every thread as TIF_MEMDIE Oleg Nesterov
2011-03-13  1:08                 ` David Rientjes
2011-03-10 16:36               ` [PATCH 0/1] select_bad_process: improve the PF_EXITING check Oleg Nesterov
2011-03-10 16:37                 ` [PATCH 1/1] " Oleg Nesterov
2011-03-10 16:40                 ` [PATCH 0/1] " Oleg Nesterov
2011-03-10 17:18                   ` [PATCH v2 " Oleg Nesterov
2011-03-10 17:19                     ` [PATCH v2 1/1] " Oleg Nesterov
2011-03-13  1:06             ` [patch] oom: prevent unnecessary oom kills or kernel panics David Rientjes
2011-03-09 23:19       ` Andrew Morton
2011-03-11 19:45         ` David Rientjes
2011-03-12 12:34           ` Oleg Nesterov
2011-03-12 13:43             ` [PATCH 0/3] oom: TIF_MEMDIE/PF_EXITING fixes Oleg Nesterov
2011-03-12 13:44               ` [PATCH 1/3] oom: oom_kill_task: mark every thread as TIF_MEMDIE Oleg Nesterov
2011-03-13  1:14                 ` David Rientjes
2011-03-12 13:44               ` [PATCH 2/3] oom: select_bad_process: improve the PF_EXITING check Oleg Nesterov
2011-03-12 13:44               ` [PATCH 3/3] oom: select_bad_process: use same_thread_group() Oleg Nesterov
2011-03-12 19:40               ` [PATCH 0/3] oom: TIF_MEMDIE/PF_EXITING fixes Hugh Dickins
2011-03-13  8:53                 ` KOSAKI Motohiro
2011-03-13 21:27                 ` Oleg Nesterov
2011-03-14 19:04                   ` [PATCH 0/3 for 2.6.38] oom: fixes Oleg Nesterov
2011-03-14 19:04                     ` [PATCH 1/3 for 2.6.38] oom: oom_kill_process: don't set TIF_MEMDIE if !p->mm Oleg Nesterov
2011-03-14 19:35                       ` Linus Torvalds
2011-03-14 20:31                         ` Oleg Nesterov
2011-03-14 20:32                         ` David Rientjes
2011-03-15 19:12                           ` Oleg Nesterov
2011-03-15 19:51                             ` David Rientjes
2011-03-14 20:22                       ` David Rientjes
2011-03-15 18:53                         ` Oleg Nesterov
2011-03-15 19:54                           ` David Rientjes
2011-03-15 21:16                             ` Oleg Nesterov
2011-03-14 19:05                     ` [PATCH 2/3 for 2.6.38] oom: select_bad_process: ignore TIF_MEMDIE zombies Oleg Nesterov
2011-03-14 20:50                       ` David Rientjes
2011-03-14 19:05                     ` [PATCH 3/3 for 2.6.38] oom: oom_kill_process: fix the child_points logic Oleg Nesterov
2011-03-14 20:41                       ` David Rientjes
2011-03-15 19:21                         ` Oleg Nesterov
2011-03-13 11:36               ` [PATCH 0/3] oom: TIF_MEMDIE/PF_EXITING fixes KOSAKI Motohiro
2011-03-13  1:11             ` [patch] oom: prevent unnecessary oom kills or kernel panics David Rientjes
2011-03-13  1:15               ` [patch -mm] oom: avoid deferring oom killer if exiting task is being traced David Rientjes
2011-03-14 17:40                 ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).