linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: skip zombie in OOM-killer
@ 2011-03-05  0:51 Andrey Vagin
  2011-03-06  2:44 ` David Rientjes
  2011-03-06 10:37 ` KOSAKI Motohiro
  0 siblings, 2 replies; 11+ messages in thread
From: Andrey Vagin @ 2011-03-05  0:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, KOSAKI Motohiro, avagin, linux-mm, linux-kernel

When we check that task has flag TIF_MEMDIE, we forgot check that
it has mm. A task may be zombie and a parent may wait a memor.

v2: Check that task doesn't have mm one time and skip it immediately

Signed-off-by: Andrey Vagin <avagin@openvz.org>
---
 mm/oom_kill.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 7dcca55..b1ce3ba 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -299,6 +299,9 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 	for_each_process(p) {
 		unsigned int points;
 
+		if (!p->mm)
+			continue;
+
 		if (oom_unkillable_task(p, mem, nodemask))
 			continue;
 
@@ -324,7 +327,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 		 * the process of exiting and releasing its resources.
 		 * Otherwise we could get an easy OOM deadlock.
 		 */
-		if (thread_group_empty(p) && (p->flags & PF_EXITING) && p->mm) {
+		if (thread_group_empty(p) && (p->flags & PF_EXITING)) {
 			if (p != current)
 				return ERR_PTR(-1UL);
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-05  0:51 [PATCH] mm: skip zombie in OOM-killer Andrey Vagin
@ 2011-03-06  2:44 ` David Rientjes
  2011-03-06 10:37 ` KOSAKI Motohiro
  1 sibling, 0 replies; 11+ messages in thread
From: David Rientjes @ 2011-03-06  2:44 UTC (permalink / raw)
  To: Andrey Vagin; +Cc: Andrew Morton, KOSAKI Motohiro, linux-mm, linux-kernel

On Sat, 5 Mar 2011, Andrey Vagin wrote:

> When we check that task has flag TIF_MEMDIE, we forgot check that
> it has mm. A task may be zombie and a parent may wait a memor.
> 
> v2: Check that task doesn't have mm one time and skip it immediately
> 
> Signed-off-by: Andrey Vagin <avagin@openvz.org>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-05  0:51 [PATCH] mm: skip zombie in OOM-killer Andrey Vagin
  2011-03-06  2:44 ` David Rientjes
@ 2011-03-06 10:37 ` KOSAKI Motohiro
  2011-03-06 22:03   ` David Rientjes
  1 sibling, 1 reply; 11+ messages in thread
From: KOSAKI Motohiro @ 2011-03-06 10:37 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: kosaki.motohiro, Andrew Morton, David Rientjes, linux-mm,
	linux-kernel

> When we check that task has flag TIF_MEMDIE, we forgot check that
> it has mm. A task may be zombie and a parent may wait a memor.
> 
> v2: Check that task doesn't have mm one time and skip it immediately
> 
> Signed-off-by: Andrey Vagin <avagin@openvz.org>

This seems incorrect. Do you have a reprodusable testcasae?
Your patch only care thread group leader state, but current code
care all thread in the process. Please look at oom_badness() and 
find_lock_task_mm(). 




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-06 10:37 ` KOSAKI Motohiro
@ 2011-03-06 22:03   ` David Rientjes
  2011-03-07 11:55     ` Andrew Vagin
  2011-03-08  2:07     ` KOSAKI Motohiro
  0 siblings, 2 replies; 11+ messages in thread
From: David Rientjes @ 2011-03-06 22:03 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Andrey Vagin, Andrew Morton, linux-mm, linux-kernel

On Sun, 6 Mar 2011, KOSAKI Motohiro wrote:

> > When we check that task has flag TIF_MEMDIE, we forgot check that
> > it has mm. A task may be zombie and a parent may wait a memor.
> > 
> > v2: Check that task doesn't have mm one time and skip it immediately
> > 
> > Signed-off-by: Andrey Vagin <avagin@openvz.org>
> 
> This seems incorrect. Do you have a reprodusable testcasae?
> Your patch only care thread group leader state, but current code
> care all thread in the process. Please look at oom_badness() and 
> find_lock_task_mm(). 
> 

That's all irrelevant, the test for TIF_MEMDIE specifically makes the oom 
killer a complete no-op when an eligible task is found to have been oom 
killed to prevent needlessly killing additional tasks.  oom_badness() and 
find_lock_task_mm() have nothing to do with that check to return 
ERR_PTR(-1UL) from select_bad_process().

Andrey is patching the case where an eligible TIF_MEMDIE process is found 
but it has already detached its ->mm.  In combination with the patch 
posted to linux-mm, oom: prevent unnecessary oom kills or kernel panics, 
which makes select_bad_process() iterate over all threads, it is an 
effective solution.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-06 22:03   ` David Rientjes
@ 2011-03-07 11:55     ` Andrew Vagin
  2011-03-07 20:36       ` David Rientjes
  2011-03-08  1:24       ` KOSAKI Motohiro
  2011-03-08  2:07     ` KOSAKI Motohiro
  1 sibling, 2 replies; 11+ messages in thread
From: Andrew Vagin @ 2011-03-07 11:55 UTC (permalink / raw)
  To: David Rientjes
  Cc: KOSAKI Motohiro, Andrey Vagin, Andrew Morton, linux-mm,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2172 bytes --]

2011/3/7 David Rientjes <rientjes@google.com>:
> On Sun, 6 Mar 2011, KOSAKI Motohiro wrote:
>
>> > When we check that task has flag TIF_MEMDIE, we forgot check that
>> > it has mm. A task may be zombie and a parent may wait a memor.
>> >
>> > v2: Check that task doesn't have mm one time and skip it immediately
>> >
>> > Signed-off-by: Andrey Vagin <avagin@openvz.org>
>>
>> This seems incorrect. Do you have a reprodusable testcasae?
>> Your patch only care thread group leader state, but current code
>> care all thread in the process. Please look at oom_badness() and
>> find_lock_task_mm().
>>
>
> That's all irrelevant, the test for TIF_MEMDIE specifically makes the oom
> killer a complete no-op when an eligible task is found to have been oom
> killed to prevent needlessly killing additional tasks.  oom_badness() and
> find_lock_task_mm() have nothing to do with that check to return
> ERR_PTR(-1UL) from select_bad_process().
>
> Andrey is patching the case where an eligible TIF_MEMDIE process is found
> but it has already detached its ->mm.  In combination with the patch
> posted to linux-mm, oom: prevent unnecessary oom kills or kernel panics,
> which makes select_bad_process() iterate over all threads, it is an
> effective solution.

Probably you said about the first version of my patch.
This version is incorrect because of
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=dd8e8f405ca386c7ce7cbb996ccd985d283b0e03

but my first patch is correct and it has a simple reproducer(I
attached it). You can execute it and your kernel hangs up, because the
parent doesn't wait children, but the one child (zombie) will have
flag TIF_MEMDIE, oom_killer will kill nobody


The link on the first patch:
http://groups.google.com/group/linux.kernel/browse_thread/thread/b9c6ddf34d1671ab/2941e1877ca4f626?lnk=raot&pli=1
>
> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

[-- Attachment #2: memeater_killer.py --]
[-- Type: application/octet-stream, Size: 780 bytes --]

import sys, time, mmap, os
from subprocess import Popen, PIPE
import random

global mem_size

def info(msg):
	pid = os.getpid()
	print >> sys.stderr, "%s: %s" % (pid, msg)
	sys.stderr.flush()



def memory_loop(cmd = "a"):
	"""
	cmd may be:
		c: check memory
		else: touch memory
	"""
	c = 0
	for j in xrange(0, mem_size):
		if cmd == "c":
			if f[j<<12] != chr(j % 255):
				info("Data corruption")
				sys.exit(1)
		else:
			f[j<<12] = chr(j % 255)
for i in xrange(20):
	pid = os.fork()
	time.sleep(1)
	if (pid == 0):
		sys.stdout.write("mmap\n")
		sys.stdout.flush()
		mem_size = 400 * 1024
		f = mmap.mmap(-1, mem_size << 12, mmap.MAP_ANONYMOUS|mmap.MAP_PRIVATE)
		memory_loop()
		time.sleep(100)
		f.close()
		sys.stdout.write("ummap\n")
		sys.stdout.flush()
time.sleep(100)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-07 11:55     ` Andrew Vagin
@ 2011-03-07 20:36       ` David Rientjes
  2011-03-07 21:52         ` Andrew Morton
  2011-03-08  1:24       ` KOSAKI Motohiro
  1 sibling, 1 reply; 11+ messages in thread
From: David Rientjes @ 2011-03-07 20:36 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: KOSAKI Motohiro, Andrey Vagin, Andrew Morton, linux-mm,
	linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2140 bytes --]

On Mon, 7 Mar 2011, Andrew Vagin wrote:

> > Andrey is patching the case where an eligible TIF_MEMDIE process is found
> > but it has already detached its ->mm.  In combination with the patch
> > posted to linux-mm, oom: prevent unnecessary oom kills or kernel panics,
> > which makes select_bad_process() iterate over all threads, it is an
> > effective solution.
> 
> Probably you said about the first version of my patch.
> This version is incorrect because of
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=dd8e8f405ca386c7ce7cbb996ccd985d283b0e03
> 
> but my first patch is correct and it has a simple reproducer(I
> attached it). You can execute it and your kernel hangs up, because the
> parent doesn't wait children, but the one child (zombie) will have
> flag TIF_MEMDIE, oom_killer will kill nobody
> 

The second version of your patch works fine in combination with the 
pending "oom: prevent unnecessary oom kills or kernel panics" patch from 
linux-mm (included below).  Try your test case with both this patch and 
the second version of your patch.

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -292,11 +292,11 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 		unsigned long totalpages, struct mem_cgroup *mem,
 		const nodemask_t *nodemask)
 {
-	struct task_struct *p;
+	struct task_struct *g, *p;
 	struct task_struct *chosen = NULL;
 	*ppoints = 0;
 
-	for_each_process(p) {
+	do_each_thread(g, p) {
 		unsigned int points;
 
 		if (oom_unkillable_task(p, mem, nodemask))
@@ -324,7 +324,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 		 * the process of exiting and releasing its resources.
 		 * Otherwise we could get an easy OOM deadlock.
 		 */
-		if (thread_group_empty(p) && (p->flags & PF_EXITING) && p->mm) {
+		if ((p->flags & PF_EXITING) && p->mm) {
 			if (p != current)
 				return ERR_PTR(-1UL);
 
@@ -337,7 +337,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 			chosen = p;
 			*ppoints = points;
 		}
-	}
+	} while_each_thread(g, p);
 
 	return chosen;
 }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-07 20:36       ` David Rientjes
@ 2011-03-07 21:52         ` Andrew Morton
  2011-03-07 23:43           ` David Rientjes
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2011-03-07 21:52 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Vagin, KOSAKI Motohiro, Andrey Vagin, linux-mm,
	linux-kernel

On Mon, 7 Mar 2011 12:36:49 -0800 (PST)
David Rientjes <rientjes@google.com> wrote:

> On Mon, 7 Mar 2011, Andrew Vagin wrote:
> 
> > > Andrey is patching the case where an eligible TIF_MEMDIE process is found
> > > but it has already detached its ->mm. __In combination with the patch
> > > posted to linux-mm, oom: prevent unnecessary oom kills or kernel panics,
> > > which makes select_bad_process() iterate over all threads, it is an
> > > effective solution.
> > 
> > Probably you said about the first version of my patch.
> > This version is incorrect because of
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=dd8e8f405ca386c7ce7cbb996ccd985d283b0e03
> > 
> > but my first patch is correct and it has a simple reproducer(I
> > attached it). You can execute it and your kernel hangs up, because the
> > parent doesn't wait children, but the one child (zombie) will have
> > flag TIF_MEMDIE, oom_killer will kill nobody
> > 
> 
> The second version of your patch works fine in combination with the 
> pending "oom: prevent unnecessary oom kills or kernel panics" patch from 
> linux-mm (included below).

Andrew's v2 doesn't apply on top of
oom-prevent-unnecessary-oom-kills-or-kernel-panics.patch and I'm
disinclined to fix that up and merge some untested patch combination.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-07 21:52         ` Andrew Morton
@ 2011-03-07 23:43           ` David Rientjes
  0 siblings, 0 replies; 11+ messages in thread
From: David Rientjes @ 2011-03-07 23:43 UTC (permalink / raw)
  To: Andrew Morton, Andrey Vagin
  Cc: Andrew Vagin, KOSAKI Motohiro, linux-mm, linux-kernel

On Mon, 7 Mar 2011, Andrew Morton wrote:

> Andrew's v2 doesn't apply on top of
> oom-prevent-unnecessary-oom-kills-or-kernel-panics.patch and I'm
> disinclined to fix that up and merge some untested patch combination.
> 

Ok.  Andrey, I rebased your patch on top of the latest -mm tree 
(mmotm-2011-03-02-16-52 with 
oom-prevent-unnecessary-oom-kills-or-kernel-panics.patch from 
http://marc.info/?l=linux-mm-commits&m=129953480527038&q=raw) and rewrote 
the changelog.  They'll both apply on top of Linus' -git even without 
mmotm.  Could you try this out on your testcase?

Thanks!


oom: skip zombies when iterating tasklist

From: Andrey Vagin <avagin@openvz.org>

We shouldn't defer oom killing if a thread has already detached its ->mm
and still has TIF_MEMDIE set.  Memory needs to be freed, so find kill
other threads that pin the same ->mm or find another task to kill.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -299,6 +299,8 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 	do_each_thread(g, p) {
 		unsigned int points;
 
+		if (!p->mm)
+			continue;
 		if (oom_unkillable_task(p, mem, nodemask))
 			continue;
 
@@ -324,7 +326,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 		 * the process of exiting and releasing its resources.
 		 * Otherwise we could get an easy OOM deadlock.
 		 */
-		if ((p->flags & PF_EXITING) && p->mm) {
+		if (p->flags & PF_EXITING) {
 			if (p != current)
 				return ERR_PTR(-1UL);
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-07 11:55     ` Andrew Vagin
  2011-03-07 20:36       ` David Rientjes
@ 2011-03-08  1:24       ` KOSAKI Motohiro
  1 sibling, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2011-03-08  1:24 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: kosaki.motohiro, David Rientjes, Andrey Vagin, Andrew Morton,
	linux-mm, linux-kernel

> 2011/3/7 David Rientjes <rientjes@google.com>:
> > On Sun, 6 Mar 2011, KOSAKI Motohiro wrote:
> >
> >> > When we check that task has flag TIF_MEMDIE, we forgot check that
> >> > it has mm. A task may be zombie and a parent may wait a memor.
> >> >
> >> > v2: Check that task doesn't have mm one time and skip it immediately
> >> >
> >> > Signed-off-by: Andrey Vagin <avagin@openvz.org>
> >>
> >> This seems incorrect. Do you have a reprodusable testcasae?
> >> Your patch only care thread group leader state, but current code
> >> care all thread in the process. Please look at oom_badness() and
> >> find_lock_task_mm().
> >>
> >
> > That's all irrelevant, the test for TIF_MEMDIE specifically makes the oom
> > killer a complete no-op when an eligible task is found to have been oom
> > killed to prevent needlessly killing additional tasks.  oom_badness() and
> > find_lock_task_mm() have nothing to do with that check to return
> > ERR_PTR(-1UL) from select_bad_process().
> >
> > Andrey is patching the case where an eligible TIF_MEMDIE process is found
> > but it has already detached its ->mm.  In combination with the patch
> > posted to linux-mm, oom: prevent unnecessary oom kills or kernel panics,
> > which makes select_bad_process() iterate over all threads, it is an
> > effective solution.
> 
> Probably you said about the first version of my patch.
> This version is incorrect because of
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=dd8e8f405ca386c7ce7cbb996ccd985d283b0e03
> 
> but my first patch is correct and it has a simple reproducer(I
> attached it). You can execute it and your kernel hangs up, because the
> parent doesn't wait children, but the one child (zombie) will have
> flag TIF_MEMDIE, oom_killer will kill nobody
> 
> 
> The link on the first patch:
> http://groups.google.com/group/linux.kernel/browse_thread/thread/b9c6ddf34d1671ab/2941e1877ca4f626?lnk=raot&pli=1

OK. I can ack this.
TIF_MEMDIE  mean the process have been receive SIGKILL therefore we can assume it
as per process flag.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-06 22:03   ` David Rientjes
  2011-03-07 11:55     ` Andrew Vagin
@ 2011-03-08  2:07     ` KOSAKI Motohiro
  2011-03-08  2:10       ` KOSAKI Motohiro
  1 sibling, 1 reply; 11+ messages in thread
From: KOSAKI Motohiro @ 2011-03-08  2:07 UTC (permalink / raw)
  To: David Rientjes
  Cc: kosaki.motohiro, Andrey Vagin, Andrew Morton, linux-mm,
	linux-kernel

> On Sun, 6 Mar 2011, KOSAKI Motohiro wrote:
> 
> > > When we check that task has flag TIF_MEMDIE, we forgot check that
> > > it has mm. A task may be zombie and a parent may wait a memor.
> > > 
> > > v2: Check that task doesn't have mm one time and skip it immediately
> > > 
> > > Signed-off-by: Andrey Vagin <avagin@openvz.org>
> > 
> > This seems incorrect. Do you have a reprodusable testcasae?
> > Your patch only care thread group leader state, but current code
> > care all thread in the process. Please look at oom_badness() and 
> > find_lock_task_mm(). 
> > 
> 
> That's all irrelevant, the test for TIF_MEMDIE specifically makes the oom 
> killer a complete no-op when an eligible task is found to have been oom 
> killed to prevent needlessly killing additional tasks.  oom_badness() and 
> find_lock_task_mm() have nothing to do with that check to return 
> ERR_PTR(-1UL) from select_bad_process().

I don't understand you think which task is eligible and unnecessary.
But, Look! Andrey is not talking about zombie process case. But, this v2
patch have factored out other tasks too. This IS the problem. No need
unrelated talk.

> 
> Andrey is patching the case where an eligible TIF_MEMDIE process is found 
> but it has already detached its ->mm.  In combination with the patch 
> posted to linux-mm, oom: prevent unnecessary oom kills or kernel panics, 
> which makes select_bad_process() iterate over all threads, it is an 
> effective solution.

Guys, It was alread NAKed. I've already talk kind explanation. Why do
you bother to look actual code. Why do you continue to talk funny your
dream?




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: skip zombie in OOM-killer
  2011-03-08  2:07     ` KOSAKI Motohiro
@ 2011-03-08  2:10       ` KOSAKI Motohiro
  0 siblings, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2011-03-08  2:10 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: kosaki.motohiro, David Rientjes, Andrey Vagin, Andrew Morton,
	linux-mm, linux-kernel

> I don't understand you think which task is eligible and unnecessary.
> But, Look! Andrey is not talking about zombie process case. But, this v2
> patch have factored out other tasks too. This IS the problem. No need
             ^^^^^^^^
             filter.

I need to rest.



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-03-08  2:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-05  0:51 [PATCH] mm: skip zombie in OOM-killer Andrey Vagin
2011-03-06  2:44 ` David Rientjes
2011-03-06 10:37 ` KOSAKI Motohiro
2011-03-06 22:03   ` David Rientjes
2011-03-07 11:55     ` Andrew Vagin
2011-03-07 20:36       ` David Rientjes
2011-03-07 21:52         ` Andrew Morton
2011-03-07 23:43           ` David Rientjes
2011-03-08  1:24       ` KOSAKI Motohiro
2011-03-08  2:07     ` KOSAKI Motohiro
2011-03-08  2:10       ` KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).