* [PATCH -mm] Kill existing current task quickly @ 2010-02-16 15:59 Minchan Kim 2010-02-16 22:03 ` David Rientjes 0 siblings, 1 reply; 5+ messages in thread From: Minchan Kim @ 2010-02-16 15:59 UTC (permalink / raw) To: Andrew Morton; +Cc: Nick Piggin, linux-kernel, linux-mm, David Rientjes I am not sure why didn't we break the loop until now. As looking git log, I found it is removed by Nick at b78483a. He mentioned "introduced a problem". If I miss something, pz, correct me. == CUT_HERE == [PATCH -mm] Kill existing current task quickly If we found current task is existing but didn't set TIF_MEMDIE during OOM victim selection, let's stop unnecessary looping for getting high badness score task and go ahead for killing current. This patch would make side effect skip OOM_DISABLE test. But It's okay since the task is existing and oom_kill_process doesn't show any killing message since __oom_kill_task will interrupt it in oom_kill_process. Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Cc: Nick Piggin <npiggin@suse.de> --- mm/oom_kill.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 3618be3..5c21398 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -295,6 +295,7 @@ static struct task_struct *select_bad_process(unsigned long *ppoints, chosen = p; *ppoints = ULONG_MAX; + break; } if (p->signal->oom_adj == OOM_DISABLE) -- 1.6.5 -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH -mm] Kill existing current task quickly 2010-02-16 15:59 [PATCH -mm] Kill existing current task quickly Minchan Kim @ 2010-02-16 22:03 ` David Rientjes 2010-02-17 6:26 ` Minchan Kim 0 siblings, 1 reply; 5+ messages in thread From: David Rientjes @ 2010-02-16 22:03 UTC (permalink / raw) To: Minchan Kim; +Cc: Andrew Morton, Nick Piggin, linux-kernel, linux-mm On Wed, 17 Feb 2010, Minchan Kim wrote: > If we found current task is existing but didn't set TIF_MEMDIE > during OOM victim selection, let's stop unnecessary looping for > getting high badness score task and go ahead for killing current. > > This patch would make side effect skip OOM_DISABLE test. > But It's okay since the task is existing and oom_kill_process > doesn't show any killing message since __oom_kill_task will > interrupt it in oom_kill_process. > > Signed-off-by: Minchan Kim <minchan.kim@gmail.com> > Cc: Nick Piggin <npiggin@suse.de> > --- > mm/oom_kill.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 3618be3..5c21398 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -295,6 +295,7 @@ static struct task_struct > *select_bad_process(unsigned long *ppoints, > > chosen = p; > *ppoints = ULONG_MAX; > + break; > } > > if (p->signal->oom_adj == OOM_DISABLE) No, we don't want to break because there may be other candidate tasks that have TIF_MEMDIE set that will be detected if we keep scanning. Returning ERR_PTR(-1UL) from select_bad_process() has a special meaning: it means we return to the page allocator without doing anything. We don't want more than one candidate task to ever have TIF_MEMDIE at a time, otherwise they can deplete all memory reserves and not make any forward progress. So we always have to iterate the entire tasklist unless we find an already oom killed task with access to memory reserves (to prevent needlessly killing additional tasks before the first had a chance to exit and free its memory) or a different candidate task is exiting so we'll be freeing memory shortly (or it will be invoking the oom killer itself as current and then get chosen as the victim). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH -mm] Kill existing current task quickly 2010-02-16 22:03 ` David Rientjes @ 2010-02-17 6:26 ` Minchan Kim 2010-02-17 9:36 ` David Rientjes 0 siblings, 1 reply; 5+ messages in thread From: Minchan Kim @ 2010-02-17 6:26 UTC (permalink / raw) To: David Rientjes; +Cc: Andrew Morton, Nick Piggin, linux-kernel, linux-mm [-- Attachment #1: Type: text/plain, Size: 6440 bytes --] On Wed, Feb 17, 2010 at 7:03 AM, David Rientjes <rientjes@google.com> wrote: > On Wed, 17 Feb 2010, Minchan Kim wrote: > >> If we found current task is existing but didn't set TIF_MEMDIE >> during OOM victim selection, let's stop unnecessary looping for >> getting high badness score task and go ahead for killing current. >> >> This patch would make side effect skip OOM_DISABLE test. >> But It's okay since the task is existing and oom_kill_process >> doesn't show any killing message since __oom_kill_task will >> interrupt it in oom_kill_process. >> >> Signed-off-by: Minchan Kim <minchan.kim@gmail.com> >> Cc: Nick Piggin <npiggin@suse.de> >> --- >> mm/oom_kill.c | 1 + >> 1 files changed, 1 insertions(+), 0 deletions(-) >> >> diff --git a/mm/oom_kill.c b/mm/oom_kill.c >> index 3618be3..5c21398 100644 >> --- a/mm/oom_kill.c >> +++ b/mm/oom_kill.c >> @@ -295,6 +295,7 @@ static struct task_struct >> *select_bad_process(unsigned long *ppoints, >> >> chosen = p; >> *ppoints = ULONG_MAX; >> + break; >> } >> >> if (p->signal->oom_adj == OOM_DISABLE) > > No, we don't want to break because there may be other candidate tasks that > have TIF_MEMDIE set that will be detected if we keep scanning. Returning > ERR_PTR(-1UL) from select_bad_process() has a special meaning: it means we > return to the page allocator without doing anything. We don't want more > than one candidate task to ever have TIF_MEMDIE at a time, otherwise they > can deplete all memory reserves and not make any forward progress. So we > always have to iterate the entire tasklist unless we find an already oom > killed task with access to memory reserves (to prevent needlessly killing > additional tasks before the first had a chance to exit and free its > memory) or a different candidate task is exiting so we'll be freeing > memory shortly (or it will be invoking the oom killer itself as current > and then get chosen as the victim). > Thanks you very much for the kind explanation, David. How about this? I can't use smtp port. so at a loss, I have to send patch by attachment in webmail which would mangle my tab space. Pz, forgive my wrong behavior. If God help me, below inlined patch isn't mangled. :) I think exit_mm's new test about TIF_MEMDIE isn't big overhead. That's because is is on cache line due to exit_signal's pending check. And expand task_lock in __oom_kill_task isn't big, I think. That's because __oom_kill_task isn't called frequently. -- CUT_HERE -- Date: Wed, 17 Feb 2010 23:59:40 +0900 Subject: [PATCH] [PATCH -mm] Kill existing current task quickly If we found current task is existing but didn't set TIF_MEMDIE during OOM victim selection, let's stop unnecessary looping for getting high badness score task and go ahead for killing current. For forkbomb scenarion, there are many processes on system so tasklist scanning time might be much big. Sometime we used to use oom_kill_allocating_task to avoid expensive task list scanning time. For it, we have to make sure there are not TIF_MEMDIE's another tasks. This patch introduces nr_memdie for counting TIF_MEMDIE's tasks. This patch would make side effect skip OOM_DISABLE test. But It's okay since the task is existing and oom_kill_process doesn't show any killing message since __oom_kill_task will interrupt it in oom_kill_process. Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Nick Piggin <npiggin@suse.de> --- include/linux/oom.h | 2 ++ kernel/exit.c | 3 +++ mm/oom_kill.c | 12 +++++++++--- 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/linux/oom.h b/include/linux/oom.h index 505aa9e..9babcce 100644 --- a/include/linux/oom.h +++ b/include/linux/oom.h @@ -48,5 +48,7 @@ extern int sysctl_panic_on_oom; extern int sysctl_oom_kill_allocating_task; extern int sysctl_oom_dump_tasks; +extern unsigned int nr_memdie; + #endif /* __KERNEL__*/ #endif /* _INCLUDE_LINUX_OOM_H */ diff --git a/kernel/exit.c b/kernel/exit.c index c5305fc..932c67b 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -50,6 +50,7 @@ #include <linux/perf_event.h> #include <trace/events/sched.h> #include <linux/hw_breakpoint.h> +#include <linux/oom.h> #include <asm/uaccess.h> #include <asm/unistd.h> @@ -685,6 +686,8 @@ static void exit_mm(struct task_struct * tsk) /* more a memory barrier than a real lock */ task_lock(tsk); tsk->mm = NULL; + if (test_thread_flag(TIF_MEMDIE)) + nr_memdie--; up_read(&mm->mmap_sem); enter_lazy_tlb(mm, current); /* We don't want this task to be frozen prematurely */ diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 3618be3..d5e3d70 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -32,6 +32,8 @@ int sysctl_panic_on_oom; int sysctl_oom_kill_allocating_task; int sysctl_oom_dump_tasks; static DEFINE_SPINLOCK(zone_scan_lock); + +unsigned int nr_memdie; /* count of TIF_MEMDIE processes */ /* #define DEBUG */ /* @@ -295,6 +297,8 @@ static struct task_struct *select_bad_process(unsigned long *ppoints, chosen = p; *ppoints = ULONG_MAX; + if (nr_memdie == 0) + break; } if (p->signal->oom_adj == OOM_DISABLE) @@ -403,8 +407,6 @@ static void __oom_kill_task(struct task_struct *p, int verbose) K(p->mm->total_vm), K(get_mm_counter(p->mm, MM_ANONPAGES)), K(get_mm_counter(p->mm, MM_FILEPAGES))); - task_unlock(p); - /* * We give our sacrificial lamb high priority and access to * all the memory it needs. That way it should be able to @@ -412,7 +414,11 @@ static void __oom_kill_task(struct task_struct *p, int verbose) */ p->rt.time_slice = HZ; set_tsk_thread_flag(p, TIF_MEMDIE); - + /* + * nr_memdie is protected by task_lock. + */ + nr_memdie++; + task_unlock(p); force_sig(SIGKILL, p); } -- 1.6.0.4 -- Kind regards, Minchan Kim [-- Attachment #2: 0001-Kill-existing-current-task-quickly.patch --] [-- Type: text/x-diff, Size: 3334 bytes --] Date: Wed, 17 Feb 2010 23:59:40 +0900 Subject: [PATCH] [PATCH -mm] Kill existing current task quickly If we found current task is existing but didn't set TIF_MEMDIE during OOM victim selection, let's stop unnecessary looping for getting high badness score task and go ahead for killing current. For forkbomb scenarion, there are many processes on system so tasklist scanning time might be much big. Sometime we used to use oom_kill_allocating_task to avoid expensive task list scanning time. For it, we have to make sure there are not TIF_MEMDIE's another tasks. This patch introduces nr_memdie for counting TIF_MEMDIE's tasks. This patch would make side effect skip OOM_DISABLE test. But It's okay since the task is existing and oom_kill_process doesn't show any killing message since __oom_kill_task will interrupt it in oom_kill_process. Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Nick Piggin <npiggin@suse.de> --- include/linux/oom.h | 2 ++ kernel/exit.c | 3 +++ mm/oom_kill.c | 12 +++++++++--- 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/linux/oom.h b/include/linux/oom.h index 505aa9e..9babcce 100644 --- a/include/linux/oom.h +++ b/include/linux/oom.h @@ -48,5 +48,7 @@ extern int sysctl_panic_on_oom; extern int sysctl_oom_kill_allocating_task; extern int sysctl_oom_dump_tasks; +extern unsigned int nr_memdie; + #endif /* __KERNEL__*/ #endif /* _INCLUDE_LINUX_OOM_H */ diff --git a/kernel/exit.c b/kernel/exit.c index c5305fc..932c67b 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -50,6 +50,7 @@ #include <linux/perf_event.h> #include <trace/events/sched.h> #include <linux/hw_breakpoint.h> +#include <linux/oom.h> #include <asm/uaccess.h> #include <asm/unistd.h> @@ -685,6 +686,8 @@ static void exit_mm(struct task_struct * tsk) /* more a memory barrier than a real lock */ task_lock(tsk); tsk->mm = NULL; + if (test_thread_flag(TIF_MEMDIE)) + nr_memdie--; up_read(&mm->mmap_sem); enter_lazy_tlb(mm, current); /* We don't want this task to be frozen prematurely */ diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 3618be3..d5e3d70 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -32,6 +32,8 @@ int sysctl_panic_on_oom; int sysctl_oom_kill_allocating_task; int sysctl_oom_dump_tasks; static DEFINE_SPINLOCK(zone_scan_lock); + +unsigned int nr_memdie; /* count of TIF_MEMDIE processes */ /* #define DEBUG */ /* @@ -295,6 +297,8 @@ static struct task_struct *select_bad_process(unsigned long *ppoints, chosen = p; *ppoints = ULONG_MAX; + if (nr_memdie == 0) + break; } if (p->signal->oom_adj == OOM_DISABLE) @@ -403,8 +407,6 @@ static void __oom_kill_task(struct task_struct *p, int verbose) K(p->mm->total_vm), K(get_mm_counter(p->mm, MM_ANONPAGES)), K(get_mm_counter(p->mm, MM_FILEPAGES))); - task_unlock(p); - /* * We give our sacrificial lamb high priority and access to * all the memory it needs. That way it should be able to @@ -412,7 +414,11 @@ static void __oom_kill_task(struct task_struct *p, int verbose) */ p->rt.time_slice = HZ; set_tsk_thread_flag(p, TIF_MEMDIE); - + /* + * nr_memdie is protected by task_lock. + */ + nr_memdie++; + task_unlock(p); force_sig(SIGKILL, p); } -- 1.6.0.4 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH -mm] Kill existing current task quickly 2010-02-17 6:26 ` Minchan Kim @ 2010-02-17 9:36 ` David Rientjes 2010-02-17 10:55 ` Minchan Kim 0 siblings, 1 reply; 5+ messages in thread From: David Rientjes @ 2010-02-17 9:36 UTC (permalink / raw) To: Minchan Kim; +Cc: Andrew Morton, Nick Piggin, linux-kernel, linux-mm On Wed, 17 Feb 2010, Minchan Kim wrote: > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 3618be3..d5e3d70 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -32,6 +32,8 @@ int sysctl_panic_on_oom; > int sysctl_oom_kill_allocating_task; > int sysctl_oom_dump_tasks; > static DEFINE_SPINLOCK(zone_scan_lock); > + > +unsigned int nr_memdie; /* count of TIF_MEMDIE processes */ > /* #define DEBUG */ > > /* > @@ -295,6 +297,8 @@ static struct task_struct > *select_bad_process(unsigned long *ppoints, > > chosen = p; > *ppoints = ULONG_MAX; > + if (nr_memdie == 0) > + break; > } > > if (p->signal->oom_adj == OOM_DISABLE) Nack, finding a candidate task with TIF_MEMDIE set is not the only time we return ERR_PTR(-1UL) from select_bad_process(): we also do it if any other task other than current is PF_EXITING. Thus, we _must_ continue the tasklist scan to avoid needlessly killing current simply because it was the first PF_EXITING task in the tasklist. > @@ -403,8 +407,6 @@ static void __oom_kill_task(struct task_struct *p, > int verbose) > K(p->mm->total_vm), > K(get_mm_counter(p->mm, MM_ANONPAGES)), > K(get_mm_counter(p->mm, MM_FILEPAGES))); > - task_unlock(p); > - > /* > * We give our sacrificial lamb high priority and access to > * all the memory it needs. That way it should be able to > @@ -412,7 +414,11 @@ static void __oom_kill_task(struct task_struct > *p, int verbose) > */ > p->rt.time_slice = HZ; > set_tsk_thread_flag(p, TIF_MEMDIE); > - > + /* > + * nr_memdie is protected by task_lock. > + */ > + nr_memdie++; > + task_unlock(p); > force_sig(SIGKILL, p); > } > task_lock() is a per-task entity, i.e. each task_struct has an alloc_lock spinlock. This cannot protect a global variable. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH -mm] Kill existing current task quickly 2010-02-17 9:36 ` David Rientjes @ 2010-02-17 10:55 ` Minchan Kim 0 siblings, 0 replies; 5+ messages in thread From: Minchan Kim @ 2010-02-17 10:55 UTC (permalink / raw) To: David Rientjes; +Cc: Andrew Morton, Nick Piggin, linux-kernel, linux-mm On Wed, 2010-02-17 at 01:36 -0800, David Rientjes wrote: > On Wed, 17 Feb 2010, Minchan Kim wrote: > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index 3618be3..d5e3d70 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -32,6 +32,8 @@ int sysctl_panic_on_oom; > > int sysctl_oom_kill_allocating_task; > > int sysctl_oom_dump_tasks; > > static DEFINE_SPINLOCK(zone_scan_lock); > > + > > +unsigned int nr_memdie; /* count of TIF_MEMDIE processes */ > > /* #define DEBUG */ > > > > /* > > @@ -295,6 +297,8 @@ static struct task_struct > > *select_bad_process(unsigned long *ppoints, > > > > chosen = p; > > *ppoints = ULONG_MAX; > > + if (nr_memdie == 0) > > + break; > > } > > > > if (p->signal->oom_adj == OOM_DISABLE) > > Nack, finding a candidate task with TIF_MEMDIE set is not the only time we > return ERR_PTR(-1UL) from select_bad_process(): we also do it if any other > task other than current is PF_EXITING. Thus, we _must_ continue the > tasklist scan to avoid needlessly killing current simply because it was > the first PF_EXITING task in the tasklist. Okay. Sorry for missing PF_EXITING tasks. > > > @@ -403,8 +407,6 @@ static void __oom_kill_task(struct task_struct *p, > > int verbose) > > K(p->mm->total_vm), > > K(get_mm_counter(p->mm, MM_ANONPAGES)), > > K(get_mm_counter(p->mm, MM_FILEPAGES))); > > - task_unlock(p); > > - > > /* > > * We give our sacrificial lamb high priority and access to > > * all the memory it needs. That way it should be able to > > @@ -412,7 +414,11 @@ static void __oom_kill_task(struct task_struct > > *p, int verbose) > > */ > > p->rt.time_slice = HZ; > > set_tsk_thread_flag(p, TIF_MEMDIE); > > - > > + /* > > + * nr_memdie is protected by task_lock. > > + */ > > + nr_memdie++; > > + task_unlock(p); > > force_sig(SIGKILL, p); > > } > > > > task_lock() is a per-task entity, i.e. each task_struct has an alloc_lock > spinlock. This cannot protect a global variable. Yes. It was utterly dumb lock usage. Thanks for the quick reply. -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-02-17 10:55 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-02-16 15:59 [PATCH -mm] Kill existing current task quickly Minchan Kim 2010-02-16 22:03 ` David Rientjes 2010-02-17 6:26 ` Minchan Kim 2010-02-17 9:36 ` David Rientjes 2010-02-17 10:55 ` Minchan Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).