linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	anfei <anfei.zhou@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	nishimura@mxp.nes.nec.co.jp,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Mel Gorman <mel@csn.ul.ie>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch] oom: give current access to memory reserves if it has been killed
Date: Thu, 1 Apr 2010 17:26:38 +0200	[thread overview]
Message-ID: <20100401152638.GC14603@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1004010044320.6285@chino.kir.corp.google.com>

On 04/01, David Rientjes wrote:
>
> On Thu, 1 Apr 2010, Oleg Nesterov wrote:
>
> > Why? You ignored this part:
> >
> > 	Say, right after exit_mm() we are doing acct_process(), and f_op->write()
> > 	needs a page. So, you are saying that in this case __page_cache_alloc()
> > 	can never trigger out_of_memory() ?
> >
> > why this is not possible?
> >
>
> It can, but the check for p->mm is sufficient since exit_notify()

Yes, but I meant out_of_memory()->__oom_kill_task(current). OK, we
already discussed this in the previous emails.

> We cannot rely on oom_badness() to filter this task because we still
> select it as our chosen task even with a badness score of 0 if !chosen

Yes, see another email from me.

> Your point about p->mm being non-NULL for kthreads using use_mm() is
> taken, we should probably just change the is_global_init() check in
> select_bad_process() to p->flags & PF_KTHREAD and ensure we reject
> oom_kill_process() for them.

Yes, but we have to check both is_global_init() and PF_KTHREAD.

The "patch" I sent checks PF_KTHREAD in find_lock_task_mm(), but as I
said select_bad_process() is the better place.

> > OK, a bad user does
> >
> > 	int sleep_forever(void *)
> > 	{
> > 		pause();
> > 	}
> >
> > 	int main(void)
> > 	{
> > 		pthread_create(sleep_forever);
> > 		syscall(__NR_exit);
> > 	}
> >
> > Now, every time select_bad_process() is called it will find this process
> > and PF_EXITING is true, so it just returns ERR_PTR(-1UL). And note that
> > this process is not going to exit.
> >
>
> Hmm, so it looks like we need to filter on !p->mm before checking for
> PF_EXITING so that tasks that are EXIT_ZOMBIE won't make the oom killer
> into a no-op.

As it was already discussed, it is not easy to check !p->mm. Once
again, we must not filter out the task just because its ->mm == NULL.

Probably the best change for now is

	- if (p->flags & PF_EXITING) {
	+ if (p->flags & PF_EXITING && p->mm) {

This is not perfect too, but much better.

> > > > Say, oom_forkbomb_penalty() does list_for_each_entry(tsk->children).
> > > > Again, this is not right even if we forget about !child->mm check.
> > > > This list_for_each_entry() can only see the processes forked by the
> > > > main thread.
> > > >
> > >
> > > That's the intention.
> >
> > Why? shouldn't oom_badness() return the same result for any thread
> > in thread group? We should take all childs into account.
> >
>
> oom_forkbomb_penalty() only cares about first-descendant children that
> do not share the same memory,

I see, but the code doesn't really do this. I mean, it doesn't really
see the first-descendant children, only those which were forked by the
main thread.

Look. We have a main thread M and the sub-thread T. T forks a lot of
processes which use a lot of memory. These processes _are_ the first
descendant children of the M+T thread group, they should be accounted.
But M->children list is empty.

oom_forkbomb_penalty() and oom_kill_process() should do

	t = tsk;
	do {
		list_for_each_entry(child, &t->children, sibling) {
			... take child into account ...
		}
	} while_each_thread(tsk, t);


> > > > Hmm. Why oom_forkbomb_penalty() does thread_group_cputime() under
> > > > task_lock() ? It seems, ->alloc_lock() is only needed for get_mm_rss().
> > > >
> [...snip...]
> We need task_lock() to ensure child->mm hasn't detached between the check
> for child->mm == tsk->mm and get_mm_rss(child->mm).  So I'm not sure what
> you're trying to improve with this variation, it's a tradeoff between
> calling thread_group_cputime() under task_lock() for a subset of a task's
> threads when we already need to hold task_lock() anyway vs. calling it for
> all threads unconditionally.

See the patch below. Yes, this is minor, but it is always good to avoid
the unnecessary locks, and thread_group_cputime() is O(N).

Not only for performance reasons. This allows to change the locking in
thread_group_cputime() if needed without fear to deadlock with task_lock().

Oleg.

--- x/mm/oom_kill.c
+++ x/mm/oom_kill.c
@@ -97,13 +97,16 @@ static unsigned long oom_forkbomb_penalt
 		return 0;
 	list_for_each_entry(child, &tsk->children, sibling) {
 		struct task_cputime task_time;
-		unsigned long runtime;
+		unsigned long runtime, this_rss;
 
 		task_lock(child);
 		if (!child->mm || child->mm == tsk->mm) {
 			task_unlock(child);
 			continue;
 		}
+		this_rss = get_mm_rss(child->mm);
+		task_unlock(child);
+
 		thread_group_cputime(child, &task_time);
 		runtime = cputime_to_jiffies(task_time.utime) +
 			  cputime_to_jiffies(task_time.stime);
@@ -113,10 +116,9 @@ static unsigned long oom_forkbomb_penalt
 		 * get to execute at all in such cases anyway.
 		 */
 		if (runtime < HZ) {
-			child_rss += get_mm_rss(child->mm);
+			child_rss += this_rss;
 			forkcount++;
 		}
-		task_unlock(child);
 	}
 
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-04-01 15:28 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-24 16:25 [PATCH] oom killer: break from infinite loop Anfei Zhou
2010-03-25  2:51 ` KOSAKI Motohiro
2010-03-26 22:08 ` Andrew Morton
2010-03-26 22:33   ` Oleg Nesterov
2010-03-28 14:55     ` anfei
2010-03-28 16:28       ` Oleg Nesterov
2010-03-28 21:21         ` David Rientjes
2010-03-29 11:21           ` Oleg Nesterov
2010-03-29 20:49             ` [patch] oom: give current access to memory reserves if it has been killed David Rientjes
2010-03-30 15:46               ` Oleg Nesterov
2010-03-30 20:26                 ` David Rientjes
2010-03-31 17:58                   ` Oleg Nesterov
2010-03-31 20:47                     ` Oleg Nesterov
2010-04-01  8:35                       ` David Rientjes
2010-04-01  8:57                         ` [patch -mm] oom: hold tasklist_lock when dumping tasks David Rientjes
2010-04-01 14:27                           ` Oleg Nesterov
2010-04-01 19:16                             ` David Rientjes
2010-04-01 13:59                         ` [patch] oom: give current access to memory reserves if it has been killed Oleg Nesterov
2010-04-01 19:12                           ` David Rientjes
2010-04-02 11:14                             ` Oleg Nesterov
2010-04-02 18:30                               ` [PATCH -mm 0/4] oom: linux has threads Oleg Nesterov
2010-04-02 18:31                                 ` [PATCH -mm 1/4] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads Oleg Nesterov
2010-04-02 19:05                                   ` David Rientjes
2010-04-02 18:32                                 ` [PATCH -mm 2/4] oom: select_bad_process: PF_EXITING check should take ->mm into account Oleg Nesterov
2010-04-06 11:42                                   ` anfei
2010-04-06 12:18                                     ` Oleg Nesterov
2010-04-06 13:05                                       ` anfei
2010-04-06 13:38                                         ` Oleg Nesterov
2010-04-02 18:32                                 ` [PATCH -mm 3/4] oom: introduce find_lock_task_mm() to fix !mm false positives Oleg Nesterov
2010-04-02 18:33                                 ` [PATCH -mm 4/4] oom: oom_forkbomb_penalty: move thread_group_cputime() out of task_lock() Oleg Nesterov
2010-04-02 19:04                                   ` David Rientjes
2010-04-05 14:23                                 ` [PATCH -mm] oom: select_bad_process: never choose tasks with badness == 0 Oleg Nesterov
2010-04-02 19:02                               ` [patch] oom: give current access to memory reserves if it has been killed David Rientjes
2010-04-02 19:14                                 ` Oleg Nesterov
2010-04-02 19:46                                   ` David Rientjes
2010-04-02 19:54                                     ` [patch -mm] oom: exclude tasks with badness score of 0 from being selected David Rientjes
2010-04-02 21:04                                       ` Oleg Nesterov
2010-04-02 21:22                                         ` [patch -mm v2] " David Rientjes
2010-04-02 20:55                                     ` [patch] oom: give current access to memory reserves if it has been killed Oleg Nesterov
2010-03-31 21:07                     ` David Rientjes
2010-03-31 22:50                       ` Oleg Nesterov
2010-03-31 23:30                         ` Oleg Nesterov
2010-03-31 23:48                           ` David Rientjes
2010-04-01 14:39                             ` Oleg Nesterov
2010-04-01 18:58                               ` David Rientjes
2010-04-01  8:25                         ` David Rientjes
2010-04-01 15:26                           ` Oleg Nesterov [this message]
2010-04-08 21:08                             ` David Rientjes
2010-04-09 12:38                               ` Oleg Nesterov
2010-03-30 16:39               ` [PATCH] oom: fix the unsafe proc_oom_score()->badness() call Oleg Nesterov
2010-03-30 17:43                 ` [PATCH -mm] proc: don't take ->siglock for /proc/pid/oom_adj Oleg Nesterov
2010-03-30 20:30                   ` David Rientjes
2010-03-31  9:17                     ` Oleg Nesterov
2010-03-31 18:59                     ` Oleg Nesterov
2010-03-31 21:14                       ` David Rientjes
2010-03-31 23:00                         ` Oleg Nesterov
2010-04-01  8:32                           ` David Rientjes
2010-04-01 15:37                             ` Oleg Nesterov
2010-04-01 19:04                               ` David Rientjes
2010-03-30 20:32                 ` [PATCH] oom: fix the unsafe proc_oom_score()->badness() call David Rientjes
2010-03-31  9:16                   ` Oleg Nesterov
2010-03-31 20:17                     ` Oleg Nesterov
2010-04-01  7:41                       ` David Rientjes
2010-04-01 13:13                         ` [PATCH 0/1] oom: fix the unsafe usage of badness() in proc_oom_score() Oleg Nesterov
2010-04-01 13:13                           ` [PATCH 1/1] " Oleg Nesterov
2010-04-01 19:03                             ` David Rientjes
2010-03-29 14:06           ` [PATCH] oom killer: break from infinite loop anfei
2010-03-29 20:01             ` David Rientjes
2010-03-30 14:29               ` anfei
2010-03-30 20:29                 ` David Rientjes
2010-03-31  0:57                   ` KAMEZAWA Hiroyuki
2010-03-31  6:07                     ` David Rientjes
2010-03-31  6:13                       ` KAMEZAWA Hiroyuki
2010-03-31  6:30                         ` Balbir Singh
2010-03-31  6:31                           ` KAMEZAWA Hiroyuki
2010-03-31  7:04                             ` David Rientjes
2010-03-31  6:32                           ` David Rientjes
2010-03-31  7:08                             ` [patch -mm] memcg: make oom killer a no-op when no killable task can be found David Rientjes
2010-03-31  7:08                               ` KAMEZAWA Hiroyuki
2010-03-31  8:04                               ` Balbir Singh
2010-03-31 10:38                                 ` David Rientjes
2010-04-04 23:28                               ` David Rientjes
2010-04-05 21:30                                 ` Andrew Morton
2010-04-05 22:40                                   ` David Rientjes
2010-04-05 22:49                                     ` Andrew Morton
2010-04-05 23:01                                       ` David Rientjes
2010-04-06 12:08                                         ` KOSAKI Motohiro
2010-04-06 21:47                                           ` David Rientjes
2010-04-07  0:20                                             ` KAMEZAWA Hiroyuki
2010-04-07 13:29                                               ` KOSAKI Motohiro
2010-04-08 18:05                                                 ` David Rientjes
2010-04-21 19:17                                                   ` Andrew Morton
2010-04-21 22:04                                                     ` David Rientjes
2010-04-22  0:23                                                       ` KAMEZAWA Hiroyuki
2010-04-22  8:34                                                         ` David Rientjes
2010-04-27 22:58                                                       ` [patch -mm] oom: reintroduce and deprecate oom_kill_allocating_task David Rientjes
2010-04-28  0:57                                                         ` KAMEZAWA Hiroyuki
2010-04-22  7:23                                                     ` [patch -mm] memcg: make oom killer a no-op when no killable task can be found Nick Piggin
2010-04-22  7:25                                                       ` KAMEZAWA Hiroyuki
2010-04-22 10:09                                                         ` Nick Piggin
2010-04-22 10:27                                                           ` KAMEZAWA Hiroyuki
2010-04-22 21:11                                                             ` David Rientjes
2010-04-22 10:28                                                           ` David Rientjes
2010-04-22 15:39                                                             ` Nick Piggin
2010-04-22 21:09                                                               ` David Rientjes
2010-05-04 23:55                                                     ` David Rientjes
2010-04-08 17:36                                               ` David Rientjes
2010-04-02 10:17           ` [PATCH] oom killer: break from infinite loop Mel Gorman
2010-04-04 23:26             ` David Rientjes
2010-04-05 10:47               ` Mel Gorman
2010-04-06 22:40                 ` David Rientjes
2010-03-29 11:31         ` anfei
2010-03-29 11:46           ` Oleg Nesterov
2010-03-29 12:09             ` anfei
2010-03-28  2:46 ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100401152638.GC14603@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anfei.zhou@gmail.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).