Re: [PATCH 6/6] mm, oom: fortify task_will_free_mem

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>,
	Vladimir Davydov <vdavydov@parallels.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 6/6] mm, oom: fortify task_will_free_mem
Date: Wed, 1 Jun 2016 00:29:33 +0200	[thread overview]
Message-ID: <20160531222933.GD26582@redhat.com> (raw)
In-Reply-To: <20160531074624.GE26128@dhcp22.suse.cz>

On 05/31, Michal Hocko wrote:
>
> On Mon 30-05-16 19:35:05, Oleg Nesterov wrote:
> >
> > Well, let me suggest this again. I think it should do
> >
> >
> > 	if (SIGNAL_GROUP_COREDUMP)
> > 		return false;
> >
> > 	if (SIGNAL_GROUP_EXIT)
> > 		return true;
> >
> > 	if (thread_group_empty() && PF_EXITING)
> > 		return true;
> >
> > 	return false;
> >
> > we do not need fatal_signal_pending(), in this case SIGNAL_GROUP_EXIT should
> > be set (ignoring some bugs with sub-namespaces which we need to fix anyway).
>
> OK, so we shouldn't care about race when the fatal_signal is set on the
> task until it reaches do_group_exit?

if fatal_signal() is true then (ignoring exec and coredump) SIGNAL_GROUP_EXIT
is already set (again, ignoring the bugs with sub-namespace inits).

At the same time, SIGKILL can be already dequeued when the task exits, so
fatal_signal_pending() can be "false negative".

> > And. I think this needs smp_rmb() at the end of the loop (assuming we have the
> > process_shares_mm() check here). We need it to ensure that we read p->mm before
> > we read next_task(), to avoid the race with exit() + clone(CLONE_VM).
>
> Why don't we need the same barrier in oom_kill_process?

Because it calls do_send_sig_info() which takes ->siglock and copy_process()
takes the same lock. Not a barrier, but acts the same way.

> Which barrier it
> would pair with?

With the barrier implied by list_add_tail_rcu(&p->tasks, &init_task.tasks).

> Anyway I think this would deserve it's own patch.
> Barriers are always tricky and it is better to have them in a small
> patch with a full explanation.

OK, agreed.


I am not sure I can read the new patch correctly, it depends on the previous
changes... but afaics it looks good.

Cosmetic/subjective nit, feel free to ignore,

> +bool task_will_free_mem(struct task_struct *task)
> +{
> +	struct mm_struct *mm = NULL;

unnecessary initialization ;)

> +	struct task_struct *p;
> +	bool ret;
> +
> +	/*
> +	 * If the process has passed exit_mm we have to skip it because
> +	 * we have lost a link to other tasks sharing this mm, we do not
> +	 * have anything to reap and the task might then get stuck waiting
> +	 * for parent as zombie and we do not want it to hold TIF_MEMDIE
> +	 */
> +	p = find_lock_task_mm(task);
> +	if (!p)
> +		return false;
> +
> +	if (!__task_will_free_mem(p)) {
> +		task_unlock(p);
> +		return false;
> +	}

We can call the 1st __task_will_free_mem(p) before find_lock_task_mm(). In the
likely case (I think) it should return false.

And since __task_will_free_mem() has no other callers perhaps it should go into
oom_kill.c too.

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Oleg Nesterov <oleg@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>,
	Vladimir Davydov <vdavydov@parallels.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 6/6] mm, oom: fortify task_will_free_mem
Date: Wed, 1 Jun 2016 00:29:33 +0200	[thread overview]
Message-ID: <20160531222933.GD26582@redhat.com> (raw)
In-Reply-To: <20160531074624.GE26128@dhcp22.suse.cz>

On 05/31, Michal Hocko wrote:
>
> On Mon 30-05-16 19:35:05, Oleg Nesterov wrote:
> >
> > Well, let me suggest this again. I think it should do
> >
> >
> > 	if (SIGNAL_GROUP_COREDUMP)
> > 		return false;
> >
> > 	if (SIGNAL_GROUP_EXIT)
> > 		return true;
> >
> > 	if (thread_group_empty() && PF_EXITING)
> > 		return true;
> >
> > 	return false;
> >
> > we do not need fatal_signal_pending(), in this case SIGNAL_GROUP_EXIT should
> > be set (ignoring some bugs with sub-namespaces which we need to fix anyway).
>
> OK, so we shouldn't care about race when the fatal_signal is set on the
> task until it reaches do_group_exit?

if fatal_signal() is true then (ignoring exec and coredump) SIGNAL_GROUP_EXIT
is already set (again, ignoring the bugs with sub-namespace inits).

At the same time, SIGKILL can be already dequeued when the task exits, so
fatal_signal_pending() can be "false negative".

> > And. I think this needs smp_rmb() at the end of the loop (assuming we have the
> > process_shares_mm() check here). We need it to ensure that we read p->mm before
> > we read next_task(), to avoid the race with exit() + clone(CLONE_VM).
>
> Why don't we need the same barrier in oom_kill_process?

Because it calls do_send_sig_info() which takes ->siglock and copy_process()
takes the same lock. Not a barrier, but acts the same way.

> Which barrier it
> would pair with?

With the barrier implied by list_add_tail_rcu(&p->tasks, &init_task.tasks).

> Anyway I think this would deserve it's own patch.
> Barriers are always tricky and it is better to have them in a small
> patch with a full explanation.

OK, agreed.


I am not sure I can read the new patch correctly, it depends on the previous
changes... but afaics it looks good.

Cosmetic/subjective nit, feel free to ignore,

> +bool task_will_free_mem(struct task_struct *task)
> +{
> +	struct mm_struct *mm = NULL;

unnecessary initialization ;)

> +	struct task_struct *p;
> +	bool ret;
> +
> +	/*
> +	 * If the process has passed exit_mm we have to skip it because
> +	 * we have lost a link to other tasks sharing this mm, we do not
> +	 * have anything to reap and the task might then get stuck waiting
> +	 * for parent as zombie and we do not want it to hold TIF_MEMDIE
> +	 */
> +	p = find_lock_task_mm(task);
> +	if (!p)
> +		return false;
> +
> +	if (!__task_will_free_mem(p)) {
> +		task_unlock(p);
> +		return false;
> +	}

We can call the 1st __task_will_free_mem(p) before find_lock_task_mm(). In the
likely case (I think) it should return false.

And since __task_will_free_mem() has no other callers perhaps it should go into
oom_kill.c too.

Oleg.

next prev parent reply	other threads:[~2016-05-31 22:29 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30 13:05 [PATCH 0/6 -v2] Handle oom bypass more gracefully Michal Hocko
2016-05-30 13:05 ` Michal Hocko
2016-05-30 13:05 ` [PATCH 1/6] proc, oom: drop bogus task_lock and mm check Michal Hocko
2016-05-30 13:05   ` Michal Hocko
2016-05-30 13:49   ` Vladimir Davydov
2016-05-30 13:49     ` Vladimir Davydov
2016-05-30 17:43   ` Oleg Nesterov
2016-05-30 17:43     ` Oleg Nesterov
2016-05-31  7:32     ` Michal Hocko
2016-05-31  7:32       ` Michal Hocko
2016-05-31 22:53       ` Oleg Nesterov
2016-05-31 22:53         ` Oleg Nesterov
2016-06-01  6:53         ` Michal Hocko
2016-06-01  6:53           ` Michal Hocko
2016-06-01 10:41           ` Tetsuo Handa
2016-06-01 10:41             ` Tetsuo Handa
2016-06-01 10:48             ` Michal Hocko
2016-06-01 10:48               ` Michal Hocko
2016-05-30 13:05 ` [PATCH 2/6] proc, oom_adj: extract oom_score_adj setting into a helper Michal Hocko
2016-05-30 13:05   ` Michal Hocko
2016-05-30 13:05 ` [PATCH 3/6] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj Michal Hocko
2016-05-30 13:05   ` Michal Hocko
2016-05-31  7:41   ` Michal Hocko
2016-05-31  7:41     ` Michal Hocko
2016-05-30 13:05 ` [PATCH 4/6] mm, oom: skip vforked tasks from being selected Michal Hocko
2016-05-30 13:05   ` Michal Hocko
2016-05-30 19:28   ` Oleg Nesterov
2016-05-30 19:28     ` Oleg Nesterov
2016-05-31  7:42     ` Michal Hocko
2016-05-31  7:42       ` Michal Hocko
2016-05-31 21:43       ` Oleg Nesterov
2016-05-31 21:43         ` Oleg Nesterov
2016-06-01  7:09         ` Michal Hocko
2016-06-01  7:09           ` Michal Hocko
2016-06-01 14:12   ` Tetsuo Handa
2016-06-01 14:25     ` Michal Hocko
2016-06-02 10:45       ` Tetsuo Handa
2016-06-02 11:20         ` Michal Hocko
2016-06-02 11:31           ` Tetsuo Handa
2016-06-02 12:55             ` Michal Hocko
2016-05-30 13:05 ` [PATCH 5/6] mm, oom: kill all tasks sharing the mm Michal Hocko
2016-05-30 13:05   ` Michal Hocko
2016-05-30 18:18   ` Oleg Nesterov
2016-05-30 18:18     ` Oleg Nesterov
2016-05-31  7:43     ` Michal Hocko
2016-05-31  7:43       ` Michal Hocko
2016-05-31 21:48       ` Oleg Nesterov
2016-05-31 21:48         ` Oleg Nesterov
2016-05-30 13:05 ` [PATCH 6/6] mm, oom: fortify task_will_free_mem Michal Hocko
2016-05-30 13:05   ` Michal Hocko
2016-05-30 17:35   ` Oleg Nesterov
2016-05-30 17:35     ` Oleg Nesterov
2016-05-31  7:46     ` Michal Hocko
2016-05-31  7:46       ` Michal Hocko
2016-05-31 22:29       ` Oleg Nesterov [this message]
2016-05-31 22:29         ` Oleg Nesterov
2016-06-01  7:03         ` Michal Hocko
2016-06-01  7:03           ` Michal Hocko
2016-05-31 15:03   ` Tetsuo Handa
2016-05-31 15:10     ` Michal Hocko
2016-05-31 15:29       ` Tetsuo Handa
2016-06-01  7:25         ` Michal Hocko
2016-06-01 12:04           ` Tetsuo Handa
2016-06-01 12:43             ` Michal Hocko
2016-06-02 14:03 ` [PATCH 7/6] mm, oom: task_will_free_mem should skip oom_reaped tasks Michal Hocko
2016-06-02 14:03   ` Michal Hocko
2016-06-02 15:24   ` Tetsuo Handa
2016-06-02 15:50     ` Michal Hocko
  -- strict thread matches above, loose matches on Subject: below --
2016-05-26 12:40 [PATCH 0/5] Handle oom bypass more gracefully Michal Hocko
2016-05-26 12:40 ` [PATCH 6/6] mm, oom: fortify task_will_free_mem Michal Hocko
2016-05-26 12:40   ` Michal Hocko
2016-05-26 14:11   ` Tetsuo Handa
2016-05-26 14:23     ` Michal Hocko
2016-05-26 14:41       ` Tetsuo Handa
2016-05-26 14:41         ` Tetsuo Handa
2016-05-26 14:56         ` Michal Hocko
2016-05-26 14:56           ` Michal Hocko
2016-05-27 11:07   ` Michal Hocko
2016-05-27 11:07     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160531222933.GD26582@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.