Re: [PATCH] mm,oom: use per signal_struct flag rather than clear TIF_MEMDIE

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	linux-mm@kvack.org, vdavydov@virtuozzo.com, rientjes@google.com
Subject: Re: [PATCH] mm,oom: use per signal_struct flag rather than clear TIF_MEMDIE
Date: Wed, 29 Jun 2016 22:01:08 +0200	[thread overview]
Message-ID: <20160629200108.GA19253@redhat.com> (raw)
In-Reply-To: <20160629083314.GA27153@dhcp22.suse.cz>

On 06/29, Michal Hocko wrote:
>
> > > +void mark_oom_victim(struct task_struct *tsk, struct mm_struct *mm)
> > >  {
> > >  	WARN_ON(oom_killer_disabled);
> > >  	/* OOM killer might race with memcg OOM */
> > >  	if (test_and_set_tsk_thread_flag(tsk, TIF_MEMDIE))
> > >  		return;
> > > +
> > >  	atomic_inc(&tsk->signal->oom_victims);
> > > +
> > > +	/* oom_mm is bound to the signal struct life time */
> > > +	if (!tsk->signal->oom_mm) {
> > > +		atomic_inc(&mm->mm_count);
> > > +		tsk->signal->oom_mm = mm;
> >
> > Looks racy, but it is not because we rely on oom_lock? Perhaps a comment
> > makes sense.
>
> mark_oom_victim will be called only for the current or under the
> task_lock so it should be stable. Except for...

I meant that the code looks racy because 2 threads can see ->oom_mm == NULL
at the same time and in this case we have the extra atomic_inc(mm_count).
But I guess oom_lock saves us, so the code is correct but not clear.

> > > @@ -838,8 +826,8 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
> > >  	 * If the task is already exiting, don't alarm the sysadmin or kill
> > >  	 * its children or threads, just set TIF_MEMDIE so it can die quickly
> > >  	 */
> > > -	if (task_will_free_mem(p)) {
> > > -		mark_oom_victim(p);
> > > +	if (mm && task_will_free_mem(p)) {
> > > +		mark_oom_victim(p, mm);
>
> This one. I didn't bother to cover it for the example patch but I have a
> plan to address that. There are two possible ways. One is to pin
> mm_count in oom_badness() so that we have a guarantee that it will not

I thought about this too. And I think that select_bad_process() should even
return mm_struct or at least a task_lock'ed task for the start.

> > And this looks really racy at first glance. Suppose that this memory hog execs
> > (this changes its ->mm) and then exits so that task_will_free_mem() == T, in
> > this case "mm" has nothing to do with tsk->mm and it can be already freed.
>
> Hmm, I didn't think about exec case. And I guess we have never cared
> about that race. We just select a task and then kill it.

And I guess we want to fix this too, although this is not that important,
but this looks like a minor security problem.

And this is another indication that almost everything oom-kill.c does with
task_struct is wrong ;) Ideally It should only use task_struct to send the
SIGKILL, and now that we kill all users of victim->mm we can hopefully do
this later.

Btw, do we still need this list_for_each_entry(child, &t->children, sibling)
loop in oom_kill_process() ?

> I would be more worried about the use
> after free.

Yes, yes, this is what I meant.

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2016-06-29 20:01 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-24 11:02 [PATCH] mm,oom: use per signal_struct flag rather than clear TIF_MEMDIE Tetsuo Handa
2016-06-24 12:39 ` Michal Hocko
2016-06-24 15:54   ` Tetsuo Handa
2016-06-24 22:42     ` Oleg Nesterov
2016-06-24 21:56 ` Oleg Nesterov
2016-06-25  5:44   ` Tetsuo Handa
2016-06-27  9:23     ` Michal Hocko
2016-06-27 10:36       ` Michal Hocko
2016-06-27 15:51         ` Oleg Nesterov
2016-06-27 16:06           ` Michal Hocko
2016-06-27 17:55             ` Oleg Nesterov
2016-06-28 10:19               ` Michal Hocko
2016-06-29  0:13                 ` Oleg Nesterov
2016-06-29  8:33                   ` Michal Hocko
2016-06-29 14:19                     ` Michal Hocko
2016-07-01 10:15                       ` Tetsuo Handa
2016-06-29 20:01                     ` Oleg Nesterov [this message]
2016-06-30  7:59                       ` Michal Hocko
2016-06-30 10:51                         ` Tetsuo Handa
2016-06-30 11:21                           ` Michal Hocko
2016-07-03 13:32                           ` Oleg Nesterov
2016-07-03 13:21                         ` Oleg Nesterov
2016-07-07 11:51                           ` Michal Hocko
2016-07-07 16:42                             ` Oleg Nesterov
2016-06-29 20:14                 ` Oleg Nesterov
2016-06-30  8:07                   ` Michal Hocko
2016-07-03 13:24                     ` Oleg Nesterov
2016-06-27 21:09       ` Oleg Nesterov
2016-06-28 10:26         ` Michal Hocko
2016-06-29 19:34           ` Oleg Nesterov
2016-06-27 20:40     ` Oleg Nesterov
2016-06-28 10:29       ` Michal Hocko
2016-06-29 20:24         ` Oleg Nesterov
2016-06-30  8:16           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160629200108.GA19253@redhat.com \
    --to=oleg@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.