All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov@virtuozzo.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/6] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj
Date: Mon, 30 May 2016 11:47:53 +0300	[thread overview]
Message-ID: <20160530084753.GH26059@esperanza> (raw)
In-Reply-To: <20160530070705.GD22928@dhcp22.suse.cz>

On Mon, May 30, 2016 at 09:07:05AM +0200, Michal Hocko wrote:
> On Fri 27-05-16 19:18:21, Vladimir Davydov wrote:
> > On Fri, May 27, 2016 at 01:18:03PM +0200, Michal Hocko wrote:
> > ...
> > > @@ -1087,7 +1105,25 @@ static int __set_oom_adj(struct file *file, int oom_adj, bool legacy)
> > >  	unlock_task_sighand(task, &flags);
> > >  err_put_task:
> > >  	put_task_struct(task);
> > > +
> > > +	if (mm) {
> > > +		struct task_struct *p;
> > > +
> > > +		rcu_read_lock();
> > > +		for_each_process(p) {
> > > +			task_lock(p);
> > > +			if (!p->vfork_done && process_shares_mm(p, mm)) {
> > > +				p->signal->oom_score_adj = oom_adj;
> > > +				if (!legacy && has_capability_noaudit(current, CAP_SYS_RESOURCE))
> > > +					p->signal->oom_score_adj_min = (short)oom_adj;
> > > +			}
> > > +			task_unlock(p);
> > 
> > I.e. you write to /proc/pid1/oom_score_adj and get
> > /proc/pid2/oom_score_adj updated if pid1 and pid2 share mm?
> > IMO that looks unexpected from userspace pov.
> 
> How much different it is from threads in the same thread group?
> Processes sharing the mm without signals is a rather weird threading
> model isn't it?

I think so too. I wouldn't be surprised if it turned out that nobody had
ever used it. But may be there's someone out there who does.

> Currently we just lie to users about their oom_score_adj
> in this weird corner case.

Hmm, looks like a bug, but nobody has ever complained about it.

> The only exception was OOM_SCORE_ADJ_MIN
> where we really didn't kill the task but all other values are simply
> ignored in practice.
> 
> > May be, we'd better add mm->oom_score_adj and set it to the min
> > signal->oom_score_adj over all processes sharing it? This would
> > require iterating over all processes every time oom_score_adj gets
> > updated, but that's a slow path.
> 
> Not sure I understand. So you would prefer that mm->oom_score_adj might
> disagree with p->signal->oom_score_adj?

No, I wouldn't. I'd rather agree that oom_score_adj should be per mm,
because we choose the victim basing solely on mm stats.

What I mean is we don't touch p->signal->oom_score_adj of other tasks
sharing mm, but instead store minimal oom_score_adj over all tasks
sharing mm in the mm_struct whenever a task's oom_score_adj is modified.
And use mm->oom_score_adj instead of signal->oom_score_adj in oom killer
code. This would save us from any accusations of user API modifications
and it would also make the oom code a bit easier to follow IMHO.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@virtuozzo.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: <linux-mm@kvack.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>,
	Oleg Nesterov <oleg@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/6] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj
Date: Mon, 30 May 2016 11:47:53 +0300	[thread overview]
Message-ID: <20160530084753.GH26059@esperanza> (raw)
In-Reply-To: <20160530070705.GD22928@dhcp22.suse.cz>

On Mon, May 30, 2016 at 09:07:05AM +0200, Michal Hocko wrote:
> On Fri 27-05-16 19:18:21, Vladimir Davydov wrote:
> > On Fri, May 27, 2016 at 01:18:03PM +0200, Michal Hocko wrote:
> > ...
> > > @@ -1087,7 +1105,25 @@ static int __set_oom_adj(struct file *file, int oom_adj, bool legacy)
> > >  	unlock_task_sighand(task, &flags);
> > >  err_put_task:
> > >  	put_task_struct(task);
> > > +
> > > +	if (mm) {
> > > +		struct task_struct *p;
> > > +
> > > +		rcu_read_lock();
> > > +		for_each_process(p) {
> > > +			task_lock(p);
> > > +			if (!p->vfork_done && process_shares_mm(p, mm)) {
> > > +				p->signal->oom_score_adj = oom_adj;
> > > +				if (!legacy && has_capability_noaudit(current, CAP_SYS_RESOURCE))
> > > +					p->signal->oom_score_adj_min = (short)oom_adj;
> > > +			}
> > > +			task_unlock(p);
> > 
> > I.e. you write to /proc/pid1/oom_score_adj and get
> > /proc/pid2/oom_score_adj updated if pid1 and pid2 share mm?
> > IMO that looks unexpected from userspace pov.
> 
> How much different it is from threads in the same thread group?
> Processes sharing the mm without signals is a rather weird threading
> model isn't it?

I think so too. I wouldn't be surprised if it turned out that nobody had
ever used it. But may be there's someone out there who does.

> Currently we just lie to users about their oom_score_adj
> in this weird corner case.

Hmm, looks like a bug, but nobody has ever complained about it.

> The only exception was OOM_SCORE_ADJ_MIN
> where we really didn't kill the task but all other values are simply
> ignored in practice.
> 
> > May be, we'd better add mm->oom_score_adj and set it to the min
> > signal->oom_score_adj over all processes sharing it? This would
> > require iterating over all processes every time oom_score_adj gets
> > updated, but that's a slow path.
> 
> Not sure I understand. So you would prefer that mm->oom_score_adj might
> disagree with p->signal->oom_score_adj?

No, I wouldn't. I'd rather agree that oom_score_adj should be per mm,
because we choose the victim basing solely on mm stats.

What I mean is we don't touch p->signal->oom_score_adj of other tasks
sharing mm, but instead store minimal oom_score_adj over all tasks
sharing mm in the mm_struct whenever a task's oom_score_adj is modified.
And use mm->oom_score_adj instead of signal->oom_score_adj in oom killer
code. This would save us from any accusations of user API modifications
and it would also make the oom code a bit easier to follow IMHO.

  reply	other threads:[~2016-05-30  8:48 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-26 12:40 [PATCH 0/5] Handle oom bypass more gracefully Michal Hocko
2016-05-26 12:40 ` Michal Hocko
2016-05-26 12:40 ` [PATCH 1/6] mm, oom: do not loop over all tasks if there are no external tasks sharing mm Michal Hocko
2016-05-26 12:40   ` Michal Hocko
2016-05-26 14:30   ` Tetsuo Handa
2016-05-26 14:30     ` Tetsuo Handa
2016-05-26 14:59     ` Michal Hocko
2016-05-26 14:59       ` Michal Hocko
2016-05-26 15:25       ` [PATCH 1/6] mm, oom: do not loop over all tasks if there are noexternal " Tetsuo Handa
2016-05-26 15:25         ` Tetsuo Handa
2016-05-26 15:35         ` Michal Hocko
2016-05-26 15:35           ` Michal Hocko
2016-05-26 16:14           ` [PATCH 1/6] mm, oom: do not loop over all tasks if there are no external " Tetsuo Handa
2016-05-26 16:14             ` Tetsuo Handa
2016-05-27  6:45             ` Michal Hocko
2016-05-27  6:45               ` Michal Hocko
2016-05-27  7:15               ` Michal Hocko
2016-05-27  7:15                 ` Michal Hocko
2016-05-27  8:03                 ` Michal Hocko
2016-05-27  8:03                   ` Michal Hocko
2016-05-27 10:15                   ` Tetsuo Handa
2016-05-26 12:40 ` [PATCH 2/6] proc, oom_adj: extract oom_score_adj setting into a helper Michal Hocko
2016-05-26 12:40   ` Michal Hocko
2016-05-26 12:40 ` [PATCH 3/6] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj Michal Hocko
2016-05-26 12:40   ` Michal Hocko
2016-05-27 11:18   ` Michal Hocko
2016-05-27 11:18     ` Michal Hocko
2016-05-27 16:18     ` Vladimir Davydov
2016-05-27 16:18       ` Vladimir Davydov
2016-05-30  7:07       ` Michal Hocko
2016-05-30  7:07         ` Michal Hocko
2016-05-30  8:47         ` Vladimir Davydov [this message]
2016-05-30  8:47           ` Vladimir Davydov
2016-05-30  9:39           ` Michal Hocko
2016-05-30  9:39             ` Michal Hocko
2016-05-30 10:26             ` Vladimir Davydov
2016-05-30 10:26               ` Vladimir Davydov
2016-05-30 11:11               ` Michal Hocko
2016-05-30 11:11                 ` Michal Hocko
2016-05-30 12:19                 ` Vladimir Davydov
2016-05-30 12:19                   ` Vladimir Davydov
2016-05-30 12:28                   ` Michal Hocko
2016-05-30 12:28                     ` Michal Hocko
2016-05-26 12:40 ` [PATCH 4/6] mm, oom: skip over vforked tasks Michal Hocko
2016-05-26 12:40   ` Michal Hocko
2016-05-27 16:48   ` Vladimir Davydov
2016-05-27 16:48     ` Vladimir Davydov
2016-05-30  7:13     ` Michal Hocko
2016-05-30  7:13       ` Michal Hocko
2016-05-30  9:52       ` Michal Hocko
2016-05-30  9:52         ` Michal Hocko
2016-05-30 10:40         ` Vladimir Davydov
2016-05-30 10:40           ` Vladimir Davydov
2016-05-30 10:53           ` Michal Hocko
2016-05-30 10:53             ` Michal Hocko
2016-05-30 12:03   ` Michal Hocko
2016-05-30 12:03     ` Michal Hocko
2016-05-26 12:40 ` [PATCH 5/6] mm, oom: kill all tasks sharing the mm Michal Hocko
2016-05-26 12:40   ` Michal Hocko
2016-05-26 12:40 ` [PATCH 6/6] mm, oom: fortify task_will_free_mem Michal Hocko
2016-05-26 12:40   ` Michal Hocko
2016-05-26 14:11   ` Tetsuo Handa
2016-05-26 14:23     ` Michal Hocko
2016-05-26 14:41       ` Tetsuo Handa
2016-05-26 14:41         ` Tetsuo Handa
2016-05-26 14:56         ` Michal Hocko
2016-05-26 14:56           ` Michal Hocko
2016-05-27 11:07   ` Michal Hocko
2016-05-27 11:07     ` Michal Hocko
2016-05-27 16:00 ` [PATCH 0/5] Handle oom bypass more gracefully Michal Hocko
2016-05-27 16:00   ` Michal Hocko
2016-05-28 14:04   ` Tetsuo Handa
2016-05-30  7:21     ` Michal Hocko
2016-05-30 11:10       ` Tetsuo Handa
2016-05-30 11:35         ` Michal Hocko
  -- strict thread matches above, loose matches on Subject: below --
2016-05-30 13:05 [PATCH 0/6 -v2] " Michal Hocko
2016-05-30 13:05 ` [PATCH 3/6] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj Michal Hocko
2016-05-30 13:05   ` Michal Hocko
2016-05-31  7:41   ` Michal Hocko
2016-05-31  7:41     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160530084753.GH26059@esperanza \
    --to=vdavydov@virtuozzo.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.