All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mandeep Singh Baines <msb@chromium.org>
To: David Rientjes <rientjes@google.com>
Cc: Mandeep Singh Baines <msb@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>, Ying Han <yinghan@google.com>,
	linux-kernel@vger.kernel.org, gspencer@chromium.org,
	piman@chromium.org, wad@chromium.org, olofj@chromium.org
Subject: Re: [PATCH] oom: create a resource limit for oom_adj
Date: Thu, 11 Nov 2010 10:30:50 -0800	[thread overview]
Message-ID: <20101111183050.GI7363@google.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1011102328080.7571@chino.kir.corp.google.com>

David Rientjes (rientjes@google.com) wrote:
> On Wed, 10 Nov 2010, Mandeep Singh Baines wrote:
> 
> > For ChromiumOS, we'd like to be able to oom_adj a process up/down
> > as its leaves/enters the foreground. Currently, it is not possible
> > to oom_adj down without CAP_SYS_RESOURCE. This patch creates a new
> > resource limit, RLIMIT_OOMADJ, which is works in a similar fashion
> > to RLIMIT_NICE. This allows a process's oom_adj to be lowered
> > without CAP_SYS_RESOURCE as long as the new value is greater
> > than the resource limit.
> > 
> 
> First of all, oom_adj is deprecated and scheduled for removal in a couple 
> of years (see Documentation/feature-removal-schedule.txt) so any work in 
> this area should be targeting oom_score_adj instead.
> 

Ah. Thanks for the pointer.

> What is the anticipated use case for this?  We know that you want to lower 
> oom_adj without CAP_SYS_RESOURCE, but what's the expected behavior when an 
> app moves from foreground to background?  I assume it's something like 

The focus here is the web browser's tabs. In our case, each is a process. If
OOM is going to kill a process, you'd rather it kill the tab you looked at
hours ago instead of the one you're looking at now. So you'd like to have a
policy where the LRU tab gets killed first. We'd like to use oom_score_adj
as the mechanism to implement an LRU policy like this.

> having an oom_adj of 0 in the background and +15 in the foreground.  If 
> so, does /proc/sys/vm/oom_kill_allocating_task get you most of what you're 
> looking for?
> 

As explained above, oom_kill_allocating_task won't give us what we want.

> I'm wondering if we can avoid yet another resource limit for something 
> like this.
> 
> > Alternative considered:
> > 
> > * a setuid binary
> > * a daemon with CAP_SYS_RESOURCE
> > 
> > Since you don't wan't all processes to be able to reduce their
> > oom_adj, a setuid or daemon implementation would be complex. The
> > alternatives also have much higher overhead.
> > 
> 
> What do you anticipate will be writing to oom_score_adj with this patch, 
> the app itself?
> 

A process in the browser session will do the adusting. We'd rather not give
it CAP_SYS_RESOURCE. It should only be allowed to change oom_score_adj up
and down within the bounds set by the administrator. Analagous to renice()
which we also do using a similar policy.

> > Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
> > ---
> >  fs/proc/base.c                 |   12 ++++++++++--
> >  include/asm-generic/resource.h |    5 ++++-
> >  2 files changed, 14 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index f3d02ca..4384013 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -462,6 +462,7 @@ static const struct limit_names lnames[RLIM_NLIMITS] = {
> >  	[RLIMIT_NICE] = {"Max nice priority", NULL},
> >  	[RLIMIT_RTPRIO] = {"Max realtime priority", NULL},
> >  	[RLIMIT_RTTIME] = {"Max realtime timeout", "us"},
> > +	[RLIMIT_OOMADJ] = {"Max OOM adjust", NULL},
> 
> s/Max/Min, right?
> 

This is a MAX value because of how resource limits work. On the other hand,
it is really controlling the minimum oom_adj. So its a toss up for me.
More than happy to change if you prefer Min.

> >  };
> >  
> >  /* Display limits for a process */
> > @@ -1057,8 +1058,15 @@ static ssize_t oom_adjust_write(struct file *file, const char __user *buf,
> >  	}
> >  
> >  	if (oom_adjust < task->signal->oom_adj && !capable(CAP_SYS_RESOURCE)) {
> > -		err = -EACCES;
> > -		goto err_sighand;
> > +		/* convert oom_adj [15,-17] to rlimit style value [1,33] */
> > +		long oom_rlim = OOM_ADJUST_MAX + 1 - oom_adjust;
> > +
> 
> Ouch, that's a rather unfortunate mapping.
> 

Unfortunate but unavoidable. The resource limit code checks to see if the
new limit is greater than the limit. This code was based on the can_nice()
code in sched.c.

> > +		if (oom_rlim > task->signal->rlim[RLIMIT_OOMADJ].rlim_cur) {
> > +			unlock_task_sighand(task, &flags);
> > +			put_task_struct(task);
> > +			err = -EACCES;
> > +			goto err_sighand;
> 
> err_sighand has duplicate unlock_task_sighand() and put_task_struct(); 
> since you're missing the task_unlock(task) here, just using goto 
> err_sighand would suffice.
> 

D'oh. Forward port error. I should be more careful. Thanks for catching:)

> > +		}
> >  	}
> >  
> >  	if (oom_adjust != task->signal->oom_adj) {

Thank you for reviewing this patch.

Should I send an updated oom_score_adj patch?

Regards,
Mandeep

  reply	other threads:[~2010-11-11 18:31 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-11  4:35 [PATCH] oom: create a resource limit for oom_adj Mandeep Singh Baines
2010-11-11  7:35 ` David Rientjes
2010-11-11 18:30   ` Mandeep Singh Baines [this message]
2010-11-11 20:57     ` David Rientjes
2010-11-11 22:25       ` Mandeep Singh Baines
2010-11-11 23:19         ` David Rientjes
2010-11-11 23:56           ` Mandeep Singh Baines
2010-11-13  0:46             ` [PATCH] oom: allow a non-CAP_SYS_RESOURCE proces to oom_score_adj down Mandeep Singh Baines
2010-11-14  1:37               ` David Rientjes
2010-11-15 22:01                 ` [PATCH v2] " Mandeep Singh Baines
2010-11-15 22:06                   ` David Rientjes
2010-11-16  0:03                     ` [PATCH v3] " Mandeep Singh Baines
2010-11-14  5:07 ` [PATCH] oom: create a resource limit for oom_adj KOSAKI Motohiro
  -- strict thread matches above, loose matches on Subject: below --
2010-11-11  5:19 Figo.zhang
     [not found] <fNx73-1cI-1@gated-at.bofh.it>
     [not found] ` <fNzVf-5UY-3@gated-at.bofh.it>
     [not found]   ` <fNKdY-6FU-11@gated-at.bofh.it>
     [not found]     ` <fNMps-1S1-21@gated-at.bofh.it>
2010-11-11 23:15       ` Bodo Eggert
2010-11-11 23:21         ` David Rientjes
2010-11-14  5:07         ` KOSAKI Motohiro
2010-11-14 21:42           ` David Rientjes
2010-11-23  7:16             ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101111183050.GI7363@google.com \
    --to=msb@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=gspencer@chromium.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=olofj@chromium.org \
    --cc=piman@chromium.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=wad@chromium.org \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.