From: Mandeep Singh Baines <msb@chromium.org>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Rik van Riel <riel@redhat.com>, Ying Han <yinghan@google.com>,
linux-kernel@vger.kernel.org, gspencer@chromium.org,
piman@chromium.org, wad@chromium.org, olofj@chromium.org,
Bodo Eggert <7eggert@web.de>
Subject: [PATCH v2] oom: allow a non-CAP_SYS_RESOURCE proces to oom_score_adj down
Date: Mon, 15 Nov 2010 14:01:51 -0800 [thread overview]
Message-ID: <20101115220150.GR7363@google.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1011131735240.27212@chino.kir.corp.google.com>
Hi,
Attached is V2 of this patch. Since V1:
* Added documentation in Documentation/filesystems/proc.txt
* Copy oom_score_adj_min value across a fork
An alternative to this patch would be to expose oom_score_adj_min via
proc (suggested by Kosaki Motohiro). You could allow non-CAP_SYS_RESOURCE
processes to irreversibly increase the minimum. That would give you
similar semantics to a resource limit and would give you a bit more
control but it means adding another proc variable.
Regards,
Mandeep
---
We'd like to be able to oom_score_adj a process up/down as its
enters/leaves the foreground. Currently, it is not possible to oom_adj
down without CAP_SYS_RESOURCE. This patch allows a task to decrease
its oom_score_adj back to the value that a CAP_SYS_RESOURCE thread set
it or its inherited value at fork. Assuming the thread that has forked
it has oom_score_adj of 0, each tab process could decrease it back from
0 upon activation unless a CAP_SYS_RESOURCE thread elevated it to
something higher.
Alternative considered:
* a setuid binary
* a daemon with CAP_SYS_RESOURCE
Since you don't wan't all processes to be able to reduce their
oom_adj, a setuid or daemon implementation would be complex. The
alternatives also have much higher overhead.
This patch updated from original patch based on feedback from
David Rientjes <rientjes@google.com>.
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
---
Documentation/filesystems/proc.txt | 4 ++++
fs/proc/base.c | 4 +++-
include/linux/sched.h | 2 ++
kernel/fork.c | 1 +
4 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index e73df27..7139c50 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1296,6 +1296,10 @@ scaled linearly with /proc/<pid>/oom_score_adj.
Writing to /proc/<pid>/oom_score_adj or /proc/<pid>/oom_adj will change the
other with its scaled value.
+The value of /proc/<pid>/oom_score_adj may be reduced no lower than the last
+value set by a CAP_SYS_RESOURCE process. To reduce the value any lower
+requires CAP_SYS_RESOURCE.
+
NOTICE: /proc/<pid>/oom_adj is deprecated and will be removed, please see
Documentation/feature-removal-schedule.txt.
diff --git a/fs/proc/base.c b/fs/proc/base.c
index f3d02ca..e617413 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1164,7 +1164,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
goto err_task_lock;
}
- if (oom_score_adj < task->signal->oom_score_adj &&
+ if (oom_score_adj < task->signal->oom_score_adj_min &&
!capable(CAP_SYS_RESOURCE)) {
err = -EACCES;
goto err_sighand;
@@ -1177,6 +1177,8 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
atomic_dec(&task->mm->oom_disable_count);
}
task->signal->oom_score_adj = oom_score_adj;
+ if (capable(CAP_SYS_RESOURCE))
+ task->signal->oom_score_adj_min = oom_score_adj;
/*
* Scale /proc/pid/oom_adj appropriately ensuring that OOM_DISABLE is
* always attainable.
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f53cdf2..2a71ee0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -626,6 +626,8 @@ struct signal_struct {
int oom_adj; /* OOM kill score adjustment (bit shift) */
int oom_score_adj; /* OOM kill score adjustment */
+ int oom_score_adj_min; /* OOM kill score adjustment minimum value.
+ * Only settable by CAP_SYS_RESOURCE. */
struct mutex cred_guard_mutex; /* guard against foreign influences on
* credential calculations
diff --git a/kernel/fork.c b/kernel/fork.c
index 3b159c5..0979527 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -907,6 +907,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
sig->oom_adj = current->signal->oom_adj;
sig->oom_score_adj = current->signal->oom_score_adj;
+ sig->oom_score_adj_min = current->signal->oom_score_adj_min;
mutex_init(&sig->cred_guard_mutex);
--
1.7.3.1
next prev parent reply other threads:[~2010-11-15 22:02 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-11 4:35 [PATCH] oom: create a resource limit for oom_adj Mandeep Singh Baines
2010-11-11 7:35 ` David Rientjes
2010-11-11 18:30 ` Mandeep Singh Baines
2010-11-11 20:57 ` David Rientjes
2010-11-11 22:25 ` Mandeep Singh Baines
2010-11-11 23:19 ` David Rientjes
2010-11-11 23:56 ` Mandeep Singh Baines
2010-11-13 0:46 ` [PATCH] oom: allow a non-CAP_SYS_RESOURCE proces to oom_score_adj down Mandeep Singh Baines
2010-11-14 1:37 ` David Rientjes
2010-11-15 22:01 ` Mandeep Singh Baines [this message]
2010-11-15 22:06 ` [PATCH v2] " David Rientjes
2010-11-16 0:03 ` [PATCH v3] " Mandeep Singh Baines
2010-11-14 5:07 ` [PATCH] oom: create a resource limit for oom_adj KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101115220150.GR7363@google.com \
--to=msb@chromium.org \
--cc=7eggert@web.de \
--cc=akpm@linux-foundation.org \
--cc=gspencer@chromium.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=olofj@chromium.org \
--cc=piman@chromium.org \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=wad@chromium.org \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.