From: Jiri Slaby <jirislaby@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>,
LKML <linux-kernel@vger.kernel.org>,
Neil Horman <nhorman@tuxdriver.com>,
Oleg Nesterov <oleg@redhat.com>
Subject: Resource limits interface proposal [was: pull request for writable limits]
Date: Wed, 05 May 2010 14:12:54 +0200 [thread overview]
Message-ID: <4BE160C6.90404@gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1003211128520.18017@i5.linux-foundation.org>
Hi.
On 03/21/2010 07:38 PM, Linus Torvalds wrote:
> Or even just _one_ system call that takes two pointers, and can do an
> atomic replace-and-return-the-old-value, like 'sigaction()' does, ie
> something like
>
> int prlimit64(pid, limit, const struct rlimit64 *new, struct rlimit64 *old);
>
> wouldn't that be a nice generic interface?
So I ended up with thinking about these possibilities:
1) internal representation of limits will stay as is in signal_struct,
i.e. long limits with infinity being ~0ul. This is the least intrusive
solution. The new prlimit64 will convert rlimit64 to rlimit and pass
down to do_prlimit. With setrlimit and getrilimit just as wrappers it
will look like:
prlimit64(pid, resource, new64, old64) ->
new = convert_to_rlim(new64)
tsk = find_task(pid)
do_prlimit(tsk, resource, new, old)
old64 = convert_to_rlim64(old)
setrlimit(resource, rlim) ->
do_prlimit(current, resource, rlim, NULL)
getrlimit(resource, rlim) ->
do_prlimit(current, resource, NULL, rlim)
with appropriate copy_{from,to}_user. (And setrlimit+getrlimit will be
scheduled for removal with all the compat crap around them.)
It may also be that rlimit64 will contain flags like:
#define RLIM64_CUR_INFINITY 0x00000001
#define RLIM64_MAX_INFINITY 0x00000002
struct rlimit64 {
__u64 rlim_cur;
__u64 rlim_max;
__u32 flags;
};
if I understood Alexey correctly to separate limits values from
infinity? flags then will be converted to ~0ul when converting from
rlimit64 to rlimit above too.
The drawback is when a 32-bit user passes down a value >= (1 << 32),
EINVAL shall occur.
The pros are, no locking, no magic, longs are naturally atomic. Still
with arch-independent parameter for sys_prlimit64.
2) Introduce an rlimit lock and move every user to the rlimit helpers
which appropriately lock the accesses. And making locking a nop when
BITS_PER_LONG == 64. Then we can have rlimit64 in signal_struct and
everything will happen on 64-bit limit values.
If we decide to separate infinity from value with the flags above, we
should also reconsider what infinity will be. Much code just counts with
rlimit.rlim_{cur,max} being the highest possible value and doesn't count
with something like rlimit64.flags. This will result in locks not-being
a nop on 64-bit, because we want fresh rlim_cur+flags and rlim_max+flags
pairs. We could also have the flags solely in the syscall interface and
~0ULL count as infty internally.
In this case the situation will be
prlimit64(pid, resource, new64, old64) ->
tsk = find_task(pid)
do_prlimit(tsk, resource, new64, old64)
setrlimit(resource, rlim) ->
rlim64 = convert_to_rlim64(rlim)
do_prlimit(current, resource, rlim64, NULL)
getrlimit(resource, rlim) ->
do_prlimit(current, resource, NULL, rlim64)
rlim = convert_to_rlim(rlim64)
We cannot fail in prlimit64 due to limited space in longs on 32-bit,
however we added locking which may slow things down. I have no idea how
contended the lock will be, but as rlimits are used in the scheduler and
filesystem core, it might affect performance. I might measure if this is
of interest.
3) [inspired by Jan Kara's idea who knows how inode handling works] It's
some kind of similar to 2), we just avoid locks similarly to
inode->i_size accessors.
It doesn't solve the case of separate flags though.
Just a side note, we cannot use the rlimit64 name which is already
reserved in glibc headers for limits handling.
I will appreciate any comments.
thanks,
--
js
--
js
suse labs
--
js
next prev parent reply other threads:[~2010-05-05 12:13 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-07 16:52 [PULL] pull request for writable limits for 2.6.33-rc1 Jiri Slaby
2009-12-09 19:25 ` [PULL] pull request for writable limits for 2.6.33-rc0 Jiri Slaby
2009-12-11 11:05 ` [git pull -resend] " Jiri Slaby
2009-12-23 9:40 ` Jiri Slaby
2010-01-02 21:40 ` [PULL] " Jiri Kosina
2010-01-02 21:52 ` Ingo Molnar
2010-01-04 21:59 ` Jiri Kosina
2010-01-04 10:47 ` [PULL] pull request for limits FIXES for 2.6.33-rc Jiri Slaby
2010-01-04 10:48 ` [PATCH 1/3] SECURITY: selinux, fix update_rlimit_cpu parameter Jiri Slaby
2010-01-05 15:50 ` David Howells
2010-01-04 10:48 ` [PATCH 2/3] resource: move kernel function inside __KERNEL__ Jiri Slaby
2010-01-04 10:48 ` [PATCH 3/3] resource: add helpers for fetching rlimits Jiri Slaby
2010-03-05 16:53 ` [git pull] pull request for writable limits for 2.6.34-rc0 Jiri Slaby
2010-03-20 19:20 ` Linus Torvalds
2010-03-21 1:45 ` Neil Horman
2010-03-21 6:06 ` Alexey Dobriyan
2010-03-21 18:38 ` Linus Torvalds
2010-03-24 17:02 ` Jiri Slaby
2010-04-14 9:31 ` Jiri Slaby
2010-05-05 12:12 ` Jiri Slaby [this message]
2010-05-05 15:08 ` Resource limits interface proposal [was: pull request for writable limits] Linus Torvalds
2010-05-06 6:39 ` Alexey Dobriyan
2010-05-06 15:37 ` Linus Torvalds
2010-05-07 8:55 ` [PATCH 01/11] rlimits: security, add task_struct to setrlimit Jiri Slaby
2010-05-07 8:55 ` [PATCH 02/11] rlimits: add task_struct to update_rlimit_cpu Jiri Slaby
2010-05-07 8:55 ` [PATCH 03/11] rlimits: make sure ->rlim_max never grows in sys_setrlimit Jiri Slaby
2010-05-07 8:55 ` [PATCH 04/11] rlimits: split sys_setrlimit Jiri Slaby
2010-05-07 8:55 ` [PATCH 05/11] rlimits: allow setrlimit to non-current tasks Jiri Slaby
2010-05-07 8:55 ` [PATCH 06/11] rlimits: do security check under task_lock Jiri Slaby
2010-05-07 8:55 ` [PATCH 07/11] rlimits: add rlimit64 structure Jiri Slaby
2010-05-07 8:55 ` [PATCH 08/11] rlimits: redo do_setrlimit to more generic do_prlimit Jiri Slaby
2010-05-07 8:55 ` [PATCH 09/11] rlimits: switch getrlimit to do_prlimit Jiri Slaby
2010-05-07 9:02 ` [PATCH v2 09/11] rlimits: switch more rlimit syscalls " Jiri Slaby
2010-05-07 9:05 ` Jiri Slaby
2010-05-07 8:55 ` [PATCH " Jiri Slaby
2010-05-07 8:55 ` [PATCH 10/11] rlimits: implement prlimit64 syscall Jiri Slaby
2010-05-07 8:55 ` [PATCH 11/11] unistd: add __NR_prlimit64 syscall numbers Jiri Slaby
2010-05-06 15:46 ` Resource limits interface proposal [was: pull request for writable limits] Jiri Slaby
2010-03-24 17:04 ` [git pull] pull request for writable limits for 2.6.34-rc0 Jiri Slaby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BE160C6.90404@gmail.com \
--to=jirislaby@gmail.com \
--cc=adobriyan@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
--cc=oleg@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.