From: Jiri Slaby <jirislaby@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>,
LKML <linux-kernel@vger.kernel.org>,
Neil Horman <nhorman@tuxdriver.com>,
Oleg Nesterov <oleg@redhat.com>
Subject: Resource limits interface proposal [was: pull request for writable limits]
Date: Wed, 05 May 2010 14:12:54 +0200 [thread overview]
Message-ID: <4BE160C6.90404@gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1003211128520.18017@i5.linux-foundation.org>
Hi.
On 03/21/2010 07:38 PM, Linus Torvalds wrote:
> Or even just _one_ system call that takes two pointers, and can do an
> atomic replace-and-return-the-old-value, like 'sigaction()' does, ie
> something like
>
> int prlimit64(pid, limit, const struct rlimit64 *new, struct rlimit64 *old);
>
> wouldn't that be a nice generic interface?
So I ended up with thinking about these possibilities:
1) internal representation of limits will stay as is in signal_struct,
i.e. long limits with infinity being ~0ul. This is the least intrusive
solution. The new prlimit64 will convert rlimit64 to rlimit and pass
down to do_prlimit. With setrlimit and getrilimit just as wrappers it
will look like:
prlimit64(pid, resource, new64, old64) ->
new = convert_to_rlim(new64)
tsk = find_task(pid)
do_prlimit(tsk, resource, new, old)
old64 = convert_to_rlim64(old)
setrlimit(resource, rlim) ->
do_prlimit(current, resource, rlim, NULL)
getrlimit(resource, rlim) ->
do_prlimit(current, resource, NULL, rlim)
with appropriate copy_{from,to}_user. (And setrlimit+getrlimit will be
scheduled for removal with all the compat crap around them.)
It may also be that rlimit64 will contain flags like:
#define RLIM64_CUR_INFINITY 0x00000001
#define RLIM64_MAX_INFINITY 0x00000002
struct rlimit64 {
__u64 rlim_cur;
__u64 rlim_max;
__u32 flags;
};
if I understood Alexey correctly to separate limits values from
infinity? flags then will be converted to ~0ul when converting from
rlimit64 to rlimit above too.
The drawback is when a 32-bit user passes down a value >= (1 << 32),
EINVAL shall occur.
The pros are, no locking, no magic, longs are naturally atomic. Still
with arch-independent parameter for sys_prlimit64.
2) Introduce an rlimit lock and move every user to the rlimit helpers
which appropriately lock the accesses. And making locking a nop when
BITS_PER_LONG == 64. Then we can have rlimit64 in signal_struct and
everything will happen on 64-bit limit values.
If we decide to separate infinity from value with the flags above, we
should also reconsider what infinity will be. Much code just counts with
rlimit.rlim_{cur,max} being the highest possible value and doesn't count
with something like rlimit64.flags. This will result in locks not-being
a nop on 64-bit, because we want fresh rlim_cur+flags and rlim_max+flags
pairs. We could also have the flags solely in the syscall interface and
~0ULL count as infty internally.
In this case the situation will be
prlimit64(pid, resource, new64, old64) ->
tsk = find_task(pid)
do_prlimit(tsk, resource, new64, old64)
setrlimit(resource, rlim) ->
rlim64 = convert_to_rlim64(rlim)
do_prlimit(current, resource, rlim64, NULL)
getrlimit(resource, rlim) ->
do_prlimit(current, resource, NULL, rlim64)
rlim = convert_to_rlim(rlim64)
We cannot fail in prlimit64 due to limited space in longs on 32-bit,
however we added locking which may slow things down. I have no idea how
contended the lock will be, but as rlimits are used in the scheduler and
filesystem core, it might affect performance. I might measure if this is
of interest.
3) [inspired by Jan Kara's idea who knows how inode handling works] It's
some kind of similar to 2), we just avoid locks similarly to
inode->i_size accessors.
It doesn't solve the case of separate flags though.
Just a side note, we cannot use the rlimit64 name which is already
reserved in glibc headers for limits handling.
I will appreciate any comments.
thanks,
--
js
--
js
suse labs
--
js
next prev parent reply other threads:[~2010-05-05 12:13 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-07 16:52 [PULL] pull request for writable limits for 2.6.33-rc1 Jiri Slaby
2009-12-09 19:25 ` [PULL] pull request for writable limits for 2.6.33-rc0 Jiri Slaby
2009-12-11 11:05 ` [git pull -resend] " Jiri Slaby
2009-12-23 9:40 ` Jiri Slaby
2010-01-02 21:40 ` [PULL] " Jiri Kosina
2010-01-02 21:52 ` Ingo Molnar
2010-01-04 21:59 ` Jiri Kosina
2010-01-04 10:47 ` [PULL] pull request for limits FIXES for 2.6.33-rc Jiri Slaby
2010-01-04 10:48 ` [PATCH 1/3] SECURITY: selinux, fix update_rlimit_cpu parameter Jiri Slaby
2010-01-05 15:50 ` David Howells
2010-01-04 10:48 ` [PATCH 2/3] resource: move kernel function inside __KERNEL__ Jiri Slaby
2010-01-04 10:48 ` [PATCH 3/3] resource: add helpers for fetching rlimits Jiri Slaby
2010-03-05 16:53 ` [git pull] pull request for writable limits for 2.6.34-rc0 Jiri Slaby
2010-03-20 19:20 ` Linus Torvalds
2010-03-21 1:45 ` Neil Horman
2010-03-21 6:06 ` Alexey Dobriyan
2010-03-21 18:38 ` Linus Torvalds
2010-03-24 17:02 ` Jiri Slaby
2010-04-14 9:31 ` Jiri Slaby
2010-05-05 12:12 ` Jiri Slaby [this message]
2010-05-05 15:08 ` Resource limits interface proposal [was: pull request for writable limits] Linus Torvalds
2010-05-06 6:39 ` Alexey Dobriyan
2010-05-06 15:37 ` Linus Torvalds
2010-05-07 8:55 ` [PATCH 01/11] rlimits: security, add task_struct to setrlimit Jiri Slaby
2010-05-07 8:55 ` [PATCH 02/11] rlimits: add task_struct to update_rlimit_cpu Jiri Slaby
2010-05-07 8:55 ` [PATCH 03/11] rlimits: make sure ->rlim_max never grows in sys_setrlimit Jiri Slaby
2010-05-07 8:55 ` [PATCH 04/11] rlimits: split sys_setrlimit Jiri Slaby
2010-05-07 8:55 ` [PATCH 05/11] rlimits: allow setrlimit to non-current tasks Jiri Slaby
2010-05-07 8:55 ` [PATCH 06/11] rlimits: do security check under task_lock Jiri Slaby
2010-05-07 8:55 ` [PATCH 07/11] rlimits: add rlimit64 structure Jiri Slaby
2010-05-07 8:55 ` [PATCH 08/11] rlimits: redo do_setrlimit to more generic do_prlimit Jiri Slaby
2010-05-07 8:55 ` [PATCH 09/11] rlimits: switch getrlimit to do_prlimit Jiri Slaby
2010-05-07 9:02 ` [PATCH v2 09/11] rlimits: switch more rlimit syscalls " Jiri Slaby
2010-05-07 9:05 ` Jiri Slaby
2010-05-07 8:55 ` [PATCH " Jiri Slaby
2010-05-07 8:55 ` [PATCH 10/11] rlimits: implement prlimit64 syscall Jiri Slaby
2010-05-07 8:55 ` [PATCH 11/11] unistd: add __NR_prlimit64 syscall numbers Jiri Slaby
2010-05-06 15:46 ` Resource limits interface proposal [was: pull request for writable limits] Jiri Slaby
2010-03-24 17:04 ` [git pull] pull request for writable limits for 2.6.34-rc0 Jiri Slaby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BE160C6.90404@gmail.com \
--to=jirislaby@gmail.com \
--cc=adobriyan@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
--cc=oleg@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox