public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [GIT] writable_limits for 2.6.36
@ 2010-08-07 12:15 Jiri Slaby
  2010-08-10 16:01 ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Slaby @ 2010-08-07 12:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LKML, Oleg Nesterov, Andrew Morton

Hello,

please consider the following repository for 2.6.36. It introduces a new
syscall for arch independent resource limits handling. It also adds a
support for runtime limits changing. This feature is needed mostly by
daemons servicing databases and similar service where limits are needed
to be changed without services being restarted on production systems.

The following changes since commit 2f7989efd4398d92b8adffce2e07dd043a0895fe:

  Merge master.kernel.org:/home/rmk/linux-2.6-arm (2010-07-14 17:28:13
-0700)

are available in the git repository at:

  git://decibel.fi.muni.cz/~xslaby/linux writable_limits

Jiri Slaby (10):
      rlimits: security, add task_struct to setrlimit
      rlimits: add task_struct to update_rlimit_cpu
      rlimits: split sys_setrlimit
      rlimits: allow setrlimit to non-current tasks
      rlimits: do security check under task_lock
      rlimits: add rlimit64 structure
      rlimits: redo do_setrlimit to more generic do_prlimit
      rlimits: switch more rlimit syscalls to do_prlimit
      rlimits: implement prlimit64 syscall
      unistd: add __NR_prlimit64 syscall numbers

Oleg Nesterov (2):
      rlimits: make sure ->rlim_max never grows in sys_setrlimit
      rlimits: selinux, do rlimits changes under task_lock

 arch/x86/ia32/ia32entry.S          |    1 +
 arch/x86/include/asm/unistd_32.h   |    3 +-
 arch/x86/include/asm/unistd_64.h   |    2 +
 arch/x86/kernel/syscall_table_32.S |    1 +
 include/asm-generic/unistd.h       |    4 +-
 include/linux/posix-timers.h       |    2 +-
 include/linux/resource.h           |    9 ++
 include/linux/security.h           |    9 +-
 include/linux/syscalls.h           |    4 +
 kernel/compat.c                    |   17 +---
 kernel/posix-cpu-timers.c          |    8 +-
 kernel/sys.c                       |  202
++++++++++++++++++++++++++++--------
 security/capability.c              |    3 +-
 security/security.c                |    5 +-
 security/selinux/hooks.c           |   12 ++-
 15 files changed, 207 insertions(+), 75 deletions(-)

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT] writable_limits for 2.6.36
  2010-08-07 12:15 [GIT] writable_limits for 2.6.36 Jiri Slaby
@ 2010-08-10 16:01 ` Linus Torvalds
  2010-08-10 16:21   ` Chris Metcalf
  2010-08-10 16:24   ` Linus Torvalds
  0 siblings, 2 replies; 10+ messages in thread
From: Linus Torvalds @ 2010-08-10 16:01 UTC (permalink / raw)
  To: Jiri Slaby, Chris Metcalf, Arnd Bergmann
  Cc: LKML, Oleg Nesterov, Andrew Morton

2010/8/7 Jiri Slaby <jslaby@suse.cz>:
>
> please consider the following repository for 2.6.36. It introduces a new
> syscall for arch independent resource limits handling. It also adds a
> support for runtime limits changing. This feature is needed mostly by
> daemons servicing databases and similar service where limits are needed
> to be changed without services being restarted on production systems.

Ok, so the code looks fine, and I don't have any real objections any
more. I don't know how much use this will get, but it doesn't appear
to be "wrong" in any way. So I was going to pull it.

However, in the meantime we have commit 5360bd776f73 ("Fix up the
"generic" unistd.h ABI to be more useful") that clashes with it. Now,
the conflict is trivial to resolve, and I could do that easily - it's
not a technical problem. But that commit code comments say

  + * Architectures may provide up to 16 syscalls of their own
  + * starting with this value.
  + */
  +#define __NR_arch_specific_syscall 244

and the new writable rlimits syscall is obviously 244.

Now, looking at it all, I think that commit was badly done - not
leaving any room for new generic system calls is pretty iffy. And if I
had happened to take the Tilera merge later, I'd have had no problems
with just changing it. As is, though, I want to check with Arnd and
Chris first.

Arnd, Chris - how about making the "arch-specific" system calls start
at 256 or something? Or even higher, like 512? Yes, it makes the
system call array bigger, but is that really a problem? Especially as
we start the "deprecated" system calls at 1024, it would seem to make
sense to raise it to 512, and leave the low numbers for the "regular"
system calls.

[ I'm leaving the quoted email for the edification of Chris/Arnd that
I added to the discussion ]

                                   Linus

---
>
>  git://decibel.fi.muni.cz/~xslaby/linux writable_limits
>
> Jiri Slaby (10):
>      rlimits: security, add task_struct to setrlimit
>      rlimits: add task_struct to update_rlimit_cpu
>      rlimits: split sys_setrlimit
>      rlimits: allow setrlimit to non-current tasks
>      rlimits: do security check under task_lock
>      rlimits: add rlimit64 structure
>      rlimits: redo do_setrlimit to more generic do_prlimit
>      rlimits: switch more rlimit syscalls to do_prlimit
>      rlimits: implement prlimit64 syscall
>      unistd: add __NR_prlimit64 syscall numbers
>
> Oleg Nesterov (2):
>      rlimits: make sure ->rlim_max never grows in sys_setrlimit
>      rlimits: selinux, do rlimits changes under task_lock
>
>  arch/x86/ia32/ia32entry.S          |    1 +
>  arch/x86/include/asm/unistd_32.h   |    3 +-
>  arch/x86/include/asm/unistd_64.h   |    2 +
>  arch/x86/kernel/syscall_table_32.S |    1 +
>  include/asm-generic/unistd.h       |    4 +-
>  include/linux/posix-timers.h       |    2 +-
>  include/linux/resource.h           |    9 ++
>  include/linux/security.h           |    9 +-
>  include/linux/syscalls.h           |    4 +
>  kernel/compat.c                    |   17 +---
>  kernel/posix-cpu-timers.c          |    8 +-
>  kernel/sys.c                       |  202 ++++++++++++++++++++++++++++--------
>  security/capability.c              |    3 +-
>  security/security.c                |    5 +-
>  security/selinux/hooks.c           |   12 ++-
>  15 files changed, 207 insertions(+), 75 deletions(-)
>
> thanks,
> --
> js
> suse labs
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT] writable_limits for 2.6.36
  2010-08-10 16:01 ` Linus Torvalds
@ 2010-08-10 16:21   ` Chris Metcalf
  2010-08-10 16:43     ` Jiri Slaby
  2010-08-10 18:50     ` Arnd Bergmann
  2010-08-10 16:24   ` Linus Torvalds
  1 sibling, 2 replies; 10+ messages in thread
From: Chris Metcalf @ 2010-08-10 16:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jiri Slaby, Arnd Bergmann, LKML, Oleg Nesterov, Andrew Morton

On 8/10/2010 12:01 PM, Linus Torvalds wrote:
> 2010/8/7 Jiri Slaby <jslaby@suse.cz>:
>   
>> please consider the following repository for 2.6.36. It introduces a new
>> syscall for arch independent resource limits handling. It also adds a
>> support for runtime limits changing. This feature is needed mostly by
>> daemons servicing databases and similar service where limits are needed
>> to be changed without services being restarted on production systems.
>>     
> Ok, so the code looks fine, and I don't have any real objections any
> more. I don't know how much use this will get, but it doesn't appear
> to be "wrong" in any way. So I was going to pull it.
>
> However, in the meantime we have commit 5360bd776f73 ("Fix up the
> "generic" unistd.h ABI to be more useful") that clashes with it. Now,
> the conflict is trivial to resolve, and I could do that easily - it's
> not a technical problem. But that commit code comments say
>
>   + * Architectures may provide up to 16 syscalls of their own
>   + * starting with this value.
>   + */
>   +#define __NR_arch_specific_syscall 244
>
> and the new writable rlimits syscall is obviously 244.
>   

Jiri and I actually discussed this back on July 20th on LKML when it
first conflicted in linux-next, and at the time he said he'd move
prlimit64 to 261 in <asm-generic/unistd.h>.  It looks like what actually
stuck in linux-next was different, however.  It's partly my fault for
not following up on this.

> Now, looking at it all, I think that commit was badly done - not
> leaving any room for new generic system calls is pretty iffy. And if I
> had happened to take the Tilera merge later, I'd have had no problems
> with just changing it. As is, though, I want to check with Arnd and
> Chris first.
>   

In any case, obviously the larger question is how many
architecture-specific syscalls are appropriate, and where they should be
located in the syscall number space.  To be clear, the model for new
generic system calls is that they just continue on after the 16
architecture-specific ones, and in fact __NR_wait4 is already an example
of just this -- done that way to avoid making trouble for the "score"
architecture, since it was deprecated and then later un-deprecated.  So
new generic syscalls are not a problem.

There is definitely some tension between allowing architectures free
reign with their own set of unlimited additional syscalls on the one
hand, and having a contiguous and small array of syscalls on the other
hand.  I suspect it's slightly nicer to have a contiguous and small
array, as long as we've provided enough room for architectures to add
extra syscalls, but I'm not strongly married to this position.

For what it's worth, from Tilera's point of view we can certainly
tolerate changes in this area; we have not released any of this new
syscall ABI stuff to customers yet, so thrashing this just involves an
internal flag day for our developers, which is not too big a deal.

> Arnd, Chris - how about making the "arch-specific" system calls start
> at 256 or something? Or even higher, like 512? Yes, it makes the
> system call array bigger, but is that really a problem? Especially as
> we start the "deprecated" system calls at 1024, it would seem to make
> sense to raise it to 512, and leave the low numbers for the "regular"
> system calls.
>
> [ I'm leaving the quoted email for the edification of Chris/Arnd that
> I added to the discussion ]
>
>                                    Linus
>
> ---
>   
>>  git://decibel.fi.muni.cz/~xslaby/linux writable_limits
>>
>> Jiri Slaby (10):
>>      rlimits: security, add task_struct to setrlimit
>>      rlimits: add task_struct to update_rlimit_cpu
>>      rlimits: split sys_setrlimit
>>      rlimits: allow setrlimit to non-current tasks
>>      rlimits: do security check under task_lock
>>      rlimits: add rlimit64 structure
>>      rlimits: redo do_setrlimit to more generic do_prlimit
>>      rlimits: switch more rlimit syscalls to do_prlimit
>>      rlimits: implement prlimit64 syscall
>>      unistd: add __NR_prlimit64 syscall numbers
>>
>> Oleg Nesterov (2):
>>      rlimits: make sure ->rlim_max never grows in sys_setrlimit
>>      rlimits: selinux, do rlimits changes under task_lock
>>
>>  arch/x86/ia32/ia32entry.S          |    1 +
>>  arch/x86/include/asm/unistd_32.h   |    3 +-
>>  arch/x86/include/asm/unistd_64.h   |    2 +
>>  arch/x86/kernel/syscall_table_32.S |    1 +
>>  include/asm-generic/unistd.h       |    4 +-
>>  include/linux/posix-timers.h       |    2 +-
>>  include/linux/resource.h           |    9 ++
>>  include/linux/security.h           |    9 +-
>>  include/linux/syscalls.h           |    4 +
>>  kernel/compat.c                    |   17 +---
>>  kernel/posix-cpu-timers.c          |    8 +-
>>  kernel/sys.c                       |  202 ++++++++++++++++++++++++++++--------
>>  security/capability.c              |    3 +-
>>  security/security.c                |    5 +-
>>  security/selinux/hooks.c           |   12 ++-
>>  15 files changed, 207 insertions(+), 75 deletions(-)
>>
>> thanks,
>> --
>> js
>> suse labs
>>
>>     

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT] writable_limits for 2.6.36
  2010-08-10 16:01 ` Linus Torvalds
  2010-08-10 16:21   ` Chris Metcalf
@ 2010-08-10 16:24   ` Linus Torvalds
  1 sibling, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2010-08-10 16:24 UTC (permalink / raw)
  To: Jiri Slaby, Chris Metcalf, Arnd Bergmann
  Cc: LKML, Oleg Nesterov, Andrew Morton

On Tue, Aug 10, 2010 at 9:01 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Ok, so the code looks fine, and I don't have any real objections any
> more. I don't know how much use this will get, but it doesn't appear
> to be "wrong" in any way. So I was going to pull it.
>
> However, in the meantime we have commit 5360bd776f73 ("Fix up the
> "generic" unistd.h ABI to be more useful") that clashes with it. Now,
> the conflict is trivial to resolve, and I could do that easily - it's
> not a technical problem. But that commit code comments say
>
>  + * Architectures may provide up to 16 syscalls of their own
>  + * starting with this value.
>  + */
>  +#define __NR_arch_specific_syscall 244
>
> and the new writable rlimits syscall is obviously 244.

I should have clarified that. The new asm-generic prlimit64 system
call was added at the end (as 244), not in general. Only tilera and
score use that "generic" unistd.h file currently, and score doesn't do
any other system calls, which is why it's really only arch/tile that
is affected by this. Of course, new architectures are likely to use
that model, but we don't care about those yet.

I still think that starting the arch-specific ones at 512 is likely
the right model. I just wanted to clarify in case somebody thought
that x86 put a new system call at 244.

                        Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT] writable_limits for 2.6.36
  2010-08-10 16:21   ` Chris Metcalf
@ 2010-08-10 16:43     ` Jiri Slaby
  2010-08-10 18:50     ` Arnd Bergmann
  1 sibling, 0 replies; 10+ messages in thread
From: Jiri Slaby @ 2010-08-10 16:43 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Linus Torvalds, Arnd Bergmann, LKML, Oleg Nesterov, Andrew Morton

On 08/10/2010 06:21 PM, Chris Metcalf wrote:
> On 8/10/2010 12:01 PM, Linus Torvalds wrote:
>> 2010/8/7 Jiri Slaby <jslaby@suse.cz>:
>>   
>>> please consider the following repository for 2.6.36. It introduces a new
>>> syscall for arch independent resource limits handling. It also adds a
>>> support for runtime limits changing. This feature is needed mostly by
>>> daemons servicing databases and similar service where limits are needed
>>> to be changed without services being restarted on production systems.
>>>     
>> Ok, so the code looks fine, and I don't have any real objections any
>> more. I don't know how much use this will get, but it doesn't appear
>> to be "wrong" in any way. So I was going to pull it.

Ok, thanks.

>> However, in the meantime we have commit 5360bd776f73 ("Fix up the
>> "generic" unistd.h ABI to be more useful") that clashes with it. Now,
>> the conflict is trivial to resolve, and I could do that easily - it's
>> not a technical problem. But that commit code comments say
>>
>>   + * Architectures may provide up to 16 syscalls of their own
>>   + * starting with this value.
>>   + */
>>   +#define __NR_arch_specific_syscall 244
>>
>> and the new writable rlimits syscall is obviously 244.
>>   
> 
> Jiri and I actually discussed this back on July 20th on LKML when it
> first conflicted in linux-next, and at the time he said he'd move
> prlimit64 to 261 in <asm-generic/unistd.h>.  It looks like what actually
> stuck in linux-next was different, however.  It's partly my fault for
> not following up on this.

I would do that if the tree reached linus's tree earlier, so that I
could rebase my tree on the top of that. Otherwise I couldn't do much
with that.

The resolving (merge) in -next is done by Stephen, so he probably
misunderstood us. (Oh, I could have a for-next branch where I would
merge your tree to solve the -next merging done by Stephen, but it
wouldn't solve the situation we got into now.)

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT] writable_limits for 2.6.36
  2010-08-10 16:21   ` Chris Metcalf
  2010-08-10 16:43     ` Jiri Slaby
@ 2010-08-10 18:50     ` Arnd Bergmann
  2010-08-10 19:12       ` Linus Torvalds
  1 sibling, 1 reply; 10+ messages in thread
From: Arnd Bergmann @ 2010-08-10 18:50 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Linus Torvalds, Jiri Slaby, LKML, Oleg Nesterov, Andrew Morton

On Tuesday 10 August 2010, Chris Metcalf wrote:
> In any case, obviously the larger question is how many
> architecture-specific syscalls are appropriate, and where they should be
> located in the syscall number space.  To be clear, the model for new
> generic system calls is that they just continue on after the 16
> architecture-specific ones, and in fact __NR_wait4 is already an example
> of just this -- done that way to avoid making trouble for the "score"
> architecture, since it was deprecated and then later un-deprecated.  So
> new generic syscalls are not a problem.

Right. The writable_rlimits syscall should just go after wait4 at 262.

In retrospect, it would have been nicer to have the architecture specific
syscalls start at zero, but it's too late for that. Since we don't have
an architecture with more than a handful of arch specific calls, I think
16 will get us a very long way, while trying to leave "enough" space
between the generic and the arch specific calls would result either
in wasting space in the table or chosing a too small value.

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT] writable_limits for 2.6.36
  2010-08-10 18:50     ` Arnd Bergmann
@ 2010-08-10 19:12       ` Linus Torvalds
  2010-08-10 19:43         ` Chris Metcalf
  2010-08-11  2:39         ` Arnd Bergmann
  0 siblings, 2 replies; 10+ messages in thread
From: Linus Torvalds @ 2010-08-10 19:12 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Chris Metcalf, Jiri Slaby, LKML, Oleg Nesterov, Andrew Morton

On Tue, Aug 10, 2010 at 11:50 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> Right. The writable_rlimits syscall should just go after wait4 at 262.

.. and in the meantime I added the notify tree too, so now the
x86(-64) numbers also clashed.

So I just moved the prlimit64() system call, both on x86[-64] and in
asm-generic/unistd.h

Pushed out. Guys, please verify that it looks ok.

                    Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT] writable_limits for 2.6.36
  2010-08-10 19:12       ` Linus Torvalds
@ 2010-08-10 19:43         ` Chris Metcalf
  2010-08-10 21:44           ` Jiri Slaby
  2010-08-11  2:39         ` Arnd Bergmann
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Metcalf @ 2010-08-10 19:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Jiri Slaby, LKML, Oleg Nesterov, Andrew Morton

On 8/10/2010 3:12 PM, Linus Torvalds wrote:
> On Tue, Aug 10, 2010 at 11:50 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>   
>> Right. The writable_rlimits syscall should just go after wait4 at 262.
>>     
> .. and in the meantime I added the notify tree too, so now the
> x86(-64) numbers also clashed.
>
> So I just moved the prlimit64() system call, both on x86[-64] and in
> asm-generic/unistd.h
>
> Pushed out. Guys, please verify that it looks ok.
>   

It looks good in asm-generic/unistd.h; thanks.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT] writable_limits for 2.6.36
  2010-08-10 19:43         ` Chris Metcalf
@ 2010-08-10 21:44           ` Jiri Slaby
  0 siblings, 0 replies; 10+ messages in thread
From: Jiri Slaby @ 2010-08-10 21:44 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Linus Torvalds, Arnd Bergmann, LKML, Oleg Nesterov, Andrew Morton

On 08/10/2010 09:43 PM, Chris Metcalf wrote:
> On 8/10/2010 3:12 PM, Linus Torvalds wrote:
>> So I just moved the prlimit64() system call, both on x86[-64] and in
>> asm-generic/unistd.h
>>
>> Pushed out. Guys, please verify that it looks ok.
>>   
> 
> It looks good in asm-generic/unistd.h; thanks.

To me too, except the nits below.

---
 arch/x86/ia32/ia32entry.S          |    2 +-
 arch/x86/kernel/syscall_table_32.S |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 91dc4bb..b9472ec 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -844,5 +844,5 @@ ia32_sys_call_table:
        .quad compat_sys_recvmmsg
        .quad sys_fanotify_init
        .quad sys32_fanotify_mark
-       .quad sys_prlimit64
+       .quad sys_prlimit64                     /* 340 */
 ia32_syscall_end:
diff --git a/arch/x86/kernel/syscall_table_32.S
b/arch/x86/kernel/syscall_table_32.S
index 4802acc..b35786d 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -339,4 +339,4 @@ ENTRY(sys_call_table)
        .long sys_recvmmsg
        .long sys_fanotify_init
        .long sys_fanotify_mark
-       .long sys_prlimit64
+       .long sys_prlimit64             /* 340 */


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [GIT] writable_limits for 2.6.36
  2010-08-10 19:12       ` Linus Torvalds
  2010-08-10 19:43         ` Chris Metcalf
@ 2010-08-11  2:39         ` Arnd Bergmann
  1 sibling, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2010-08-11  2:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Chris Metcalf, Jiri Slaby, LKML, Oleg Nesterov, Andrew Morton

On Tuesday 10 August 2010, Linus Torvalds wrote:
> On Tue, Aug 10, 2010 at 11:50 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> So I just moved the prlimit64() system call, both on x86[-64] and in
> asm-generic/unistd.h
> 
> Pushed out. Guys, please verify that it looks ok.

Looks good, thanks!

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-08-11  2:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-07 12:15 [GIT] writable_limits for 2.6.36 Jiri Slaby
2010-08-10 16:01 ` Linus Torvalds
2010-08-10 16:21   ` Chris Metcalf
2010-08-10 16:43     ` Jiri Slaby
2010-08-10 18:50     ` Arnd Bergmann
2010-08-10 19:12       ` Linus Torvalds
2010-08-10 19:43         ` Chris Metcalf
2010-08-10 21:44           ` Jiri Slaby
2010-08-11  2:39         ` Arnd Bergmann
2010-08-10 16:24   ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox