* [PATCH 0/2] sched: SCHED_BATCH fixes @ 2011-02-22 21:04 Darren Hart 2011-02-22 21:04 ` [PATCH 1/2] sched: allow SCHED_BATCH to preempt SCHED_IDLE tasks Darren Hart 2011-02-22 21:04 ` [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy Darren Hart 0 siblings, 2 replies; 17+ messages in thread From: Darren Hart @ 2011-02-22 21:04 UTC (permalink / raw) To: Linux Kernel Mailing List Cc: Peter Zijlstra, Ingo Molnar, richard.purdie, dvhart The following patches tweak the scheduling behavior surrounding SCHED_BATCH tasks slightly. While experimenting on ways to improve performance of a build system I stumbled across what I thought to be oversights in the implementation. Specifically with respect to rtprio privileges and batch-idle task interaction. -- Darren Hart Intel Open Source Technology Center Yocto Project - Linux Kernel ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] sched: allow SCHED_BATCH to preempt SCHED_IDLE tasks 2011-02-22 21:04 [PATCH 0/2] sched: SCHED_BATCH fixes Darren Hart @ 2011-02-22 21:04 ` Darren Hart 2011-02-23 4:20 ` Mike Galbraith 2011-03-04 11:49 ` [tip:sched/core] sched: Allow " tip-bot for Darren Hart 2011-02-22 21:04 ` [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy Darren Hart 1 sibling, 2 replies; 17+ messages in thread From: Darren Hart @ 2011-02-22 21:04 UTC (permalink / raw) To: Linux Kernel Mailing List Cc: Peter Zijlstra, Ingo Molnar, richard.purdie, dvhart Perform the test for SCHED_IDLE before testing for SCHED_BATCH (and ensure idle tasks don't preempt idle tasks) so the non-interactive, but still important, SCHED_BATCH tasks will run in favor of the very low priority SCHED_IDLE tasks. Signed-off-by: Darren Hart <dvhart@linux.intel.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Ingo Molnar <mingo@elte.hu> CC: Richard Purdie <richard.purdie@linuxfoundation.org> --- kernel/sched_fair.c | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index 0c26e2d..ff04bbd 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -1857,16 +1857,18 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_ if (test_tsk_need_resched(curr)) return; + /* Idle tasks are by definition preempted by non-idle tasks. */ + if (unlikely(curr->policy == SCHED_IDLE) && + likely(p->policy != SCHED_IDLE)) + goto preempt; + /* - * Batch and idle tasks do not preempt (their preemption is driven by - * the tick): + * Batch and idle tasks do not preempt non-idle tasks (their preemption + * is driven by the tick): */ if (unlikely(p->policy != SCHED_NORMAL)) return; - /* Idle tasks are by definition preempted by everybody. */ - if (unlikely(curr->policy == SCHED_IDLE)) - goto preempt; if (!sched_feat(WAKEUP_PREEMPT)) return; -- 1.7.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] sched: allow SCHED_BATCH to preempt SCHED_IDLE tasks 2011-02-22 21:04 ` [PATCH 1/2] sched: allow SCHED_BATCH to preempt SCHED_IDLE tasks Darren Hart @ 2011-02-23 4:20 ` Mike Galbraith 2011-02-23 5:31 ` Mike Galbraith 2011-02-23 5:33 ` Darren Hart 2011-03-04 11:49 ` [tip:sched/core] sched: Allow " tip-bot for Darren Hart 1 sibling, 2 replies; 17+ messages in thread From: Mike Galbraith @ 2011-02-23 4:20 UTC (permalink / raw) To: Darren Hart Cc: Linux Kernel Mailing List, Peter Zijlstra, Ingo Molnar, richard.purdie On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: > Perform the test for SCHED_IDLE before testing for SCHED_BATCH (and ensure idle > tasks don't preempt idle tasks) so the non-interactive, but still important, > SCHED_BATCH tasks will run in favor of the very low priority SCHED_IDLE tasks. Yeah, that could be construed as a fairness fix for light SCHED_BATCH vs a heavy SCHED_IDLE. It should lower latencies for both when mixed. Acked-by: Mike Galbraith <efault@gmx.de> Nit below. > Signed-off-by: Darren Hart <dvhart@linux.intel.com> > CC: Peter Zijlstra <peterz@infradead.org> > CC: Ingo Molnar <mingo@elte.hu> > CC: Richard Purdie <richard.purdie@linuxfoundation.org> > --- > kernel/sched_fair.c | 12 +++++++----- > 1 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c > index 0c26e2d..ff04bbd 100644 > --- a/kernel/sched_fair.c > +++ b/kernel/sched_fair.c > @@ -1857,16 +1857,18 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_ > if (test_tsk_need_resched(curr)) > return; > > + /* Idle tasks are by definition preempted by non-idle tasks. */ > + if (unlikely(curr->policy == SCHED_IDLE) && > + likely(p->policy != SCHED_IDLE)) > + goto preempt; > + if (unlikely(curr->policy == SCHED_IDLE && p->policy != curr->policy)) goto preempt; Looks better to me. -Mike ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] sched: allow SCHED_BATCH to preempt SCHED_IDLE tasks 2011-02-23 4:20 ` Mike Galbraith @ 2011-02-23 5:31 ` Mike Galbraith 2011-02-23 5:33 ` Darren Hart 1 sibling, 0 replies; 17+ messages in thread From: Mike Galbraith @ 2011-02-23 5:31 UTC (permalink / raw) To: Darren Hart Cc: Linux Kernel Mailing List, Peter Zijlstra, Ingo Molnar, richard.purdie On Wed, 2011-02-23 at 05:20 +0100, Mike Galbraith wrote: > On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: > > Perform the test for SCHED_IDLE before testing for SCHED_BATCH (and ensure idle > > tasks don't preempt idle tasks) so the non-interactive, but still important, > > SCHED_BATCH tasks will run in favor of the very low priority SCHED_IDLE tasks. > > Yeah, that could be construed as a fairness fix for light SCHED_BATCH vs > a heavy SCHED_IDLE. It should lower latencies for both when mixed. Hm. Seems SCHED_IDLE _always_ being preempted is a potential terminal starvation bug though, unless preempt_tick() checks spread to guarantee that the preempted task will eventually get the CPU back, even in the face of massive non-idle wakeup driven load.. which it does not. (idle task holds resource?) Maybe my imagination has had too much java though. -Mike ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] sched: allow SCHED_BATCH to preempt SCHED_IDLE tasks 2011-02-23 4:20 ` Mike Galbraith 2011-02-23 5:31 ` Mike Galbraith @ 2011-02-23 5:33 ` Darren Hart 1 sibling, 0 replies; 17+ messages in thread From: Darren Hart @ 2011-02-23 5:33 UTC (permalink / raw) To: Mike Galbraith Cc: Linux Kernel Mailing List, Peter Zijlstra, Ingo Molnar, richard.purdie On 02/22/2011 08:20 PM, Mike Galbraith wrote: > On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: >> Perform the test for SCHED_IDLE before testing for SCHED_BATCH (and ensure idle >> tasks don't preempt idle tasks) so the non-interactive, but still important, >> SCHED_BATCH tasks will run in favor of the very low priority SCHED_IDLE tasks. > > Yeah, that could be construed as a fairness fix for light SCHED_BATCH vs > a heavy SCHED_IDLE. It should lower latencies for both when mixed. > > Acked-by: Mike Galbraith<efault@gmx.de> > > Nit below. > >> Signed-off-by: Darren Hart<dvhart@linux.intel.com> >> CC: Peter Zijlstra<peterz@infradead.org> >> CC: Ingo Molnar<mingo@elte.hu> >> CC: Richard Purdie<richard.purdie@linuxfoundation.org> >> --- >> kernel/sched_fair.c | 12 +++++++----- >> 1 files changed, 7 insertions(+), 5 deletions(-) >> >> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c >> index 0c26e2d..ff04bbd 100644 >> --- a/kernel/sched_fair.c >> +++ b/kernel/sched_fair.c >> @@ -1857,16 +1857,18 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_ >> if (test_tsk_need_resched(curr)) >> return; >> >> + /* Idle tasks are by definition preempted by non-idle tasks. */ >> + if (unlikely(curr->policy == SCHED_IDLE)&& >> + likely(p->policy != SCHED_IDLE)) >> + goto preempt; >> + > > if (unlikely(curr->policy == SCHED_IDLE&& p->policy != curr->policy)) > goto preempt; > > Looks better to me. I have no opinion on the unlikely/likely optimizations. I chose the way I did as I thought it was more consistent with the existing code. I'll leave that to Peter and Ingo - let me know if I should resend. -- Darren Hart Intel Open Source Technology Center Yocto Project - Linux Kernel ^ permalink raw reply [flat|nested] 17+ messages in thread
* [tip:sched/core] sched: Allow SCHED_BATCH to preempt SCHED_IDLE tasks 2011-02-22 21:04 ` [PATCH 1/2] sched: allow SCHED_BATCH to preempt SCHED_IDLE tasks Darren Hart 2011-02-23 4:20 ` Mike Galbraith @ 2011-03-04 11:49 ` tip-bot for Darren Hart 1 sibling, 0 replies; 17+ messages in thread From: tip-bot for Darren Hart @ 2011-03-04 11:49 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, a.p.zijlstra, dvhart, efault, richard.purdie, tglx, mingo Commit-ID: a2f5c9ab79f78e8b91ac993e0543d65b661dd19b Gitweb: http://git.kernel.org/tip/a2f5c9ab79f78e8b91ac993e0543d65b661dd19b Author: Darren Hart <dvhart@linux.intel.com> AuthorDate: Tue, 22 Feb 2011 13:04:33 -0800 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Fri, 4 Mar 2011 11:14:29 +0100 sched: Allow SCHED_BATCH to preempt SCHED_IDLE tasks Perform the test for SCHED_IDLE before testing for SCHED_BATCH (and ensure idle tasks don't preempt idle tasks) so the non-interactive, but still important, SCHED_BATCH tasks will run in favor of the very low priority SCHED_IDLE tasks. Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Cc: Richard Purdie <richard.purdie@linuxfoundation.org> LKML-Reference: <1298408674-3130-2-git-send-email-dvhart@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/sched_fair.c | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index 3a88dee..1438e13 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -1870,16 +1870,18 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_ if (test_tsk_need_resched(curr)) return; + /* Idle tasks are by definition preempted by non-idle tasks. */ + if (unlikely(curr->policy == SCHED_IDLE) && + likely(p->policy != SCHED_IDLE)) + goto preempt; + /* - * Batch and idle tasks do not preempt (their preemption is driven by - * the tick): + * Batch and idle tasks do not preempt non-idle tasks (their preemption + * is driven by the tick): */ if (unlikely(p->policy != SCHED_NORMAL)) return; - /* Idle tasks are by definition preempted by everybody. */ - if (unlikely(curr->policy == SCHED_IDLE)) - goto preempt; if (!sched_feat(WAKEUP_PREEMPT)) return; ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-22 21:04 [PATCH 0/2] sched: SCHED_BATCH fixes Darren Hart 2011-02-22 21:04 ` [PATCH 1/2] sched: allow SCHED_BATCH to preempt SCHED_IDLE tasks Darren Hart @ 2011-02-22 21:04 ` Darren Hart 2011-02-23 11:03 ` Peter Zijlstra 1 sibling, 1 reply; 17+ messages in thread From: Darren Hart @ 2011-02-22 21:04 UTC (permalink / raw) To: Linux Kernel Mailing List Cc: Peter Zijlstra, Ingo Molnar, richard.purdie, dvhart As it stands, users with rtprio rlimit permissions can change their policy from SCHED_OTHER to SCHED_FIFO and back. They can change to SCHED_IDLE, but not back to SCHED_FIFO. If they have the rtprio permission, they should be able to. Once in SCHED_FIFO, they could go back to SCHED_OTHER. This patch allows users with rtprio permission to change out of SCHED_IDLE. This patch allows the following test-case to pass for users with rtprio permissions with or without the ifdef block: int main() { int ret; struct sched_param sp; /* switch to SCHED_FIFO */ sp.sched_priority = 1; ret = sched_setscheduler(0, SCHED_FIFO, &sp); printf("setscheduler FIFO: %d\n", ret); if (ret) return ret; /* switch to SCHED_IDLE */ sp.sched_priority = 0; ret = sched_setscheduler(0, SCHED_IDLE, &sp); printf("setscheduler IDLE: %d\n", ret); if (ret) return ret; /* switch to SCHED_FIFO */ sp.sched_priority = 1; ret = sched_setscheduler(0, SCHED_FIFO, &sp); printf("setscheduler FIFO: %d\n", ret); if (ret) return ret; /* switch back to SCHED_OTHER */ sp.sched_priority = 0; ret = sched_setscheduler(0, SCHED_OTHER, &sp); printf("setscheduler OTHER: %d\n", ret); return ret; } Signed-off-by: Darren Hart <dvhart@linux.intel.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Ingo Molnar <mingo@elte.hu> CC: Richard Purdie <richard.purdie@linuxfoundation.org> --- kernel/sched.c | 12 ++++++++---- 1 files changed, 8 insertions(+), 4 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 18d38e4..ec6943e 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -4822,12 +4822,16 @@ recheck: param->sched_priority > rlim_rtprio) return -EPERM; } + /* - * Like positive nice levels, dont allow tasks to - * move out of SCHED_IDLE either: + * Like positive nice levels, don't allow tasks to move out of + * SCHED_IDLE either. Make an exception if the user can set rt + * policy normally. */ - if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) - return -EPERM; + if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) { + if (!task_rlimit(p, RLIMIT_RTPRIO)) + return -EPERM; + } /* can't change other user's priorities */ if (!check_same_owner(p)) -- 1.7.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-22 21:04 ` [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy Darren Hart @ 2011-02-23 11:03 ` Peter Zijlstra 2011-02-23 11:13 ` Ingo Molnar 0 siblings, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2011-02-23 11:03 UTC (permalink / raw) To: Darren Hart; +Cc: Linux Kernel Mailing List, Ingo Molnar, richard.purdie On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: > As it stands, users with rtprio rlimit permissions can change their policy from > SCHED_OTHER to SCHED_FIFO and back. They can change to SCHED_IDLE, but not back > to SCHED_FIFO. If they have the rtprio permission, they should be able to. Once > in SCHED_FIFO, they could go back to SCHED_OTHER. This patch allows users with > rtprio permission to change out of SCHED_IDLE. > Ingo, can you remember the rationale for this? The fact is that SCHED_IDLE is very near nice-20, and we can do: peterz@twins:~$ renice 5 -p $$ 1867: old priority 0, new priority 5 peterz@twins:~$ renice 0 -p $$ 1867: old priority 5, new priority 0 Which would suggest that we should be able to return to SCHED_OTHER RLIMIT_NICE-20. Hmm? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-23 11:03 ` Peter Zijlstra @ 2011-02-23 11:13 ` Ingo Molnar 2011-02-23 11:17 ` Peter Zijlstra 0 siblings, 1 reply; 17+ messages in thread From: Ingo Molnar @ 2011-02-23 11:13 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Darren Hart, Linux Kernel Mailing List, richard.purdie * Peter Zijlstra <peterz@infradead.org> wrote: > On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: > > As it stands, users with rtprio rlimit permissions can change their policy from > > SCHED_OTHER to SCHED_FIFO and back. They can change to SCHED_IDLE, but not back > > to SCHED_FIFO. If they have the rtprio permission, they should be able to. Once > > in SCHED_FIFO, they could go back to SCHED_OTHER. This patch allows users with > > rtprio permission to change out of SCHED_IDLE. > > > > Ingo, can you remember the rationale for this? > > The fact is that SCHED_IDLE is very near nice-20, and we can do: > > peterz@twins:~$ renice 5 -p $$ > 1867: old priority 0, new priority 5 > peterz@twins:~$ renice 0 -p $$ > 1867: old priority 5, new priority 0 > > Which would suggest that we should be able to return to SCHED_OTHER > RLIMIT_NICE-20. I dont remember anything subtle there - most likely we just forgot about that spot when adding RLIMIT_RTPRIO support. Thanks, Ingo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-23 11:13 ` Ingo Molnar @ 2011-02-23 11:17 ` Peter Zijlstra 2011-02-23 11:35 ` Ingo Molnar 2011-02-23 15:52 ` Darren Hart 0 siblings, 2 replies; 17+ messages in thread From: Peter Zijlstra @ 2011-02-23 11:17 UTC (permalink / raw) To: Ingo Molnar; +Cc: Darren Hart, Linux Kernel Mailing List, richard.purdie On Wed, 2011-02-23 at 12:13 +0100, Ingo Molnar wrote: > * Peter Zijlstra <peterz@infradead.org> wrote: > > > On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: > > > As it stands, users with rtprio rlimit permissions can change their policy from > > > SCHED_OTHER to SCHED_FIFO and back. They can change to SCHED_IDLE, but not back > > > to SCHED_FIFO. If they have the rtprio permission, they should be able to. Once > > > in SCHED_FIFO, they could go back to SCHED_OTHER. This patch allows users with > > > rtprio permission to change out of SCHED_IDLE. > > > > > > > Ingo, can you remember the rationale for this? > > > > The fact is that SCHED_IDLE is very near nice-20, and we can do: > > > > peterz@twins:~$ renice 5 -p $$ > > 1867: old priority 0, new priority 5 > > peterz@twins:~$ renice 0 -p $$ > > 1867: old priority 5, new priority 0 > > > > Which would suggest that we should be able to return to SCHED_OTHER > > RLIMIT_NICE-20. > > I dont remember anything subtle there - most likely we just forgot about that spot > when adding RLIMIT_RTPRIO support. Ah, I was arguing we should allow it regardless of RLIMIT_RTPRIO, based on RLIMIT_NICE, it is after all a change to SCHED_OTHER, not SCHED_FIFO/RR. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-23 11:17 ` Peter Zijlstra @ 2011-02-23 11:35 ` Ingo Molnar 2011-02-23 15:52 ` Darren Hart 1 sibling, 0 replies; 17+ messages in thread From: Ingo Molnar @ 2011-02-23 11:35 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Darren Hart, Linux Kernel Mailing List, richard.purdie * Peter Zijlstra <peterz@infradead.org> wrote: > On Wed, 2011-02-23 at 12:13 +0100, Ingo Molnar wrote: > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > > On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: > > > > As it stands, users with rtprio rlimit permissions can change their policy from > > > > SCHED_OTHER to SCHED_FIFO and back. They can change to SCHED_IDLE, but not back > > > > to SCHED_FIFO. If they have the rtprio permission, they should be able to. Once > > > > in SCHED_FIFO, they could go back to SCHED_OTHER. This patch allows users with > > > > rtprio permission to change out of SCHED_IDLE. > > > > > > > > > > Ingo, can you remember the rationale for this? > > > > > > The fact is that SCHED_IDLE is very near nice-20, and we can do: > > > > > > peterz@twins:~$ renice 5 -p $$ > > > 1867: old priority 0, new priority 5 > > > peterz@twins:~$ renice 0 -p $$ > > > 1867: old priority 5, new priority 0 > > > > > > Which would suggest that we should be able to return to SCHED_OTHER > > > RLIMIT_NICE-20. > > > > I dont remember anything subtle there - most likely we just forgot about that spot > > when adding RLIMIT_RTPRIO support. > > Ah, I was arguing we should allow it regardless of RLIMIT_RTPRIO, based on > RLIMIT_NICE, it is after all a change to SCHED_OTHER, not SCHED_FIFO/RR. Sure. We just went for the most restrictive conditions - it's hard to add security holes that way ;-) Ingo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-23 11:17 ` Peter Zijlstra 2011-02-23 11:35 ` Ingo Molnar @ 2011-02-23 15:52 ` Darren Hart 2011-02-23 16:00 ` Peter Zijlstra 1 sibling, 1 reply; 17+ messages in thread From: Darren Hart @ 2011-02-23 15:52 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, Linux Kernel Mailing List, richard.purdie On 02/23/2011 03:17 AM, Peter Zijlstra wrote: > On Wed, 2011-02-23 at 12:13 +0100, Ingo Molnar wrote: >> * Peter Zijlstra<peterz@infradead.org> wrote: >> >>> On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: >>>> As it stands, users with rtprio rlimit permissions can change their policy from >>>> SCHED_OTHER to SCHED_FIFO and back. They can change to SCHED_IDLE, but not back >>>> to SCHED_FIFO. If they have the rtprio permission, they should be able to. Once >>>> in SCHED_FIFO, they could go back to SCHED_OTHER. This patch allows users with >>>> rtprio permission to change out of SCHED_IDLE. >>>> >>> >>> Ingo, can you remember the rationale for this? >>> >>> The fact is that SCHED_IDLE is very near nice-20, and we can do: >>> >>> peterz@twins:~$ renice 5 -p $$ >>> 1867: old priority 0, new priority 5 >>> peterz@twins:~$ renice 0 -p $$ >>> 1867: old priority 5, new priority 0 >>> >>> Which would suggest that we should be able to return to SCHED_OTHER >>> RLIMIT_NICE-20. >> >> I dont remember anything subtle there - most likely we just forgot about that spot >> when adding RLIMIT_RTPRIO support. > > Ah, I was arguing we should allow it regardless of RLIMIT_RTPRIO, based > on RLIMIT_NICE, it is after all a change to SCHED_OTHER, not > SCHED_FIFO/RR. So we need an OR test of RLIMIT_NICE | RLIMIT_RTPRIO ? The reason I keep coming back to RTPRIO is it allows the user to change to SCHED_(FIFO|RR), and from there they can change to anything they want - so why force two steps? Perhaps the argument is to keep the meaning of the RLIMITs precise, and if you want to go from IDLE->OTHER you had better properly set RLIMIT_NICE - maybe I just convinced myself. Shall I respin the patch to reflect that? -- Darren Hart Intel Open Source Technology Center Yocto Project - Linux Kernel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-23 15:52 ` Darren Hart @ 2011-02-23 16:00 ` Peter Zijlstra 2011-02-23 16:07 ` Darren Hart 2011-02-23 21:28 ` Darren Hart 0 siblings, 2 replies; 17+ messages in thread From: Peter Zijlstra @ 2011-02-23 16:00 UTC (permalink / raw) To: Darren Hart; +Cc: Ingo Molnar, Linux Kernel Mailing List, richard.purdie On Wed, 2011-02-23 at 07:52 -0800, Darren Hart wrote: > On 02/23/2011 03:17 AM, Peter Zijlstra wrote: > > On Wed, 2011-02-23 at 12:13 +0100, Ingo Molnar wrote: > >> * Peter Zijlstra<peterz@infradead.org> wrote: > >> > >>> On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: > >>>> As it stands, users with rtprio rlimit permissions can change their policy from > >>>> SCHED_OTHER to SCHED_FIFO and back. They can change to SCHED_IDLE, but not back > >>>> to SCHED_FIFO. If they have the rtprio permission, they should be able to. Once > >>>> in SCHED_FIFO, they could go back to SCHED_OTHER. This patch allows users with > >>>> rtprio permission to change out of SCHED_IDLE. > >>>> > >>> > >>> Ingo, can you remember the rationale for this? > >>> > >>> The fact is that SCHED_IDLE is very near nice-20, and we can do: > >>> > >>> peterz@twins:~$ renice 5 -p $$ > >>> 1867: old priority 0, new priority 5 > >>> peterz@twins:~$ renice 0 -p $$ > >>> 1867: old priority 5, new priority 0 > >>> > >>> Which would suggest that we should be able to return to SCHED_OTHER > >>> RLIMIT_NICE-20. > >> > >> I dont remember anything subtle there - most likely we just forgot about that spot > >> when adding RLIMIT_RTPRIO support. > > > > Ah, I was arguing we should allow it regardless of RLIMIT_RTPRIO, based > > on RLIMIT_NICE, it is after all a change to SCHED_OTHER, not > > SCHED_FIFO/RR. > > So we need an OR test of RLIMIT_NICE | RLIMIT_RTPRIO ? Just RLIMIT_NICE I think. > The reason I keep > coming back to RTPRIO is it allows the user to change to > SCHED_(FIFO|RR), and from there they can change to anything they want - Hmm,. is that so? I would think that even if you're SCHED_FIFO changing back to SCHED_OTHER ought to make you respect RLIMIT_NICE. That is, even if you're a SCHED_FIFO-1 task due to RLIMIT_RTPRIO, when you switch back to SCHED_OTHER I would expect you not to be able to switch to a lower nice than RLIMIT_NICE-20. Now, if this isn't actually so I think we ought to make it so. > so why force two steps? Perhaps the argument is to keep the meaning of > the RLIMITs precise, and if you want to go from IDLE->OTHER you had > better properly set RLIMIT_NICE - maybe I just convinced myself. Right > Shall I respin the patch to reflect that? Please. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-23 16:00 ` Peter Zijlstra @ 2011-02-23 16:07 ` Darren Hart 2011-02-23 21:28 ` Darren Hart 1 sibling, 0 replies; 17+ messages in thread From: Darren Hart @ 2011-02-23 16:07 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, Linux Kernel Mailing List, richard.purdie On 02/23/2011 08:00 AM, Peter Zijlstra wrote: > On Wed, 2011-02-23 at 07:52 -0800, Darren Hart wrote: >> On 02/23/2011 03:17 AM, Peter Zijlstra wrote: >>> On Wed, 2011-02-23 at 12:13 +0100, Ingo Molnar wrote: >>>> * Peter Zijlstra<peterz@infradead.org> wrote: >>>> >>>>> On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: >>>>>> As it stands, users with rtprio rlimit permissions can change their policy from >>>>>> SCHED_OTHER to SCHED_FIFO and back. They can change to SCHED_IDLE, but not back >>>>>> to SCHED_FIFO. If they have the rtprio permission, they should be able to. Once >>>>>> in SCHED_FIFO, they could go back to SCHED_OTHER. This patch allows users with >>>>>> rtprio permission to change out of SCHED_IDLE. >>>>>> >>>>> >>>>> Ingo, can you remember the rationale for this? >>>>> >>>>> The fact is that SCHED_IDLE is very near nice-20, and we can do: >>>>> >>>>> peterz@twins:~$ renice 5 -p $$ >>>>> 1867: old priority 0, new priority 5 >>>>> peterz@twins:~$ renice 0 -p $$ >>>>> 1867: old priority 5, new priority 0 >>>>> >>>>> Which would suggest that we should be able to return to SCHED_OTHER >>>>> RLIMIT_NICE-20. >>>> >>>> I dont remember anything subtle there - most likely we just forgot about that spot >>>> when adding RLIMIT_RTPRIO support. >>> >>> Ah, I was arguing we should allow it regardless of RLIMIT_RTPRIO, based >>> on RLIMIT_NICE, it is after all a change to SCHED_OTHER, not >>> SCHED_FIFO/RR. >> >> So we need an OR test of RLIMIT_NICE | RLIMIT_RTPRIO ? > > Just RLIMIT_NICE I think. > >> The reason I keep >> coming back to RTPRIO is it allows the user to change to >> SCHED_(FIFO|RR), and from there they can change to anything they want - > > Hmm,. is that so? I would think that even if you're SCHED_FIFO changing > back to SCHED_OTHER ought to make you respect RLIMIT_NICE. > > That is, even if you're a SCHED_FIFO-1 task due to RLIMIT_RTPRIO, when > you switch back to SCHED_OTHER I would expect you not to be able to > switch to a lower nice than RLIMIT_NICE-20. > > Now, if this isn't actually so I think we ought to make it so. I was just thinking in terms of POLICY, not priority or nice level. I'll do some experimenting and see where the limits are. > >> so why force two steps? Perhaps the argument is to keep the meaning of >> the RLIMITs precise, and if you want to go from IDLE->OTHER you had >> better properly set RLIMIT_NICE - maybe I just convinced myself. > > Right > >> Shall I respin the patch to reflect that? > > Please. Will do. -- Darren Hart Intel Open Source Technology Center Yocto Project - Linux Kernel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-23 16:00 ` Peter Zijlstra 2011-02-23 16:07 ` Darren Hart @ 2011-02-23 21:28 ` Darren Hart 2011-02-24 11:49 ` Peter Zijlstra 2011-03-04 11:49 ` [tip:sched/core] sched: Allow users with sufficient RLIMIT_NICE " tip-bot for Darren Hart 1 sibling, 2 replies; 17+ messages in thread From: Darren Hart @ 2011-02-23 21:28 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, Linux Kernel Mailing List, richard.purdie On 02/23/2011 08:00 AM, Peter Zijlstra wrote: > On Wed, 2011-02-23 at 07:52 -0800, Darren Hart wrote: >> On 02/23/2011 03:17 AM, Peter Zijlstra wrote: >>> On Wed, 2011-02-23 at 12:13 +0100, Ingo Molnar wrote: >>>> * Peter Zijlstra<peterz@infradead.org> wrote: >>>> >>>>> On Tue, 2011-02-22 at 13:04 -0800, Darren Hart wrote: >>>>>> As it stands, users with rtprio rlimit permissions can change their policy from >>>>>> SCHED_OTHER to SCHED_FIFO and back. They can change to SCHED_IDLE, but not back >>>>>> to SCHED_FIFO. If they have the rtprio permission, they should be able to. Once >>>>>> in SCHED_FIFO, they could go back to SCHED_OTHER. This patch allows users with >>>>>> rtprio permission to change out of SCHED_IDLE. >>>>>> >>>>> >>>>> Ingo, can you remember the rationale for this? >>>>> >>>>> The fact is that SCHED_IDLE is very near nice-20, and we can do: >>>>> >>>>> peterz@twins:~$ renice 5 -p $$ >>>>> 1867: old priority 0, new priority 5 >>>>> peterz@twins:~$ renice 0 -p $$ >>>>> 1867: old priority 5, new priority 0 >>>>> >>>>> Which would suggest that we should be able to return to SCHED_OTHER >>>>> RLIMIT_NICE-20. >>>> >>>> I dont remember anything subtle there - most likely we just forgot about that spot >>>> when adding RLIMIT_RTPRIO support. >>> >>> Ah, I was arguing we should allow it regardless of RLIMIT_RTPRIO, based >>> on RLIMIT_NICE, it is after all a change to SCHED_OTHER, not >>> SCHED_FIFO/RR. >> >> So we need an OR test of RLIMIT_NICE | RLIMIT_RTPRIO ? > > Just RLIMIT_NICE I think. Agreed. > >> The reason I keep >> coming back to RTPRIO is it allows the user to change to >> SCHED_(FIFO|RR), and from there they can change to anything they want - > > Hmm,. is that so? I would think that even if you're SCHED_FIFO changing > back to SCHED_OTHER ought to make you respect RLIMIT_NICE. You are correct, no gaps here. > > That is, even if you're a SCHED_FIFO-1 task due to RLIMIT_RTPRIO, when > you switch back to SCHED_OTHER I would expect you not to be able to > switch to a lower nice than RLIMIT_NICE-20. > > Now, if this isn't actually so I think we ought to make it so. > >> so why force two steps? Perhaps the argument is to keep the meaning of >> the RLIMITs precise, and if you want to go from IDLE->OTHER you had >> better properly set RLIMIT_NICE - maybe I just convinced myself. > > Right > >> Shall I respin the patch to reflect that? > > Please. How about this: >From b66e1a5b1e61476c1af0095f16c666b94664f7f0 Mon Sep 17 00:00:00 2001 From: Darren Hart <dvhart@linux.intel.com> Date: Thu, 17 Feb 2011 15:37:07 -0800 Subject: [PATCH] sched: allow users with sufficient RLIMIT_NICE to change from SCHED_IDLE policy The current scheduler implementation returns -EPERM when trying to change from SCHED_IDLE to SCHED_OTHER or SCHED_BATCH. Since SCHED_IDLE is considered to be equivalent to a nice 20, changing to another policy should be allowed provided the RLIMIT_NICE is accounted for. This patch allows the following test-case to pass with RLIMIT_NICE=40, but still fail with RLIMIT_NICE=10 when the calling process is run from a typical shell (nice 0, or 20 in rlimit terms). int main() { int ret; struct sched_param sp; sp.sched_priority = 0; /* switch to SCHED_IDLE */ ret = sched_setscheduler(0, SCHED_IDLE, &sp); printf("setscheduler IDLE: %d\n", ret); if (ret) return ret; /* switch back to SCHED_OTHER */ ret = sched_setscheduler(0, SCHED_OTHER, &sp); printf("setscheduler OTHER: %d\n", ret); return ret; } $ ulimit -e 40 $ ./test setscheduler IDLE: 0 setscheduler OTHER: 0 $ ulimit -e 10 $ ulimit -e 10 $ ./test setscheduler IDLE: 0 setscheduler OTHER: -1 Signed-off-by: Darren Hart <dvhart@linux.intel.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Ingo Molnar <mingo@elte.hu> CC: Richard Purdie <richard.purdie@linuxfoundation.org> --- kernel/sched.c | 11 +++++++---- 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 18d38e4..9bf6284 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -4822,12 +4822,15 @@ recheck: param->sched_priority > rlim_rtprio) return -EPERM; } + /* - * Like positive nice levels, dont allow tasks to - * move out of SCHED_IDLE either: + * Treat SCHED_IDLE as nice 20. Only allow a switch to + * SCHED_NORMAL if the RLIMIT_NICE would normally permit it. */ - if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) - return -EPERM; + if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) { + if (!can_nice(p, TASK_NICE(p))) + return -EPERM; + } /* can't change other user's priorities */ if (!check_same_owner(p)) -- 1.7.1 -- Darren Hart Intel Open Source Technology Center Yocto Project - Linux Kernel ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy 2011-02-23 21:28 ` Darren Hart @ 2011-02-24 11:49 ` Peter Zijlstra 2011-03-04 11:49 ` [tip:sched/core] sched: Allow users with sufficient RLIMIT_NICE " tip-bot for Darren Hart 1 sibling, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2011-02-24 11:49 UTC (permalink / raw) To: Darren Hart; +Cc: Ingo Molnar, Linux Kernel Mailing List, richard.purdie On Wed, 2011-02-23 at 13:28 -0800, Darren Hart wrote: > From: Darren Hart <dvhart@linux.intel.com> > Date: Thu, 17 Feb 2011 15:37:07 -0800 > Subject: [PATCH] sched: allow users with sufficient RLIMIT_NICE to change from SCHED_IDLE policy > > The current scheduler implementation returns -EPERM when trying to change from > SCHED_IDLE to SCHED_OTHER or SCHED_BATCH. > Since SCHED_IDLE is considered to be > equivalent to a nice 20, Well, its not quite equivalent, its actually 5 times lighter still and the preemption behaviour is slightly different as you've found ;-) > changing to another policy should be allowed provided > the RLIMIT_NICE is accounted for. > > This patch allows the following test-case to pass with RLIMIT_NICE=40, but still > fail with RLIMIT_NICE=10 when the calling process is run from a typical shell > (nice 0, or 20 in rlimit terms). > > Signed-off-by: Darren Hart <dvhart@linux.intel.com> > CC: Peter Zijlstra <peterz@infradead.org> > CC: Ingo Molnar <mingo@elte.hu> > CC: Richard Purdie <richard.purdie@linuxfoundation.org> > --- > kernel/sched.c | 11 +++++++---- > 1 files changed, 7 insertions(+), 4 deletions(-) > > diff --git a/kernel/sched.c b/kernel/sched.c > index 18d38e4..9bf6284 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -4822,12 +4822,15 @@ recheck: > param->sched_priority > rlim_rtprio) > return -EPERM; > } > + > /* > + * Treat SCHED_IDLE as nice 20. Only allow a switch to > + * SCHED_NORMAL if the RLIMIT_NICE would normally permit it. > */ > + if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) { > + if (!can_nice(p, TASK_NICE(p))) > + return -EPERM; > + } > > /* can't change other user's priorities */ > if (!check_same_owner(p)) Looks fine, thanks! ^ permalink raw reply [flat|nested] 17+ messages in thread
* [tip:sched/core] sched: Allow users with sufficient RLIMIT_NICE to change from SCHED_IDLE policy 2011-02-23 21:28 ` Darren Hart 2011-02-24 11:49 ` Peter Zijlstra @ 2011-03-04 11:49 ` tip-bot for Darren Hart 1 sibling, 0 replies; 17+ messages in thread From: tip-bot for Darren Hart @ 2011-03-04 11:49 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, a.p.zijlstra, dvhart, richard.purdie, tglx, mingo Commit-ID: c02aa73b1d18e43cfd79c2f193b225e84ca497c8 Gitweb: http://git.kernel.org/tip/c02aa73b1d18e43cfd79c2f193b225e84ca497c8 Author: Darren Hart <dvhart@linux.intel.com> AuthorDate: Thu, 17 Feb 2011 15:37:07 -0800 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Fri, 4 Mar 2011 11:14:30 +0100 sched: Allow users with sufficient RLIMIT_NICE to change from SCHED_IDLE policy The current scheduler implementation returns -EPERM when trying to change from SCHED_IDLE to SCHED_OTHER or SCHED_BATCH. Since SCHED_IDLE is considered to be a nice 20 on steroids, changing to another policy should be allowed provided the RLIMIT_NICE is accounted for. This patch allows the following test-case to pass with RLIMIT_NICE=40, but still fail with RLIMIT_NICE=10 when the calling process is run from a typical shell (nice 0, or 20 in rlimit terms). int main() { int ret; struct sched_param sp; sp.sched_priority = 0; /* switch to SCHED_IDLE */ ret = sched_setscheduler(0, SCHED_IDLE, &sp); printf("setscheduler IDLE: %d\n", ret); if (ret) return ret; /* switch back to SCHED_OTHER */ ret = sched_setscheduler(0, SCHED_OTHER, &sp); printf("setscheduler OTHER: %d\n", ret); return ret; } $ ulimit -e 40 $ ./test setscheduler IDLE: 0 setscheduler OTHER: 0 $ ulimit -e 10 $ ulimit -e 10 $ ./test setscheduler IDLE: 0 setscheduler OTHER: -1 Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Purdie <richard.purdie@linuxfoundation.org> LKML-Reference: <4D657BEE.4040608@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/sched.c | 11 +++++++---- 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 0c87126..f303070 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -4981,12 +4981,15 @@ recheck: param->sched_priority > rlim_rtprio) return -EPERM; } + /* - * Like positive nice levels, dont allow tasks to - * move out of SCHED_IDLE either: + * Treat SCHED_IDLE as nice 20. Only allow a switch to + * SCHED_NORMAL if the RLIMIT_NICE would normally permit it. */ - if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) - return -EPERM; + if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) { + if (!can_nice(p, TASK_NICE(p))) + return -EPERM; + } /* can't change other user's priorities */ if (!check_same_owner(p)) ^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2011-03-04 11:50 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-22 21:04 [PATCH 0/2] sched: SCHED_BATCH fixes Darren Hart 2011-02-22 21:04 ` [PATCH 1/2] sched: allow SCHED_BATCH to preempt SCHED_IDLE tasks Darren Hart 2011-02-23 4:20 ` Mike Galbraith 2011-02-23 5:31 ` Mike Galbraith 2011-02-23 5:33 ` Darren Hart 2011-03-04 11:49 ` [tip:sched/core] sched: Allow " tip-bot for Darren Hart 2011-02-22 21:04 ` [PATCH 2/2] sched: allow users with rtprio rlimit to change from SCHED_IDLE policy Darren Hart 2011-02-23 11:03 ` Peter Zijlstra 2011-02-23 11:13 ` Ingo Molnar 2011-02-23 11:17 ` Peter Zijlstra 2011-02-23 11:35 ` Ingo Molnar 2011-02-23 15:52 ` Darren Hart 2011-02-23 16:00 ` Peter Zijlstra 2011-02-23 16:07 ` Darren Hart 2011-02-23 21:28 ` Darren Hart 2011-02-24 11:49 ` Peter Zijlstra 2011-03-04 11:49 ` [tip:sched/core] sched: Allow users with sufficient RLIMIT_NICE " tip-bot for Darren Hart
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox