* [PATCH] sched/deadline: Always calculate end of period on sched_yield() @ 2016-02-12 23:10 Steven Rostedt 2016-02-15 10:18 ` Juri Lelli 2016-02-23 12:28 ` Peter Zijlstra 0 siblings, 2 replies; 8+ messages in thread From: Steven Rostedt @ 2016-02-12 23:10 UTC (permalink / raw) To: LKML Cc: Juri Lelli, Peter Zijlstra, Ingo Molnar, Clark Williams, Daniel Bristot de Oliveira, John Kacur I'm writing a test case for SCHED_DEADLINE, and notice a strange anomaly. Every so often, a deadline is missed and when I looked into it, it happened because the sched_yield() had no effect (it didn't end the previous period and let the start of the next runtime happen on the end of the old period). deadline-2228 7...1 116.778420: sys_enter_sched_yield: deadline-2228 7d..3 116.778421: hrtimer_cancel: hrtimer=0xffff88011ebd79a0 deadline-2228 7d..2 116.778422: rcu_utilization: Start context switch deadline-2228 7d..2 116.778423: rcu_utilization: End context switch deadline-2228 7d..4 116.778423: hrtimer_start: hrtimer=0xffff88011ebd79a0 function=hrtick/0x0 expires=116124420428 softexpires=116124420428 deadline-2228 7...1 116.778425: sys_exit_sched_yield: 0x0 Schedule was never called. A added some trace_printks() and discovered that this happens when sched_yield() is called right after a tick that updates its current bandwidth. When the schedule tick happens that updates the current bandwidth, update_curr_dl() is called, where it updates curr->se.exec_start to rq_clock_task(rq). The rq_clock_task(rq) gets updated by update_rq_clock_task() that gets update by various points in the scheduler. Now, if the user task calls sched_yield() just after a bandwidth update synced curr->se.exec_start to rq_clock_task(rq), when sched_yield() calls into update_curr_dl() we have: delta_exec = rq_clock_task(rq) - curr->se.exec_start; if (unlikely((s64)delta_exec <= 0)) return; Coming in here from a sched_yield() will have delta_exec == 0 if the sched_yield() was called after a DL tick and before another update_rq_clock_task() is called. This means that the task will not release its remaining runtime, and the will start off in the current period when it expected to be in the next period. The fix that appears to work for me is to add a test in update_curr_dl() to not exit if delta_exec is zero and dl_se->dl_yielded is true. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index cd64c979d0e1..1dd180cda574 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -735,7 +735,7 @@ static void update_curr_dl(struct rq *rq) * approach need further study. */ delta_exec = rq_clock_task(rq) - curr->se.exec_start; - if (unlikely((s64)delta_exec <= 0)) + if (unlikely((s64)delta_exec <= 0 && !dl_se->dl_yielded)) return; schedstat_set(curr->se.statistics.exec_max, ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/deadline: Always calculate end of period on sched_yield() 2016-02-12 23:10 [PATCH] sched/deadline: Always calculate end of period on sched_yield() Steven Rostedt @ 2016-02-15 10:18 ` Juri Lelli 2016-02-15 12:37 ` Daniel Bristot de Oliveira 2016-02-15 16:22 ` Steven Rostedt 2016-02-23 12:28 ` Peter Zijlstra 1 sibling, 2 replies; 8+ messages in thread From: Juri Lelli @ 2016-02-15 10:18 UTC (permalink / raw) To: Steven Rostedt Cc: LKML, Juri Lelli, Peter Zijlstra, Ingo Molnar, Clark Williams, Daniel Bristot de Oliveira, John Kacur Hi, On 12/02/16 18:10, Steven Rostedt wrote: > I'm writing a test case for SCHED_DEADLINE, and notice a strange > anomaly. Every so often, a deadline is missed and when I looked into > it, it happened because the sched_yield() had no effect (it didn't end > the previous period and let the start of the next runtime happen on the > end of the old period). > > deadline-2228 7...1 116.778420: sys_enter_sched_yield: > deadline-2228 7d..3 116.778421: hrtimer_cancel: hrtimer=0xffff88011ebd79a0 > deadline-2228 7d..2 116.778422: rcu_utilization: Start context switch > deadline-2228 7d..2 116.778423: rcu_utilization: End context switch > deadline-2228 7d..4 116.778423: hrtimer_start: hrtimer=0xffff88011ebd79a0 function=hrtick/0x0 expires=116124420428 softexpires=116124420428 > deadline-2228 7...1 116.778425: sys_exit_sched_yield: 0x0 > > > Schedule was never called. A added some trace_printks() and discovered > that this happens when sched_yield() is called right after a tick that > updates its current bandwidth. > > When the schedule tick happens that updates the current bandwidth, > update_curr_dl() is called, where it updates curr->se.exec_start to > rq_clock_task(rq). > > The rq_clock_task(rq) gets updated by update_rq_clock_task() that gets > update by various points in the scheduler. > > Now, if the user task calls sched_yield() just after a bandwidth update > synced curr->se.exec_start to rq_clock_task(rq), when sched_yield() > calls into update_curr_dl() we have: > > delta_exec = rq_clock_task(rq) - curr->se.exec_start; > if (unlikely((s64)delta_exec <= 0)) > return; > > Coming in here from a sched_yield() will have delta_exec == 0 if the > sched_yield() was called after a DL tick and before another > update_rq_clock_task() is called. > > This means that the task will not release its remaining runtime, and > the will start off in the current period when it expected to be in the > next period. > > The fix that appears to work for me is to add a test in > update_curr_dl() to not exit if delta_exec is zero and > dl_se->dl_yielded is true. > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > --- > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index cd64c979d0e1..1dd180cda574 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -735,7 +735,7 @@ static void update_curr_dl(struct rq *rq) > * approach need further study. > */ > delta_exec = rq_clock_task(rq) - curr->se.exec_start; > - if (unlikely((s64)delta_exec <= 0)) > + if (unlikely((s64)delta_exec <= 0 && !dl_se->dl_yielded)) > return; > This looks good to me. Do you think we could also skip some of the following updates/accounting in this case? Not sure we win anything by doing that, though. Thanks, - Juri ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/deadline: Always calculate end of period on sched_yield() 2016-02-15 10:18 ` Juri Lelli @ 2016-02-15 12:37 ` Daniel Bristot de Oliveira 2016-02-15 16:22 ` Steven Rostedt 1 sibling, 0 replies; 8+ messages in thread From: Daniel Bristot de Oliveira @ 2016-02-15 12:37 UTC (permalink / raw) To: Juri Lelli, Steven Rostedt Cc: LKML, Juri Lelli, Peter Zijlstra, Ingo Molnar, Clark Williams, John Kacur On 02/15/2016 08:18 AM, Juri Lelli wrote: > Do you think we could also skip some of the > following updates/accounting in this case? Not sure we win anything by > doing that, though. I reviewed rostedt's patch and the following updates/accounting operations. I agree with rostedt's patch, and also agree that if (delta_exec == 0) it is a good idea to skip some += 0 and function calls of the next updates/accounting operations, before the if (dl_runtime_exeeded...). ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/deadline: Always calculate end of period on sched_yield() 2016-02-15 10:18 ` Juri Lelli 2016-02-15 12:37 ` Daniel Bristot de Oliveira @ 2016-02-15 16:22 ` Steven Rostedt 1 sibling, 0 replies; 8+ messages in thread From: Steven Rostedt @ 2016-02-15 16:22 UTC (permalink / raw) To: Juri Lelli Cc: LKML, Juri Lelli, Peter Zijlstra, Ingo Molnar, Clark Williams, Daniel Bristot de Oliveira, John Kacur On Mon, 15 Feb 2016 10:18:24 +0000 Juri Lelli <juri.lelli@arm.com> wrote: > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > > --- > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > > index cd64c979d0e1..1dd180cda574 100644 > > --- a/kernel/sched/deadline.c > > +++ b/kernel/sched/deadline.c > > @@ -735,7 +735,7 @@ static void update_curr_dl(struct rq *rq) > > * approach need further study. > > */ > > delta_exec = rq_clock_task(rq) - curr->se.exec_start; > > - if (unlikely((s64)delta_exec <= 0)) > > + if (unlikely((s64)delta_exec <= 0 && !dl_se->dl_yielded)) > > return; > > > > This looks good to me. Do you think we could also skip some of the > following updates/accounting in this case? Not sure we win anything by > doing that, though. > Well, I would say we get this patch in first and think about other updates second. This fixes one bug, might as well pull it in. I'm now looking into a second bug. I'm getting: RT throttling activated and DL replenish lagged to much messages, back to back, when I'm only using 50% of the band width. Looks to be a leak of how much is being used. The big issue here is that these messages kill the test due to the latency caused to perform the printk(). After the messages are splatted out (they only print once per boot), the tests run fine again. IOW, there seems to be no real issue of something doing too much bandwidth. I get this with or without this current patch. -- Steve ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/deadline: Always calculate end of period on sched_yield() 2016-02-12 23:10 [PATCH] sched/deadline: Always calculate end of period on sched_yield() Steven Rostedt 2016-02-15 10:18 ` Juri Lelli @ 2016-02-23 12:28 ` Peter Zijlstra 2016-02-23 13:12 ` Steven Rostedt ` (2 more replies) 1 sibling, 3 replies; 8+ messages in thread From: Peter Zijlstra @ 2016-02-23 12:28 UTC (permalink / raw) To: Steven Rostedt Cc: LKML, Juri Lelli, Ingo Molnar, Clark Williams, Daniel Bristot de Oliveira, John Kacur On Fri, Feb 12, 2016 at 06:10:20PM -0500, Steven Rostedt wrote: > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index cd64c979d0e1..1dd180cda574 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -735,7 +735,7 @@ static void update_curr_dl(struct rq *rq) > * approach need further study. > */ > delta_exec = rq_clock_task(rq) - curr->se.exec_start; > - if (unlikely((s64)delta_exec <= 0)) > + if (unlikely((s64)delta_exec <= 0 && !dl_se->dl_yielded)) > return; > > schedstat_set(curr->se.statistics.exec_max, Would something like this make sense instead? It also retains the ->runtime while yielded, and would actually 'fix' a case where, when we call yield, we would have had a negative runtime after update_curr_dl(). The current code will 'gift' us extra runtime in that case. --- kernel/sched/deadline.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 57b939c81bce..c2bca80d3388 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -399,6 +399,9 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se, dl_se->runtime = pi_se->dl_runtime; } + if (dl_se->dl_yielded && dl_se->runtime > 0) + dl_se->runtime = 0; + /* * We keep moving the deadline away until we get some * available runtime for the entity. This ensures correct @@ -735,8 +738,11 @@ static void update_curr_dl(struct rq *rq) * approach need further study. */ delta_exec = rq_clock_task(rq) - curr->se.exec_start; - if (unlikely((s64)delta_exec <= 0)) + if (unlikely((s64)delta_exec <= 0)) { + if (unlikely(dl_se->dl_yielded)) + goto throttle; return; + } schedstat_set(curr->se.statistics.exec_max, max(curr->se.statistics.exec_max, delta_exec)); @@ -749,8 +755,10 @@ static void update_curr_dl(struct rq *rq) sched_rt_avg_update(rq, delta_exec); - dl_se->runtime -= dl_se->dl_yielded ? 0 : delta_exec; - if (dl_runtime_exceeded(dl_se)) { + dl_se->runtime -= delta_exec; + +throttle: + if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) { dl_se->dl_throttled = 1; __dequeue_task_dl(rq, curr, 0); if (unlikely(dl_se->dl_boosted || !start_dl_timer(curr))) @@ -1002,10 +1010,8 @@ static void yield_task_dl(struct rq *rq) * it and the bandwidth timer will wake it up and will give it * new scheduling parameters (thanks to dl_yielded=1). */ - if (p->dl.runtime > 0) { - rq->curr->dl.dl_yielded = 1; - p->dl.runtime = 0; - } + rq->curr->dl.dl_yielded = 1; + update_rq_clock(rq); update_curr_dl(rq); /* ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/deadline: Always calculate end of period on sched_yield() 2016-02-23 12:28 ` Peter Zijlstra @ 2016-02-23 13:12 ` Steven Rostedt 2016-02-23 15:04 ` Steven Rostedt 2016-02-29 11:14 ` [tip:sched/core] " tip-bot for Peter Zijlstra 2 siblings, 0 replies; 8+ messages in thread From: Steven Rostedt @ 2016-02-23 13:12 UTC (permalink / raw) To: Peter Zijlstra Cc: LKML, Juri Lelli, Ingo Molnar, Clark Williams, Daniel Bristot de Oliveira, John Kacur On Tue, 23 Feb 2016 13:28:22 +0100 Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, Feb 12, 2016 at 06:10:20PM -0500, Steven Rostedt wrote: > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > > index cd64c979d0e1..1dd180cda574 100644 > > --- a/kernel/sched/deadline.c > > +++ b/kernel/sched/deadline.c > > @@ -735,7 +735,7 @@ static void update_curr_dl(struct rq *rq) > > * approach need further study. > > */ > > delta_exec = rq_clock_task(rq) - curr->se.exec_start; > > - if (unlikely((s64)delta_exec <= 0)) > > + if (unlikely((s64)delta_exec <= 0 && !dl_se->dl_yielded)) > > return; > > > > schedstat_set(curr->se.statistics.exec_max, > > > Would something like this make sense instead? > I'll test it and see if it works. -- Steve ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/deadline: Always calculate end of period on sched_yield() 2016-02-23 12:28 ` Peter Zijlstra 2016-02-23 13:12 ` Steven Rostedt @ 2016-02-23 15:04 ` Steven Rostedt 2016-02-29 11:14 ` [tip:sched/core] " tip-bot for Peter Zijlstra 2 siblings, 0 replies; 8+ messages in thread From: Steven Rostedt @ 2016-02-23 15:04 UTC (permalink / raw) To: Peter Zijlstra Cc: LKML, Juri Lelli, Ingo Molnar, Clark Williams, Daniel Bristot de Oliveira, John Kacur On Tue, 23 Feb 2016 13:28:22 +0100 Peter Zijlstra <peterz@infradead.org> wrote: > Would something like this make sense instead? It works perfectly. Reported-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Thanks! -- Steve > > It also retains the ->runtime while yielded, and would actually 'fix' a > case where, when we call yield, we would have had a negative runtime > after update_curr_dl(). > > The current code will 'gift' us extra runtime in that case. > > --- ^ permalink raw reply [flat|nested] 8+ messages in thread
* [tip:sched/core] sched/deadline: Always calculate end of period on sched_yield() 2016-02-23 12:28 ` Peter Zijlstra 2016-02-23 13:12 ` Steven Rostedt 2016-02-23 15:04 ` Steven Rostedt @ 2016-02-29 11:14 ` tip-bot for Peter Zijlstra 2 siblings, 0 replies; 8+ messages in thread From: tip-bot for Peter Zijlstra @ 2016-02-29 11:14 UTC (permalink / raw) To: linux-tip-commits Cc: rostedt, linux-kernel, williams, juri.lelli, hpa, torvalds, tglx, jkacur, mingo, peterz, bristot Commit-ID: 48be3a67da7413d62e5efbcf2c73a9dddf61fb96 Gitweb: http://git.kernel.org/tip/48be3a67da7413d62e5efbcf2c73a9dddf61fb96 Author: Peter Zijlstra <peterz@infradead.org> AuthorDate: Tue, 23 Feb 2016 13:28:22 +0100 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Mon, 29 Feb 2016 09:41:51 +0100 sched/deadline: Always calculate end of period on sched_yield() Steven noticed that occasionally a sched_yield() call would not result in a wait for the next period edge as expected. It turns out that when we call update_curr_dl() and end up with delta_exec <= 0, we will bail early and fail to throttle. Further inspection of the yield code revealed that yield_task_dl() clearing dl.runtime is wrong too, it will not account the last bit of runtime which could result in dl.runtime < 0, which in turn means that replenish would gift us with too much runtime. Fix both issues by not relying on the dl.runtime value for yield. Reported-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Clark Williams <williams@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160223122822.GP6357@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/deadline.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 57b939c..04a569c 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -399,6 +399,9 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se, dl_se->runtime = pi_se->dl_runtime; } + if (dl_se->dl_yielded && dl_se->runtime > 0) + dl_se->runtime = 0; + /* * We keep moving the deadline away until we get some * available runtime for the entity. This ensures correct @@ -735,8 +738,11 @@ static void update_curr_dl(struct rq *rq) * approach need further study. */ delta_exec = rq_clock_task(rq) - curr->se.exec_start; - if (unlikely((s64)delta_exec <= 0)) + if (unlikely((s64)delta_exec <= 0)) { + if (unlikely(dl_se->dl_yielded)) + goto throttle; return; + } schedstat_set(curr->se.statistics.exec_max, max(curr->se.statistics.exec_max, delta_exec)); @@ -749,8 +755,10 @@ static void update_curr_dl(struct rq *rq) sched_rt_avg_update(rq, delta_exec); - dl_se->runtime -= dl_se->dl_yielded ? 0 : delta_exec; - if (dl_runtime_exceeded(dl_se)) { + dl_se->runtime -= delta_exec; + +throttle: + if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) { dl_se->dl_throttled = 1; __dequeue_task_dl(rq, curr, 0); if (unlikely(dl_se->dl_boosted || !start_dl_timer(curr))) @@ -994,18 +1002,14 @@ static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags) */ static void yield_task_dl(struct rq *rq) { - struct task_struct *p = rq->curr; - /* * We make the task go to sleep until its current deadline by * forcing its runtime to zero. This way, update_curr_dl() stops * it and the bandwidth timer will wake it up and will give it * new scheduling parameters (thanks to dl_yielded=1). */ - if (p->dl.runtime > 0) { - rq->curr->dl.dl_yielded = 1; - p->dl.runtime = 0; - } + rq->curr->dl.dl_yielded = 1; + update_rq_clock(rq); update_curr_dl(rq); /* ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-02-29 11:14 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-02-12 23:10 [PATCH] sched/deadline: Always calculate end of period on sched_yield() Steven Rostedt 2016-02-15 10:18 ` Juri Lelli 2016-02-15 12:37 ` Daniel Bristot de Oliveira 2016-02-15 16:22 ` Steven Rostedt 2016-02-23 12:28 ` Peter Zijlstra 2016-02-23 13:12 ` Steven Rostedt 2016-02-23 15:04 ` Steven Rostedt 2016-02-29 11:14 ` [tip:sched/core] " tip-bot for Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox