* [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup [not found] <20190319130005.25492-1-pauld@redhat.com> @ 2019-04-03 8:38 ` tip-bot for Phil Auld [not found] ` <20190405141524.05DDA2186A@mail.kernel.org> 2019-04-16 15:32 ` [tip:sched/urgent] sched/fair: Limit sched_cfs_period_timer() " tip-bot for Phil Auld 1 sibling, 1 reply; 8+ messages in thread From: tip-bot for Phil Auld @ 2019-04-03 8:38 UTC (permalink / raw) To: linux-tip-commits Cc: mingo, pauld, efault, tglx, peterz, anton, bsegall, torvalds, stable, hpa, linux-kernel Commit-ID: 06ec5d30e8d57b820d44df6340dcb25010d6d0fa Gitweb: https://git.kernel.org/tip/06ec5d30e8d57b820d44df6340dcb25010d6d0fa Author: Phil Auld <pauld@redhat.com> AuthorDate: Tue, 19 Mar 2019 09:00:05 -0400 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 3 Apr 2019 09:50:23 +0200 sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup With extremely short cfs_period_us setting on a parent task group with a large number of children the for loop in sched_cfs_period_timer can run until the watchdog fires. There is no guarantee that the call to hrtimer_forward_now() will ever return 0. The large number of children can make do_sched_cfs_period_timer() take longer than the period. NMI watchdog: Watchdog detected hard LOCKUP on cpu 24 RIP: 0010:tg_nop+0x0/0x10 <IRQ> walk_tg_tree_from+0x29/0xb0 unthrottle_cfs_rq+0xe0/0x1a0 distribute_cfs_runtime+0xd3/0xf0 sched_cfs_period_timer+0xcb/0x160 ? sched_cfs_slack_timer+0xd0/0xd0 __hrtimer_run_queues+0xfb/0x270 hrtimer_interrupt+0x122/0x270 smp_apic_timer_interrupt+0x6a/0x140 apic_timer_interrupt+0xf/0x20 </IRQ> To prevent this we add protection to the loop that detects when the loop has run too many times and scales the period and quota up, proportionally, so that the timer can complete before then next period expires. This preserves the relative runtime quota while preventing the hard lockup. A warning is issued reporting this state and the new values. Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ben Segall <bsegall@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20190319130005.25492-1-pauld@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/fair.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 40bd1e27b1b7..d4cce633eac8 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4885,6 +4885,8 @@ static enum hrtimer_restart sched_cfs_slack_timer(struct hrtimer *timer) return HRTIMER_NORESTART; } +extern const u64 max_cfs_quota_period; + static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) { struct cfs_bandwidth *cfs_b = @@ -4892,6 +4894,7 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) unsigned long flags; int overrun; int idle = 0; + int count = 0; raw_spin_lock_irqsave(&cfs_b->lock, flags); for (;;) { @@ -4899,6 +4902,28 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) if (!overrun) break; + if (++count > 3) { + u64 new, old = ktime_to_ns(cfs_b->period); + + new = (old * 147) / 128; /* ~115% */ + new = min(new, max_cfs_quota_period); + + cfs_b->period = ns_to_ktime(new); + + /* since max is 1s, this is limited to 1e9^2, which fits in u64 */ + cfs_b->quota *= new; + cfs_b->quota /= old; + + pr_warn_ratelimited( + "cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us %lld, cfs_quota_us = %lld)\n", + smp_processor_id(), + new/NSEC_PER_USEC, + cfs_b->quota/NSEC_PER_USEC); + + /* reset count so we don't come right back in here */ + count = 0; + } + idle = do_sched_cfs_period_timer(cfs_b, overrun, flags); } if (idle) ^ permalink raw reply related [flat|nested] 8+ messages in thread
[parent not found: <20190405141524.05DDA2186A@mail.kernel.org>]
* Re: [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup [not found] ` <20190405141524.05DDA2186A@mail.kernel.org> @ 2019-04-08 13:42 ` Phil Auld 2019-04-08 14:50 ` Sasha Levin 0 siblings, 1 reply; 8+ messages in thread From: Phil Auld @ 2019-04-08 13:42 UTC (permalink / raw) To: Sasha Levin Cc: tip-bot for Phil Auld, linux-tip-commits, mingo, efault, Anton Blanchard, Ben Segall, Linus Torvalds, Peter Zijlstra, Thomas Gleixner, stable On Fri, Apr 05, 2019 at 02:15:23PM +0000 Sasha Levin wrote: > Hi, > > [This is an automated email] > > This commit has been processed because it contains a -stable tag. > The stable tag indicates that it's relevant for the following trees: all > > The bot has tested the following trees: v5.0.6, v4.19.33, v4.14.110, v4.9.167, v4.4.178, v3.18.138. > > v5.0.6: Failed to apply! Possible dependencies: > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > > v4.19.33: Failed to apply! Possible dependencies: > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > > v4.14.110: Failed to apply! Possible dependencies: > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > This is a minor context difference. There is no actual dependency on the c0ad4aa4d841 patch. It would be easy to produce new version that could go in these trees. I'm not sure what the right action is in that case. Should I spin a new version with the different locking in the context? Also, I can't find this commit in tip. It's visible in gitweb but it's not in the tip tree as far as I can tell. > v4.9.167: Failed to apply! Possible dependencies: > 8a8c69c32778 ("sched/core: Add rq->lock wrappers") > 92509b732baf ("sched/core: Reset RQCF_ACT_SKIP before unpinning rq->lock") > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > cb42c9a3ebbb ("sched/core: Add debugging code to catch missing update_rq_clock() calls") > d1ccc66df8bf ("sched/core: Clean up comments") > d8ac897137a2 ("sched/core: Add wrappers for lockdep_(un)pin_lock()") > > v4.4.178: Failed to apply! Possible dependencies: > 2a67e741bbbc ("rcu: Create transitive rnp->lock acquisition functions") > 3e71a462dd48 ("sched/core: Move task_rq_lock() out of line") > 3ea94de15ce9 ("sched/core: Fix incorrect wait time and wait count statistics") > 46a5d164db53 ("rcu: Stop disabling interrupts in scheduler fastpaths") > 8a8c69c32778 ("sched/core: Add rq->lock wrappers") > 958c5f848e17 ("stop_machine: Change stop_one_cpu() to rely on cpu_stop_queue_work()") > bf89a304722f ("stop_machine: Avoid a sleep and wakeup in stop_one_cpu()") > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > cb2517653fcc ("sched/debug: Make schedstats a runtime tunable that is disabled by default") > d8ac897137a2 ("sched/core: Add wrappers for lockdep_(un)pin_lock()") > e7904a28f533 ("locking/lockdep, sched/core: Implement a better lock pinning scheme") > fecbf6f01fbd ("rcu: Simplify rcu_sched_qs() control flow") > > v3.18.138: Failed to apply! Possible dependencies: > 1a43a14a5bd9 ("sched: Fix schedule_tail() to disable preemption") > 1b537c7d1e58 ("sched/core: Remove check of p->sched_class") > 1d7e974cbf2f ("sched/deadline: Don't check SD_BALANCE_FORK") > 3960c8c0c789 ("sched: Make dl_task_time() use task_rq_lock()") > 3e71a462dd48 ("sched/core: Move task_rq_lock() out of line") > 4c9a4bc89a9c ("sched: Allow balance callbacks for check_class_changed()") > 5cc389bcee08 ("sched: Move code around") > 5e16bbc2fb40 ("sched: Streamline the task migration locking a little") > 67dfa1b756f2 ("sched/deadline: Implement cancel_dl_timer() to use in switched_from_dl()") > 6c1d9410f007 ("sched: Move p->nr_cpus_allowed check to select_task_rq()") > 74b8a4cb6ce3 ("sched: Clarify ordering between task_rq_lock() and move_queued_task()") > 8a8c69c32778 ("sched/core: Add rq->lock wrappers") > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > cbce1a686700 ("sched,lockdep: Employ lock pinning") > dfa50b605c2a ("sched: Make finish_task_switch() return 'struct rq *'") > f4e9d94a5bf6 ("sched/deadline: Don't balance during wakeup if wakee is pinned") > > > How should we proceed with this patch? > I haven't look that far back. This patch should be pretty self contained. I don't think it's worth pulling it back to these trees if it would require all of these patches. The risk would out-weigh the reward. Cheers, Phil > -- > Thanks, > Sasha -- ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup 2019-04-08 13:42 ` Phil Auld @ 2019-04-08 14:50 ` Sasha Levin 2019-04-08 17:03 ` Phil Auld 0 siblings, 1 reply; 8+ messages in thread From: Sasha Levin @ 2019-04-08 14:50 UTC (permalink / raw) To: Phil Auld Cc: tip-bot for Phil Auld, linux-tip-commits, mingo, efault, Anton Blanchard, Ben Segall, Linus Torvalds, Peter Zijlstra, Thomas Gleixner, stable On Mon, Apr 08, 2019 at 09:42:14AM -0400, Phil Auld wrote: >On Fri, Apr 05, 2019 at 02:15:23PM +0000 Sasha Levin wrote: >> Hi, >> >> [This is an automated email] >> >> This commit has been processed because it contains a -stable tag. >> The stable tag indicates that it's relevant for the following trees: all >> >> The bot has tested the following trees: v5.0.6, v4.19.33, v4.14.110, v4.9.167, v4.4.178, v3.18.138. >> >> v5.0.6: Failed to apply! Possible dependencies: >> c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") >> >> v4.19.33: Failed to apply! Possible dependencies: >> c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") >> >> v4.14.110: Failed to apply! Possible dependencies: >> c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") >> > >This is a minor context difference. There is no actual dependency on the >c0ad4aa4d841 patch. It would be easy to produce new version that could >go in these trees. I'm not sure what the right action is in that case. >Should I spin a new version with the different locking in the context? Please do :) The algorithm does the dependency analysis by looking at surrounding code rather than actual functional dependency. -- Thanks, Sasha ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup 2019-04-08 14:50 ` Sasha Levin @ 2019-04-08 17:03 ` Phil Auld 2019-04-08 17:06 ` Sasha Levin 0 siblings, 1 reply; 8+ messages in thread From: Phil Auld @ 2019-04-08 17:03 UTC (permalink / raw) To: Sasha Levin Cc: tip-bot for Phil Auld, linux-tip-commits, mingo, efault, Anton Blanchard, Ben Segall, Linus Torvalds, Peter Zijlstra, Thomas Gleixner, stable On Mon, Apr 08, 2019 at 10:50:26AM -0400 Sasha Levin wrote: > On Mon, Apr 08, 2019 at 09:42:14AM -0400, Phil Auld wrote: > > On Fri, Apr 05, 2019 at 02:15:23PM +0000 Sasha Levin wrote: > > > Hi, > > > > > > [This is an automated email] > > > > > > This commit has been processed because it contains a -stable tag. > > > The stable tag indicates that it's relevant for the following trees: all > > > > > > The bot has tested the following trees: v5.0.6, v4.19.33, v4.14.110, v4.9.167, v4.4.178, v3.18.138. > > > > > > v5.0.6: Failed to apply! Possible dependencies: > > > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > > > > > > v4.19.33: Failed to apply! Possible dependencies: > > > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > > > > > > v4.14.110: Failed to apply! Possible dependencies: > > > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > > > > > > > This is a minor context difference. There is no actual dependency on the > > c0ad4aa4d841 patch. It would be easy to produce new version that could > > go in these trees. I'm not sure what the right action is in that case. > > Should I spin a new version with the different locking in the context? > > Please do :) > Sure. I'm just not sure how to post it. It only shows up in this tip-bot email and on gitweb. It's not in tip.git and not in Linus' upstream tree. I've updated the patch at it will apply now to v5.0.6, v4.19.33, v4.14.110, v4.9.167, v4.4.178 all with increasing offsets but nothing else. v3.18.138 won't take it without more work. I'd be inclined to skip that one. > The algorithm does the dependency analysis by looking at surrounding > code rather than actual functional dependency. That's pretty cool though :) Cheers, Phil > > -- > Thanks, > Sasha -- ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup 2019-04-08 17:03 ` Phil Auld @ 2019-04-08 17:06 ` Sasha Levin 2019-04-08 17:31 ` Phil Auld 0 siblings, 1 reply; 8+ messages in thread From: Sasha Levin @ 2019-04-08 17:06 UTC (permalink / raw) To: Phil Auld Cc: tip-bot for Phil Auld, linux-tip-commits, mingo, efault, Anton Blanchard, Ben Segall, Linus Torvalds, Peter Zijlstra, Thomas Gleixner, stable On Mon, Apr 08, 2019 at 01:03:05PM -0400, Phil Auld wrote: >On Mon, Apr 08, 2019 at 10:50:26AM -0400 Sasha Levin wrote: >> On Mon, Apr 08, 2019 at 09:42:14AM -0400, Phil Auld wrote: >> > On Fri, Apr 05, 2019 at 02:15:23PM +0000 Sasha Levin wrote: >> > > Hi, >> > > >> > > [This is an automated email] >> > > >> > > This commit has been processed because it contains a -stable tag. >> > > The stable tag indicates that it's relevant for the following trees: all >> > > >> > > The bot has tested the following trees: v5.0.6, v4.19.33, v4.14.110, v4.9.167, v4.4.178, v3.18.138. >> > > >> > > v5.0.6: Failed to apply! Possible dependencies: >> > > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") >> > > >> > > v4.19.33: Failed to apply! Possible dependencies: >> > > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") >> > > >> > > v4.14.110: Failed to apply! Possible dependencies: >> > > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") >> > > >> > >> > This is a minor context difference. There is no actual dependency on the >> > c0ad4aa4d841 patch. It would be easy to produce new version that could >> > go in these trees. I'm not sure what the right action is in that case. >> > Should I spin a new version with the different locking in the context? >> >> Please do :) >> > >Sure. I'm just not sure how to post it. It only shows up in this tip-bot >email and on gitweb. It's not in tip.git and not in Linus' upstream tree. > >I've updated the patch at it will apply now to v5.0.6, v4.19.33, v4.14.110, >v4.9.167, v4.4.178 all with increasing offsets but nothing else. You can either reply to this thread with the patch(es), or just send them out and annotate one way or the other that they should go to their appropriate stable trees. >v3.18.138 won't take it without more work. I'd be inclined to skip that one. No problem there. -- Thanks, Sasha ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup 2019-04-08 17:06 ` Sasha Levin @ 2019-04-08 17:31 ` Phil Auld 0 siblings, 0 replies; 8+ messages in thread From: Phil Auld @ 2019-04-08 17:31 UTC (permalink / raw) To: Sasha Levin Cc: tip-bot for Phil Auld, linux-tip-commits, mingo, efault, Anton Blanchard, Ben Segall, Linus Torvalds, Peter Zijlstra, Thomas Gleixner, stable On Mon, Apr 08, 2019 at 01:06:12PM -0400 Sasha Levin wrote: > On Mon, Apr 08, 2019 at 01:03:05PM -0400, Phil Auld wrote: > > On Mon, Apr 08, 2019 at 10:50:26AM -0400 Sasha Levin wrote: > > > On Mon, Apr 08, 2019 at 09:42:14AM -0400, Phil Auld wrote: > > > > On Fri, Apr 05, 2019 at 02:15:23PM +0000 Sasha Levin wrote: > > > > > Hi, > > > > > > > > > > [This is an automated email] > > > > > > > > > > This commit has been processed because it contains a -stable tag. > > > > > The stable tag indicates that it's relevant for the following trees: all > > > > > > > > > > The bot has tested the following trees: v5.0.6, v4.19.33, v4.14.110, v4.9.167, v4.4.178, v3.18.138. > > > > > > > > > > v5.0.6: Failed to apply! Possible dependencies: > > > > > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > > > > > > > > > > v4.19.33: Failed to apply! Possible dependencies: > > > > > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > > > > > > > > > > v4.14.110: Failed to apply! Possible dependencies: > > > > > c0ad4aa4d841 ("sched/fair: Robustify CFS-bandwidth timer locking") > > > > > > > > > > > > > This is a minor context difference. There is no actual dependency on the > > > > c0ad4aa4d841 patch. It would be easy to produce new version that could > > > > go in these trees. I'm not sure what the right action is in that case. > > > > Should I spin a new version with the different locking in the context? > > > > > > Please do :) > > > > > > > Sure. I'm just not sure how to post it. It only shows up in this tip-bot > > email and on gitweb. It's not in tip.git and not in Linus' upstream tree. > > > > I've updated the patch at it will apply now to v5.0.6, v4.19.33, v4.14.110, > > v4.9.167, v4.4.178 all with increasing offsets but nothing else. > > You can either reply to this thread with the patch(es), or just send > them out and annotate one way or the other that they should go to their > appropriate stable trees. > Okay, I'll do that. Am I the only one who finds it strange that the commit exists only in this email and gitweb but not the actual git tree(s)? > > v3.18.138 won't take it without more work. I'd be inclined to skip that one. > > No problem there. > > > -- > Thanks, > Sasha -- ^ permalink raw reply [flat|nested] 8+ messages in thread
* [tip:sched/urgent] sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup [not found] <20190319130005.25492-1-pauld@redhat.com> 2019-04-03 8:38 ` [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup tip-bot for Phil Auld @ 2019-04-16 15:32 ` tip-bot for Phil Auld 2019-04-16 19:26 ` Phil Auld 1 sibling, 1 reply; 8+ messages in thread From: tip-bot for Phil Auld @ 2019-04-16 15:32 UTC (permalink / raw) To: linux-tip-commits Cc: torvalds, bsegall, linux-kernel, pauld, stable, anton, tglx, peterz, hpa, mingo Commit-ID: 2e8e19226398db8265a8e675fcc0118b9e80c9e8 Gitweb: https://git.kernel.org/tip/2e8e19226398db8265a8e675fcc0118b9e80c9e8 Author: Phil Auld <pauld@redhat.com> AuthorDate: Tue, 19 Mar 2019 09:00:05 -0400 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 16 Apr 2019 16:50:05 +0200 sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup With extremely short cfs_period_us setting on a parent task group with a large number of children the for loop in sched_cfs_period_timer() can run until the watchdog fires. There is no guarantee that the call to hrtimer_forward_now() will ever return 0. The large number of children can make do_sched_cfs_period_timer() take longer than the period. NMI watchdog: Watchdog detected hard LOCKUP on cpu 24 RIP: 0010:tg_nop+0x0/0x10 <IRQ> walk_tg_tree_from+0x29/0xb0 unthrottle_cfs_rq+0xe0/0x1a0 distribute_cfs_runtime+0xd3/0xf0 sched_cfs_period_timer+0xcb/0x160 ? sched_cfs_slack_timer+0xd0/0xd0 __hrtimer_run_queues+0xfb/0x270 hrtimer_interrupt+0x122/0x270 smp_apic_timer_interrupt+0x6a/0x140 apic_timer_interrupt+0xf/0x20 </IRQ> To prevent this we add protection to the loop that detects when the loop has run too many times and scales the period and quota up, proportionally, so that the timer can complete before then next period expires. This preserves the relative runtime quota while preventing the hard lockup. A warning is issued reporting this state and the new values. Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ben Segall <bsegall@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190319130005.25492-1-pauld@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/fair.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 40bd1e27b1b7..a4d9e14bf138 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4885,6 +4885,8 @@ static enum hrtimer_restart sched_cfs_slack_timer(struct hrtimer *timer) return HRTIMER_NORESTART; } +extern const u64 max_cfs_quota_period; + static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) { struct cfs_bandwidth *cfs_b = @@ -4892,6 +4894,7 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) unsigned long flags; int overrun; int idle = 0; + int count = 0; raw_spin_lock_irqsave(&cfs_b->lock, flags); for (;;) { @@ -4899,6 +4902,28 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) if (!overrun) break; + if (++count > 3) { + u64 new, old = ktime_to_ns(cfs_b->period); + + new = (old * 147) / 128; /* ~115% */ + new = min(new, max_cfs_quota_period); + + cfs_b->period = ns_to_ktime(new); + + /* since max is 1s, this is limited to 1e9^2, which fits in u64 */ + cfs_b->quota *= new; + cfs_b->quota = div64_u64(cfs_b->quota, old); + + pr_warn_ratelimited( + "cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us %lld, cfs_quota_us = %lld)\n", + smp_processor_id(), + div_u64(new, NSEC_PER_USEC), + div_u64(cfs_b->quota, NSEC_PER_USEC)); + + /* reset count so we don't come right back in here */ + count = 0; + } + idle = do_sched_cfs_period_timer(cfs_b, overrun, flags); } if (idle) ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [tip:sched/urgent] sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup 2019-04-16 15:32 ` [tip:sched/urgent] sched/fair: Limit sched_cfs_period_timer() " tip-bot for Phil Auld @ 2019-04-16 19:26 ` Phil Auld 0 siblings, 0 replies; 8+ messages in thread From: Phil Auld @ 2019-04-16 19:26 UTC (permalink / raw) To: sashal Cc: tglx, hpa, peterz, mingo, torvalds, linux-kernel, bsegall, stable, anton Hi Sasha, On Tue, Apr 16, 2019 at 08:32:09AM -0700 tip-bot for Phil Auld wrote: > Commit-ID: 2e8e19226398db8265a8e675fcc0118b9e80c9e8 > Gitweb: https://git.kernel.org/tip/2e8e19226398db8265a8e675fcc0118b9e80c9e8 > Author: Phil Auld <pauld@redhat.com> > AuthorDate: Tue, 19 Mar 2019 09:00:05 -0400 > Committer: Ingo Molnar <mingo@kernel.org> > CommitDate: Tue, 16 Apr 2019 16:50:05 +0200 > > sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup > > With extremely short cfs_period_us setting on a parent task group with a large > number of children the for loop in sched_cfs_period_timer() can run until the > watchdog fires. There is no guarantee that the call to hrtimer_forward_now() > will ever return 0. The large number of children can make > do_sched_cfs_period_timer() take longer than the period. > > NMI watchdog: Watchdog detected hard LOCKUP on cpu 24 > RIP: 0010:tg_nop+0x0/0x10 > <IRQ> > walk_tg_tree_from+0x29/0xb0 > unthrottle_cfs_rq+0xe0/0x1a0 > distribute_cfs_runtime+0xd3/0xf0 > sched_cfs_period_timer+0xcb/0x160 > ? sched_cfs_slack_timer+0xd0/0xd0 > __hrtimer_run_queues+0xfb/0x270 > hrtimer_interrupt+0x122/0x270 > smp_apic_timer_interrupt+0x6a/0x140 > apic_timer_interrupt+0xf/0x20 > </IRQ> > > To prevent this we add protection to the loop that detects when the loop has run > too many times and scales the period and quota up, proportionally, so that the timer > can complete before then next period expires. This preserves the relative runtime > quota while preventing the hard lockup. > > A warning is issued reporting this state and the new values. > > Signed-off-by: Phil Auld <pauld@redhat.com> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Cc: <stable@vger.kernel.org> > Cc: Anton Blanchard <anton@ozlabs.org> > Cc: Ben Segall <bsegall@google.com> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Link: https://lkml.kernel.org/r/20190319130005.25492-1-pauld@redhat.com > Signed-off-by: Ingo Molnar <mingo@kernel.org> > --- The above commit won't work on the stable trees. Below is an updated version that will work on v5.0.7, v4.19.34, v4.14.111, v4.9.168, and v4.4.178 with increasing offsets. I believe v3.18.138 will require more so that one is not included. There is only a minor change to context, none of actual changes in the patch are different. Thanks, Phil diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 310d0637fe4b..f0380229b6f2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4859,12 +4859,15 @@ static enum hrtimer_restart sched_cfs_slack_timer(struct hrtimer *timer) return HRTIMER_NORESTART; } +extern const u64 max_cfs_quota_period; + static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) { struct cfs_bandwidth *cfs_b = container_of(timer, struct cfs_bandwidth, period_timer); int overrun; int idle = 0; + int count = 0; raw_spin_lock(&cfs_b->lock); for (;;) { @@ -4872,6 +4875,28 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) if (!overrun) break; + if (++count > 3) { + u64 new, old = ktime_to_ns(cfs_b->period); + + new = (old * 147) / 128; /* ~115% */ + new = min(new, max_cfs_quota_period); + + cfs_b->period = ns_to_ktime(new); + + /* since max is 1s, this is limited to 1e9^2, which fits in u64 */ + cfs_b->quota *= new; + cfs_b->quota = div64_u64(cfs_b->quota, old); + + pr_warn_ratelimited( + "cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us %lld, cfs_quota_us = %lld)\n", + smp_processor_id(), + div_u64(new, NSEC_PER_USEC), + div_u64(cfs_b->quota, NSEC_PER_USEC)); + + /* reset count so we don't come right back in here */ + count = 0; + } + idle = do_sched_cfs_period_timer(cfs_b, overrun); } if (idle) -- ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-04-16 19:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20190319130005.25492-1-pauld@redhat.com>
2019-04-03 8:38 ` [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup tip-bot for Phil Auld
[not found] ` <20190405141524.05DDA2186A@mail.kernel.org>
2019-04-08 13:42 ` Phil Auld
2019-04-08 14:50 ` Sasha Levin
2019-04-08 17:03 ` Phil Auld
2019-04-08 17:06 ` Sasha Levin
2019-04-08 17:31 ` Phil Auld
2019-04-16 15:32 ` [tip:sched/urgent] sched/fair: Limit sched_cfs_period_timer() " tip-bot for Phil Auld
2019-04-16 19:26 ` Phil Auld
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox