From: Mike Galbraith <mgalbraith@suse.de>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Chris Mason <clm@fb.com>, Ingo Molnar <mingo@kernel.org>,
Matt Fleming <matt@codeblueprint.co.uk>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] select_idle_sibling experiments
Date: Mon, 02 May 2016 07:35:23 +0200 [thread overview]
Message-ID: <1462167323.4507.1.camel@suse.de> (raw)
In-Reply-To: <20160428120012.GZ3430@twins.programming.kicks-ass.net>
On Thu, 2016-04-28 at 14:00 +0200, Peter Zijlstra wrote:
> On Wed, Apr 06, 2016 at 09:27:24AM +0200, Mike Galbraith wrote:
> > sched: ratelimit nohz
> >
> > Entering nohz code on every micro-idle is too expensive to bear.
> >
> > Signed-off-by: Mike Galbraith <efault@gmx.de>
>
> > +int sched_needs_cpu(int cpu)
> > +{
> > +> > > > if (tick_nohz_full_cpu(cpu))
> > +> > > > > > return 0;
> > +
> > +> > > > return cpu_rq(cpu)->avg_idle < sysctl_sched_migration_cost;
>
> So the only problem I have with this patch is the choice of limit. This
> isn't at all tied to the migration cost.
>
> And some people are already twiddling with the migration_cost knob to
> affect the idle_balance() behaviour -- making it much more agressive by
> dialing it down. When you do that you also loose the effectiveness of
> this proposed usage, even though those same people would probably want
> this.
>
> Failing a spot of inspiration for a runtime limit on this; we might have
> to introduce yet another knob :/
sched: ratelimit nohz tick shutdown/restart
Tick shutdown/restart overhead can be substantial when CPUs
enter/exit the idle loop at high frequency. Ratelimit based
upon rq->avg_idle, and provide an adjustment knob.
Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
---
include/linux/sched.h | 5 +++++
include/linux/sched/sysctl.h | 4 ++++
kernel/sched/core.c | 10 ++++++++++
kernel/sysctl.c | 9 +++++++++
kernel/time/tick-sched.c | 2 +-
5 files changed, 29 insertions(+), 1 deletion(-)
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2286,6 +2286,11 @@ static inline int set_cpus_allowed_ptr(s
#ifdef CONFIG_NO_HZ_COMMON
void calc_load_enter_idle(void);
void calc_load_exit_idle(void);
+#ifdef CONFIG_SMP
+extern int sched_needs_cpu(int cpu);
+#else
+static inline int sched_needs_cpu(int cpu) { return 0; }
+#endif
#else
static inline void calc_load_enter_idle(void) { }
static inline void calc_load_exit_idle(void) { }
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -19,6 +19,10 @@ extern unsigned int sysctl_sched_min_gra
extern unsigned int sysctl_sched_wakeup_granularity;
extern unsigned int sysctl_sched_child_runs_first;
+#if defined(CONFIG_NO_HZ_COMMON) && defined(CONFIG_SMP)
+extern unsigned int sysctl_sched_nohz_throttle;
+#endif
+
enum sched_tunable_scaling {
SCHED_TUNABLESCALING_NONE,
SCHED_TUNABLESCALING_LOG,
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -577,6 +577,16 @@ static inline bool got_nohz_idle_kick(vo
return false;
}
+unsigned int sysctl_sched_nohz_throttle = 500000UL;
+
+int sched_needs_cpu(int cpu)
+{
+ if (tick_nohz_full_cpu(cpu))
+ return 0;
+
+ return cpu_rq(cpu)->avg_idle < sysctl_sched_nohz_throttle;
+}
+
#else /* CONFIG_NO_HZ_COMMON */
static inline bool got_nohz_idle_kick(void)
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -351,6 +351,15 @@ static struct ctl_table kern_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
+#ifdef CONFIG_NO_HZ_COMMON
+ {
+ .procname = "sched_nohz_throttle_ns",
+ .data = &sysctl_sched_nohz_throttle,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
+#endif
#ifdef CONFIG_SCHEDSTATS
{
.procname = "sched_schedstats",
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -676,7 +676,7 @@ static ktime_t tick_nohz_stop_sched_tick
} while (read_seqretry(&jiffies_lock, seq));
ts->last_jiffies = basejiff;
- if (rcu_needs_cpu(basemono, &next_rcu) ||
+ if (sched_needs_cpu(cpu) || rcu_needs_cpu(basemono, &next_rcu) ||
arch_needs_cpu() || irq_work_needs_cpu()) {
next_tick = basemono + TICK_NSEC;
} else {
next prev parent reply other threads:[~2016-05-02 5:35 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-05 18:08 [PATCH RFC] select_idle_sibling experiments Chris Mason
2016-04-05 18:43 ` Bastien Bastien Philbert
2016-04-05 19:28 ` Chris Mason
2016-04-05 20:03 ` Matt Fleming
2016-04-05 21:05 ` Bastien Philbert
2016-04-06 0:44 ` Chris Mason
2016-04-06 7:27 ` Mike Galbraith
2016-04-06 13:36 ` Chris Mason
2016-04-09 17:30 ` Chris Mason
2016-04-12 21:45 ` Matt Fleming
2016-04-13 3:40 ` Mike Galbraith
2016-04-13 15:54 ` Chris Mason
2016-04-28 12:00 ` Peter Zijlstra
2016-04-28 13:17 ` Mike Galbraith
2016-05-02 5:35 ` Mike Galbraith [this message]
2016-04-07 15:17 ` Chris Mason
2016-04-09 19:05 ` sched: tweak select_idle_sibling to look for idle threads Chris Mason
2016-04-10 10:04 ` Mike Galbraith
2016-04-10 12:35 ` Chris Mason
2016-04-10 12:46 ` Mike Galbraith
2016-04-10 19:55 ` Chris Mason
2016-04-11 4:54 ` Mike Galbraith
2016-04-12 0:30 ` Chris Mason
2016-04-12 4:44 ` Mike Galbraith
2016-04-12 13:27 ` Chris Mason
2016-04-12 18:16 ` Mike Galbraith
2016-04-12 20:07 ` Chris Mason
2016-04-13 3:18 ` Mike Galbraith
2016-04-13 13:44 ` Chris Mason
2016-04-13 14:22 ` Mike Galbraith
2016-04-13 14:36 ` Chris Mason
2016-04-13 15:05 ` Mike Galbraith
2016-04-13 15:34 ` Mike Galbraith
2016-04-30 12:47 ` Peter Zijlstra
2016-05-01 7:12 ` Mike Galbraith
2016-05-01 8:53 ` Peter Zijlstra
2016-05-01 9:20 ` Mike Galbraith
2016-05-07 1:24 ` Yuyang Du
2016-05-08 8:08 ` Mike Galbraith
2016-05-08 18:57 ` Yuyang Du
2016-05-09 3:45 ` Mike Galbraith
2016-05-08 20:22 ` Yuyang Du
2016-05-09 7:44 ` Mike Galbraith
2016-05-09 1:13 ` Yuyang Du
2016-05-09 9:39 ` Mike Galbraith
2016-05-09 23:26 ` Yuyang Du
2016-05-10 7:49 ` Mike Galbraith
2016-05-10 15:26 ` Mike Galbraith
2016-05-10 19:16 ` Yuyang Du
2016-05-11 4:17 ` Mike Galbraith
2016-05-11 1:23 ` Yuyang Du
2016-05-11 9:56 ` Mike Galbraith
2016-05-18 6:41 ` Mike Galbraith
2016-05-09 3:52 ` Mike Galbraith
2016-05-08 20:31 ` Yuyang Du
2016-05-02 8:46 ` Peter Zijlstra
2016-05-02 14:50 ` Mike Galbraith
2016-05-02 14:58 ` Peter Zijlstra
2016-05-02 15:47 ` Chris Mason
2016-05-03 14:32 ` Peter Zijlstra
2016-05-03 15:11 ` Chris Mason
2016-05-04 10:37 ` Peter Zijlstra
2016-05-04 15:31 ` Peter Zijlstra
2016-05-05 22:03 ` Matt Fleming
2016-05-06 18:54 ` Mike Galbraith
2016-05-09 8:33 ` Peter Zijlstra
2016-05-09 8:56 ` Mike Galbraith
2016-05-04 15:45 ` Peter Zijlstra
2016-05-04 17:46 ` Chris Mason
2016-05-05 9:33 ` Peter Zijlstra
2016-05-05 13:58 ` Chris Mason
2016-05-06 7:12 ` Peter Zijlstra
2016-05-06 17:27 ` Chris Mason
2016-05-06 7:25 ` Peter Zijlstra
2016-05-02 17:30 ` Mike Galbraith
2016-05-02 15:01 ` Peter Zijlstra
2016-05-02 16:04 ` Ingo Molnar
2016-05-03 11:31 ` Peter Zijlstra
2016-05-03 18:22 ` Peter Zijlstra
2016-05-02 15:10 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1462167323.4507.1.camel@suse.de \
--to=mgalbraith@suse.de \
--cc=clm@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.