From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@elte.hu>,
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
Mike Galbraith <efault@gmx.de>,
Peter Zijlstra <peterz@infradead.org>
Subject: [PATCH 26/30] sched: update shares on wakeup
Date: Fri, 27 Jun 2008 13:41:35 +0200 [thread overview]
Message-ID: <20080627115212.251661522@chello.nl> (raw)
In-Reply-To: 20080627114109.724249622@chello.nl
[-- Attachment #1: sched-throttle-update_shares.patch --]
[-- Type: text/plain, Size: 3814 bytes --]
We found that the affine wakeup code needs rather accurate load figures
to be effective. The trouble is that updating the load figures is fairly
expensive with group scheduling. Therefore ratelimit the updating.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
---
include/linux/sched.h | 3 +++
kernel/sched.c | 30 +++++++++++++++++++++++++++++-
kernel/sched_features.h | 3 ++-
kernel/sysctl.c | 8 ++++++++
4 files changed, 42 insertions(+), 2 deletions(-)
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -778,6 +778,12 @@ late_initcall(sched_init_debug);
const_debug unsigned int sysctl_sched_nr_migrate = 32;
/*
+ * ratelimit for updating the group shares.
+ * default: 0.5ms
+ */
+const_debug unsigned int sysctl_sched_shares_ratelimit = 500000;
+
+/*
* period over which we measure -rt task cpu usage in us.
* default: 1s
*/
@@ -1590,7 +1596,13 @@ tg_nop(struct task_group *tg, int cpu, s
static void update_shares(struct sched_domain *sd)
{
- walk_tg_tree(tg_nop, tg_shares_up, 0, sd);
+ u64 now = cpu_clock(raw_smp_processor_id());
+ s64 elapsed = now - sd->last_update;
+
+ if (elapsed >= (s64)(u64)sysctl_sched_shares_ratelimit) {
+ sd->last_update = now;
+ walk_tg_tree(tg_nop, tg_shares_up, 0, sd);
+ }
}
static void update_shares_locked(struct rq *rq, struct sched_domain *sd)
@@ -2199,6 +2211,22 @@ static int try_to_wake_up(struct task_st
if (!sched_feat(SYNC_WAKEUPS))
sync = 0;
+#ifdef CONFIG_SMP
+ if (sched_feat(LB_WAKEUP_UPDATE)) {
+ struct sched_domain *sd;
+
+ this_cpu = raw_smp_processor_id();
+ cpu = task_cpu(p);
+
+ for_each_domain(this_cpu, sd) {
+ if (cpu_isset(cpu, sd->span)) {
+ update_shares(sd);
+ break;
+ }
+ }
+ }
+#endif
+
smp_wmb();
rq = task_rq_lock(p, &flags);
old_state = p->state;
Index: linux-2.6/kernel/sched_features.h
===================================================================
--- linux-2.6.orig/kernel/sched_features.h
+++ linux-2.6/kernel/sched_features.h
@@ -8,4 +8,5 @@ SCHED_FEAT(SYNC_WAKEUPS, 1)
SCHED_FEAT(HRTICK, 1)
SCHED_FEAT(DOUBLE_TICK, 0)
SCHED_FEAT(ASYM_GRAN, 1)
-SCHED_FEAT(LB_BIAS, 0)
\ No newline at end of file
+SCHED_FEAT(LB_BIAS, 0)
+SCHED_FEAT(LB_WAKEUP_UPDATE, 1)
Index: linux-2.6/include/linux/sched.h
===================================================================
--- linux-2.6.orig/include/linux/sched.h
+++ linux-2.6/include/linux/sched.h
@@ -783,6 +783,8 @@ struct sched_domain {
unsigned int balance_interval; /* initialise to 1. units in ms. */
unsigned int nr_balance_failed; /* initialise to 0 */
+ u64 last_update;
+
#ifdef CONFIG_SCHEDSTATS
/* load_balance() stats */
unsigned int lb_count[CPU_MAX_IDLE_TYPES];
@@ -1605,6 +1607,7 @@ extern unsigned int sysctl_sched_child_r
extern unsigned int sysctl_sched_features;
extern unsigned int sysctl_sched_migration_cost;
extern unsigned int sysctl_sched_nr_migrate;
+extern unsigned int sysctl_sched_shares_ratelimit;
int sched_nr_latency_handler(struct ctl_table *table, int write,
struct file *file, void __user *buffer, size_t *length,
Index: linux-2.6/kernel/sysctl.c
===================================================================
--- linux-2.6.orig/kernel/sysctl.c
+++ linux-2.6/kernel/sysctl.c
@@ -269,6 +269,14 @@ static struct ctl_table kern_table[] = {
},
{
.ctl_name = CTL_UNNUMBERED,
+ .procname = "sched_shares_ratelimit",
+ .data = &sysctl_sched_shares_ratelimit,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec,
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
.procname = "sched_child_runs_first",
.data = &sysctl_sched_child_runs_first,
.maxlen = sizeof(unsigned int),
--
next prev parent reply other threads:[~2008-06-27 12:03 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-27 11:41 [PATCH 00/30] SMP-group balancer - take 3 Peter Zijlstra
2008-06-27 11:41 ` [PATCH 01/30] sched: clean up some unused variables Peter Zijlstra
2008-06-27 11:41 ` [PATCH 02/30] sched: revert the revert of: weight calculations Peter Zijlstra
2008-06-30 18:07 ` Balbir Singh
2008-07-15 20:16 ` Peter Zijlstra
2008-06-27 11:41 ` [PATCH 03/30] sched: fix calc_delta_asym() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 04/30] sched: fix calc_delta_asym Peter Zijlstra
2008-06-27 11:41 ` [PATCH 05/30] sched: revert revert of: fair-group: SMP-nice for group scheduling Peter Zijlstra
2008-06-27 11:41 ` [PATCH 06/30] sched: sched_clock_cpu() based cpu_clock() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 07/30] sched: fix wakeup granularity and buddy granularity Peter Zijlstra
2008-06-27 11:41 ` [PATCH 08/30] sched: add full schedstats to /proc/sched_debug Peter Zijlstra
2008-06-27 11:41 ` [PATCH 09/30] sched: fix sched_domain aggregation Peter Zijlstra
2008-06-27 11:41 ` [PATCH 10/30] sched: update aggregate when holding the RQs Peter Zijlstra
2008-06-27 11:41 ` [PATCH 11/30] sched: kill task_group balancing Peter Zijlstra
2008-06-27 11:41 ` [PATCH 12/30] sched: dont micro manage share losses Peter Zijlstra
2008-06-27 11:41 ` [PATCH 13/30] sched: no need to aggregate task_weight Peter Zijlstra
2008-06-27 11:41 ` [PATCH 14/30] sched: simplify the group load balancer Peter Zijlstra
2008-06-27 11:41 ` [PATCH 15/30] sched: fix newidle smp group balancing Peter Zijlstra
2008-06-27 11:41 ` [PATCH 16/30] sched: fix sched_balance_self() " Peter Zijlstra
2008-06-27 11:41 ` [PATCH 17/30] sched: persistent average load per task Peter Zijlstra
2008-06-27 11:41 ` [PATCH 18/30] sched: hierarchical load vs affine wakeups Peter Zijlstra
2008-06-27 11:41 ` [PATCH 19/30] sched: hierarchical load vs find_busiest_group Peter Zijlstra
2008-06-27 11:41 ` [PATCH 20/30] sched: fix load scaling in group balancing Peter Zijlstra
2008-06-27 11:41 ` [PATCH 21/30] sched: fix task_h_load() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 22/30] sched: remove prio preference from balance decisions Peter Zijlstra
2008-06-27 11:41 ` [PATCH 23/30] sched: optimize effective_load() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 24/30] sched: disable source/target_load bias Peter Zijlstra
2008-06-27 11:41 ` [PATCH 25/30] sched: fix shares boost logic Peter Zijlstra
2008-06-27 11:41 ` Peter Zijlstra [this message]
2008-06-27 11:41 ` [PATCH 27/30] sched: fix mult overflow Peter Zijlstra
2008-06-27 11:41 ` [PATCH 28/30] sched: correct wakeup weight calculations Peter Zijlstra
2008-06-27 11:41 ` [PATCH 29/30] sched: incremental effective_load() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 30/30] sched: bias effective_load() error towards failing wake_affine() Peter Zijlstra
2008-06-27 12:46 ` [PATCH 00/30] SMP-group balancer - take 3 Ingo Molnar
2008-06-27 17:33 ` Dhaval Giani
2008-06-28 17:08 ` Dhaval Giani
2008-06-30 12:59 ` Ingo Molnar
2008-06-30 14:53 ` Dhaval Giani
2008-07-01 10:57 ` Dhaval Giani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080627115212.251661522@chello.nl \
--to=a.p.zijlstra@chello.nl \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox