From: tip-bot for Venkatesh Pallipadi <venki@google.com>
To: linux-tip-commits@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@redhat.com,
a.p.zijlstra@chello.nl, tglx@linutronix.de, mingo@elte.hu,
venki@google.com
Subject: [tip:sched/core] sched: Remove irq time from available CPU power
Date: Mon, 18 Oct 2010 19:26:50 GMT [thread overview]
Message-ID: <tip-aa483808516ca5cacfa0e5849691f64fec25828e@git.kernel.org> (raw)
In-Reply-To: <1286237003-12406-8-git-send-email-venki@google.com>
Commit-ID: aa483808516ca5cacfa0e5849691f64fec25828e
Gitweb: http://git.kernel.org/tip/aa483808516ca5cacfa0e5849691f64fec25828e
Author: Venkatesh Pallipadi <venki@google.com>
AuthorDate: Mon, 4 Oct 2010 17:03:22 -0700
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 18 Oct 2010 20:52:27 +0200
sched: Remove irq time from available CPU power
The idea was suggested by Peter Zijlstra here:
http://marc.info/?l=linux-kernel&m=127476934517534&w=2
irq time is technically not available to the tasks running on the CPU.
This patch removes irq time from CPU power piggybacking on
sched_rt_avg_update().
Tested this by keeping CPU X busy with a network intensive task having 75%
oa a single CPU irq processing (hard+soft) on a 4-way system. And start seven
cycle soakers on the system. Without this change, there will be two tasks on
each CPU. With this change, there is a single task on irq busy CPU X and
remaining 7 tasks are spread around among other 3 CPUs.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-8-git-send-email-venki@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/sched.c | 18 ++++++++++++++++++
kernel/sched_fair.c | 8 +++++++-
kernel/sched_features.h | 5 +++++
3 files changed, 30 insertions(+), 1 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 9e01b71..bff9ef5 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -519,6 +519,10 @@ struct rq {
u64 avg_idle;
#endif
+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
+ u64 prev_irq_time;
+#endif
+
/* calc_load related fields */
unsigned long calc_load_update;
long calc_load_active;
@@ -643,6 +647,7 @@ static inline struct task_group *task_group(struct task_struct *p)
#endif /* CONFIG_CGROUP_SCHED */
static u64 irq_time_cpu(int cpu);
+static void sched_irq_time_avg_update(struct rq *rq, u64 irq_time);
inline void update_rq_clock(struct rq *rq)
{
@@ -654,6 +659,8 @@ inline void update_rq_clock(struct rq *rq)
irq_time = irq_time_cpu(cpu);
if (rq->clock - irq_time > rq->clock_task)
rq->clock_task = rq->clock - irq_time;
+
+ sched_irq_time_avg_update(rq, irq_time);
}
}
@@ -1985,6 +1992,15 @@ void account_system_vtime(struct task_struct *curr)
local_irq_restore(flags);
}
+static void sched_irq_time_avg_update(struct rq *rq, u64 curr_irq_time)
+{
+ if (sched_clock_irqtime && sched_feat(NONIRQ_POWER)) {
+ u64 delta_irq = curr_irq_time - rq->prev_irq_time;
+ rq->prev_irq_time = curr_irq_time;
+ sched_rt_avg_update(rq, delta_irq);
+ }
+}
+
#else
static u64 irq_time_cpu(int cpu)
@@ -1992,6 +2008,8 @@ static u64 irq_time_cpu(int cpu)
return 0;
}
+static void sched_irq_time_avg_update(struct rq *rq, u64 curr_irq_time) { }
+
#endif
#include "sched_idletask.c"
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index c358d40..74cccfa 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -2275,7 +2275,13 @@ unsigned long scale_rt_power(int cpu)
u64 total, available;
total = sched_avg_period() + (rq->clock - rq->age_stamp);
- available = total - rq->rt_avg;
+
+ if (unlikely(total < rq->rt_avg)) {
+ /* Ensures that power won't end up being negative */
+ available = 0;
+ } else {
+ available = total - rq->rt_avg;
+ }
if (unlikely((s64)total < SCHED_LOAD_SCALE))
total = SCHED_LOAD_SCALE;
diff --git a/kernel/sched_features.h b/kernel/sched_features.h
index 83c66e8..185f920 100644
--- a/kernel/sched_features.h
+++ b/kernel/sched_features.h
@@ -61,3 +61,8 @@ SCHED_FEAT(ASYM_EFF_LOAD, 1)
* release the lock. Decreases scheduling overhead.
*/
SCHED_FEAT(OWNER_SPIN, 1)
+
+/*
+ * Decrement CPU power based on irq activity
+ */
+SCHED_FEAT(NONIRQ_POWER, 1)
next prev parent reply other threads:[~2010-10-18 19:27 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-05 0:03 Proper kernel irq time accounting -v4 Venkatesh Pallipadi
2010-10-05 0:03 ` [PATCH 1/8] si time accounting accounts bh_disable'd time to si -v4 Venkatesh Pallipadi
2010-10-18 19:24 ` [tip:sched/core] sched: Fix softirq time accounting tip-bot for Venkatesh Pallipadi
2010-10-05 0:03 ` [PATCH 2/8] Consolidate account_system_vtime extern declaration -v4 Venkatesh Pallipadi
2010-10-18 19:24 ` [tip:sched/core] sched: Consolidate account_system_vtime extern declaration tip-bot for Venkatesh Pallipadi
2010-10-18 19:27 ` [tip:sched/core] sched: Export account_system_vtime() tip-bot for Ingo Molnar
2010-10-05 0:03 ` [PATCH 3/8] Add a PF flag for ksoftirqd identification Venkatesh Pallipadi
2010-10-15 14:26 ` Peter Zijlstra
2010-10-15 14:46 ` Eric Dumazet
2010-10-18 19:25 ` [tip:sched/core] sched: " tip-bot for Venkatesh Pallipadi
2010-10-05 0:03 ` [PATCH 4/8] Add IRQ_TIME_ACCOUNTING, finer accounting of irq time -v4 Venkatesh Pallipadi
2010-10-15 14:28 ` Peter Zijlstra
2010-10-18 19:25 ` [tip:sched/core] sched: Add IRQ_TIME_ACCOUNTING, finer accounting of irq time tip-bot for Venkatesh Pallipadi
2010-10-05 0:03 ` [PATCH 5/8] x86: Add IRQ_TIME_ACCOUNTING in x86 -v4 Venkatesh Pallipadi
2010-10-15 14:38 ` Peter Zijlstra
2010-10-18 19:26 ` [tip:sched/core] x86: Add IRQ_TIME_ACCOUNTING tip-bot for Venkatesh Pallipadi
2010-10-05 0:03 ` [PATCH 6/8] sched: Do not account irq time to current task -v4 Venkatesh Pallipadi
2010-10-18 19:26 ` [tip:sched/core] sched: Do not account irq time to current task tip-bot for Venkatesh Pallipadi
2010-11-29 8:45 ` Yong Zhang
2010-11-29 11:59 ` Peter Zijlstra
2010-11-29 14:22 ` Yong Zhang
2010-11-29 17:06 ` Raistlin
2010-11-30 5:57 ` Yong Zhang
2010-12-01 18:55 ` Venkatesh Pallipadi
2010-12-01 19:16 ` Peter Zijlstra
2010-10-05 0:03 ` [PATCH 7/8] sched: Remove irq time from available CPU power -v4 Venkatesh Pallipadi
2010-10-18 19:26 ` tip-bot for Venkatesh Pallipadi [this message]
2010-10-05 0:03 ` [PATCH 8/8] Call tick_check_idle before __irq_enter Venkatesh Pallipadi
2010-10-17 9:05 ` Yong Zhang
2010-10-18 9:15 ` Peter Zijlstra
2010-10-18 19:27 ` [tip:sched/core] sched: " tip-bot for Venkatesh Pallipadi
2010-10-12 19:00 ` Proper kernel irq time accounting -v4 Venkatesh Pallipadi
2010-10-14 16:12 ` Shaun Ruffell
2010-10-14 18:19 ` Venkatesh Pallipadi
2010-10-14 20:00 ` Shaun Ruffell
2010-10-15 15:11 ` Peter Zijlstra
2010-10-15 15:27 ` Peter Zijlstra
2010-10-15 17:13 ` Venkatesh Pallipadi
2010-10-15 17:20 ` Peter Zijlstra
2010-10-17 9:11 ` Yong Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tip-aa483808516ca5cacfa0e5849691f64fec25828e@git.kernel.org \
--to=venki@google.com \
--cc=a.p.zijlstra@chello.nl \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.