From: Venkatesh Pallipadi <venki@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-kernel@vger.kernel.org, Paul Turner <pjt@google.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
Venkatesh Pallipadi <venki@google.com>
Subject: [PATCH 6/7] sched: Remove irq time from available CPU power -v3
Date: Wed, 29 Sep 2010 12:21:35 -0700 [thread overview]
Message-ID: <1285788096-29471-7-git-send-email-venki@google.com> (raw)
In-Reply-To: <1285788096-29471-1-git-send-email-venki@google.com>
The idea suggested by Peter Zijlstra here.
http://marc.info/?l=linux-kernel&m=127476934517534&w=2
irq time is technically not available to the tasks running on the CPU.
This patch removes irq time from CPU power piggybacking on
sched_rt_avg_update().
Tested this by keeping CPU X busy with a network intensive task having 75%
oa a single CPU irq processing (hard+soft) on a 4-way system. And start seven
cycle soakers on the system. Without this change, there will be two tasks on
each CPU. With this change, there is a single task on irq busy CPU X and
remaining 7 tasks are spread around among other 3 CPUs.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
---
kernel/sched.c | 18 ++++++++++++++++++
kernel/sched_fair.c | 8 +++++++-
kernel/sched_features.h | 5 +++++
3 files changed, 30 insertions(+), 1 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 771bfa9..bfbe064 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -521,6 +521,10 @@ struct rq {
u64 avg_idle;
#endif
+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
+ u64 prev_irq_time;
+#endif
+
/* calc_load related fields */
unsigned long calc_load_update;
long calc_load_active;
@@ -645,6 +649,7 @@ static inline struct task_group *task_group(struct task_struct *p)
#endif /* CONFIG_CGROUP_SCHED */
static u64 irq_time_cpu(int cpu);
+static void sched_irq_time_avg_update(struct rq *rq, u64 irq_time);
inline void update_rq_clock(struct rq *rq)
{
@@ -656,6 +661,8 @@ inline void update_rq_clock(struct rq *rq)
irq_time = irq_time_cpu(cpu);
if (rq->clock - irq_time > rq->clock_task)
rq->clock_task = rq->clock - irq_time;
+
+ sched_irq_time_avg_update(rq, irq_time);
}
}
@@ -1983,6 +1990,15 @@ void account_system_vtime(struct task_struct *curr)
local_irq_restore(flags);
}
+static void sched_irq_time_avg_update(struct rq *rq, u64 curr_irq_time)
+{
+ if (sched_clock_irqtime && sched_feat(NONIRQ_POWER)) {
+ u64 delta_irq = curr_irq_time - rq->prev_irq_time;
+ rq->prev_irq_time = curr_irq_time;
+ sched_rt_avg_update(rq, delta_irq);
+ }
+}
+
#else
static u64 irq_time_cpu(int cpu)
@@ -1990,6 +2006,8 @@ static u64 irq_time_cpu(int cpu)
return 0;
}
+static void sched_irq_time_avg_update(struct rq *rq, u64 curr_irq_time) { }
+
#endif
#include "sched_idletask.c"
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 0baa696..6d0362a 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -2268,7 +2268,13 @@ unsigned long scale_rt_power(int cpu)
u64 total, available;
total = sched_avg_period() + (rq->clock - rq->age_stamp);
- available = total - rq->rt_avg;
+
+ if (unlikely(total < rq->rt_avg)) {
+ /* Ensures that power won't end up being negative */
+ available = 0;
+ } else {
+ available = total - rq->rt_avg;
+ }
if (unlikely((s64)total < SCHED_LOAD_SCALE))
total = SCHED_LOAD_SCALE;
diff --git a/kernel/sched_features.h b/kernel/sched_features.h
index 83c66e8..185f920 100644
--- a/kernel/sched_features.h
+++ b/kernel/sched_features.h
@@ -61,3 +61,8 @@ SCHED_FEAT(ASYM_EFF_LOAD, 1)
* release the lock. Decreases scheduling overhead.
*/
SCHED_FEAT(OWNER_SPIN, 1)
+
+/*
+ * Decrement CPU power based on irq activity
+ */
+SCHED_FEAT(NONIRQ_POWER, 1)
--
1.7.1
next prev parent reply other threads:[~2010-09-29 19:22 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-29 19:21 Proper kernel irq time accounting -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 1/7] si time accounting accounts bh_disable'd time to si -v3 Venkatesh Pallipadi
2010-09-30 11:04 ` Peter Zijlstra
2010-09-30 16:26 ` Venkatesh Pallipadi
2010-10-01 23:16 ` Peter Zijlstra
2010-10-02 15:42 ` Venkatesh Pallipadi
2010-10-03 0:34 ` Peter Zijlstra
2010-10-04 16:54 ` Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 2/7] Consolidate account_system_vtime extern declaration -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 3/7] Add IRQ_TIME_ACCOUNTING, finer accounting of irq time -v3 Venkatesh Pallipadi
2010-09-30 11:06 ` Peter Zijlstra
2010-09-30 16:29 ` Venkatesh Pallipadi
2010-09-30 20:38 ` Venkatesh Pallipadi
2010-10-01 11:46 ` Peter Zijlstra
2010-10-01 16:51 ` Venkatesh Pallipadi
2010-10-01 17:29 ` Venkatesh Pallipadi
2010-10-01 23:14 ` Peter Zijlstra
2010-10-01 23:32 ` Venkatesh Pallipadi
2010-10-02 10:53 ` Peter Zijlstra
2010-10-02 15:26 ` Venkatesh Pallipadi
2010-10-03 0:26 ` Peter Zijlstra
2010-10-01 11:45 ` Peter Zijlstra
2010-09-29 19:21 ` [PATCH 4/7] x86: Add IRQ_TIME_ACCOUNTING in x86 -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 5/7] sched: Do not account irq time to current task -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` Venkatesh Pallipadi [this message]
2010-09-29 19:21 ` [PATCH 7/7] Export per cpu hardirq and softirq time in proc -v3 Venkatesh Pallipadi
2010-09-30 7:59 ` Proper kernel irq time accounting -v3 Andi Kleen
2010-09-30 16:37 ` Venkatesh Pallipadi
2010-09-30 17:36 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1285788096-29471-7-git-send-email-venki@google.com \
--to=venki@google.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=eric.dumazet@gmail.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=schwidefsky@de.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.