From: peterz@infradead.org (Peter Zijlstra)
To: linux-arm-kernel@lists.infradead.org
Subject: [BUG] 2.6.37-rc3 massive interactivity regression on ARM
Date: Fri, 10 Dec 2010 14:27:15 +0100 [thread overview]
Message-ID: <1291987635.6803.161.camel@twins> (raw)
In-Reply-To: <1291987065.6803.151.camel@twins>
On Fri, 2010-12-10 at 14:17 +0100, Peter Zijlstra wrote:
>
> OK, so I ended up doing the same you did.. Still staring at that, 32bit
> will go very funny in the head once every so often. One possible
> solution would be to ignore the occasional abs(irq_delta) > 2 * delta.
>
> That would however result in an accounting discrepancy such that:
> clock_task + irq_time != clock
>
> Thoughts?
The brute force solution is a seqcount.. something like so:
---
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -1786,21 +1786,63 @@ static void deactivate_task(struct rq *r
#ifdef CONFIG_IRQ_TIME_ACCOUNTING
/*
- * There are no locks covering percpu hardirq/softirq time.
- * They are only modified in account_system_vtime, on corresponding CPU
- * with interrupts disabled. So, writes are safe.
+ * There are no locks covering percpu hardirq/softirq time. They are only
+ * modified in account_system_vtime, on corresponding CPU with interrupts
+ * disabled. So, writes are safe.
+ *
* They are read and saved off onto struct rq in update_rq_clock().
- * This may result in other CPU reading this CPU's irq time and can
- * race with irq/account_system_vtime on this CPU. We would either get old
- * or new value (or semi updated value on 32 bit) with a side effect of
- * accounting a slice of irq time to wrong task when irq is in progress
- * while we read rq->clock. That is a worthy compromise in place of having
- * locks on each irq in account_system_time.
+ *
+ * This may result in other CPU reading this CPU's irq time and can race with
+ * irq/account_system_vtime on this CPU. We would either get old or new value
+ * with a side effect of accounting a slice of irq time to wrong task when irq
+ * is in progress while we read rq->clock. That is a worthy compromise in place
+ * of having locks on each irq in account_system_time.
*/
static DEFINE_PER_CPU(u64, cpu_hardirq_time);
static DEFINE_PER_CPU(u64, cpu_softirq_time);
-
static DEFINE_PER_CPU(u64, irq_start_time);
+
+#ifndef CONFIG_64BIT
+static DEFINE_PER_CPU(seqcount_t, irq_time_seq);
+
+static inline void irq_time_write_begin(int cpu)
+{
+ write_seqcount_begin(&per_cpu(irq_time_seq, cpu));
+}
+
+static inline void irq_time_write_end(int cpu)
+{
+ write_seqcount_end(&per_cpu(irq_time_seq, cpu));
+}
+
+static inline u64 irq_time_read(int cpu)
+{
+ u64 irq_time;
+ unsigned seq;
+
+ do {
+ seq = read_seqcount_begin(&per_cpu(irq_time_seq, cpu));
+ irq_time = per_cpu(cpu_softirq_time, cpu) +
+ per_cpu(cpu_hardirq_time, cpu);
+ } while (read_seqcount_retry(&per_cpu(irq_time_seq, cpu), seq));
+
+ return irq_time;
+}
+#else /* CONFIG_64BIT */
+static inline void irq_time_write_begin(int cpu)
+{
+}
+
+static inline void irq_time_write_end(int cpu)
+{
+}
+
+static inline u64 irq_time_read(int cpu)
+{
+ return per_cpu(cpu_softirq_time, cpu) + per_cpu(cpu_hardirq_time, cpu);
+}
+#endif /* CONFIG_64BIT */
+
static int sched_clock_irqtime;
void enable_sched_clock_irqtime(void)
@@ -1820,6 +1862,7 @@ static void __account_system_vtime(int c
delta = now - per_cpu(irq_start_time, cpu);
per_cpu(irq_start_time, cpu) = now;
+ irq_time_write_begin(cpu);
if (hardirq_count())
per_cpu(cpu_hardirq_time, cpu) += delta;
/*
@@ -1830,6 +1873,7 @@ static void __account_system_vtime(int c
*/
else if (in_serving_softirq() && !(current->flags & PF_KSOFTIRQD))
per_cpu(cpu_softirq_time, cpu) += delta;
+ irq_time_write_end(cpu);
}
/*
@@ -1859,14 +1903,11 @@ EXPORT_SYMBOL_GPL(account_system_vtime);
static u64 irq_time_cpu(struct rq *rq)
{
- int cpu = cpu_of(rq);
/*
* See the comment in update_rq_clock_task(), ideally we'd update
* the *irq_time values using rq->clock here.
- *
- * As it stands, reading this from a remote cpu is buggy on 32bit.
*/
- return per_cpu(cpu_softirq_time, cpu) + per_cpu(cpu_hardirq_time, cpu);
+ return irq_time_read(cpu_of(rq));
}
static void update_rq_clock_task(struct rq *rq, s64 delta)
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Venkatesh Pallipadi <venki@google.com>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>,
Mikael Pettersson <mikpe@it.uu.se>, Ingo Molnar <mingo@elte.hu>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
John Stultz <johnstul@us.ibm.com>
Subject: Re: [BUG] 2.6.37-rc3 massive interactivity regression on ARM
Date: Fri, 10 Dec 2010 14:27:15 +0100 [thread overview]
Message-ID: <1291987635.6803.161.camel@twins> (raw)
In-Reply-To: <1291987065.6803.151.camel@twins>
On Fri, 2010-12-10 at 14:17 +0100, Peter Zijlstra wrote:
>
> OK, so I ended up doing the same you did.. Still staring at that, 32bit
> will go very funny in the head once every so often. One possible
> solution would be to ignore the occasional abs(irq_delta) > 2 * delta.
>
> That would however result in an accounting discrepancy such that:
> clock_task + irq_time != clock
>
> Thoughts?
The brute force solution is a seqcount.. something like so:
---
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -1786,21 +1786,63 @@ static void deactivate_task(struct rq *r
#ifdef CONFIG_IRQ_TIME_ACCOUNTING
/*
- * There are no locks covering percpu hardirq/softirq time.
- * They are only modified in account_system_vtime, on corresponding CPU
- * with interrupts disabled. So, writes are safe.
+ * There are no locks covering percpu hardirq/softirq time. They are only
+ * modified in account_system_vtime, on corresponding CPU with interrupts
+ * disabled. So, writes are safe.
+ *
* They are read and saved off onto struct rq in update_rq_clock().
- * This may result in other CPU reading this CPU's irq time and can
- * race with irq/account_system_vtime on this CPU. We would either get old
- * or new value (or semi updated value on 32 bit) with a side effect of
- * accounting a slice of irq time to wrong task when irq is in progress
- * while we read rq->clock. That is a worthy compromise in place of having
- * locks on each irq in account_system_time.
+ *
+ * This may result in other CPU reading this CPU's irq time and can race with
+ * irq/account_system_vtime on this CPU. We would either get old or new value
+ * with a side effect of accounting a slice of irq time to wrong task when irq
+ * is in progress while we read rq->clock. That is a worthy compromise in place
+ * of having locks on each irq in account_system_time.
*/
static DEFINE_PER_CPU(u64, cpu_hardirq_time);
static DEFINE_PER_CPU(u64, cpu_softirq_time);
-
static DEFINE_PER_CPU(u64, irq_start_time);
+
+#ifndef CONFIG_64BIT
+static DEFINE_PER_CPU(seqcount_t, irq_time_seq);
+
+static inline void irq_time_write_begin(int cpu)
+{
+ write_seqcount_begin(&per_cpu(irq_time_seq, cpu));
+}
+
+static inline void irq_time_write_end(int cpu)
+{
+ write_seqcount_end(&per_cpu(irq_time_seq, cpu));
+}
+
+static inline u64 irq_time_read(int cpu)
+{
+ u64 irq_time;
+ unsigned seq;
+
+ do {
+ seq = read_seqcount_begin(&per_cpu(irq_time_seq, cpu));
+ irq_time = per_cpu(cpu_softirq_time, cpu) +
+ per_cpu(cpu_hardirq_time, cpu);
+ } while (read_seqcount_retry(&per_cpu(irq_time_seq, cpu), seq));
+
+ return irq_time;
+}
+#else /* CONFIG_64BIT */
+static inline void irq_time_write_begin(int cpu)
+{
+}
+
+static inline void irq_time_write_end(int cpu)
+{
+}
+
+static inline u64 irq_time_read(int cpu)
+{
+ return per_cpu(cpu_softirq_time, cpu) + per_cpu(cpu_hardirq_time, cpu);
+}
+#endif /* CONFIG_64BIT */
+
static int sched_clock_irqtime;
void enable_sched_clock_irqtime(void)
@@ -1820,6 +1862,7 @@ static void __account_system_vtime(int c
delta = now - per_cpu(irq_start_time, cpu);
per_cpu(irq_start_time, cpu) = now;
+ irq_time_write_begin(cpu);
if (hardirq_count())
per_cpu(cpu_hardirq_time, cpu) += delta;
/*
@@ -1830,6 +1873,7 @@ static void __account_system_vtime(int c
*/
else if (in_serving_softirq() && !(current->flags & PF_KSOFTIRQD))
per_cpu(cpu_softirq_time, cpu) += delta;
+ irq_time_write_end(cpu);
}
/*
@@ -1859,14 +1903,11 @@ EXPORT_SYMBOL_GPL(account_system_vtime);
static u64 irq_time_cpu(struct rq *rq)
{
- int cpu = cpu_of(rq);
/*
* See the comment in update_rq_clock_task(), ideally we'd update
* the *irq_time values using rq->clock here.
- *
- * As it stands, reading this from a remote cpu is buggy on 32bit.
*/
- return per_cpu(cpu_softirq_time, cpu) + per_cpu(cpu_hardirq_time, cpu);
+ return irq_time_read(cpu_of(rq));
}
static void update_rq_clock_task(struct rq *rq, s64 delta)
next prev parent reply other threads:[~2010-12-10 13:27 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-27 15:16 [BUG] 2.6.37-rc3 massive interactivity regression on ARM Mikael Pettersson
2010-11-27 15:16 ` Mikael Pettersson
2010-12-05 12:32 ` Mikael Pettersson
2010-12-05 12:32 ` Mikael Pettersson
2010-12-05 13:17 ` Russell King - ARM Linux
2010-12-05 13:17 ` Russell King - ARM Linux
2010-12-05 14:19 ` Russell King - ARM Linux
2010-12-05 14:19 ` Russell King - ARM Linux
2010-12-05 16:07 ` Mikael Pettersson
2010-12-05 16:07 ` Mikael Pettersson
2010-12-05 16:21 ` Russell King - ARM Linux
2010-12-05 16:21 ` Russell King - ARM Linux
2010-12-08 12:40 ` Peter Zijlstra
2010-12-08 12:40 ` Peter Zijlstra
2010-12-08 12:55 ` Russell King - ARM Linux
2010-12-08 12:55 ` Russell King - ARM Linux
2010-12-08 14:04 ` Peter Zijlstra
2010-12-08 14:04 ` Peter Zijlstra
2010-12-08 14:28 ` Russell King - ARM Linux
2010-12-08 14:28 ` Russell King - ARM Linux
2010-12-08 14:44 ` Peter Zijlstra
2010-12-08 14:44 ` Peter Zijlstra
2010-12-08 15:05 ` Russell King - ARM Linux
2010-12-08 15:05 ` Russell King - ARM Linux
2010-12-08 15:43 ` Linus Walleij
2010-12-08 15:43 ` Linus Walleij
2010-12-08 20:42 ` john stultz
2010-12-08 20:42 ` john stultz
2010-12-08 23:31 ` Venkatesh Pallipadi
2010-12-08 23:31 ` Venkatesh Pallipadi
2010-12-09 12:52 ` Peter Zijlstra
2010-12-09 12:52 ` Peter Zijlstra
2010-12-09 17:43 ` Venkatesh Pallipadi
2010-12-09 17:43 ` Venkatesh Pallipadi
2010-12-09 17:55 ` Peter Zijlstra
2010-12-09 17:55 ` Peter Zijlstra
2010-12-09 18:11 ` Venkatesh Pallipadi
2010-12-09 18:11 ` Venkatesh Pallipadi
2010-12-09 18:55 ` Peter Zijlstra
2010-12-09 18:55 ` Peter Zijlstra
2010-12-09 22:21 ` Venkatesh Pallipadi
2010-12-09 22:21 ` Venkatesh Pallipadi
2010-12-09 23:16 ` Peter Zijlstra
2010-12-09 23:16 ` Peter Zijlstra
2010-12-09 23:35 ` Venkatesh Pallipadi
2010-12-09 23:35 ` Venkatesh Pallipadi
2010-12-10 10:08 ` Peter Zijlstra
2010-12-10 10:08 ` Peter Zijlstra
2010-12-10 13:17 ` Peter Zijlstra
2010-12-10 13:17 ` Peter Zijlstra
2010-12-10 13:27 ` Peter Zijlstra [this message]
2010-12-10 13:27 ` Peter Zijlstra
2010-12-10 13:47 ` Peter Zijlstra
2010-12-10 13:47 ` Peter Zijlstra
2010-12-10 16:50 ` Russell King - ARM Linux
2010-12-10 16:50 ` Russell King - ARM Linux
2010-12-10 16:54 ` Peter Zijlstra
2010-12-10 16:54 ` Peter Zijlstra
2010-12-10 17:18 ` Eric Dumazet
2010-12-10 17:18 ` Eric Dumazet
2010-12-10 17:49 ` Peter Zijlstra
2010-12-10 17:49 ` Peter Zijlstra
2010-12-10 18:14 ` Eric Dumazet
2010-12-10 18:14 ` Eric Dumazet
2010-12-10 18:39 ` Christoph Lameter
2010-12-10 18:39 ` Christoph Lameter
2010-12-10 18:46 ` Peter Zijlstra
2010-12-10 18:46 ` Peter Zijlstra
2010-12-10 19:51 ` Christoph Lameter
2010-12-10 19:51 ` Christoph Lameter
2010-12-10 20:07 ` Peter Zijlstra
2010-12-10 20:07 ` Peter Zijlstra
2010-12-10 20:23 ` Christoph Lameter
2010-12-10 20:23 ` Christoph Lameter
2010-12-10 20:32 ` Peter Zijlstra
2010-12-10 20:32 ` Peter Zijlstra
2010-12-10 20:39 ` Eric Dumazet
2010-12-10 20:39 ` Eric Dumazet
2010-12-10 20:49 ` Eric Dumazet
2010-12-10 20:49 ` Eric Dumazet
2010-12-10 21:09 ` Christoph Lameter
2010-12-10 21:09 ` Christoph Lameter
2010-12-10 21:22 ` Eric Dumazet
2010-12-10 21:22 ` Eric Dumazet
2010-12-10 21:45 ` Christoph Lameter
2010-12-10 21:45 ` Christoph Lameter
2010-12-10 17:56 ` Russell King - ARM Linux
2010-12-10 17:56 ` Russell King - ARM Linux
2010-12-10 18:10 ` Peter Zijlstra
2010-12-10 18:10 ` Peter Zijlstra
2010-12-10 18:43 ` Peter Zijlstra
2010-12-10 18:43 ` Peter Zijlstra
2010-12-10 19:17 ` Russell King - ARM Linux
2010-12-10 19:17 ` Russell King - ARM Linux
2010-12-10 19:37 ` Peter Zijlstra
2010-12-10 19:37 ` Peter Zijlstra
2010-12-10 19:25 ` Peter Zijlstra
2010-12-10 19:25 ` Peter Zijlstra
2010-12-13 14:33 ` Jack Daniel
2010-12-13 14:33 ` Jack Daniel
2010-12-06 21:29 ` Venkatesh Pallipadi
2010-12-06 21:29 ` Venkatesh Pallipadi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1291987635.6803.161.camel@twins \
--to=peterz@infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.