public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Venkatesh Pallipadi <venki@google.com>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>,
	Mikael Pettersson <mikpe@it.uu.se>, Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	John Stultz <johnstul@us.ibm.com>
Subject: Re: [BUG] 2.6.37-rc3 massive interactivity regression on ARM
Date: Fri, 10 Dec 2010 14:17:45 +0100	[thread overview]
Message-ID: <1291987065.6803.151.camel@twins> (raw)
In-Reply-To: <1291975704.6803.59.camel@twins>

On Fri, 2010-12-10 at 11:08 +0100, Peter Zijlstra wrote:
> > Just to make sure, update_rq_clock() always gets called on current
> > CPU. Right? 
> 
> No, specifically not. If that were the case we wouldn't need the
> cross-cpu synced timestamp. Things like load-balancing and
> remote-wakeups need to update a remote CPUs clock.
> 
> > The pending patches I have optimizes
> > account_system_vtime() to use this_cpu_write and friends. Want to make
> > sure this change will still keep that optimization relevant.
> 
> Ah, good point, remote CPUs updating that will mess with the consistency
> of the per-cpu timestamps due to non atomic updates :/
> 
> Bugger.. making them atomics will make it even more expensive. /me goes
> ponder. 

OK, so I ended up doing the same you did.. Still staring at that, 32bit
will go very funny in the head once every so often. One possible
solution would be to ignore the occasional abs(irq_delta) > 2 * delta.

That would however result in an accounting discrepancy such that:
   clock_task + irq_time != clock

Thoughts?

---
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -1813,11 +1813,36 @@ void disable_sched_clock_irqtime(void)
 	sched_clock_irqtime = 0;
 }
 
+static void __account_system_vtime(int cpu, u64 now)
+{
+	s64 delta;
+
+	delta = now - per_cpu(irq_start_time, cpu);
+	per_cpu(irq_start_time, cpu) = now;
+
+	if (hardirq_count())
+		per_cpu(cpu_hardirq_time, cpu) += delta;
+	/*
+	 * We do not account for softirq time from ksoftirqd here. We want to
+	 * continue accounting softirq time to ksoftirqd thread in that case,
+	 * so as not to confuse scheduler with a special task that do not
+	 * consume any time, but still wants to run.
+	 */
+	else if (in_serving_softirq() && !(current->flags & PF_KSOFTIRQD))
+		per_cpu(cpu_softirq_time, cpu) += delta;
+}
+
+/*
+ * Called before incrementing preempt_count on {soft,}irq_enter
+ * and before decrementing preempt_count on {soft,}irq_exit.
+ *
+ * @curr should at all times be current.
+ */
 void account_system_vtime(struct task_struct *curr)
 {
 	unsigned long flags;
+	u64 now;
 	int cpu;
-	u64 now, delta;
 
 	if (!sched_clock_irqtime)
 		return;
@@ -1826,26 +1851,21 @@ void account_system_vtime(struct task_st
 
 	cpu = smp_processor_id();
 	now = sched_clock_cpu(cpu);
-	delta = now - per_cpu(irq_start_time, cpu);
-	per_cpu(irq_start_time, cpu) = now;
-	/*
-	 * We do not account for softirq time from ksoftirqd here.
-	 * We want to continue accounting softirq time to ksoftirqd thread
-	 * in that case, so as not to confuse scheduler with a special task
-	 * that do not consume any time, but still wants to run.
-	 */
-	if (hardirq_count())
-		per_cpu(cpu_hardirq_time, cpu) += delta;
-	else if (in_serving_softirq() && !(curr->flags & PF_KSOFTIRQD))
-		per_cpu(cpu_softirq_time, cpu) += delta;
+	__account_system_vtime(cpu, now);
 
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(account_system_vtime);
 
-static u64 irq_time_cpu(int cpu)
+static u64 irq_time_cpu(struct rq *rq)
 {
-	account_system_vtime(current);
+	int cpu = cpu_of(rq);
+	/*
+	 * See the comment in update_rq_clock_task(), ideally we'd update
+	 * the *irq_time values using rq->clock here.
+	 *
+	 * As it stands, reading this from a remote cpu is buggy on 32bit.
+	 */
 	return per_cpu(cpu_softirq_time, cpu) + per_cpu(cpu_hardirq_time, cpu);
 }
 
@@ -1853,9 +1873,27 @@ static void update_rq_clock_task(struct
 {
 	s64 irq_delta;
 
-	irq_delta = irq_time_cpu(cpu_of(rq)) - rq->prev_irq_time;
-	rq->prev_irq_time += irq_delta;
+	irq_delta = irq_time_cpu(rq) - rq->prev_irq_time;
+
+	/*
+	 * Since irq_time is only updated on {soft,}irq_exit, we might run into
+	 * this case when a previous update_rq_clock() happened inside a
+	 * {soft,}irq region.
+	 *
+	 * When this happens, we stop ->clock_task and only update the
+	 * prev_irq_time stamp to account for the part that fit, so that a next
+	 * update will consume the rest. This ensures ->clock_task is
+	 * monotonic.
+	 *
+	 * It does however cause some slight miss-attribution of {soft,}irq
+	 * time, a more accurate solution would be to update the irq_time using
+	 * the current rq->clock timestamp, except that would require using
+	 * atomic ops.
+	 */
+	if (irq_delta > delta)
+		irq_delta = delta;
 
+	rq->prev_irq_time += irq_delta;
 	delta -= irq_delta;
 	rq->clock_task += delta;
 


  reply	other threads:[~2010-12-10 13:18 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-27 15:16 [BUG] 2.6.37-rc3 massive interactivity regression on ARM Mikael Pettersson
2010-12-05 12:32 ` Mikael Pettersson
2010-12-05 13:17   ` Russell King - ARM Linux
2010-12-05 14:19     ` Russell King - ARM Linux
2010-12-05 16:07       ` Mikael Pettersson
2010-12-05 16:21         ` Russell King - ARM Linux
2010-12-08 12:40           ` Peter Zijlstra
2010-12-08 12:55             ` Russell King - ARM Linux
2010-12-08 14:04               ` Peter Zijlstra
2010-12-08 14:28                 ` Russell King - ARM Linux
2010-12-08 14:44                   ` Peter Zijlstra
2010-12-08 15:05                     ` Russell King - ARM Linux
2010-12-08 15:43                     ` Linus Walleij
2010-12-08 20:42                     ` john stultz
2010-12-08 23:31                   ` Venkatesh Pallipadi
2010-12-09 12:52                     ` Peter Zijlstra
2010-12-09 17:43                       ` Venkatesh Pallipadi
2010-12-09 17:55                         ` Peter Zijlstra
2010-12-09 18:11                           ` Venkatesh Pallipadi
2010-12-09 18:55                             ` Peter Zijlstra
2010-12-09 22:21                               ` Venkatesh Pallipadi
2010-12-09 23:16                                 ` Peter Zijlstra
2010-12-09 23:35                                   ` Venkatesh Pallipadi
2010-12-10 10:08                                     ` Peter Zijlstra
2010-12-10 13:17                                       ` Peter Zijlstra [this message]
2010-12-10 13:27                                         ` Peter Zijlstra
2010-12-10 13:47                                           ` Peter Zijlstra
2010-12-10 16:50                                             ` Russell King - ARM Linux
2010-12-10 16:54                                               ` Peter Zijlstra
2010-12-10 17:18                                             ` Eric Dumazet
2010-12-10 17:49                                               ` Peter Zijlstra
2010-12-10 18:14                                                 ` Eric Dumazet
2010-12-10 18:39                                                   ` Christoph Lameter
2010-12-10 18:46                                                     ` Peter Zijlstra
2010-12-10 19:51                                                       ` Christoph Lameter
2010-12-10 20:07                                                         ` Peter Zijlstra
2010-12-10 20:23                                                           ` Christoph Lameter
2010-12-10 20:32                                                             ` Peter Zijlstra
2010-12-10 20:39                                                             ` Eric Dumazet
2010-12-10 20:49                                                               ` Eric Dumazet
2010-12-10 21:09                                                                 ` Christoph Lameter
2010-12-10 21:22                                                                   ` Eric Dumazet
2010-12-10 21:45                                                                     ` Christoph Lameter
2010-12-10 17:56                                             ` Russell King - ARM Linux
2010-12-10 18:10                                               ` Peter Zijlstra
2010-12-10 18:43                                                 ` Peter Zijlstra
2010-12-10 19:17                                                 ` Russell King - ARM Linux
2010-12-10 19:37                                                   ` Peter Zijlstra
2010-12-10 19:25                                                 ` Peter Zijlstra
2010-12-13 14:33                             ` Jack Daniel
2010-12-06 21:29       ` Venkatesh Pallipadi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1291987065.6803.151.camel@twins \
    --to=peterz@infradead.org \
    --cc=johnstul@us.ibm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mikpe@it.uu.se \
    --cc=mingo@elte.hu \
    --cc=venki@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox