From: tglx@linutronix.de (Thomas Gleixner)
To: linux-arm-kernel@lists.infradead.org
Subject: too many timer retries happen when do local timer swtich with broadcast timer
Date: Thu, 21 Feb 2013 14:48:20 +0100 (CET) [thread overview]
Message-ID: <alpine.LFD.2.02.1302211436300.22263@ionos> (raw)
In-Reply-To: <CAB4PhKf+Z-TK4KGDZm1kYOWibg=e2Ehnhjxt_oTbFq9eqouAjA@mail.gmail.com>
Jason,
On Thu, 21 Feb 2013, Jason Liu wrote:
> 2013/2/21 Thomas Gleixner <tglx@linutronix.de>:
> > Now your explanation makes sense.
> >
> > I have no fast solution for this, but I think that I have an idea how
> > to fix it. Stay tuned.
>
> Thanks Thomas, wait for your fix. :)
find below a completely untested patch, which should address that issue.
Sigh. Stopping the cpu local timer in deep idle is known to be an
idiotic design decision for 10+ years. I'll never understand why ARM
vendors insist on implementing the same stupidity over and over.
Thanks,
tglx
Index: linux-2.6/kernel/time/tick-broadcast.c
===================================================================
--- linux-2.6.orig/kernel/time/tick-broadcast.c
+++ linux-2.6/kernel/time/tick-broadcast.c
@@ -397,6 +397,8 @@ int tick_resume_broadcast(void)
/* FIXME: use cpumask_var_t. */
static DECLARE_BITMAP(tick_broadcast_oneshot_mask, NR_CPUS);
+static DECLARE_BITMAP(tick_broadcast_pending, NR_CPUS);
+static DECLARE_BITMAP(tick_force_broadcast_mask, NR_CPUS);
/*
* Exposed for debugging: see timer_list.c
@@ -453,12 +455,24 @@ again:
/* Find all expired events */
for_each_cpu(cpu, tick_get_broadcast_oneshot_mask()) {
td = &per_cpu(tick_cpu_device, cpu);
- if (td->evtdev->next_event.tv64 <= now.tv64)
+ if (td->evtdev->next_event.tv64 <= now.tv64) {
cpumask_set_cpu(cpu, to_cpumask(tmpmask));
- else if (td->evtdev->next_event.tv64 < next_event.tv64)
+ /*
+ * Mark the remote cpu in the pending mask, so
+ * it can avoid reprogramming the cpu local
+ * timer in tick_broadcast_oneshot_control().
+ */
+ set_bit(cpu, tick_broadcast_pending);
+ } else if (td->evtdev->next_event.tv64 < next_event.tv64)
next_event.tv64 = td->evtdev->next_event.tv64;
}
+ /* Take care of enforced broadcast requests */
+ for_each_cpu(cpu, to_cpumask(tick_force_broadcast_mask)) {
+ set_bit(cpu, tmpmask);
+ clear_bit(cpu, tick_force_broadcast_mask);
+ }
+
/*
* Wakeup the cpus which have an expired event.
*/
@@ -494,6 +508,7 @@ void tick_broadcast_oneshot_control(unsi
struct clock_event_device *bc, *dev;
struct tick_device *td;
unsigned long flags;
+ ktime_t now;
int cpu;
/*
@@ -518,6 +533,8 @@ void tick_broadcast_oneshot_control(unsi
raw_spin_lock_irqsave(&tick_broadcast_lock, flags);
if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ENTER) {
+ WARN_ON_ONCE(test_bit(cpu, tick_broadcast_pending));
+ WARN_ON_ONCE(test_bit(cpu, tick_force_broadcast_mask));
if (!cpumask_test_cpu(cpu, tick_get_broadcast_oneshot_mask())) {
cpumask_set_cpu(cpu, tick_get_broadcast_oneshot_mask());
clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
@@ -529,10 +546,63 @@ void tick_broadcast_oneshot_control(unsi
cpumask_clear_cpu(cpu,
tick_get_broadcast_oneshot_mask());
clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
- if (dev->next_event.tv64 != KTIME_MAX)
- tick_program_event(dev->next_event, 1);
+ if (dev->next_event.tv64 == KTIME_MAX)
+ goto out;
+ /*
+ * The cpu handling the broadcast timer marked
+ * this cpu in the broadcast pending mask and
+ * fired the broadcast IPI. So we are going to
+ * handle the expired event anyway via the
+ * broadcast IPI handler. No need to reprogram
+ * the timer with an already expired event.
+ */
+ if (test_and_clear_bit(cpu, tick_broadcast_pending))
+ goto out;
+ /*
+ * If the pending bit is not set, then we are
+ * either the CPU handling the broadcast
+ * interrupt or we got woken by something else.
+ *
+ * We are not longer in the broadcast mask, so
+ * if the cpu local expiry time is already
+ * reached, we would reprogram the cpu local
+ * timer with an already expired event.
+ *
+ * This can lead to a ping-pong when we return
+ * to idle and therefor rearm the broadcast
+ * timer before the cpu local timer was able
+ * to fire. This happens because the forced
+ * reprogramming makes sure that the event
+ * will happen in the future and depending on
+ * the min_delta setting this might be far
+ * enough out that the ping-pong starts.
+ *
+ * If the cpu local next_event has expired
+ * then we know that the broadcast timer
+ * next_event has expired as well and
+ * broadcast is about to be handled. So we
+ * avoid reprogramming and enforce that the
+ * broadcast handler, which did not run yet,
+ * will invoke the cpu local handler.
+ *
+ * We cannot call the handler directly from
+ * here, because we might be in a NOHZ phase
+ * and we did not go through the irq_enter()
+ * nohz fixups.
+ */
+ now = ktime_get();
+ if (dev->next_event.tv64 <= now.tv64) {
+ set_bit(cpu, tick_force_broadcast_mask);
+ goto out;
+ }
+ /*
+ * We got woken by something else. Reprogram
+ * the cpu local timer device.
+ */
+ tick_program_event(dev->next_event, 1);
}
}
+out:
raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
}
next prev parent reply other threads:[~2013-02-21 13:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-20 11:16 too many timer retries happen when do local timer swtich with broadcast timer Jason Liu
2013-02-20 13:33 ` Thomas Gleixner
2013-02-21 6:16 ` Jason Liu
2013-02-21 9:36 ` Thomas Gleixner
2013-02-21 10:50 ` Jason Liu
2013-02-21 13:48 ` Thomas Gleixner [this message]
2013-02-21 15:12 ` Santosh Shilimkar
2013-02-21 22:19 ` Thomas Gleixner
2013-02-22 10:07 ` Santosh Shilimkar
2013-02-22 10:24 ` Thomas Gleixner
2013-02-22 10:30 ` Santosh Shilimkar
2013-02-22 10:31 ` Lorenzo Pieralisi
2013-02-22 11:02 ` Santosh Shilimkar
2013-02-22 12:07 ` Thomas Gleixner
2013-02-22 14:48 ` Lorenzo Pieralisi
2013-02-22 15:03 ` Thomas Gleixner
2013-02-22 15:26 ` Lorenzo Pieralisi
2013-02-22 18:52 ` Thomas Gleixner
2013-02-25 6:12 ` Santosh Shilimkar
2013-02-25 6:38 ` Jason Liu
2013-02-25 13:34 ` Lorenzo Pieralisi
2013-02-22 10:28 ` Lorenzo Pieralisi
2013-02-22 10:26 ` Jason Liu
2013-02-21 10:35 ` Lorenzo Pieralisi
2013-02-21 10:49 ` Jason Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.02.1302211436300.22263@ionos \
--to=tglx@linutronix.de \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).