public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: linux-kernel@vger.kernel.org, Rik van Riel <riel@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Viresh Kumar <viresh.kumar@linaro.org>
Subject: Re: kernel/timer: avoid spurious ksoftirqd wakeups
Date: Thu, 2 Apr 2015 16:59:40 +0200	[thread overview]
Message-ID: <20150402145938.GA10357@lerouge> (raw)
In-Reply-To: <20150402014455.GA25970@amt.cnet>

On Wed, Apr 01, 2015 at 10:44:55PM -0300, Marcelo Tosatti wrote:
> 
> It is only necessary to raise timer softirq
> in case there are active timers or irq work 
> to do.
> 
> Limit the ksoftirqd wakeup to those cases.
> 
> Fixes a latency spike with isolated CPUs and 
> nohz full mode.
> 
> Reported-and-tested-by: Luiz Capitulino <lcapitulino@redhat.com>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> diff --git a/include/linux/timer.h b/include/linux/timer.h
> index 8c5a197..0c065f9 100644
> --- a/include/linux/timer.h
> +++ b/include/linux/timer.h
> @@ -192,7 +192,7 @@ extern void set_timer_slack(struct timer_list *time, int slack_hz);
>   * locks the timer base and does the comparison against the given
>   * jiffie.
>   */
> -extern unsigned long get_next_timer_interrupt(unsigned long now);
> +extern unsigned long get_next_timer_interrupt(unsigned long now, bool *raise_softirq);
>  
>  /*
>   * Timer-statistics info:
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index a4c4eda..615e276 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -568,6 +568,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  	unsigned long rcu_delta_jiffies;
>  	struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
>  	u64 time_delta;
> +	bool raise_softirq;
>  
>  	time_delta = timekeeping_max_deferment();
>  
> @@ -582,9 +583,11 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  	    arch_needs_cpu() || irq_work_needs_cpu()) {
>  		next_jiffies = last_jiffies + 1;
>  		delta_jiffies = 1;
> +		raise_softirq = true;

I believe that irq_work doesn't need the softirq. It needs a tick only in order to call
irq_work_tick(). And I believe this is the same for RCU which needs a call to
rcu_check_callbacks(), but it might need something else that the softirq does
(but this is the timer softirq, not the rcu one). 

>  	} else {
>  		/* Get the next timer wheel timer */
> -		next_jiffies = get_next_timer_interrupt(last_jiffies);
> +		next_jiffies = get_next_timer_interrupt(last_jiffies,
> +							&raise_softirq);
>  		delta_jiffies = next_jiffies - last_jiffies;
>  		if (rcu_delta_jiffies < delta_jiffies) {
>  			next_jiffies = last_jiffies + rcu_delta_jiffies;
> @@ -703,7 +706,8 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  		 */
>  		tick_do_update_jiffies64(ktime_get());
>  	}
> -	raise_softirq_irqoff(TIMER_SOFTIRQ);
> +	if (raise_softirq)
> +		raise_softirq_irqoff(TIMER_SOFTIRQ);
>  out:
>  	ts->next_jiffies = next_jiffies;
>  	ts->last_jiffies = last_jiffies;

Lets look at the things outside the pending timer list that can end up failing
to program the timer because it is in the past already:

_ timekeeping_max_deferment(): I doubt, the value is pretty high
_ scheduler_tick_max_deferment(); it's one second long, way enough to never be in
  the past by the time we program the clock
_ RCU, irq_work, arch: may be, if the last jiffies update is far enough. But apparently
  the problem is elsewhere since you keep the softirq for these and your patch solves your
  problem.
_ In case hrtimer runs in low-res mode and the next hrtimer is very close, or even in the past
  already, you may run into such issue. And hrtimer doesn't need the timer softirq, at least not
  to run the callbacks. It needs it only if hrtimer is switching to high-res mode, I think it's
  a rare event.

Now it would be nice to identify the issue we are facing here. Are you running in hrtimer low-res
mode?

  parent reply	other threads:[~2015-04-02 14:59 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-02  1:44 kernel/timer: avoid spurious ksoftirqd wakeups Marcelo Tosatti
2015-04-02 13:58 ` Rik van Riel
2015-04-02 20:53   ` Marcelo Tosatti
2015-04-02 14:59 ` Frederic Weisbecker [this message]
2015-04-02 21:08   ` Marcelo Tosatti
2015-04-06 23:34     ` Frederic Weisbecker
2015-04-06 23:51       ` Marcelo Tosatti
2015-04-07 20:17         ` Frederic Weisbecker
2015-04-07 22:29           ` Marcelo Tosatti
     [not found] <042301d06cf5$2ec7ae90$8c570bb0$@alibaba-inc.com>
2015-04-02  3:32 ` Hillf Danton
2015-04-02 20:29   ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150402145938.GA10357@lerouge \
    --to=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox