All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <bitbucket@online.de>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ethan Zhao <ethan.kernel@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	johlstei@codeaurora.org, Yinghai Lu <yinghai@kernel.org>,
	Jin Feng <joe.jin@oracle.com>,
	Youquan Song <youquan.song@intel.com>,
	LenBrown <len.brown@intel.com>
Subject: Re: [PATCH V3]hrtimer: Fix a performance regression by disable reprogramming in remove_hrtimer
Date: Tue, 06 Aug 2013 09:46:12 +0200	[thread overview]
Message-ID: <1375775172.5412.14.camel@marge.simpson.net> (raw)
In-Reply-To: <1375774140.5412.9.camel@marge.simpson.net>

(CCs Intel folks)

On Tue, 2013-08-06 at 09:29 +0200, Mike Galbraith wrote: 
> On Tue, 2013-07-30 at 11:35 +0200, Peter Zijlstra wrote:
> 
> > It would be good if you could do what Thomas suggested and look at which
> > timer is actually active during your workload.
> 
> Rebuilding regression test trees, some pipe-test results...
> 
> I'm missing mwait_idle() rather a lot on Q6600, and at 3.8, E5620 took a
> severe NOHZ drubbing from the menu governor. 
> 
> pipe-test, scheduling cross core
> 
> NOTE: nohz is throttled here (patchlet below), as to not eat horrible
> microidle cost, see E5620 v3.7.10-nothrottle below.
> 
> Q6600
> v3.8.13                  500.6 KHz     1.000
> v3.9.11                  422.4 KHz      .843
> v3.10.4                  420.2 KHz      .839
> v3.11-rc3-4-g36f571e     404.7 KHz      .808
> 
> Q6600 3.9 regression:
> guilty party is 69fb3676 x86 idle: remove mwait_idle() and "idle=mwait" cmdline param
> halt sucks, HTH does one activate mwait_idle_with_hints() [processor_idle()] for core2 boxen?
> 
> E5620                                            +write 0 -> /dev/cpu_dma_latency, hold open
> v3.7.10                  578.5 KHz     1.000     675.4 KHz     1.000
> v3.7.10-nothrottle       366.7 KHz      .633     395.0 KHz      .584
> v3.8.13                  468.3 KHz      .809     690.0 KHz     1.021
> v3.8.13 idle=mwait       595.1 KHz     1.028     NA
> v3.9.11                  462.0 KHz      .798     691.1 KHz     1.023
> v3.10.4                  419.4 KHz      .724     570.8 KHz      .845
> v3.11-rc3-4-g36f571e     400.1 KHz      .691     538.5 KHz      .797
> 
> E5620 3.8 regression:
> guilty party: 69a37bea cpuidle: Quickly notice prediction failure for repeat mode
> 
> 
> Q6600 (2.4 GHz core2 quad)
>      v3.11-rc3-4-g36f571e                       v3.8.13
>      7.97%  [k] reschedule_interrupt            8.63%  [k] __schedule
>      6.27%  [k] __schedule                      6.07%  [k] native_sched_clock
>      4.74%  [k] native_sched_clock              4.96%  [k] system_call
>      4.23%  [k] _raw_spin_lock_irqsave          4.30%  [k] _raw_spin_lock_irqsave
>      3.39%  [k] system_call                     4.06%  [k] resched_task
>      2.89%  [k] sched_clock_local               3.44%  [k] sched_clock_local
>      2.79%  [k] mutex_lock                      3.39%  [k] pipe_read
>      2.57%  [k] pipe_read                       3.21%  [k] mutex_lock
>      2.55%  [k] __switch_to                     2.98%  [k] read_tsc
>      2.24%  [k] read_tsc                        2.87%  [k] __switch_to
> 
> 
> E5620 (2.4 GHz Westmere quad)
>     v3.7.10                                     v3.7.10-nothrottle                       v3.7.10-nothrottle
>     8.01%  [k] __schedule                      25.80%  [k] _raw_spin_unlock_irqrestore   21.80%  [k] _raw_spin_unlock_irqrestore
>     4.49%  [k] resched_tas                      4.64%  [k] __hrtimer_start_range_ns      - _raw_spin_unlock_irqrestore
>     3.94%  [k] mutex_lock                       4.62%  [k] timerqueue_add                   + 37.94% __hrtimer_start_range_ns
>     3.44%  [k] __switch_to                      4.54%  [k] __schedule                         19.69% hrtimer_cancel
>     3.18%  [k] menu_select                      2.84%  [k] enqueue_hrtimer                       tick_nohz_restart
>     3.05%  [k] copy_user_generic_string         2.64%  [k] resched_task                          tick_nohz_idle_exit
>     3.02%  [k] task_waking_fair                 2.29%  [k] _raw_spin_lock_irqsave                cpu_idle
>     2.91%  [k] mutex_unlock                     2.28%  [k] mutex_lock                            start_secondary
>     2.82%  [k] pipe_read                        1.96%  [k] __switch_to                      + 16.05% hrtimer_start_range_ns
>     2.32%  [k] ktime_get_real                   1.73%  [k] menu_select                        15.46% hrtimer_start
>                                                                                                  tick_nohz_stop_sched_tick
>                                                                                                  __tick_nohz_idle_enter
>                                                                                                  tick_nohz_idle_enter
>                                                                                                  cpu_idle
>                                                                                                  start_secondary
>                                                                                               6.37% hrtimer_try_to_cancel
>                                                                                                  hrtimer_cancel
>                                                                                                  tick_nohz_restart
>                                                                                                  tick_nohz_idle_exit
>                                                                                                  cpu_idle
>                                                                                                  start_secondary
> 
>     v3.8.13                                    v3.8.13 idle=mwait                        v3.8.13 (throttled, but menu gov bites.. HARD)
>     23.16%  [k] _raw_spin_unlock_irqrestore    8.35%  [k] __schedule                     -  22.91%  [k] _raw_spin_unlock_irqrestore
>      4.93%  [k] __schedule                     6.49%  [k] __switch_to                       - _raw_spin_unlock_irqrestore
>      3.42%  [k] resched_task                   5.71%  [k] resched_task                         - 47.26% hrtimer_try_to_cancel
>      3.27%  [k] __switch_to                    4.64%  [k] mutex_lock                                hrtimer_cancel
>      3.05%  [k] mutex_lock                     3.48%  [k] copy_user_generic_string                  menu_hrtimer_cancel
>      2.32%  [k] copy_user_generic_string       3.15%  [k] task_waking_fair                          tick_nohz_idle_exit
>      2.30%  [k] _raw_spin_lock_irqsave         3.13%  [k] pipe_read                                 cpu_idle
>      2.15%  [k] pipe_read                      2.61%  [k] mutex_unlock                              start_secondary
>      2.15%  [k] task_waking_fair               2.54%  [k] finish_task_switch                   - 40.01% __hrtimer_start_range_ns
>      2.08%  [k] ktime_get                      2.29%  [k] _raw_spin_lock_irqsave                    hrtimer_start
>      1.87%  [k] mutex_unlock                   1.91%  [k] idle_cpu                                  menu_select
>      1.76%  [k] finish_task_switch             1.84%  [k] __wake_up_common                          cpuidle_idle_call
>                                                                                                     cpu_idle
>                                                                                                     start_secondary
> 
>     v3.9.11
>     18.67%  [k] _raw_spin_unlock_irqrestore
>      4.36%  [k] __schedule
>      3.66%  [k] __switch_to
>      3.13%  [k] mutex_lock
>      2.97%  [k] __hrtimer_start_range_ns
>      2.69%  [k] _raw_spin_lock_irqsave
>      2.38%  [k] copy_user_generic_string
>      2.34%  [k] hrtimer_reprogram.isra.32
>      2.34%  [k] task_waking_fair
>      2.25%  [k] ktime_get
>      2.14%  [k] pipe_read
>      1.98%  [k] menu_select
> 
>     v3.10.4
>     20.42%  [k] _raw_spin_unlock_irqrestore
>      4.75%  [k] __schedule
>      4.42%  [k] reschedule_interrupt  <== appears in 3.10, guilty party as yet unknown
>      3.52%  [k] __switch_to
>      3.27%  [k] resched_task
>      2.64%  [k] cpuidle_enter_state
>      2.63%  [k] _raw_spin_lock_irqsave
>      2.04%  [k] copy_user_generic_string
>      2.00%  [k] cpu_idle_loop
>      1.97%  [k] mutex_lock
>      1.90%  [k] ktime_get
>      1.75%  [k] task_waking_fair
> 
>    v3.11-rc3-4-g36f571e
>    18.96%  [k] _raw_spin_unlock_irqrestore
>     4.84%  [k] __schedule
>     4.69%  [k] reschedule_interrupt
>     3.75%  [k] __switch_to
>     2.62%  [k] _raw_spin_lock_irqsave
>     2.43%  [k] cpuidle_enter_state
>     2.28%  [k] resched_task
>     2.20%  [k] cpu_idle_loop
>     1.97%  [k] copy_user_generic_string
>     1.88%  [k] ktime_get
>     1.81%  [k] task_waking_fair
>     1.75%  [k] mutex_lock
> 
> sched: ratelimit nohz
> 
> Entering nohz code on every micro-idle is too expensive to bear.
> 
> Signed-off-by: Mike Galbraith <efault@gmx.de>
> 
> ---
>  include/linux/sched.h    |    5 +++++
>  kernel/sched/core.c      |    5 +++++
>  kernel/time/tick-sched.c |    2 +-
>  3 files changed, 11 insertions(+), 1 deletion(-)
> 
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -235,9 +235,14 @@ extern int runqueue_is_locked(int cpu);
>  extern void nohz_balance_enter_idle(int cpu);
>  extern void set_cpu_sd_state_idle(void);
>  extern int get_nohz_timer_target(void);
> +extern int sched_needs_cpu(int cpu);
>  #else
>  static inline void nohz_balance_enter_idle(int cpu) { }
>  static inline void set_cpu_sd_state_idle(void) { }
> +static inline int sched_needs_cpu(int cpu)
> +{
> +	return 0;
> +}
>  #endif
>  
>  /*
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -650,6 +650,11 @@ static inline bool got_nohz_idle_kick(vo
>  	return false;
>  }
>  
> +int sched_needs_cpu(int cpu)
> +{
> +	return  cpu_rq(cpu)->avg_idle < sysctl_sched_migration_cost;
> +}
> +
>  #else /* CONFIG_NO_HZ_COMMON */
>  
>  static inline bool got_nohz_idle_kick(void)
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -548,7 +548,7 @@ static ktime_t tick_nohz_stop_sched_tick
>  		time_delta = timekeeping_max_deferment();
>  	} while (read_seqretry(&jiffies_lock, seq));
>  
> -	if (rcu_needs_cpu(cpu, &rcu_delta_jiffies) ||
> +	if (sched_needs_cpu(cpu) || rcu_needs_cpu(cpu, &rcu_delta_jiffies) ||
>  	    arch_needs_cpu(cpu) || irq_work_needs_cpu()) {
>  		next_jiffies = last_jiffies + 1;
>  		delta_jiffies = 1;
> 
> 



  reply	other threads:[~2013-08-06  7:47 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-27 20:04 [PATCH V3]hrtimer: Fix a performance regression by disable reprogramming in remove_hrtimer ethan.kernel
2013-07-29 10:18 ` Thomas Gleixner
2013-07-29 11:57   ` Peter Zijlstra
2013-08-08  7:32     ` ethan.zhao
2013-09-05  6:36       ` Mike Galbraith
     [not found]         ` <20130905111428.GB23362@gmail.com>
     [not found]           ` <1378386697.6567.9.camel@marge.simpson.net>
     [not found]             ` <20130905133750.GA26637@gmail.com>
     [not found]               ` <1378445942.5434.31.camel@marge.simpson.net>
     [not found]                 ` <20130909122325.GX31370@twins.programming.kicks-ass.net>
     [not found]                   ` <1378730538.5586.30.camel@marge.simpson.net>
2013-09-09 13:30                     ` Peter Zijlstra
2013-09-09 13:46                       ` Peter Zijlstra
2013-09-11  8:56                         ` Peter Zijlstra
2013-09-11 10:25                           ` Mike Galbraith
2013-10-04 12:06                             ` Ethan Zhao
2013-10-07  4:41                               ` Mike Galbraith
2013-10-07  4:57                                 ` Ethan Zhao
2013-12-12 14:14                                   ` Ethan Zhao
2013-12-12 14:42                                     ` Mike Galbraith
     [not found]   ` <CABawtvP4oLuvHOS3prbbgPShXVziV_wTo7i6KCqJ9KkoVdz0ag@mail.gmail.com>
2013-07-30  9:35     ` Peter Zijlstra
2013-07-30 11:44       ` Ethan Zhao
2013-07-30 11:59         ` Peter Zijlstra
2013-08-03  6:55           ` ethan
2013-08-03  7:37           ` ethan
2013-08-06  7:29       ` Mike Galbraith
2013-08-06  7:46         ` Mike Galbraith [this message]
2013-08-08  4:31           ` ethan.zhao
2013-08-08  5:29             ` Mike Galbraith
2013-08-08  5:51               ` Mike Galbraith
2013-08-08  9:04             ` ethan.zhao
2013-08-08  9:05               ` ethan.zhao
2013-08-08 12:14               ` Mike Galbraith
2013-08-07  8:25         ` Mike Galbraith
2013-08-08  4:05           ` Mike Galbraith
2013-08-08 15:02         ` ethan.zhao
2013-08-09  6:52           ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1375775172.5412.14.camel@marge.simpson.net \
    --to=bitbucket@online.de \
    --cc=ethan.kernel@gmail.com \
    --cc=joe.jin@oracle.com \
    --cc=johlstei@codeaurora.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=yinghai@kernel.org \
    --cc=youquan.song@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.