All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael wang <wangyun@linux.vnet.ibm.com>
To: Alex Shi <alex.shi@linaro.org>,
	mingo@redhat.com, peterz@infradead.org, morten.rasmussen@arm.com
Cc: vincent.guittot@linaro.org, daniel.lezcano@linaro.org,
	fweisbec@gmail.com, linux@arm.linux.org.uk, tony.luck@intel.com,
	fenghua.yu@intel.com, james.hogan@imgtec.com, jason.low2@hp.com,
	viresh.kumar@linaro.org, hanjun.guo@linaro.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	akpm@linux-foundation.org, arjan@linux.intel.com, pjt@google.com,
	fengguang.wu@intel.com, linaro-kernel@lists.linaro.org
Subject: Re: [PATCH v2 0/11] remove cpu_load in rq
Date: Tue, 18 Feb 2014 12:52:10 +0800	[thread overview]
Message-ID: <5302E6FA.508@linux.vnet.ibm.com> (raw)
In-Reply-To: <1392602117-20773-1-git-send-email-alex.shi@linaro.org>

On 02/17/2014 09:55 AM, Alex Shi wrote:
> The cpu_load decays on time according past cpu load of rq. The sched_avg also decays tasks' load on time. Now we has 2 kind decay for cpu_load. That is a kind of redundancy. And increase the system load by decay calculation. This patch try to remove the cpu_load decay.
> 
> There are 5 load_idx used for cpu_load in sched_domain. busy_idx and idle_idx are not zero usually, but newidle_idx, wake_idx and forkexec_idx are all zero on every arch. A shortcut to remove cpu_Load decay in the first patch. just one line patch for this change.
> 
> V2,
> 1, This version do some tuning on load bias of target load, to maximum match current code logical.
> 2, Got further to remove the cpu_load in rq.
> 3, Revert the patch 'Limit sd->*_idx range on sysctl' since no needs
> 
> Any testing/comments are appreciated.

Tested on 12-cpu-x86 box with tip/master, ebizzy and hackbench
works fine, show little improvements for each time's testing.

ebizzy default:

BASE					PATCHED

32506 records/s                         |32785 records/s                        
real 10.00 s                            |real 10.00 s                           
user 50.32 s                            |user 49.66 s                           
sys  69.46 s                            |sys  70.19 s                           
32552 records/s                         |32946 records/s                        
real 10.00 s                            |real 10.00 s                           
user 50.11 s                            |user 50.70 s                           
sys  69.68 s                            |sys  69.15 s                           
32265 records/s                         |32824 records/s                        
real 10.00 s                            |real 10.00 s                           
user 49.46 s                            |user 50.46 s                           
sys  70.28 s                            |sys  69.34 s                           
32489 records/s                         |32735 records/s                        
real 10.00 s                            |real 10.00 s                           
user 49.67 s                            |user 50.21 s                           
sys  70.12 s                            |sys  69.54 s                           
32490 records/s                         |32662 records/s                        
real 10.00 s                            |real 10.00 s                           
user 50.01 s                            |user 50.07 s                           
sys  69.79 s                            |sys  69.68 s                           
32471 records/s                         |32784 records/s                        
real 10.00 s                            |real 10.00 s 
32471 records/s                         |32784 records/s                        
real 10.00 s                            |real 10.00 s                           
user 49.73 s                            |user 49.88 s                           
sys  70.07 s                            |sys  69.87 s                           
32596 records/s                         |32783 records/s                        
real 10.00 s                            |real 10.00 s                           
user 49.81 s                            |user 49.42 s                           
sys  70.00 s                            |sys  70.30 s

hackbench 10000 loops:

BASE					PATCHED

Running with 48*40 (== 1920) tasks.     |Running with 48*40 (== 1920) tasks.    
Time: 30.934                            |Time: 29.965                           
Running with 48*40 (== 1920) tasks.     |Running with 48*40 (== 1920) tasks.    
Time: 31.603                            |Time: 30.410                           
Running with 48*40 (== 1920) tasks.     |Running with 48*40 (== 1920) tasks.    
Time: 31.724                            |Time: 30.627                           
Running with 48*40 (== 1920) tasks.     |Running with 48*40 (== 1920) tasks.    
Time: 31.648                            |Time: 30.596                           
Running with 48*40 (== 1920) tasks.     |Running with 48*40 (== 1920) tasks.    
Time: 31.799                            |Time: 30.763                           
Running with 48*40 (== 1920) tasks.     |Running with 48*40 (== 1920) tasks.    
Time: 31.847                            |Time: 30.532                           
Running with 48*40 (== 1920) tasks.     |Running with 48*40 (== 1920) tasks.    
Time: 31.828                            |Time: 30.871                           
Running with 24*40 (== 960) tasks.      |Running with 24*40 (== 960) tasks.     
Time: 15.768                            |Time: 15.284                           
Running with 24*40 (== 960) tasks.      |Running with 24*40 (== 960) tasks.     
Time: 15.720                            |Time: 15.228                           
Running with 24*40 (== 960) tasks.      |Running with 24*40 (== 960) tasks.     
Time: 15.819                            |Time: 15.373                           
Running with 24*40 (== 960) tasks.      |Running with 24*40 (== 960) tasks.     
Time: 15.888                            |Time: 15.184
Running with 24*40 (== 960) tasks.      |Running with 24*40 (== 960) tasks.     
Time: 15.888                            |Time: 15.184                           
Running with 24*40 (== 960) tasks.      |Running with 24*40 (== 960) tasks.     
Time: 15.660                            |Time: 15.525                           
Running with 24*40 (== 960) tasks.      |Running with 24*40 (== 960) tasks.     
Time: 15.934                            |Time: 15.337                           
Running with 24*40 (== 960) tasks.      |Running with 24*40 (== 960) tasks.     
Time: 15.669                            |Time: 15.357                           
Running with 12*40 (== 480) tasks.      |Running with 12*40 (== 480) tasks.     
Time: 7.699                             |Time: 7.458                            
Running with 12*40 (== 480) tasks.      |Running with 12*40 (== 480) tasks.     
Time: 7.693                             |Time: 7.498                            
Running with 12*40 (== 480) tasks.      |Running with 12*40 (== 480) tasks.     
Time: 7.705                             |Time: 7.439                            
Running with 12*40 (== 480) tasks.      |Running with 12*40 (== 480) tasks.     
Time: 7.664                             |Time: 7.553                            
Running with 12*40 (== 480) tasks.      |Running with 12*40 (== 480) tasks.     
Time: 7.603                             |Time: 7.470                            
Running with 12*40 (== 480) tasks.      |Running with 12*40 (== 480) tasks.     
Time: 7.651                             |Time: 7.491                            
Running with 12*40 (== 480) tasks.      |Running with 12*40 (== 480) tasks.     
Time: 7.647                             |Time: 7.535                        
Running with 12*40 (== 480) tasks.      |Running with 12*40 (== 480) tasks.     
Time: 7.647                             |Time: 7.535                            
Running with 6*40 (== 240) tasks.       |Running with 6*40 (== 240) tasks.      
Time: 6.054                             |Time: 5.293                            
Running with 6*40 (== 240) tasks.       |Running with 6*40 (== 240) tasks.      
Time: 5.417                             |Time: 5.701                            
Running with 6*40 (== 240) tasks.       |Running with 6*40 (== 240) tasks.      
Time: 5.287                             |Time: 5.240                            
Running with 6*40 (== 240) tasks.       |Running with 6*40 (== 240) tasks.      
Time: 5.594                             |Time: 5.571                            
Running with 6*40 (== 240) tasks.       |Running with 6*40 (== 240) tasks.      
Time: 5.347                             |Time: 6.136                            
Running with 6*40 (== 240) tasks.       |Running with 6*40 (== 240) tasks.      
Time: 5.430                             |Time: 5.323                            
Running with 6*40 (== 240) tasks.       |Running with 6*40 (== 240) tasks.      
Time: 5.691                             |Time: 5.481                            
Running with 1*40 (== 40) tasks.        |Running with 1*40 (== 40) tasks.       
Time: 1.192                             |Time: 1.140                            
Running with 1*40 (== 40) tasks.        |Running with 1*40 (== 40) tasks.       
Time: 1.190                             |Time: 1.125                            
Running with 1*40 (== 40) tasks.        |Running with 1*40 (== 40) tasks.       
Time: 1.189                             |Time: 1.013                       
Running with 1*40 (== 40) tasks.        |Running with 1*40 (== 40) tasks.       
Time: 1.189                             |Time: 1.013                            
Running with 1*40 (== 40) tasks.        |Running with 1*40 (== 40) tasks.       
Time: 1.163                             |Time: 1.060                            
Running with 1*40 (== 40) tasks.        |Running with 1*40 (== 40) tasks.       
Time: 1.186                             |Time: 1.131                            
Running with 1*40 (== 40) tasks.        |Running with 1*40 (== 40) tasks.       
Time: 1.175                             |Time: 1.125                            
Running with 1*40 (== 40) tasks.        |Running with 1*40 (== 40) tasks.       
Time: 1.157                             |Time: 0.998 


BTW, I got panic while rebooting, but should not caused by
this patch set, will recheck and post the report later.

Regards,
Michael Wang



INFO: rcu_sched detected stalls on CPUs/tasks: { 7} (detected by 1, t=21002 jiffies, g=6707, c=6706, q=227)
Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 7
CPU: 7 PID: 1040 Comm: bioset Not tainted 3.14.0-rc2-test+ #402
Hardware name: IBM System x3650 M3 -[794582A]-/94Y7614, BIOS -[D6E154AUS-1.13]- 09/23/2011
 0000000000000000 ffff88097f2e7bd8 ffffffff8156b38a 0000000000004f27
 ffffffff817ecb90 ffff88097f2e7c58 ffffffff81561d8d ffff88097f2e7c08
 ffffffff00000010 ffff88097f2e7c68 ffff88097f2e7c08 ffff88097f2e7c78
Call Trace:
 <NMI>  [<ffffffff8156b38a>] dump_stack+0x46/0x58
 [<ffffffff81561d8d>] panic+0xbe/0x1ce
 [<ffffffff810e6b03>] watchdog_overflow_callback+0xb3/0xc0
 [<ffffffff8111e928>] __perf_event_overflow+0x98/0x220
 [<ffffffff8111f224>] perf_event_overflow+0x14/0x20
 [<ffffffff8101eef2>] intel_pmu_handle_irq+0x1c2/0x2c0
 [<ffffffff81089af9>] ? load_balance+0xf9/0x590
 [<ffffffff81089b0d>] ? load_balance+0x10d/0x590
 [<ffffffff81562ac2>] ? printk+0x4d/0x4f
 [<ffffffff815763b4>] perf_event_nmi_handler+0x34/0x60
 [<ffffffff81575b6e>] nmi_handle+0x7e/0x140
 [<ffffffff81575d1a>] default_do_nmi+0x5a/0x250
 [<ffffffff81575fa0>] do_nmi+0x90/0xd0
 [<ffffffff815751e7>] end_repeat_nmi+0x1e/0x2e
 [<ffffffff81089340>] ? find_busiest_group+0x120/0x7e0
 [<ffffffff81089340>] ? find_busiest_group+0x120/0x7e0
 [<ffffffff81089340>] ? find_busiest_group+0x120/0x7e0
 <<EOE>>  [<ffffffff81089b7c>] load_balance+0x17c/0x590
 [<ffffffff8108a49f>] idle_balance+0x10f/0x1c0
 [<ffffffff8108a66e>] pick_next_task_fair+0x11e/0x2a0
 [<ffffffff8107ba53>] ? dequeue_task+0x73/0x90
 [<ffffffff815712b7>] __schedule+0x127/0x670
 [<ffffffff815718d9>] schedule+0x29/0x70
 [<ffffffff8104e3b5>] do_exit+0x2a5/0x470
 [<ffffffff81066c90>] ? process_scheduled_works+0x40/0x40
 [<ffffffff8106e78a>] kthread+0xba/0xe0
 [<ffffffff8106e6d0>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff8157d0ec>] ret_from_fork+0x7c/0xb0
 [<ffffffff8106e6d0>] ? flush_kthread_worker+0xb0/0xb0
Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)


> 
> This patch rebase on latest tip/master.
> The git tree for this patchset at:
>  git@github.com:alexshi/power-scheduling.git noload
> 
> Thanks
> Alex
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


  parent reply	other threads:[~2014-02-18  4:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-17  1:55 [PATCH v2 0/11] remove cpu_load in rq Alex Shi
2014-02-17  1:55 ` [PATCH v2 01/11] sched: shortcut to remove load_idx Alex Shi
2014-02-17  1:55 ` [PATCH v2 02/11] sched: remove rq->cpu_load[load_idx] array Alex Shi
2014-02-17  1:55 ` [PATCH v2 03/11] sched: clean up cpu_load update Alex Shi
2014-02-17  1:55 ` [PATCH v2 04/11] sched: unify imbalance bias for target group Alex Shi
2014-02-17  1:55 ` [PATCH v2 05/11] sched: rewrite update_cpu_load_nohz Alex Shi
2014-02-17  1:55 ` [PATCH v2 06/11] sched: clean up source_load/target_load Alex Shi
2014-02-17  1:55 ` [PATCH v2 07/11] sched: clean up weighted_cpuload Alex Shi
2014-02-17  1:55 ` [PATCH v2 08/11] sched: remove weighted_load() Alex Shi
2014-02-17  1:55 ` [PATCH v2 09/11] sched: remove rq->cpu_load and rq->nr_load_updates Alex Shi
2014-02-17  1:55 ` [PATCH v2 10/11] sched: rename update_*_cpu_load Alex Shi
2014-02-17  1:55 ` [PATCH v2 11/11] sched: clean up task_hot function Alex Shi
2014-02-18  2:37 ` [PATCH v2 0/11] remove cpu_load in rq Alex Shi
2014-02-18  4:52 ` Michael wang [this message]
2014-02-18  6:03   ` Alex Shi
2014-02-18  6:17     ` Michael wang
     [not found] ` <20140218120522.GG19029@e103034-lin>
2014-02-18 12:28   ` Vincent Guittot
2014-02-19 10:23     ` Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5302E6FA.508@linux.vnet.ibm.com \
    --to=wangyun@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linaro.org \
    --cc=arjan@linux.intel.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=fengguang.wu@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=hanjun.guo@linaro.org \
    --cc=james.hogan@imgtec.com \
    --cc=jason.low2@hp.com \
    --cc=linaro-kernel@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.