From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759018Ab1FWJDO (ORCPT ); Thu, 23 Jun 2011 05:03:14 -0400 Received: from casper.infradead.org ([85.118.1.10]:60949 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755834Ab1FWJDN convert rfc822-to-8bit (ORCPT ); Thu, 23 Jun 2011 05:03:13 -0400 Subject: Re: power increase issue on light load From: Peter Zijlstra To: "Alex,Shi" Cc: ncrao@google.com, mingo@elte.hu, "Chen, Tim C" , "Li, Shaohua" , "linux-kernel@vger.kernel.org" In-Reply-To: <1308797024.23204.95.camel@debian> References: <1308797024.23204.95.camel@debian> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Date: Thu, 23 Jun 2011 11:02:28 +0200 Message-ID: <1308819748.1022.69.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2011-06-23 at 10:43 +0800, Alex,Shi wrote: > commit c8b281161dfa4bb5d5be63fb036ce19347b88c63 causes light load > benchmark use more than 10% system power on platform NHM-EP and laptop > Thinkpad T410 etc. The benchmarks are specpower and bltk office. > > I tried to track this issue, but only find deep C sate time reduced > much, about from 90% to 30~40%, the C0 or C1 state increase much on > different machines. > > Powertop just hints RES interrupts has a bit more. but when I try "perf > probe native_smp_send_reschedule". I didn't find much. > > I also checked the /proc/schedstat, just can sure the load_balance was > called a bit more frequency. but pull_task() was called really rare. > > > The following are the /proc/schedstat increased number in about 300' when do bltk-office. > The getting command is here: > #on a 16 LCPU system, with 3 level domain, 0,1,2, so all domain number > is 48, the domain statistic number is 2 + 36, so fs=38, > > $cat /proc/schedstat > schedstat ; sleep x ; cat /proc/schedstat >> > schedstat ; cat schedstat | grep domain | sed '49 i \\n' | awk -v fs=38 > 'BEGIN { RS=""; FS=" " } { if ( NR ==1) for (i=0; i { value1[i]=$i ; } ; if ( NR ==2) for (i=0; i $i } } END {ORS=" "; for (i=0;i ll=""; print value2[i] - value1[i] ll }; print "\n" }' /proc/schedstat is already a massive pain to interpret and then you go and mangle things even more and expect me to try and understand that crap? I don't think so, life is too short. > BTW, the imbalance increasing is due to the SCALE increase about 1024. > Any ideas of this? What happens if you try something like the below. Increased imbalance might lead to more load-balance action, which might lead to more task migration/waking up of cpus etc. If the below makes any difference, Nikhil's changes have a funny that needs to be caught. --- include/linux/sched.h | 6 ------ 1 files changed, 0 insertions(+), 6 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index a837b20..84121d6 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -808,15 +808,9 @@ enum cpu_idle_type { * when BITS_PER_LONG <= 32 are pretty high and the returns do not justify the * increased costs. */ -#if BITS_PER_LONG > 32 -# define SCHED_LOAD_RESOLUTION 10 -# define scale_load(w) ((w) << SCHED_LOAD_RESOLUTION) -# define scale_load_down(w) ((w) >> SCHED_LOAD_RESOLUTION) -#else # define SCHED_LOAD_RESOLUTION 0 # define scale_load(w) (w) # define scale_load_down(w) (w) -#endif #define SCHED_LOAD_SHIFT (10 + SCHED_LOAD_RESOLUTION) #define SCHED_LOAD_SCALE (1L << SCHED_LOAD_SHIFT)