* [rcu] 7c66b15f870: +1.8% turbostat.Pkg_W
@ 2014-06-26 1:33 Fengguang Wu
2014-06-26 2:10 ` Paul E. McKenney
0 siblings, 1 reply; 4+ messages in thread
From: Fengguang Wu @ 2014-06-26 1:33 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 7357 bytes --]
Hi Paul,
We noticed increased power consumption in our internal merge commit.
It's not a direct evidence, however in the hope you can see any
obvious clues. :)
git://internal_merge_and_test_tree devel-hourly-2014062315
commit 7c66b15f8704703f6861faa246387342f7a05108 ("Merge 'rcu/rcu_cond_resched.2014.06.20c' into devel-hourly-2014062315")
Merge sequence is:
7c66b15 Merge 'rcu/rcu_cond_resched.2014.06.20c' into devel-hourly-2014062315
db47b74 Merge 'ipvs-next/master' into devel-hourly-2014062315
ca8d737 Merge 'regulator/topic/ab8500' into devel-hourly-2014062315
aee9255 Merge 'renesas/devel' into devel-hourly-2014062315
af18816 Merge 'vireshk/tick/lowres-go-tickless' into devel-hourly-2014062315
3cf94bc1 Merge 'vireshk/tick/ONESHOT-STOPPED' into devel-hourly-2014062315
94ae897 Merge 'asoc/topic/tlv320aic32x4' into devel-hourly-2014062315
28ef1ff Merge 'tianyu/dep_support' into devel-hourly-2014062315
f04bc00 Merge 'robclark/msm-next' into devel-hourly-2014062315
03f1b1e Merge 'net/master' into devel-hourly-2014062315
a1a9b33 Merge 'spi/fix/qup' into devel-hourly-2014062315
45bf5db Merge 'renesas/next' into devel-hourly-2014062315
fb20fce Merge 'asoc/for-next' into devel-hourly-2014062315
a385be7 Merge 'asoc/topic/samsung' into devel-hourly-2014062315
1a9f804 0day base guard for 'devel-hourly-2014062315'
a497c3b Linux 3.16-rc2
test case: brickland3/vm-scalability/300s-anon-rx-seq-mt
db47b74b78e1623 7c66b15f8704703f6861faa24
--------------- -------------------------
0.73 ~ 7% -13.4% 0.64 ~ 6% TOTAL turbostat.%c1
1499 ~ 5% +11.6% 1672 ~ 2% TOTAL slabinfo.sock_inode_cache.num_objs
1499 ~ 5% +11.6% 1672 ~ 2% TOTAL slabinfo.sock_inode_cache.active_objs
4.36 ~ 1% -8.1% 4.01 ~ 1% TOTAL turbostat.%c6
11782 ~ 1% +8.0% 12726 ~ 2% TOTAL time.involuntary_context_switches
2848 ~ 0% +4.8% 2983 ~ 0% TOTAL time.user_time
24.26 ~ 0% +4.4% 25.33 ~ 0% TOTAL time.elapsed_time
402 ~ 0% +2.0% 410 ~ 0% TOTAL turbostat.Cor_W
469 ~ 0% +1.8% 477 ~ 0% TOTAL turbostat.Pkg_W
c8bb7487275a9b7 7c66b15f8704703f6861faa24
--------------- -------------------------
2.047e+08 ~ 0% -5.2% 1.942e+08 ~ 0% TOTAL vm-scalability.throughput
58629 ~11% +25.1% 73348 ~13% TOTAL numa-numastat.node3.local_node
58670 ~11% +25.1% 73387 ~13% TOTAL numa-numastat.node3.numa_hit
41 ~ 6% -12.6% 36 ~ 1% TOTAL numa-numastat.node2.other_node
493777 ~ 2% -12.7% 431077 ~ 3% TOTAL proc-vmstat.pgfault
386476 ~ 1% -10.1% 347456 ~ 2% TOTAL proc-vmstat.pgalloc_normal
366616 ~ 2% -9.6% 331587 ~ 2% TOTAL proc-vmstat.numa_hit
366489 ~ 2% -9.6% 331475 ~ 2% TOTAL proc-vmstat.numa_local
11970 ~ 2% +6.3% 12726 ~ 2% TOTAL time.involuntary_context_switches
2829 ~ 0% +5.5% 2983 ~ 0% TOTAL time.user_time
24.02 ~ 0% +5.5% 25.33 ~ 0% TOTAL time.elapsed_time
402 ~ 0% +2.0% 410 ~ 0% TOTAL turbostat.Cor_W
469 ~ 0% +1.7% 477 ~ 0% TOTAL turbostat.Pkg_W
Legend:
~XX% - stddev percent
[+-]XX% - change percent
time.user_time
3000 ++-------------------------------------------------------------------+
2980 ++O O O O O O O O O O O O O |
| O O O O O |
2960 O+ |
2940 ++ |
| |
2920 ++ |
2900 ++ |
2880 ++ |
| |
2860 ++ *. .*. .*. *. .*. |
2840 ++ + * *.* *. + *.* *. |
*. .*.* *.*.*.*.*..*.*.*. .*.*.*.*.*.*.* *.*
2820 ++*.* * |
2800 ++-------------------------------------------------------------------+
time.elapsed_time
29 ++-------------------------------------------------------------------+
28.5 O+ |
| |
28 ++ |
27.5 ++ |
27 ++ |
26.5 ++ |
| |
26 ++ |
25.5 ++O O O O O O O O O O O O O O O O O |
25 ++ O |
24.5 ++ |
| .*. .*.*.*.*.*.*.*. .*.*.*.*.*. |
24 *+*.* * *.*.*.*.*..*.*.*.*.*.*.*.*.*.*.* *.*
23.5 ++-------------------------------------------------------------------+
vm-scalability.throughput
2.06e+08 ++*-*---------------------------------*--------------------------+
* + *.*.*.*.*.**.* *.*.*.*.*.*.* *.|
2.04e+08 ++ *.*. + + + *
| **.*.*.*.*.* **.*.*.* |
2.02e+08 ++ |
| |
2e+08 ++ |
| |
1.98e+08 ++ |
| |
1.96e+08 O+ |
| O OO O |
1.94e+08 ++ O O O O O O O O O O O O OO |
| |
1.92e+08 ++---------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [rcu] 7c66b15f870: +1.8% turbostat.Pkg_W 2014-06-26 1:33 [rcu] 7c66b15f870: +1.8% turbostat.Pkg_W Fengguang Wu @ 2014-06-26 2:10 ` Paul E. McKenney 2014-06-26 2:18 ` Fengguang Wu 0 siblings, 1 reply; 4+ messages in thread From: Paul E. McKenney @ 2014-06-26 2:10 UTC (permalink / raw) To: lkp [-- Attachment #1: Type: text/plain, Size: 7934 bytes --] On Thu, Jun 26, 2014 at 09:33:08AM +0800, Fengguang Wu wrote: > Hi Paul, > > We noticed increased power consumption in our internal merge commit. > It's not a direct evidence, however in the hope you can see any > obvious clues. :) Hello, Fengguang, This particular branch is obsolete, and has been replaced by commit 4a81e8328d37 (Reduce overhead of cond_resched() checks for RCU). Just out of curiosity, how do you measure the power consumption? Thanx, Paul > git://internal_merge_and_test_tree devel-hourly-2014062315 > commit 7c66b15f8704703f6861faa246387342f7a05108 ("Merge 'rcu/rcu_cond_resched.2014.06.20c' into devel-hourly-2014062315") > > Merge sequence is: > > 7c66b15 Merge 'rcu/rcu_cond_resched.2014.06.20c' into devel-hourly-2014062315 > db47b74 Merge 'ipvs-next/master' into devel-hourly-2014062315 > ca8d737 Merge 'regulator/topic/ab8500' into devel-hourly-2014062315 > aee9255 Merge 'renesas/devel' into devel-hourly-2014062315 > af18816 Merge 'vireshk/tick/lowres-go-tickless' into devel-hourly-2014062315 > 3cf94bc1 Merge 'vireshk/tick/ONESHOT-STOPPED' into devel-hourly-2014062315 > 94ae897 Merge 'asoc/topic/tlv320aic32x4' into devel-hourly-2014062315 > 28ef1ff Merge 'tianyu/dep_support' into devel-hourly-2014062315 > f04bc00 Merge 'robclark/msm-next' into devel-hourly-2014062315 > 03f1b1e Merge 'net/master' into devel-hourly-2014062315 > a1a9b33 Merge 'spi/fix/qup' into devel-hourly-2014062315 > 45bf5db Merge 'renesas/next' into devel-hourly-2014062315 > fb20fce Merge 'asoc/for-next' into devel-hourly-2014062315 > a385be7 Merge 'asoc/topic/samsung' into devel-hourly-2014062315 > 1a9f804 0day base guard for 'devel-hourly-2014062315' > a497c3b Linux 3.16-rc2 > > test case: brickland3/vm-scalability/300s-anon-rx-seq-mt > > db47b74b78e1623 7c66b15f8704703f6861faa24 > --------------- ------------------------- > 0.73 ~ 7% -13.4% 0.64 ~ 6% TOTAL turbostat.%c1 > 1499 ~ 5% +11.6% 1672 ~ 2% TOTAL slabinfo.sock_inode_cache.num_objs > 1499 ~ 5% +11.6% 1672 ~ 2% TOTAL slabinfo.sock_inode_cache.active_objs > 4.36 ~ 1% -8.1% 4.01 ~ 1% TOTAL turbostat.%c6 > 11782 ~ 1% +8.0% 12726 ~ 2% TOTAL time.involuntary_context_switches > 2848 ~ 0% +4.8% 2983 ~ 0% TOTAL time.user_time > 24.26 ~ 0% +4.4% 25.33 ~ 0% TOTAL time.elapsed_time > 402 ~ 0% +2.0% 410 ~ 0% TOTAL turbostat.Cor_W > 469 ~ 0% +1.8% 477 ~ 0% TOTAL turbostat.Pkg_W > > c8bb7487275a9b7 7c66b15f8704703f6861faa24 > --------------- ------------------------- > 2.047e+08 ~ 0% -5.2% 1.942e+08 ~ 0% TOTAL vm-scalability.throughput > 58629 ~11% +25.1% 73348 ~13% TOTAL numa-numastat.node3.local_node > 58670 ~11% +25.1% 73387 ~13% TOTAL numa-numastat.node3.numa_hit > 41 ~ 6% -12.6% 36 ~ 1% TOTAL numa-numastat.node2.other_node > 493777 ~ 2% -12.7% 431077 ~ 3% TOTAL proc-vmstat.pgfault > 386476 ~ 1% -10.1% 347456 ~ 2% TOTAL proc-vmstat.pgalloc_normal > 366616 ~ 2% -9.6% 331587 ~ 2% TOTAL proc-vmstat.numa_hit > 366489 ~ 2% -9.6% 331475 ~ 2% TOTAL proc-vmstat.numa_local > 11970 ~ 2% +6.3% 12726 ~ 2% TOTAL time.involuntary_context_switches > 2829 ~ 0% +5.5% 2983 ~ 0% TOTAL time.user_time > 24.02 ~ 0% +5.5% 25.33 ~ 0% TOTAL time.elapsed_time > 402 ~ 0% +2.0% 410 ~ 0% TOTAL turbostat.Cor_W > 469 ~ 0% +1.7% 477 ~ 0% TOTAL turbostat.Pkg_W > > Legend: > ~XX% - stddev percent > [+-]XX% - change percent > > > time.user_time > > 3000 ++-------------------------------------------------------------------+ > 2980 ++O O O O O O O O O O O O O | > | O O O O O | > 2960 O+ | > 2940 ++ | > | | > 2920 ++ | > 2900 ++ | > 2880 ++ | > | | > 2860 ++ *. .*. .*. *. .*. | > 2840 ++ + * *.* *. + *.* *. | > *. .*.* *.*.*.*.*..*.*.*. .*.*.*.*.*.*.* *.* > 2820 ++*.* * | > 2800 ++-------------------------------------------------------------------+ > > > time.elapsed_time > > 29 ++-------------------------------------------------------------------+ > 28.5 O+ | > | | > 28 ++ | > 27.5 ++ | > 27 ++ | > 26.5 ++ | > | | > 26 ++ | > 25.5 ++O O O O O O O O O O O O O O O O O | > 25 ++ O | > 24.5 ++ | > | .*. .*.*.*.*.*.*.*. .*.*.*.*.*. | > 24 *+*.* * *.*.*.*.*..*.*.*.*.*.*.*.*.*.*.* *.* > 23.5 ++-------------------------------------------------------------------+ > > > vm-scalability.throughput > > 2.06e+08 ++*-*---------------------------------*--------------------------+ > * + *.*.*.*.*.**.* *.*.*.*.*.*.* *.| > 2.04e+08 ++ *.*. + + + * > | **.*.*.*.*.* **.*.*.* | > 2.02e+08 ++ | > | | > 2e+08 ++ | > | | > 1.98e+08 ++ | > | | > 1.96e+08 O+ | > | O OO O | > 1.94e+08 ++ O O O O O O O O O O O O OO | > | | > 1.92e+08 ++---------------------------------------------------------------+ > > > [*] bisect-good sample > [O] bisect-bad sample > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > Thanks, > Fengguang > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [rcu] 7c66b15f870: +1.8% turbostat.Pkg_W 2014-06-26 2:10 ` Paul E. McKenney @ 2014-06-26 2:18 ` Fengguang Wu 2014-06-26 4:00 ` Paul E. McKenney 0 siblings, 1 reply; 4+ messages in thread From: Fengguang Wu @ 2014-06-26 2:18 UTC (permalink / raw) To: lkp [-- Attachment #1: Type: text/plain, Size: 4854 bytes --] On Wed, Jun 25, 2014 at 07:10:23PM -0700, Paul E. McKenney wrote: > On Thu, Jun 26, 2014 at 09:33:08AM +0800, Fengguang Wu wrote: > > Hi Paul, > > > > We noticed increased power consumption in our internal merge commit. > > It's not a direct evidence, however in the hope you can see any > > obvious clues. :) > > Hello, Fengguang, > > This particular branch is obsolete, and has been replaced by commit > 4a81e8328d37 (Reduce overhead of cond_resched() checks for RCU). Ah great, we happen to have more direct numbers for that patch. :) > Just out of curiosity, how do you measure the power consumption? It's read by the turbostat tool, from CPU MSRs: commit 889facbee3e67dbc8eb29d8ee7fd66d33a647bfc Author: Len Brown <len.brown@intel.com> AuthorDate: Thu Nov 8 00:48:57 2012 -0500 Commit: Len Brown <len.brown@intel.com> CommitDate: Fri Nov 30 01:09:44 2012 -0500 tools/power turbostat: v3.0: monitor Watts and Temperature Show power in Watts and temperature in Celsius when hardware support is present. Intel's Sandy Bridge and Ivy Bridge processor generations support RAPL (Run-Time-Average-Power-Limiting). Per the Intel SDM (Intel® 64 and IA-32 Architectures Software Developer Manual) RAPL provides hardware energy counters and power control MSRs (Model Specific Registers). RAPL MSRs are designed primarily as a method to implement power capping. However, they are useful for monitoring system power whether or not power capping is used. In addition, Turbostat now shows temperature from DTS (Digital Thermal Sensor) and PTM (Package Thermal Monitor) hardware, if present. As before, turbostat reads MSRs, and never writes MSRs. New columns are present in turbostat output: The Pkg_W column shows Watts for each package (socket) in the system. On multi-socket systems, the system summary on the 1st row shows the sum for all sockets together. The Cor_W column shows Watts due to processors cores. Note that Core_W is included in Pkg_W. The optional GFX_W column shows Watts due to the graphics "un-core". Note that GFX_W is included in Pkg_W. The optional RAM_W column on server processors shows Watts due to DRAM DIMMS. As DRAM DIMMs are outside the processor package, RAM_W is not included in Pkg_W. The optional PKG_% and RAM_% columns on server processors shows the % of time in the measurement interval that RAPL power limiting is in effect on the package and on DRAM. Note that the RAPL energy counters have some limitations. First, hardware updates the counters about once every milli-second. This is fine for typical turbostat measurement intervals > 1 sec. However, when turbostat is used to measure events that approach 1ms, the counters are less useful. Second, the 32-bit energy counters are subject to wrapping. For example, a counter incrementing 15 micro-Joule units on a 130 Watt TDP server processor could (in theory) roll over in about 9 minutes. Turbostat detects and handles up to 1 counter overflow per measurement interval. But when the measurement interval exceeds the guaranteed counter range, we can't detect if more than 1 overflow occured. So in this case turbostat indicates that the results are in question by replacing the fractional part of the Watts in the output with "**": Pkg_W Cor_W GFX_W 3** 0** 0** Third, the RAPL counters are energy (Joule) counters -- they sum up weighted events in the package to estimate energy consumed. They are not analong power (Watt) meters. In practice, they tend to under-count because they don't cover every possible use of energy in the package. The accuracy of the RAPL counters will vary between product generations, and between SKU's in the same product generation, and with temperature. turbostat's -v (verbose) option now displays more power and thermal configuration information -- as shown on the turbostat.8 manual page. For example, it now displays the Package and DRAM Thermal Design Power (TDP): cpu0: MSR_PKG_POWER_INFO: 0x2f064001980410 (130 W TDP, RAPL 51 - 200 W, 0.045898 sec.) cpu0: MSR_DRAM_POWER_INFO,: 0x28025800780118 (35 W TDP, RAPL 15 - 75 W, 0.039062 sec.) cpu8: MSR_PKG_POWER_INFO: 0x2f064001980410 (130 W TDP, RAPL 51 - 200 W, 0.045898 sec.) cpu8: MSR_DRAM_POWER_INFO,: 0x28025800780118 (35 W TDP, RAPL 15 - 75 W, 0.039062 sec.) Signed-off-by: Len Brown <len.brown@intel.com> --- tools/power/x86/turbostat/turbostat.8 | 103 ++++-- tools/power/x86/turbostat/turbostat.c | 643 ++++++++++++++++++++++++++++++++-- 2 files changed, 690 insertions(+), 56 deletions(-) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [rcu] 7c66b15f870: +1.8% turbostat.Pkg_W 2014-06-26 2:18 ` Fengguang Wu @ 2014-06-26 4:00 ` Paul E. McKenney 0 siblings, 0 replies; 4+ messages in thread From: Paul E. McKenney @ 2014-06-26 4:00 UTC (permalink / raw) To: lkp [-- Attachment #1: Type: text/plain, Size: 5268 bytes --] On Thu, Jun 26, 2014 at 10:18:09AM +0800, Fengguang Wu wrote: > On Wed, Jun 25, 2014 at 07:10:23PM -0700, Paul E. McKenney wrote: > > On Thu, Jun 26, 2014 at 09:33:08AM +0800, Fengguang Wu wrote: > > > Hi Paul, > > > > > > We noticed increased power consumption in our internal merge commit. > > > It's not a direct evidence, however in the hope you can see any > > > obvious clues. :) > > > > Hello, Fengguang, > > > > This particular branch is obsolete, and has been replaced by commit > > 4a81e8328d37 (Reduce overhead of cond_resched() checks for RCU). > > Ah great, we happen to have more direct numbers for that patch. :) > > > Just out of curiosity, how do you measure the power consumption? > > It's read by the turbostat tool, from CPU MSRs: OK, thank you for the info. I would trust a physical power-line sensor more, but then again, I am picky. ;-) Thanx, Paul > commit 889facbee3e67dbc8eb29d8ee7fd66d33a647bfc > Author: Len Brown <len.brown@intel.com> > AuthorDate: Thu Nov 8 00:48:57 2012 -0500 > Commit: Len Brown <len.brown@intel.com> > CommitDate: Fri Nov 30 01:09:44 2012 -0500 > > tools/power turbostat: v3.0: monitor Watts and Temperature > > Show power in Watts and temperature in Celsius > when hardware support is present. > > Intel's Sandy Bridge and Ivy Bridge processor generations support RAPL > (Run-Time-Average-Power-Limiting). Per the Intel SDM > (Intel® 64 and IA-32 Architectures Software Developer Manual) > RAPL provides hardware energy counters and power control MSRs > (Model Specific Registers). RAPL MSRs are designed primarily > as a method to implement power capping. However, they are useful > for monitoring system power whether or not power capping is used. > > In addition, Turbostat now shows temperature from DTS > (Digital Thermal Sensor) and PTM (Package Thermal Monitor) hardware, > if present. > > As before, turbostat reads MSRs, and never writes MSRs. > > New columns are present in turbostat output: > > The Pkg_W column shows Watts for each package (socket) in the system. > On multi-socket systems, the system summary on the 1st row shows the sum > for all sockets together. > > The Cor_W column shows Watts due to processors cores. > Note that Core_W is included in Pkg_W. > > The optional GFX_W column shows Watts due to the graphics "un-core". > Note that GFX_W is included in Pkg_W. > > The optional RAM_W column on server processors shows Watts due to DRAM DIMMS. > As DRAM DIMMs are outside the processor package, RAM_W is not included in Pkg_W. > > The optional PKG_% and RAM_% columns on server processors shows the % of time > in the measurement interval that RAPL power limiting is in effect on the > package and on DRAM. > > Note that the RAPL energy counters have some limitations. > > First, hardware updates the counters about once every milli-second. > This is fine for typical turbostat measurement intervals > 1 sec. > However, when turbostat is used to measure events that approach > 1ms, the counters are less useful. > > Second, the 32-bit energy counters are subject to wrapping. > For example, a counter incrementing 15 micro-Joule units > on a 130 Watt TDP server processor could (in theory) > roll over in about 9 minutes. Turbostat detects and handles > up to 1 counter overflow per measurement interval. > But when the measurement interval exceeds the guaranteed > counter range, we can't detect if more than 1 overflow occured. > So in this case turbostat indicates that the results are > in question by replacing the fractional part of the Watts > in the output with "**": > > Pkg_W Cor_W GFX_W > 3** 0** 0** > > Third, the RAPL counters are energy (Joule) counters -- they sum up > weighted events in the package to estimate energy consumed. They are > not analong power (Watt) meters. In practice, they tend to under-count > because they don't cover every possible use of energy in the package. > The accuracy of the RAPL counters will vary between product generations, > and between SKU's in the same product generation, and with temperature. > > turbostat's -v (verbose) option now displays more power and thermal configuration > information -- as shown on the turbostat.8 manual page. > For example, it now displays the Package and DRAM Thermal Design Power (TDP): > > cpu0: MSR_PKG_POWER_INFO: 0x2f064001980410 (130 W TDP, RAPL 51 - 200 W, 0.045898 sec.) > cpu0: MSR_DRAM_POWER_INFO,: 0x28025800780118 (35 W TDP, RAPL 15 - 75 W, 0.039062 sec.) > cpu8: MSR_PKG_POWER_INFO: 0x2f064001980410 (130 W TDP, RAPL 51 - 200 W, 0.045898 sec.) > cpu8: MSR_DRAM_POWER_INFO,: 0x28025800780118 (35 W TDP, RAPL 15 - 75 W, 0.039062 sec.) > > Signed-off-by: Len Brown <len.brown@intel.com> > --- > tools/power/x86/turbostat/turbostat.8 | 103 ++++-- > tools/power/x86/turbostat/turbostat.c | 643 ++++++++++++++++++++++++++++++++-- > 2 files changed, 690 insertions(+), 56 deletions(-) > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-06-26 4:00 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-06-26 1:33 [rcu] 7c66b15f870: +1.8% turbostat.Pkg_W Fengguang Wu 2014-06-26 2:10 ` Paul E. McKenney 2014-06-26 2:18 ` Fengguang Wu 2014-06-26 4:00 ` Paul E. McKenney
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.