* Re: [patch v7 0/21] sched: power aware scheduling [not found] <1365040862-8390-1-git-send-email-alex.shi@intel.com> @ 2013-04-11 21:02 ` Len Brown 2013-04-12 8:46 ` Alex Shi 0 siblings, 1 reply; 30+ messages in thread From: Len Brown @ 2013-04-11 21:02 UTC (permalink / raw) To: Alex Shi Cc: mingo, peterz, tglx, akpm, arjan, bp, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/03/2013 10:00 PM, Alex Shi wrote: > As mentioned in the power aware scheduling proposal, Power aware > scheduling has 2 assumptions: > 1, race to idle is helpful for power saving > 2, less active sched groups will reduce cpu power consumption linux-pm@vger.kernel.org should be cc: on Linux proposals that affect power. > Since the patch can perfect pack tasks into fewer groups, I just show > some performance/power testing data here: > ========================================= > $for ((i = 0; i < x; i++)) ; do while true; do :; done & done > > On my SNB laptop with 4 core* HT: the data is avg Watts > powersaving performance > x = 8 72.9482 72.6702 > x = 4 61.2737 66.7649 > x = 2 44.8491 59.0679 > x = 1 43.225 43.0638 > on SNB EP machine with 2 sockets * 8 cores * HT: > powersaving performance > x = 32 393.062 395.134 > x = 16 277.438 376.152 > x = 8 209.33 272.398 > x = 4 199 238.309 > x = 2 175.245 210.739 > x = 1 174.264 173.603 The numbers above say nothing about performance, and thus don't tell us much. In particular, they don't tell us if reducing power by hacking the scheduler is more or less efficient than using the existing techniques that are already shipping, such as controlling P-states. > tasks number keep waving benchmark, 'make -j <x> vmlinux' > on my SNB EP 2 sockets machine with 8 cores * HT: > powersaving performance > x = 2 189.416 /228 23 193.355 /209 24 Energy = Power * Time 189.416*228 = 43186.848 Joules for powersaving to retire the workload 193.355*209 = 40411.195 Joules for performance to retire the workload. So the net effect of the 'powersaving' mode here is: 1. 228/209 = 9% performance degradation 2. 43186.848/40411.195 = 6.9 % more energy to retire the workload. These numbers suggest that this patch series simultaneously has a negative impact on performance and energy required to retire the workload. Why do it? > x = 4 215.728 /132 35 219.69 /122 37 ditto here. 8% increase in time. 6% increase in energy. > x = 8 244.31 /75 54 252.709 /68 58 ditto here 10% increase in time. 6% increase in energy. > x = 16 299.915 /43 77 259.127 /58 66 Are you sure that powersave mode ran in 43 seconds when performance mode ran in 58 seconds? If that is true, than somewhere in this patch series you have a _significant_ performance benefit on this workload under these conditions! Interestingly, powersave mode also ran at 15% higher power than performance mode. maybe "powersave" isn't quite the right name for it:-) > x = 32 341.221 /35 83 323.418 /38 81 Why does this patch series have a performance impact (8%) at x=32. All the processors are always busy, no? > data explains: 189.416 /228 23 > 189.416: average Watts during compilation > 228: seconds(compile time) > 23: scaled performance/watts = 1000000 / seconds / watts > The performance value of kbuild is better on threads 16/32, that's due > to lazy power balance reduced the context switch and CPU has more boost > chance on powersaving balance. 25% is a huge difference in performance. Can you get a performance benefit in that scenario without having a negative performance impact in the other scenarios? In particular, an 8% hit to the fully utilized case is a deal killer. The x=16 performance change here suggest there is value someplace in this patch series to increase performance. However, the case that these scheduling changes are a benefit from an energy efficiency point of view is yet to be made. thanks, -Len Brown Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-11 21:02 ` [patch v7 0/21] sched: power aware scheduling Len Brown @ 2013-04-12 8:46 ` Alex Shi 2013-04-12 16:23 ` Borislav Petkov 0 siblings, 1 reply; 30+ messages in thread From: Alex Shi @ 2013-04-12 8:46 UTC (permalink / raw) To: Len Brown Cc: mingo, peterz, tglx, akpm, arjan, bp, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/12/2013 05:02 AM, Len Brown wrote: >> > x = 16 299.915 /43 77 259.127 /58 66 > Are you sure that powersave mode ran in 43 seconds > when performance mode ran in 58 seconds? Thanks a lot for comments, Len! Will do more testing by your tool fspin. :) powersaving using less time when thread = 16 or 32. The main contribution come from CPU freq boost. I have disable the boost of cpufreq. then find the compile time become similar between powersaving and performance on thread 32, and powersaving is slower when threads is 16. And less Context Switch from less lazy power balance should also do some help. > > If that is true, than somewhere in this patch series > you have a _significant_ performance benefit > on this workload under these conditions! > > Interestingly, powersave mode also ran at > 15% higher power than performance mode. > maybe "powersave" isn't quite the right name for it:-) What other name you suggest? :) > >> > x = 32 341.221 /35 83 323.418 /38 81 > Why does this patch series have a performance impact (8%) > at x=32. All the processors are always busy, no? No, all processors are not always busy in 'make -j vmlinux' So, compile time also get benefit from boost and less CS. the performance policy doesn't introduce any impact. there is nothing added in performance policy. > >> > data explains: 189.416 /228 23 >> > 189.416: average Watts during compilation >> > 228: seconds(compile time) >> > 23: scaled performance/watts = 1000000 / seconds / watts >> > The performance value of kbuild is better on threads 16/32, that's due >> > to lazy power balance reduced the context switch and CPU has more boost >> > chance on powersaving balance. > 25% is a huge difference in performance. > Can you get a performance benefit in that scenario > without having a negative performance impact > in the other scenarios? In particular, will try packing task on cpu capacity not cpu weight. > an 8% hit to the fully utilized case is a deal killer. that is the 8% gain on powersaving, not 8% lose on performance policy. :) > > The x=16 performance change here suggest there is value > someplace in this patch series to increase performance. > However, the case that these scheduling changes are > a benefit from an energy efficiency point of view > is yet to be made. -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-12 8:46 ` Alex Shi @ 2013-04-12 16:23 ` Borislav Petkov 2013-04-12 16:48 ` Mike Galbraith 2013-04-14 1:28 ` Alex Shi 0 siblings, 2 replies; 30+ messages in thread From: Borislav Petkov @ 2013-04-12 16:23 UTC (permalink / raw) To: Alex Shi Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Fri, Apr 12, 2013 at 04:46:50PM +0800, Alex Shi wrote: > Thanks a lot for comments, Len! AFAICT, you kinda forgot to answer his most important question: > These numbers suggest that this patch series simultaneously > has a negative impact on performance and energy required > to retire the workload. Why do it? -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-12 16:23 ` Borislav Petkov @ 2013-04-12 16:48 ` Mike Galbraith 2013-04-12 17:12 ` Borislav Petkov 2013-04-17 21:53 ` Len Brown 2013-04-14 1:28 ` Alex Shi 1 sibling, 2 replies; 30+ messages in thread From: Mike Galbraith @ 2013-04-12 16:48 UTC (permalink / raw) To: Borislav Petkov Cc: Alex Shi, Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Fri, 2013-04-12 at 18:23 +0200, Borislav Petkov wrote: > On Fri, Apr 12, 2013 at 04:46:50PM +0800, Alex Shi wrote: > > Thanks a lot for comments, Len! > > AFAICT, you kinda forgot to answer his most important question: > > > These numbers suggest that this patch series simultaneously > > has a negative impact on performance and energy required > > to retire the workload. Why do it? Hm. When I tested AIM7 compute on a NUMA box, there was a marked throughput increase at the low to moderate load end of the test spectrum IIRC. Fully repeatable. There were also other benefits unrelated to power, ie mitigation of the evil face of select_idle_sibling(). I rather liked what I saw during ~big box test-drive. (just saying there are other aspects besides joules in there) -Mike ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-12 16:48 ` Mike Galbraith @ 2013-04-12 17:12 ` Borislav Petkov 2013-04-14 1:36 ` Alex Shi 2013-04-17 21:53 ` Len Brown 1 sibling, 1 reply; 30+ messages in thread From: Borislav Petkov @ 2013-04-12 17:12 UTC (permalink / raw) To: Mike Galbraith Cc: Alex Shi, Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Fri, Apr 12, 2013 at 06:48:31PM +0200, Mike Galbraith wrote: > (just saying there are other aspects besides joules in there) Yeah, but we don't allow any regressions in sched*, do we? Can we pick only the good cherries? :-) -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-12 17:12 ` Borislav Petkov @ 2013-04-14 1:36 ` Alex Shi 0 siblings, 0 replies; 30+ messages in thread From: Alex Shi @ 2013-04-14 1:36 UTC (permalink / raw) To: Borislav Petkov Cc: Mike Galbraith, Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/13/2013 01:12 AM, Borislav Petkov wrote: > On Fri, Apr 12, 2013 at 06:48:31PM +0200, Mike Galbraith wrote: >> (just saying there are other aspects besides joules in there) > > Yeah, but we don't allow any regressions in sched*, do we? Can we pick > only the good cherries? :-) > Thanks for all of discussion on this threads. :) I think we can bear a little power efficient lose when want powersaving. For second question, the performance increase come from cpu boost feature, the hardware feature diffined, if there are some cores idle in cpu socket, other core has more chance to boost on higher frequency. The task packing try to pack tasks so that left more idle cores. The difficult to merge this feature into current performance is that current balance policy is trying to give as much as possible cpu resources to each of task. that just conflict with the cpu boost condition. -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-12 16:48 ` Mike Galbraith 2013-04-12 17:12 ` Borislav Petkov @ 2013-04-17 21:53 ` Len Brown 2013-04-18 1:51 ` Mike Galbraith 2013-04-26 15:11 ` Mike Galbraith 1 sibling, 2 replies; 30+ messages in thread From: Len Brown @ 2013-04-17 21:53 UTC (permalink / raw) To: Mike Galbraith Cc: Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/12/2013 12:48 PM, Mike Galbraith wrote: > On Fri, 2013-04-12 at 18:23 +0200, Borislav Petkov wrote: >> On Fri, Apr 12, 2013 at 04:46:50PM +0800, Alex Shi wrote: >>> Thanks a lot for comments, Len! >> >> AFAICT, you kinda forgot to answer his most important question: >> >>> These numbers suggest that this patch series simultaneously >>> has a negative impact on performance and energy required >>> to retire the workload. Why do it? > > Hm. When I tested AIM7 compute on a NUMA box, there was a marked > throughput increase at the low to moderate load end of the test spectrum > IIRC. Fully repeatable. There were also other benefits unrelated to > power, ie mitigation of the evil face of select_idle_sibling(). I > rather liked what I saw during ~big box test-drive. > > (just saying there are other aspects besides joules in there) Mike, Can you re-run your AIM7 measurement with turbo-mode and HT-mode disabled, and then independently re-enable them? If you still see the performance benefit, then that proves that the scheduler hacks are not about tricking into turbo mode, but something else. If the performance gains *are* about interactions with turbo-mode, then perhaps what we should really be doing here is making the scheduler explicitly turbo-aware? Of course, that begs the question of how the scheduler should be aware of cpufreq in general... thanks, Len Brown, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-17 21:53 ` Len Brown @ 2013-04-18 1:51 ` Mike Galbraith 2013-04-26 15:11 ` Mike Galbraith 1 sibling, 0 replies; 30+ messages in thread From: Mike Galbraith @ 2013-04-18 1:51 UTC (permalink / raw) To: Len Brown Cc: Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Wed, 2013-04-17 at 17:53 -0400, Len Brown wrote: > On 04/12/2013 12:48 PM, Mike Galbraith wrote: > > On Fri, 2013-04-12 at 18:23 +0200, Borislav Petkov wrote: > >> On Fri, Apr 12, 2013 at 04:46:50PM +0800, Alex Shi wrote: > >>> Thanks a lot for comments, Len! > >> > >> AFAICT, you kinda forgot to answer his most important question: > >> > >>> These numbers suggest that this patch series simultaneously > >>> has a negative impact on performance and energy required > >>> to retire the workload. Why do it? > > > > Hm. When I tested AIM7 compute on a NUMA box, there was a marked > > throughput increase at the low to moderate load end of the test spectrum > > IIRC. Fully repeatable. There were also other benefits unrelated to > > power, ie mitigation of the evil face of select_idle_sibling(). I > > rather liked what I saw during ~big box test-drive. > > > > (just saying there are other aspects besides joules in there) > > Mike, > > Can you re-run your AIM7 measurement with turbo-mode and HT-mode disabled, > and then independently re-enable them? Unfortunately no, because I don't have remote access to buttons. > If you still see the performance benefit, then that proves > that the scheduler hacks are not about tricking into > turbo mode, but something else. Yeah, turbo playing a role in that makes lots of sense. Someone else will have to test that though. It was 100% repeatable, so should be easy to verify. -Mike ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-17 21:53 ` Len Brown 2013-04-18 1:51 ` Mike Galbraith @ 2013-04-26 15:11 ` Mike Galbraith 2013-04-30 5:16 ` Mike Galbraith 1 sibling, 1 reply; 30+ messages in thread From: Mike Galbraith @ 2013-04-26 15:11 UTC (permalink / raw) To: Len Brown Cc: Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Wed, 2013-04-17 at 17:53 -0400, Len Brown wrote: > On 04/12/2013 12:48 PM, Mike Galbraith wrote: > > On Fri, 2013-04-12 at 18:23 +0200, Borislav Petkov wrote: > >> On Fri, Apr 12, 2013 at 04:46:50PM +0800, Alex Shi wrote: > >>> Thanks a lot for comments, Len! > >> > >> AFAICT, you kinda forgot to answer his most important question: > >> > >>> These numbers suggest that this patch series simultaneously > >>> has a negative impact on performance and energy required > >>> to retire the workload. Why do it? > > > > Hm. When I tested AIM7 compute on a NUMA box, there was a marked > > throughput increase at the low to moderate load end of the test spectrum > > IIRC. Fully repeatable. There were also other benefits unrelated to > > power, ie mitigation of the evil face of select_idle_sibling(). I > > rather liked what I saw during ~big box test-drive. > > > > (just saying there are other aspects besides joules in there) > > Mike, > > Can you re-run your AIM7 measurement with turbo-mode and HT-mode disabled, > and then independently re-enable them? > > If you still see the performance benefit, then that proves > that the scheduler hacks are not about tricking into > turbo mode, but something else. I did that today, neither turbo nor HT affected the performance gain. I used the same box and patch set as tested before (v4), but plugged into linus HEAD. "powersaving" AIM7 numbers are ~identical to those I posted before, "performance" is lower at the low end of AIM7 test spectrum, but as before, delta goes away once the load becomes hefty. -Mike ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-26 15:11 ` Mike Galbraith @ 2013-04-30 5:16 ` Mike Galbraith 2013-04-30 8:30 ` Mike Galbraith 0 siblings, 1 reply; 30+ messages in thread From: Mike Galbraith @ 2013-04-30 5:16 UTC (permalink / raw) To: Len Brown Cc: Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Fri, 2013-04-26 at 17:11 +0200, Mike Galbraith wrote: > On Wed, 2013-04-17 at 17:53 -0400, Len Brown wrote: > > On 04/12/2013 12:48 PM, Mike Galbraith wrote: > > > On Fri, 2013-04-12 at 18:23 +0200, Borislav Petkov wrote: > > >> On Fri, Apr 12, 2013 at 04:46:50PM +0800, Alex Shi wrote: > > >>> Thanks a lot for comments, Len! > > >> > > >> AFAICT, you kinda forgot to answer his most important question: > > >> > > >>> These numbers suggest that this patch series simultaneously > > >>> has a negative impact on performance and energy required > > >>> to retire the workload. Why do it? > > > > > > Hm. When I tested AIM7 compute on a NUMA box, there was a marked > > > throughput increase at the low to moderate load end of the test spectrum > > > IIRC. Fully repeatable. There were also other benefits unrelated to > > > power, ie mitigation of the evil face of select_idle_sibling(). I > > > rather liked what I saw during ~big box test-drive. > > > > > > (just saying there are other aspects besides joules in there) > > > > Mike, > > > > Can you re-run your AIM7 measurement with turbo-mode and HT-mode disabled, > > and then independently re-enable them? > > > > If you still see the performance benefit, then that proves > > that the scheduler hacks are not about tricking into > > turbo mode, but something else. > > I did that today, neither turbo nor HT affected the performance gain. I > used the same box and patch set as tested before (v4), but plugged into > linus HEAD. "powersaving" AIM7 numbers are ~identical to those I posted > before, "performance" is lower at the low end of AIM7 test spectrum, but > as before, delta goes away once the load becomes hefty. Well now, that's not exactly what I expected to see for AIM7 compute. Filesystem is munching cycles otherwise used for compute when load is spread across the whole box vs consolidated. performance PerfTop: 35 irqs/sec kernel:94.3% exact: 0.0% [1000Hz cycles], (all, 80 CPUs) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- samples pcnt function DSO _______ _____ ______________________________ ________________________________________ 9367.00 15.5% jbd2_journal_put_journal_head /lib/modules/3.9.0-default/build/vmlinux 7658.00 12.7% jbd2_journal_add_journal_head /lib/modules/3.9.0-default/build/vmlinux 7042.00 11.7% jbd2_journal_grab_journal_head /lib/modules/3.9.0-default/build/vmlinux 4433.00 7.4% sieve /abuild/mike/aim7/multitask 3248.00 5.4% jbd_lock_bh_state /lib/modules/3.9.0-default/build/vmlinux 3034.00 5.0% do_get_write_access /lib/modules/3.9.0-default/build/vmlinux 2058.00 3.4% mul_double /abuild/mike/aim7/multitask 2038.00 3.4% add_double /abuild/mike/aim7/multitask 1365.00 2.3% native_write_msr_safe /lib/modules/3.9.0-default/build/vmlinux 1333.00 2.2% __find_get_block /lib/modules/3.9.0-default/build/vmlinux 1213.00 2.0% add_long /abuild/mike/aim7/multitask 1208.00 2.0% add_int /abuild/mike/aim7/multitask 1084.00 1.8% __wait_on_bit_lock /lib/modules/3.9.0-default/build/vmlinux 1065.00 1.8% div_double /abuild/mike/aim7/multitask 901.00 1.5% intel_idle /lib/modules/3.9.0-default/build/vmlinux 812.00 1.3% _raw_spin_lock_irqsave /lib/modules/3.9.0-default/build/vmlinux 559.00 0.9% jbd2_journal_dirty_metadata /lib/modules/3.9.0-default/build/vmlinux 464.00 0.8% copy_user_generic_string /lib/modules/3.9.0-default/build/vmlinux 455.00 0.8% div_int /abuild/mike/aim7/multitask 430.00 0.7% string_rtns_1 /abuild/mike/aim7/multitask 419.00 0.7% strncat /lib64/libc-2.11.3.so 412.00 0.7% wake_bit_function /lib/modules/3.9.0-default/build/vmlinux 347.00 0.6% jbd2_journal_cancel_revoke /lib/modules/3.9.0-default/build/vmlinux 346.00 0.6% ext4_mark_iloc_dirty /lib/modules/3.9.0-default/build/vmlinux 306.00 0.5% __brelse /lib/modules/3.9.0-default/build/vmlinux powersaving PerfTop: 59 irqs/sec kernel:78.0% exact: 0.0% [1000Hz cycles], (all, 80 CPUs) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- samples pcnt function DSO _______ _____ ______________________________ ________________________________________ 6383.00 22.5% sieve /abuild/mike/aim7/multitask 2380.00 8.4% mul_double /abuild/mike/aim7/multitask 2375.00 8.4% add_double /abuild/mike/aim7/multitask 1678.00 5.9% add_long /abuild/mike/aim7/multitask 1633.00 5.8% add_int /abuild/mike/aim7/multitask 1338.00 4.7% div_double /abuild/mike/aim7/multitask 770.00 2.7% strncat /lib64/libc-2.11.3.so 698.00 2.5% string_rtns_1 /abuild/mike/aim7/multitask 678.00 2.4% copy_user_generic_string /lib/modules/3.9.0-default/build/vmlinux 569.00 2.0% div_int /abuild/mike/aim7/multitask 329.00 1.2% jbd2_journal_put_journal_head /lib/modules/3.9.0-default/build/vmlinux 306.00 1.1% array_rtns /abuild/mike/aim7/multitask 298.00 1.1% do_get_write_access /lib/modules/3.9.0-default/build/vmlinux 270.00 1.0% jbd2_journal_add_journal_head /lib/modules/3.9.0-default/build/vmlinux 258.00 0.9% _int_malloc /lib64/libc-2.11.3.so 251.00 0.9% __find_get_block /lib/modules/3.9.0-default/build/vmlinux 236.00 0.8% __memset /lib/modules/3.9.0-default/build/vmlinux 224.00 0.8% jbd2_journal_grab_journal_head /lib/modules/3.9.0-default/build/vmlinux 221.00 0.8% intel_idle /lib/modules/3.9.0-default/build/vmlinux 161.00 0.6% jbd_lock_bh_state /lib/modules/3.9.0-default/build/vmlinux 161.00 0.6% start_this_handle /lib/modules/3.9.0-default/build/vmlinux 153.00 0.5% __GI_memset /lib64/libc-2.11.3.so 147.00 0.5% ext4_do_update_inode /lib/modules/3.9.0-default/build/vmlinux 135.00 0.5% jbd2_journal_stop /lib/modules/3.9.0-default/build/vmlinux 123.00 0.4% jbd2_journal_dirty_metadata /lib/modules/3.9.0-default/build/vmlinux performance procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 14 7 0 47716456 255124 674808 0 0 0 0 6183 93733 1 3 95 1 0 0 0 0 47791912 255152 602068 0 0 0 2671 14526 49606 2 2 94 1 0 1 0 0 47794384 255152 603796 0 0 0 0 68 111 0 0 100 0 0 8 6 0 47672340 255156 730040 0 0 0 0 36249 103961 2 8 86 4 0 0 0 0 47793976 255216 604616 0 0 0 2686 5322 6379 2 1 97 0 0 0 0 0 47799128 255216 603108 0 0 0 0 62 106 0 0 100 0 0 3 0 0 47795972 255300 603136 0 0 0 2626 39115 146228 3 5 88 3 0 0 0 0 47797176 255300 603284 0 0 0 43 128 216 0 0 100 0 0 0 0 0 47803244 255300 602580 0 0 0 0 78 124 0 0 100 0 0 0 0 0 47789120 255336 603940 0 0 0 2676 14085 85798 3 3 92 1 0 powersaving 0 0 0 47820780 255516 590292 0 0 0 31 81 126 0 0 100 0 0 0 0 0 47823712 255516 589376 0 0 0 0 107 190 0 0 100 0 0 0 0 0 47826608 255516 588060 0 0 0 0 76 130 0 0 100 0 0 0 0 0 47811260 255632 602080 0 0 0 2678 106 200 0 0 100 0 0 0 0 0 47812548 255632 601892 0 0 0 0 69 110 0 0 100 0 0 0 0 0 47808284 255680 604400 0 0 0 2668 1588 3451 4 2 94 0 0 0 0 0 47810300 255680 603624 0 0 0 0 77 124 0 0 100 0 0 20 3 0 47760764 255720 643744 0 0 1 0 948 2817 2 1 97 0 0 0 0 0 47817828 255756 602400 0 0 1 2703 984 797 2 0 98 0 0 0 0 0 47819548 255756 602532 0 0 0 0 93 158 0 0 100 0 0 1 0 0 47819312 255792 603080 0 0 0 2661 1774 3348 4 2 94 0 0 0 0 0 47821912 255800 602608 0 0 0 2 66 107 0 0 100 0 0 Invisible ink is pretty expensive stuff. -Mike ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-30 5:16 ` Mike Galbraith @ 2013-04-30 8:30 ` Mike Galbraith 2013-04-30 8:41 ` Ingo Molnar 0 siblings, 1 reply; 30+ messages in thread From: Mike Galbraith @ 2013-04-30 8:30 UTC (permalink / raw) To: Len Brown Cc: Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Tue, 2013-04-30 at 07:16 +0200, Mike Galbraith wrote: > Well now, that's not exactly what I expected to see for AIM7 compute. > Filesystem is munching cycles otherwise used for compute when load is > spread across the whole box vs consolidated. So AIM7 compute performance delta boils down to: powersaving stacks tasks, so they pat single bit of spinning rust sequentially/gently. -Mike ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-30 8:30 ` Mike Galbraith @ 2013-04-30 8:41 ` Ingo Molnar 2013-04-30 9:35 ` Mike Galbraith 0 siblings, 1 reply; 30+ messages in thread From: Ingo Molnar @ 2013-04-30 8:41 UTC (permalink / raw) To: Mike Galbraith Cc: Len Brown, Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list * Mike Galbraith <bitbucket@online.de> wrote: > On Tue, 2013-04-30 at 07:16 +0200, Mike Galbraith wrote: > > > Well now, that's not exactly what I expected to see for AIM7 compute. > > Filesystem is munching cycles otherwise used for compute when load is > > spread across the whole box vs consolidated. > > So AIM7 compute performance delta boils down to: powersaving stacks > tasks, so they pat single bit of spinning rust sequentially/gently. So AIM7 with real block IO improved, due to sequentiality. Does it improve if AIM7 works on an SSD, or into ramdisk? Which are the workloads where 'powersaving' mode hurts workload performance measurably? Thanks, Ingo ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-30 8:41 ` Ingo Molnar @ 2013-04-30 9:35 ` Mike Galbraith 2013-04-30 9:49 ` Mike Galbraith 0 siblings, 1 reply; 30+ messages in thread From: Mike Galbraith @ 2013-04-30 9:35 UTC (permalink / raw) To: Ingo Molnar Cc: Len Brown, Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Tue, 2013-04-30 at 10:41 +0200, Ingo Molnar wrote: > * Mike Galbraith <bitbucket@online.de> wrote: > > > On Tue, 2013-04-30 at 07:16 +0200, Mike Galbraith wrote: > > > > > Well now, that's not exactly what I expected to see for AIM7 compute. > > > Filesystem is munching cycles otherwise used for compute when load is > > > spread across the whole box vs consolidated. > > > > So AIM7 compute performance delta boils down to: powersaving stacks > > tasks, so they pat single bit of spinning rust sequentially/gently. > > So AIM7 with real block IO improved, due to sequentiality. Does it improve > if AIM7 works on an SSD, or into ramdisk? Seriously doubt it, but I suppose I can try tmpfs. performance Tasks jobs/min jti jobs/min/task real cpu 20 11170.51 99 558.5253 10.85 15.19 Tue Apr 30 11:21:46 2013 20 11078.61 99 553.9305 10.94 15.59 Tue Apr 30 11:21:57 2013 20 11191.14 99 559.5568 10.83 15.29 Tue Apr 30 11:22:08 2013 powersaving Tasks jobs/min jti jobs/min/task real cpu 20 10978.26 99 548.9130 11.04 19.25 Tue Apr 30 11:22:38 2013 20 10988.21 99 549.4107 11.03 18.71 Tue Apr 30 11:22:49 2013 20 11008.17 99 550.4087 11.01 18.85 Tue Apr 30 11:23:00 2013 Nope. > Which are the workloads where 'powersaving' mode hurts workload > performance measurably? Well, it'll lose throughput any time there's parallel execution potential but it's serialized instead.. using average will inevitably stack tasks sometimes, but that's its goal. Hackbench shows it. performance monteverdi:/abuild/mike/aim7/:[0]# hackbench -l 1000 Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 1000 messages of 100 bytes Time: 0.487 monteverdi:/abuild/mike/aim7/:[0]# hackbench -l 1000 Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 1000 messages of 100 bytes Time: 0.487 monteverdi:/abuild/mike/aim7/:[0]# hackbench -l 1000 Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 1000 messages of 100 bytes Time: 0.497 powersaving monteverdi:/abuild/mike/aim7/:[0]# hackbench -l 1000 Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 1000 messages of 100 bytes Time: 0.702 monteverdi:/abuild/mike/aim7/:[0]# hackbench -l 1000 Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 1000 messages of 100 bytes Time: 0.679 monteverdi:/abuild/mike/aim7/:[0]# hackbench -l 1000 Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 1000 messages of 100 bytes Time: 1.137 -Mike ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-30 9:35 ` Mike Galbraith @ 2013-04-30 9:49 ` Mike Galbraith 2013-04-30 9:56 ` Mike Galbraith 0 siblings, 1 reply; 30+ messages in thread From: Mike Galbraith @ 2013-04-30 9:49 UTC (permalink / raw) To: Ingo Molnar Cc: Len Brown, Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Tue, 2013-04-30 at 11:35 +0200, Mike Galbraith wrote: > On Tue, 2013-04-30 at 10:41 +0200, Ingo Molnar wrote: > > Which are the workloads where 'powersaving' mode hurts workload > > performance measurably? > > Well, it'll lose throughput any time there's parallel execution > potential but it's serialized instead.. using average will inevitably > stack tasks sometimes, but that's its goal. Hackbench shows it. (but that consolidation can be a winner too, and I bet a nickle it would be for a socket sized pgbench run) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-30 9:49 ` Mike Galbraith @ 2013-04-30 9:56 ` Mike Galbraith 2013-05-17 8:06 ` Preeti U Murthy 0 siblings, 1 reply; 30+ messages in thread From: Mike Galbraith @ 2013-04-30 9:56 UTC (permalink / raw) To: Ingo Molnar Cc: Len Brown, Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Tue, 2013-04-30 at 11:49 +0200, Mike Galbraith wrote: > On Tue, 2013-04-30 at 11:35 +0200, Mike Galbraith wrote: > > On Tue, 2013-04-30 at 10:41 +0200, Ingo Molnar wrote: > > > > Which are the workloads where 'powersaving' mode hurts workload > > > performance measurably? > > > > Well, it'll lose throughput any time there's parallel execution > > potential but it's serialized instead.. using average will inevitably > > stack tasks sometimes, but that's its goal. Hackbench shows it. > > (but that consolidation can be a winner too, and I bet a nickle it would > be for a socket sized pgbench run) (belay that, was thinking of keeping all tasks on a single node, but it'll likely stack the whole thing on a CPU or two, if so, it'll hurt) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-30 9:56 ` Mike Galbraith @ 2013-05-17 8:06 ` Preeti U Murthy 2013-05-20 1:01 ` Alex Shi 0 siblings, 1 reply; 30+ messages in thread From: Preeti U Murthy @ 2013-05-17 8:06 UTC (permalink / raw) To: Mike Galbraith Cc: Ingo Molnar, Len Brown, Borislav Petkov, Alex Shi, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/30/2013 03:26 PM, Mike Galbraith wrote: > On Tue, 2013-04-30 at 11:49 +0200, Mike Galbraith wrote: >> On Tue, 2013-04-30 at 11:35 +0200, Mike Galbraith wrote: >>> On Tue, 2013-04-30 at 10:41 +0200, Ingo Molnar wrote: >> >>>> Which are the workloads where 'powersaving' mode hurts workload >>>> performance measurably? I ran ebizzy on a 2 socket, 16 core, SMT 4 Power machine. The power efficiency drops significantly with the powersaving policy of this patch,over the power efficiency of the scheduler without this patch. The below parameters are measured relative to the default scheduler behaviour. A: Drop in power efficiency with the patch+powersaving policy B: Drop in performance with the patch+powersaving policy C: Decrease in power consumption with the patch+powersaving policy NumThreads A B C ----------------------------------------- 2 33% 36% 4% 4 31% 33% 3% 8 28% 30% 3% 16 31% 33% 4% Each of the above run is for 30s. On investigating socket utilization,I found that only 1 socket was being used during all the above threaded runs. As can be guessed this is due to the group_weight being considered for the threshold metric. This stacks up tasks on a core and further on a socket, thus throttling them, as observed by Mike below. I therefore think we must switch to group_capacity as the metric for threshold and use only (rq->utils*nr_running) for group_utils calculation during non-bursty wakeup scenarios. This way we are comparing right; the utilization of the runqueue by the fair tasks and the cpu capacity available for them after being consumed by the rt tasks. After I made the above modification,all the above three parameters came to be nearly null. However, I am observing the load balancing of the scheduler with the patch and powersavings policy enabled. It is behaving very close to the default scheduler (spreading tasks across sockets). That also explains why there is no performance drop or gain with the patch+powersavings policy enabled. I will look into this observation and revert. >>> >>> Well, it'll lose throughput any time there's parallel execution >>> potential but it's serialized instead.. using average will inevitably >>> stack tasks sometimes, but that's its goal. Hackbench shows it. >> >> (but that consolidation can be a winner too, and I bet a nickle it would >> be for a socket sized pgbench run) > > (belay that, was thinking of keeping all tasks on a single node, but > it'll likely stack the whole thing on a CPU or two, if so, it'll hurt) At this point, I would like to raise one issue. *Is the goal of the power aware scheduler improving power efficiency of the scheduler or a compromise on the power efficiency but definitely a decrease in power consumption, since it is the user who has decided to prioritise lower power consumption over performance* ? > Regards Preeti U Murthy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-05-17 8:06 ` Preeti U Murthy @ 2013-05-20 1:01 ` Alex Shi 2013-05-20 2:30 ` Preeti U Murthy 0 siblings, 1 reply; 30+ messages in thread From: Alex Shi @ 2013-05-20 1:01 UTC (permalink / raw) To: Preeti U Murthy Cc: Mike Galbraith, Ingo Molnar, Len Brown, Borislav Petkov, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list >>>>> Which are the workloads where 'powersaving' mode hurts workload >>>>> performance measurably? > > I ran ebizzy on a 2 socket, 16 core, SMT 4 Power machine. Is this a 2 * 16 * 4 LCPUs PowerPC machine? > The power efficiency drops significantly with the powersaving policy of > this patch,over the power efficiency of the scheduler without this patch. > > The below parameters are measured relative to the default scheduler > behaviour. > > A: Drop in power efficiency with the patch+powersaving policy > B: Drop in performance with the patch+powersaving policy > C: Decrease in power consumption with the patch+powersaving policy > > NumThreads A B C > ----------------------------------------- > 2 33% 36% 4% > 4 31% 33% 3% > 8 28% 30% 3% > 16 31% 33% 4% > > Each of the above run is for 30s. > > On investigating socket utilization,I found that only 1 socket was being > used during all the above threaded runs. As can be guessed this is due > to the group_weight being considered for the threshold metric. > This stacks up tasks on a core and further on a socket, thus throttling > them, as observed by Mike below. > > I therefore think we must switch to group_capacity as the metric for > threshold and use only (rq->utils*nr_running) for group_utils > calculation during non-bursty wakeup scenarios. > This way we are comparing right; the utilization of the runqueue by the > fair tasks and the cpu capacity available for them after being consumed > by the rt tasks. > > After I made the above modification,all the above three parameters came > to be nearly null. However, I am observing the load balancing of the > scheduler with the patch and powersavings policy enabled. It is behaving > very close to the default scheduler (spreading tasks across sockets). > That also explains why there is no performance drop or gain with the > patch+powersavings policy enabled. I will look into this observation and > revert. Thanks a lot for the great testings! Seem tasks per SMT cpu isn't power efficient. And I got the similar result last week. I tested the fspin testing(do endless calculation, in linux-next tree.). when I bind task per SMT cpu, the power efficiency really dropped with most every threads number. but when bind task per core, it has better power efficiency on all threads. Beside to move task depend on group_capacity, another choice is balance task according cpu_power. I did the transfer in code. but need to go through a internal open source process before public them. > >>>> >>>> Well, it'll lose throughput any time there's parallel execution >>>> potential but it's serialized instead.. using average will inevitably >>>> stack tasks sometimes, but that's its goal. Hackbench shows it. >>> >>> (but that consolidation can be a winner too, and I bet a nickle it would >>> be for a socket sized pgbench run) >> >> (belay that, was thinking of keeping all tasks on a single node, but >> it'll likely stack the whole thing on a CPU or two, if so, it'll hurt) > > At this point, I would like to raise one issue. > *Is the goal of the power aware scheduler improving power efficiency of > the scheduler or a compromise on the power efficiency but definitely a > decrease in power consumption, since it is the user who has decided to > prioritise lower power consumption over performance* ? > It could be one of reason for this feather, but I could like to make it has better efficiency, like packing tasks according to cpu_power not current group_weight. >> > > Regards > Preeti U Murthy > -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-05-20 1:01 ` Alex Shi @ 2013-05-20 2:30 ` Preeti U Murthy 0 siblings, 0 replies; 30+ messages in thread From: Preeti U Murthy @ 2013-05-20 2:30 UTC (permalink / raw) To: Alex Shi Cc: Mike Galbraith, Ingo Molnar, Len Brown, Borislav Petkov, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, morten.rasmussen, vincent.guittot, gregkh, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list Hi Alex, On 05/20/2013 06:31 AM, Alex Shi wrote: > >>>>>> Which are the workloads where 'powersaving' mode hurts workload >>>>>> performance measurably? >> >> I ran ebizzy on a 2 socket, 16 core, SMT 4 Power machine. > > Is this a 2 * 16 * 4 LCPUs PowerPC machine? This is a 2 * 8 * 4 LCPUs PowerPC machine. >> The power efficiency drops significantly with the powersaving policy of >> this patch,over the power efficiency of the scheduler without this patch. >> >> The below parameters are measured relative to the default scheduler >> behaviour. >> >> A: Drop in power efficiency with the patch+powersaving policy >> B: Drop in performance with the patch+powersaving policy >> C: Decrease in power consumption with the patch+powersaving policy >> >> NumThreads A B C >> ----------------------------------------- >> 2 33% 36% 4% >> 4 31% 33% 3% >> 8 28% 30% 3% >> 16 31% 33% 4% >> >> Each of the above run is for 30s. >> >> On investigating socket utilization,I found that only 1 socket was being >> used during all the above threaded runs. As can be guessed this is due >> to the group_weight being considered for the threshold metric. >> This stacks up tasks on a core and further on a socket, thus throttling >> them, as observed by Mike below. >> >> I therefore think we must switch to group_capacity as the metric for >> threshold and use only (rq->utils*nr_running) for group_utils >> calculation during non-bursty wakeup scenarios. >> This way we are comparing right; the utilization of the runqueue by the >> fair tasks and the cpu capacity available for them after being consumed >> by the rt tasks. >> >> After I made the above modification,all the above three parameters came >> to be nearly null. However, I am observing the load balancing of the >> scheduler with the patch and powersavings policy enabled. It is behaving >> very close to the default scheduler (spreading tasks across sockets). >> That also explains why there is no performance drop or gain with the >> patch+powersavings policy enabled. I will look into this observation and >> revert. > > Thanks a lot for the great testings! > Seem tasks per SMT cpu isn't power efficient. > And I got the similar result last week. I tested the fspin testing(do > endless calculation, in linux-next tree.). when I bind task per SMT cpu, > the power efficiency really dropped with most every threads number. but > when bind task per core, it has better power efficiency on all threads. > Beside to move task depend on group_capacity, another choice is balance > task according cpu_power. I did the transfer in code. but need to go > through a internal open source process before public them. What do you mean by *another* choice is balance task according to cpu_power? group_capacity is based on cpu_power. Also, your balance policy in v6 was doing the same right? It was rightly comparing rq->utils * nr_running against cpu_power. Why not simply switch to that code for power policy load balancing? >>>>> Well, it'll lose throughput any time there's parallel execution >>>>> potential but it's serialized instead.. using average will inevitably >>>>> stack tasks sometimes, but that's its goal. Hackbench shows it. >>>> >>>> (but that consolidation can be a winner too, and I bet a nickle it would >>>> be for a socket sized pgbench run) >>> >>> (belay that, was thinking of keeping all tasks on a single node, but >>> it'll likely stack the whole thing on a CPU or two, if so, it'll hurt) >> >> At this point, I would like to raise one issue. >> *Is the goal of the power aware scheduler improving power efficiency of >> the scheduler or a compromise on the power efficiency but definitely a >> decrease in power consumption, since it is the user who has decided to >> prioritise lower power consumption over performance* ? >> > > It could be one of reason for this feather, but I could like to > make it has better efficiency, like packing tasks according to cpu_power > not current group_weight. Yes we could try the patch using group_capacity and observe the results for power efficiency, before we decide to compromise on power efficiency for decrease in power. Regards Preeti U Murthy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-12 16:23 ` Borislav Petkov 2013-04-12 16:48 ` Mike Galbraith @ 2013-04-14 1:28 ` Alex Shi 2013-04-14 5:10 ` Alex Shi 2013-04-14 15:59 ` Borislav Petkov 1 sibling, 2 replies; 30+ messages in thread From: Alex Shi @ 2013-04-14 1:28 UTC (permalink / raw) To: Borislav Petkov Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/13/2013 12:23 AM, Borislav Petkov wrote: > On Fri, Apr 12, 2013 at 04:46:50PM +0800, Alex Shi wrote: >> > Thanks a lot for comments, Len! > AFAICT, you kinda forgot to answer his most important question: > >> > These numbers suggest that this patch series simultaneously >> > has a negative impact on performance and energy required >> > to retire the workload. Why do it? Even some scenario the total energy cost more, at least the avg watts dropped in that scenarios. Len said he has low p-state which can work there. but that's is different. I had sent some data in another email list to show the difference: The following is 2 times kbuild testing result for 3 kinds condiation on SNB EP box, the middle column is the lowest p-state testing result, we can see, it has the lowest power consumption, also has the lowest performance/watts value. At least for kbuild benchmark, powersaving policy has the best compromise on powersaving and power efficient. Further more, due to cpu boost feature, it has better performance in some scenarios. powersaving + ondemand userspace + fixed 1.2GHz performance+ondemand x = 8 231.318 /75 57 165.063 /166 36 253.552 /63 62 x = 16 280.357 /49 72 174.408 /106 54 296.776 /41 82 x = 32 325.206 /34 90 178.675 /90 62 314.153 /37 86 x = 8 233.623 /74 57 164.507 /168 36 254.775 /65 60 x = 16 272.54 /38 96 174.364 /106 54 297.731 /42 79 x = 32 320.758 /34 91 177.917 /91 61 317.875 /35 89 x = 64 326.837 /33 92 179.037 /90 62 320.615 /36 86 -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-14 1:28 ` Alex Shi @ 2013-04-14 5:10 ` Alex Shi 2013-04-14 15:59 ` Borislav Petkov 1 sibling, 0 replies; 30+ messages in thread From: Alex Shi @ 2013-04-14 5:10 UTC (permalink / raw) To: Borislav Petkov Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/14/2013 09:28 AM, Alex Shi wrote: >>>> >> > These numbers suggest that this patch series simultaneously >>>> >> > has a negative impact on performance and energy required >>>> >> > to retire the workload. Why do it? > Even some scenario the total energy cost more, at least the avg watts > dropped in that scenarios. Len said he has low p-state which can work > there. but that's is different. I had sent some data in another email > list to show the difference: > > The following is 2 times kbuild testing result for 3 kinds condiation on > SNB EP box, the middle column is the lowest p-state testing result, we > can see, it has the lowest power consumption, also has the lowest > performance/watts value. > At least for kbuild benchmark, powersaving policy has the best > compromise on powersaving and power efficient. Further more, due to cpu > boost feature, it has better performance in some scenarios. BTW, another benefit on powersaving is that powersaving policy is very flexible on system load. when task number in sched domain is beyond LCPU number, it will take performance oriented balance. That conduct the similar performance when system is busy. -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-14 1:28 ` Alex Shi 2013-04-14 5:10 ` Alex Shi @ 2013-04-14 15:59 ` Borislav Petkov 2013-04-15 6:04 ` Alex Shi 1 sibling, 1 reply; 30+ messages in thread From: Borislav Petkov @ 2013-04-14 15:59 UTC (permalink / raw) To: Alex Shi Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Sun, Apr 14, 2013 at 09:28:50AM +0800, Alex Shi wrote: > Even some scenario the total energy cost more, at least the avg watts > dropped in that scenarios. Ok, what's wrong with x = 32 then? So basically if you're looking at avg watts, you don't want to have more than 16 threads, otherwise powersaving sucks on that particular uarch and platform. Can you say that for all platforms out there? Also, I've added in the columns below the Energy = Power * Time thing. And the funny thing is, exactly there where avg watts is better in powersaving, energy for workload retire is worse. And the other way around. Basically, avg watts vs retire energy is reciprocal. Great :-\. > Len said he has low p-state which can work there. but that's is > different. I had sent some data in another email list to show the > difference: > > The following is 2 times kbuild testing result for 3 kinds condiation on > SNB EP box, the middle column is the lowest p-state testing result, we > can see, it has the lowest power consumption, also has the lowest > performance/watts value. > At least for kbuild benchmark, powersaving policy has the best > compromise on powersaving and power efficient. Further more, due to cpu > boost feature, it has better performance in some scenarios. > > powersaving + ondemand userspace + fixed 1.2GHz performance+ondemand > x = 8 231.318 /75 57 165.063 /166 36 253.552 /63 62 > x = 16 280.357 /49 72 174.408 /106 54 296.776 /41 82 > x = 32 325.206 /34 90 178.675 /90 62 314.153 /37 86 > > x = 8 233.623 /74 57 164.507 /168 36 254.775 /65 60 > x = 16 272.54 /38 96 174.364 /106 54 297.731 /42 79 > x = 32 320.758 /34 91 177.917 /91 61 317.875 /35 89 > x = 64 326.837 /33 92 179.037 /90 62 320.615 /36 86 17348.850 27400.458 15973.776 13737.493 18487.248 12167.816 11057.004 16080.750 11623.661 17288.102 27637.176 16560.375 10356.52 18482.584 12504.702 10905.772 16190.447 11125.625 10785.621 16113.330 11542.140 -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-14 15:59 ` Borislav Petkov @ 2013-04-15 6:04 ` Alex Shi 2013-04-15 6:16 ` Alex Shi 0 siblings, 1 reply; 30+ messages in thread From: Alex Shi @ 2013-04-15 6:04 UTC (permalink / raw) To: Borislav Petkov Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/14/2013 11:59 PM, Borislav Petkov wrote: > On Sun, Apr 14, 2013 at 09:28:50AM +0800, Alex Shi wrote: >> Even some scenario the total energy cost more, at least the avg watts >> dropped in that scenarios. > > Ok, what's wrong with x = 32 then? So basically if you're looking at > avg watts, you don't want to have more than 16 threads, otherwise > powersaving sucks on that particular uarch and platform. Can you say > that for all platforms out there? The cpu freq boost make the avg watts higher with x = 32, and also make higher power efficiency. We can disable cpu freq boost for this if we want lower power consumption all time. But for my understanding, the power efficient is better way to save power. As to other platforms, I'm glad to see any testing or try and give me results... > > Also, I've added in the columns below the Energy = Power * Time thing. Thanks. btw the third data of each column is 'performance/watt'. that shows similar meaning on the other side. :) > > And the funny thing is, exactly there where avg watts is better in > powersaving, energy for workload retire is worse. And the other way > around. Basically, avg watts vs retire energy is reciprocal. Great :-\. > >> Len said he has low p-state which can work there. but that's is >> different. I had sent some data in another email list to show the >> difference: >> >> The following is 2 times kbuild testing result for 3 kinds condiation on >> SNB EP box, the middle column is the lowest p-state testing result, we >> can see, it has the lowest power consumption, also has the lowest >> performance/watts value. >> At least for kbuild benchmark, powersaving policy has the best >> compromise on powersaving and power efficient. Further more, due to cpu >> boost feature, it has better performance in some scenarios. >> >> powersaving + ondemand userspace + fixed 1.2GHz performance+ondemand >> x = 8 231.318 /75 57 165.063 /166 36 253.552 /63 62 >> x = 16 280.357 /49 72 174.408 /106 54 296.776 /41 82 >> x = 32 325.206 /34 90 178.675 /90 62 314.153 /37 86 >> >> x = 8 233.623 /74 57 164.507 /168 36 254.775 /65 60 >> x = 16 272.54 /38 96 174.364 /106 54 297.731 /42 79 >> x = 32 320.758 /34 91 177.917 /91 61 317.875 /35 89 >> x = 64 326.837 /33 92 179.037 /90 62 320.615 /36 86 > > 17348.850 27400.458 15973.776 > 13737.493 18487.248 12167.816 > 11057.004 16080.750 11623.661 > > 17288.102 27637.176 16560.375 > 10356.52 18482.584 12504.702 > 10905.772 16190.447 11125.625 > 10785.621 16113.330 11542.140 > -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-15 6:04 ` Alex Shi @ 2013-04-15 6:16 ` Alex Shi 2013-04-15 9:52 ` Borislav Petkov 0 siblings, 1 reply; 30+ messages in thread From: Alex Shi @ 2013-04-15 6:16 UTC (permalink / raw) To: Borislav Petkov Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/15/2013 02:04 PM, Alex Shi wrote: > On 04/14/2013 11:59 PM, Borislav Petkov wrote: >> > On Sun, Apr 14, 2013 at 09:28:50AM +0800, Alex Shi wrote: >>> >> Even some scenario the total energy cost more, at least the avg watts >>> >> dropped in that scenarios. >> > >> > Ok, what's wrong with x = 32 then? So basically if you're looking at >> > avg watts, you don't want to have more than 16 threads, otherwise >> > powersaving sucks on that particular uarch and platform. Can you say >> > that for all platforms out there? > The cpu freq boost make the avg watts higher with x = 32, and also make > higher power efficiency. We can disable cpu freq boost for this if we > want lower power consumption all time. > But for my understanding, the power efficient is better way to save power. BTW, lowest p-state, no freq boost and plus this powersaving policy will give the lowest power consumption. And I need to say again. the powersaving policy just effect on system under utilisation. when system goes busy, it won't has effect. performance oriented policy will take over balance behaviour. -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-15 6:16 ` Alex Shi @ 2013-04-15 9:52 ` Borislav Petkov 2013-04-15 13:50 ` Alex Shi 0 siblings, 1 reply; 30+ messages in thread From: Borislav Petkov @ 2013-04-15 9:52 UTC (permalink / raw) To: Alex Shi Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Mon, Apr 15, 2013 at 02:16:55PM +0800, Alex Shi wrote: > And I need to say again. the powersaving policy just effect on system > under utilisation. when system goes busy, it won't has effect. > performance oriented policy will take over balance behaviour. And AFACU your patches, you do this automatically, right? In which case, an underutilized system will have switched to powersaving balancing and will use *more* energy to retire the workload. Correct? -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-15 9:52 ` Borislav Petkov @ 2013-04-15 13:50 ` Alex Shi 2013-04-15 23:12 ` Borislav Petkov 0 siblings, 1 reply; 30+ messages in thread From: Alex Shi @ 2013-04-15 13:50 UTC (permalink / raw) To: Borislav Petkov Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/15/2013 05:52 PM, Borislav Petkov wrote: > On Mon, Apr 15, 2013 at 02:16:55PM +0800, Alex Shi wrote: >> And I need to say again. the powersaving policy just effect on system >> under utilisation. when system goes busy, it won't has effect. >> performance oriented policy will take over balance behaviour. > > And AFACU your patches, you do this automatically, right? Yes In which case, > an underutilized system will have switched to powersaving balancing and > will use *more* energy to retire the workload. Correct? > For fairness and total threads consideration, powersaving cost quit similar energy on kbuild benchmark, and even better. 17348.850 27400.458 15973.776 13737.493 18487.248 12167.816 11057.004 16080.750 11623.661 17288.102 27637.176 16560.375 10356.52 18482.584 12504.702 10905.772 16190.447 11125.625 10785.621 16113.330 11542.140 -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-15 13:50 ` Alex Shi @ 2013-04-15 23:12 ` Borislav Petkov 2013-04-16 0:22 ` Alex Shi 0 siblings, 1 reply; 30+ messages in thread From: Borislav Petkov @ 2013-04-15 23:12 UTC (permalink / raw) To: Alex Shi Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Mon, Apr 15, 2013 at 09:50:22PM +0800, Alex Shi wrote: > For fairness and total threads consideration, powersaving cost quit > similar energy on kbuild benchmark, and even better. > > 17348.850 27400.458 15973.776 > 13737.493 18487.248 12167.816 Yeah, but those lines don't look good - powersaving needs more energy than performance. And what is even crazier is that fixed 1.2 GHz case. I'd guess in the normal case those cores are at triple the freq. - i.e. somewhere around 3-4 GHz. And yet, 1.2 GHz eats almost *double* the power than performance and powersaving. So for the x=8 and maybe even the x=16 case we're basically better off with performance. Or could it be that the power measurements are not really that accurate and those numbers above are not really correct? Hmm. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-15 23:12 ` Borislav Petkov @ 2013-04-16 0:22 ` Alex Shi 2013-04-16 10:24 ` Borislav Petkov 0 siblings, 1 reply; 30+ messages in thread From: Alex Shi @ 2013-04-16 0:22 UTC (permalink / raw) To: Borislav Petkov Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/16/2013 07:12 AM, Borislav Petkov wrote: > On Mon, Apr 15, 2013 at 09:50:22PM +0800, Alex Shi wrote: >> For fairness and total threads consideration, powersaving cost quit >> similar energy on kbuild benchmark, and even better. >> >> 17348.850 27400.458 15973.776 >> 13737.493 18487.248 12167.816 > > Yeah, but those lines don't look good - powersaving needs more energy > than performance. > > And what is even crazier is that fixed 1.2 GHz case. I'd guess in > the normal case those cores are at triple the freq. - i.e. somewhere > around 3-4 GHz. And yet, 1.2 GHz eats almost *double* the power than > performance and powersaving. yes, the max freq is 2.7 GHZ, plus boost. > > So for the x=8 and maybe even the x=16 case we're basically better off > with performance. > > Or could it be that the power measurements are not really that accurate > and those numbers above are not really correct? testing has a little variation, but the power data is quite accurate. I may change to packing tasks per cpu capacity than current cpu weight. that should has better power efficient value. > > Hmm. > -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-16 0:22 ` Alex Shi @ 2013-04-16 10:24 ` Borislav Petkov 2013-04-17 1:18 ` Alex Shi 0 siblings, 1 reply; 30+ messages in thread From: Borislav Petkov @ 2013-04-16 10:24 UTC (permalink / raw) To: Alex Shi Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Tue, Apr 16, 2013 at 08:22:19AM +0800, Alex Shi wrote: > testing has a little variation, but the power data is quite accurate. > I may change to packing tasks per cpu capacity than current cpu > weight. that should has better power efficient value. Yeah, this probably needs careful measuring - and by "this" I mean how to place N tasks where N is less than number of cores in the system. I can imagine trying to migrate them all together on a single physical socket (maybe even overcommitting it) so that you can flush the caches of the cores on the other sockets and so that you can power down the other sockets and avoid coherent traffic from waking them up, to be one strategy. My supposition here is that maybe putting the whole unused sockets in a deep sleep state could save a lot of power. Or not, who knows. Only empirical measurements should show us what actually happens. Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-16 10:24 ` Borislav Petkov @ 2013-04-17 1:18 ` Alex Shi 2013-04-17 7:38 ` Borislav Petkov 0 siblings, 1 reply; 30+ messages in thread From: Alex Shi @ 2013-04-17 1:18 UTC (permalink / raw) To: Borislav Petkov Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On 04/16/2013 06:24 PM, Borislav Petkov wrote: > On Tue, Apr 16, 2013 at 08:22:19AM +0800, Alex Shi wrote: >> testing has a little variation, but the power data is quite accurate. >> I may change to packing tasks per cpu capacity than current cpu >> weight. that should has better power efficient value. > > Yeah, this probably needs careful measuring - and by "this" I mean how > to place N tasks where N is less than number of cores in the system. > > I can imagine trying to migrate them all together on a single physical > socket (maybe even overcommitting it) so that you can flush the caches > of the cores on the other sockets and so that you can power down the > other sockets and avoid coherent traffic from waking them up, to be one > strategy. My supposition here is that maybe putting the whole unused > sockets in a deep sleep state could save a lot of power. Sure. Currently if the whole socket get into sleep, but the memory on the node is still accessed. the cpu socket still spend some power on 'uncore' part. So the further step is reduce the remote memory access to save more power, and that is also numa balance want to do. And then the next step is to detect if this socket is cache intensive, if there is much cache thresh on the node. In theory, there is still has lots of tuning space. :) > > Or not, who knows. Only empirical measurements should show us what > actually happens. Sure. :) > > Thanks. > -- Thanks Alex ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch v7 0/21] sched: power aware scheduling 2013-04-17 1:18 ` Alex Shi @ 2013-04-17 7:38 ` Borislav Petkov 0 siblings, 0 replies; 30+ messages in thread From: Borislav Petkov @ 2013-04-17 7:38 UTC (permalink / raw) To: Alex Shi Cc: Len Brown, mingo, peterz, tglx, akpm, arjan, pjt, namhyung, efault, morten.rasmussen, vincent.guittot, gregkh, preeti, viresh.kumar, linux-kernel, len.brown, rafael.j.wysocki, jkosina, clark.williams, tony.luck, keescook, mgorman, riel, Linux PM list On Wed, Apr 17, 2013 at 09:18:28AM +0800, Alex Shi wrote: > Sure. Currently if the whole socket get into sleep, but the memory on > the node is still accessed. the cpu socket still spend some power on > 'uncore' part. So the further step is reduce the remote memory access > to save more power, and that is also numa balance want to do. Yeah, if you also mean, you need to further migrate the memory of the threads away from the node so that it doesn't need to serve memory accesses from other sockets, then that should probably help save even more power. You probably would still need to serve probes from the L3 but your DRAM links will be powered down and such. > And then the next step is to detect if this socket is cache intensive, > if there is much cache thresh on the node. Yeah, that would be probably harder to determine - is cache thrashing (and I think you mean L3 here) worse than migrating tasks to other nodes and having them powered on just because my current node is not supposed to thrash L3. Hmm. > In theory, there is still has lots of tuning space. :) Yep. :) -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2013-05-20 2:32 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1365040862-8390-1-git-send-email-alex.shi@intel.com> 2013-04-11 21:02 ` [patch v7 0/21] sched: power aware scheduling Len Brown 2013-04-12 8:46 ` Alex Shi 2013-04-12 16:23 ` Borislav Petkov 2013-04-12 16:48 ` Mike Galbraith 2013-04-12 17:12 ` Borislav Petkov 2013-04-14 1:36 ` Alex Shi 2013-04-17 21:53 ` Len Brown 2013-04-18 1:51 ` Mike Galbraith 2013-04-26 15:11 ` Mike Galbraith 2013-04-30 5:16 ` Mike Galbraith 2013-04-30 8:30 ` Mike Galbraith 2013-04-30 8:41 ` Ingo Molnar 2013-04-30 9:35 ` Mike Galbraith 2013-04-30 9:49 ` Mike Galbraith 2013-04-30 9:56 ` Mike Galbraith 2013-05-17 8:06 ` Preeti U Murthy 2013-05-20 1:01 ` Alex Shi 2013-05-20 2:30 ` Preeti U Murthy 2013-04-14 1:28 ` Alex Shi 2013-04-14 5:10 ` Alex Shi 2013-04-14 15:59 ` Borislav Petkov 2013-04-15 6:04 ` Alex Shi 2013-04-15 6:16 ` Alex Shi 2013-04-15 9:52 ` Borislav Petkov 2013-04-15 13:50 ` Alex Shi 2013-04-15 23:12 ` Borislav Petkov 2013-04-16 0:22 ` Alex Shi 2013-04-16 10:24 ` Borislav Petkov 2013-04-17 1:18 ` Alex Shi 2013-04-17 7:38 ` Borislav Petkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).