[RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
@ 2009-06-16  5:38 Gautham R Shenoy
  2009-06-16  5:38 ` [RFD PATCH 1/4] powerpc: cpu: Reduce the polling interval in __cpu_up() Gautham R Shenoy
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Gautham R Shenoy @ 2009-06-16  5:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Balbir Singh, Rusty Russel, Paul E McKenney,
	Nathan Lynch, Ingo Molnar, Venkatesh Pallipadi, Andrew Morton,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li


Hi,

(NOTE: This is an RFD. Patches are not for inclusion)

The current CPU-Hotplug infrastructure enables us to hotplug one CPU at any
given time. However, with newer machines which have multiple-cores and
multi-threads, it makes much sense to change the unit of hotplug to a core or
a package. We might want to evacuate a core or a package to reduce the avg
power/to manage the temperature of the system/to dynamically provision
cores/packages to a running system. But performing a series of CPU-Hotplug
is relatively slower.

Currently on a ppc64 box with 16 CPUs, the time taken for
a individual cpu-hotplug operation is as follows.

	# time echo 0 > /sys/devices/system/cpu/cpu2/online
	real    0m0.025s
	user    0m0.000s
	sys     0m0.002s

	# time echo 1 > /sys/devices/system/cpu/cpu2/online
	real    0m0.021s
	user    0m0.000s
	sys     0m0.000s

(The online time used to be ~200ms. It has been reduced after applying patch 1
of the series which reduces the polling interval from 200ms to 1ms.)

Of the this, the time taken for sending the notifications and performing
the actual cpu-hotplug operation (detailed profile is appended at the end of
the text) is:

	12.645925 ms on the offline path.
	21.019581 ms on the online path.

(The 10ms discrepancy that we observe in the total time taken for cpu-offline
Vs the time accounted for notifiers and cpu-hotplug operation is because of a
synchronize_sched() performed after clearing the active_cpu_mask.)

So, of the accounted time, a major chunk of time is consumed by
cpuset_track_online_cpus() while handling CPU_DEAD and CPU_ONLINE
notifications.

	11.320205 ms: cpuset_track_online_cpus      : CPU_DEAD
	12.767882 ms: cpuset_track_online_cpus      : CPU_ONLINE

cpuset_trace_online_cpus() among other things performs the task of rebuilding
the sched_domains for every online CPU in the system.

The operations performed within the cpuset_track_online_cpus()
depends only on the cpu_online_map and not on the CPU which has been
hotplugged. The other notifiers which behave similarly are
 - ratelimit_handler(),
 - vmstat_cpuup_callback()
 - vmscan: cpu_callback()

Thus if we bunch up multiple cpu-offlines/onlines, we can reduce the overall
time taken by optimizing notifiers such as these, so that they can
perform the necessary functions only once, after the completion of the
CPU-Hotplug operation. This would cut down the CPU hotplug time substantially.

The whole approach would require the Cpu-Hotplug notifiers to work on
cpumask_t instead of cpu. A similar proposal has been once proposed before by
Shaohua Li (http://lkml.org/lkml/2006/5/8/18)

In this patch series, we extend the existing cpu online/offline
interface to enable the user to offline/online a bunch of CPUs
at the same time.

The proposed interface to do so are the sysfs file:
	/sys/devices/system/cpu/online
	/sys/devices/system/cpu/online

The usage is:
	echo 4,6,7 > /sys/devices/system/cpu/offline
	echo 5 > /sys/devices/system/cpu/offline
	echo 4-7 > /sys/devices/system/cpu/online

As of now, this patch series does no optimizations to the CPU-Hotplug core
but serially hotplugs the CPUs in the list provided by the user.

The interface provided in this patch series has been tested on a
16-way ppc64 box.


Still TODO:
- Enhance the subsystem notifiers to work on a cpumask_var_t instead of a cpu
  id.

- Optimize the subsystem notifiers to reduce the time consumed while
  handling CPU_[DOWN_PREPARE/DEAD/UP_PREPARE/ONLINE] events for the
  cpumask_var_t.

- Define the Rollback Semantics for the notifiers which fail to handle
  a CPU_* event correctly.

- Send the kobject-events for the corresponding device entries of each of the
  CPUs present in the list to maintain ABI compatibility.

Any feedback is much appreciated

---

Gautham R Shenoy (4):
      cpu: measure time taken by subsystem notifiers during cpu-hotplug
      cpu: Define new functions cpu_down_mask and cpu_up_mask
      cpu: sysfs interface for hotplugging bunch of CPUs.
      powerpc: cpu: Reduce the polling interval in __cpu_up()


 arch/powerpc/kernel/smp.c      |    5 +-
 drivers/base/cpu.c             |   76 ++++++++++++++++++++++++++++--
 include/linux/cpu.h            |    2 +
 include/trace/notifier_trace.h |   32 ++++++++++++
 kernel/cpu.c                   |  103 ++++++++++++++++++++++++++++++----------
 kernel/notifier.c              |   23 +++++++--
 6 files changed, 203 insertions(+), 38 deletions(-)
 create mode 100644 include/trace/notifier_trace.h

--
Thanks and Regards
gautham


****************** Cpu-Hotplug profile ********************************

=============================================================================
statistics for CPU_DOWN_PREPARE
=============================================================================
      379 ns: buffer_cpu_notify             : CPU_DOWN_PREPARE
      457 ns: topology_cpu_callback         : CPU_DOWN_PREPARE
      504 ns: flow_cache_cpu                : CPU_DOWN_PREPARE
      517 ns: cpu_callback                  : CPU_DOWN_PREPARE
      533 ns: hotplug_cfd                   : CPU_DOWN_PREPARE
      546 ns: dev_cpu_callback              : CPU_DOWN_PREPARE
      547 ns: timer_cpu_notify              : CPU_DOWN_PREPARE
      562 ns: page_alloc_cpu_notify         : CPU_DOWN_PREPARE
      564 ns: cpuset_track_online_cpus      : CPU_DOWN_PREPARE
      594 ns: blk_cpu_notify                : CPU_DOWN_PREPARE
      623 ns: hotplug_hrtick                : CPU_DOWN_PREPARE
      623 ns: radix_tree_callback           : CPU_DOWN_PREPARE
      715 ns: remote_softirq_cpu_notify     : CPU_DOWN_PREPARE
      777 ns: rb_cpu_notify                 : CPU_DOWN_PREPARE
      777 ns: sysfs_cpu_notify              : CPU_DOWN_PREPARE
      807 ns: rcu_cpu_notify                : CPU_DOWN_PREPARE
      820 ns: ratelimit_handler             : CPU_DOWN_PREPARE
      822 ns: pageset_cpuup_callback        : CPU_DOWN_PREPARE
      898 ns: cpu_callback                  : CPU_DOWN_PREPARE
      898 ns: relay_hotcpu_callback         : CPU_DOWN_PREPARE
      929 ns: hrtimer_cpu_notify            : CPU_DOWN_PREPARE
      930 ns: cpu_callback                  : CPU_DOWN_PREPARE
     1096 ns: cpu_numa_callback             : CPU_DOWN_PREPARE
     1096 ns: percpu_counter_hotcpu_callback: CPU_DOWN_PREPARE
     1111 ns: slab_cpuup_callback           : CPU_DOWN_PREPARE
     1139 ns: update_runtime                : CPU_DOWN_PREPARE
     1143 ns: rcu_barrier_cpu_hotplug       : CPU_DOWN_PREPARE
     2725 ns: workqueue_cpu_callback        : CPU_DOWN_PREPARE
     2852 ns: migration_call                : CPU_DOWN_PREPARE
     4497 ns: vmstat_cpuup_callback         : CPU_DOWN_PREPARE
=========================================================================
Total time for CPU_DOWN_PREPARE = .030481000 ms
=========================================================================
=============================================================================
statistics for CPU_DYING
=============================================================================
      349 ns: cpu_callback                  : CPU_DYING
      349 ns: hotplug_hrtick                : CPU_DYING
      349 ns: remote_softirq_cpu_notify     : CPU_DYING
      351 ns: timer_cpu_notify              : CPU_DYING
      363 ns: vmstat_cpuup_callback         : CPU_DYING
      364 ns: rb_cpu_notify                 : CPU_DYING
      365 ns: blk_cpu_notify                : CPU_DYING
      365 ns: cpu_callback                  : CPU_DYING
      365 ns: cpu_numa_callback             : CPU_DYING
      365 ns: cpuset_track_online_cpus      : CPU_DYING
      365 ns: dev_cpu_callback              : CPU_DYING
      365 ns: hotplug_cfd                   : CPU_DYING
      365 ns: page_alloc_cpu_notify         : CPU_DYING
      365 ns: radix_tree_callback           : CPU_DYING
      365 ns: relay_hotcpu_callback         : CPU_DYING
      365 ns: topology_cpu_callback         : CPU_DYING
      365 ns: update_runtime                : CPU_DYING
      366 ns: pageset_cpuup_callback        : CPU_DYING
      367 ns: sysfs_cpu_notify              : CPU_DYING
      378 ns: flow_cache_cpu                : CPU_DYING
      380 ns: rcu_cpu_notify                : CPU_DYING
      381 ns: buffer_cpu_notify             : CPU_DYING
      381 ns: cpu_callback                  : CPU_DYING
      383 ns: slab_cpuup_callback           : CPU_DYING
      455 ns: ratelimit_handler             : CPU_DYING
      502 ns: workqueue_cpu_callback        : CPU_DYING
      699 ns: percpu_counter_hotcpu_callback: CPU_DYING
     1370 ns: rcu_barrier_cpu_hotplug       : CPU_DYING
     1583 ns: migration_call                : CPU_DYING
     2971 ns: hrtimer_cpu_notify            : CPU_DYING
=========================================================================
Total time for CPU_DYING = .016356000 ms
=========================================================================
=============================================================================
statistics for CPU_DOWN_CANCELED
=============================================================================
=========================================================================
Total time for CPU_DOWN_CANCELED = 0 ms
=========================================================================
=============================================================================
statistics for __stop_machine
=============================================================================
   556214 ns: __stop_machine                :
=========================================================================
Total time for __stop_machine = .556214000 ms
=========================================================================
=============================================================================
statistics for CPU_DEAD
=============================================================================
      352 ns: update_runtime                : CPU_DEAD
      363 ns: rb_cpu_notify                 : CPU_DEAD
      364 ns: relay_hotcpu_callback         : CPU_DEAD
      367 ns: hotplug_cfd                   : CPU_DEAD
      396 ns: cpu_callback                  : CPU_DEAD
      411 ns: hotplug_hrtick                : CPU_DEAD
      426 ns: rcu_barrier_cpu_hotplug       : CPU_DEAD
      489 ns: remote_softirq_cpu_notify     : CPU_DEAD
      517 ns: ratelimit_handler             : CPU_DEAD
      533 ns: workqueue_cpu_callback        : CPU_DEAD
      626 ns: dev_cpu_callback              : CPU_DEAD
      867 ns: cpu_numa_callback             : CPU_DEAD
     1430 ns: rcu_cpu_notify                : CPU_DEAD
     1827 ns: blk_cpu_notify                : CPU_DEAD
     1933 ns: buffer_cpu_notify             : CPU_DEAD
     2194 ns: pageset_cpuup_callback        : CPU_DEAD
     2613 ns: vmstat_cpuup_callback         : CPU_DEAD
     2902 ns: radix_tree_callback           : CPU_DEAD
     4373 ns: hrtimer_cpu_notify            : CPU_DEAD
     5799 ns: timer_cpu_notify              : CPU_DEAD
     9468 ns: flow_cache_cpu                : CPU_DEAD
    12579 ns: cpu_callback                  : CPU_DEAD
    13855 ns: cpu_callback                  : CPU_DEAD
    25095 ns: topology_cpu_callback         : CPU_DEAD
    29020 ns: page_alloc_cpu_notify         : CPU_DEAD
    66894 ns: percpu_counter_hotcpu_callback: CPU_DEAD
   118473 ns: slab_cpuup_callback           : CPU_DEAD
   153415 ns: sysfs_cpu_notify              : CPU_DEAD
   159933 ns: migration_call                : CPU_DEAD
 11320205 ns: cpuset_track_online_cpus      : CPU_DEAD
=========================================================================
Total time for CPU_DEAD = 11.937719000 ms
=========================================================================
=============================================================================
statistics for CPU_POST_DEAD
=============================================================================
      332 ns: remote_softirq_cpu_notify     : CPU_POST_DEAD
      334 ns: hotplug_hrtick                : CPU_POST_DEAD
      334 ns: hrtimer_cpu_notify            : CPU_POST_DEAD
      334 ns: radix_tree_callback           : CPU_POST_DEAD
      334 ns: relay_hotcpu_callback         : CPU_POST_DEAD
      334 ns: topology_cpu_callback         : CPU_POST_DEAD
      334 ns: update_runtime                : CPU_POST_DEAD
      335 ns: buffer_cpu_notify             : CPU_POST_DEAD
      348 ns: pageset_cpuup_callback        : CPU_POST_DEAD
      348 ns: slab_cpuup_callback           : CPU_POST_DEAD
      349 ns: rcu_barrier_cpu_hotplug       : CPU_POST_DEAD
      350 ns: cpu_callback                  : CPU_POST_DEAD
      350 ns: flow_cache_cpu                : CPU_POST_DEAD
      350 ns: rb_cpu_notify                 : CPU_POST_DEAD
      350 ns: sysfs_cpu_notify              : CPU_POST_DEAD
      350 ns: timer_cpu_notify              : CPU_POST_DEAD
      351 ns: page_alloc_cpu_notify         : CPU_POST_DEAD
      352 ns: cpuset_track_online_cpus      : CPU_POST_DEAD
      365 ns: hotplug_cfd                   : CPU_POST_DEAD
      365 ns: vmstat_cpuup_callback         : CPU_POST_DEAD
      366 ns: cpu_callback                  : CPU_POST_DEAD
      367 ns: cpu_numa_callback             : CPU_POST_DEAD
      368 ns: cpu_callback                  : CPU_POST_DEAD
      395 ns: blk_cpu_notify                : CPU_POST_DEAD
      396 ns: rcu_cpu_notify                : CPU_POST_DEAD
      397 ns: dev_cpu_callback              : CPU_POST_DEAD
      442 ns: migration_call                : CPU_POST_DEAD
      563 ns: percpu_counter_hotcpu_callback: CPU_POST_DEAD
      778 ns: ratelimit_handler             : CPU_POST_DEAD
    94184 ns: workqueue_cpu_callback        : CPU_POST_DEAD
=========================================================================
Total time for CPU_POST_DEAD = .105155000 ms
=========================================================================
=============================================================================
statistics for CPU_UP_PREPARE
=============================================================================
      334 ns: hotplug_hrtick                : CPU_UP_PREPARE
      336 ns: update_runtime                : CPU_UP_PREPARE
      350 ns: flow_cache_cpu                : CPU_UP_PREPARE
      350 ns: radix_tree_callback           : CPU_UP_PREPARE
      365 ns: cpuset_track_online_cpus      : CPU_UP_PREPARE
      365 ns: page_alloc_cpu_notify         : CPU_UP_PREPARE
      365 ns: sysfs_cpu_notify              : CPU_UP_PREPARE
      367 ns: hrtimer_cpu_notify            : CPU_UP_PREPARE
      381 ns: buffer_cpu_notify             : CPU_UP_PREPARE
      381 ns: rb_cpu_notify                 : CPU_UP_PREPARE
      383 ns: cpu_callback                  : CPU_UP_PREPARE
      410 ns: rcu_barrier_cpu_hotplug       : CPU_UP_PREPARE
      413 ns: remote_softirq_cpu_notify     : CPU_UP_PREPARE
      426 ns: blk_cpu_notify                : CPU_UP_PREPARE
      475 ns: vmstat_cpuup_callback         : CPU_UP_PREPARE
      518 ns: hotplug_cfd                   : CPU_UP_PREPARE
      594 ns: percpu_counter_hotcpu_callback: CPU_UP_PREPARE
      731 ns: ratelimit_handler             : CPU_UP_PREPARE
      805 ns: relay_hotcpu_callback         : CPU_UP_PREPARE
     1007 ns: dev_cpu_callback              : CPU_UP_PREPARE
     1690 ns: rcu_cpu_notify                : CPU_UP_PREPARE
     1875 ns: timer_cpu_notify              : CPU_UP_PREPARE
     2083 ns: pageset_cpuup_callback        : CPU_UP_PREPARE
     5016 ns: cpu_numa_callback             : CPU_UP_PREPARE
     6944 ns: topology_cpu_callback         : CPU_UP_PREPARE
     7064 ns: slab_cpuup_callback           : CPU_UP_PREPARE
    20964 ns: cpu_callback                  : CPU_UP_PREPARE
    36301 ns: cpu_callback                  : CPU_UP_PREPARE
    38337 ns: migration_call                : CPU_UP_PREPARE
   139963 ns: workqueue_cpu_callback        : CPU_UP_PREPARE
=========================================================================
Total time for CPU_UP_PREPARE = .269593000 ms
=========================================================================
=============================================================================
statistics for CPU_UP_CANCELED
=============================================================================
=========================================================================
Total time for CPU_UP_CANCELED = 0 ms
=========================================================================
=============================================================================
statistics for __cpu_up
=============================================================================
  7881152 ns: __cpu_up                      :
=========================================================================
Total time for __cpu_up = 7.881152000 ms
=========================================================================
=============================================================================
statistics for CPU_STARTING
=============================================================================
      318 ns: cpu_callback                  : CPU_STARTING
      334 ns: hotplug_cfd                   : CPU_STARTING
      334 ns: hotplug_hrtick                : CPU_STARTING
      334 ns: hrtimer_cpu_notify            : CPU_STARTING
      336 ns: remote_softirq_cpu_notify     : CPU_STARTING
      336 ns: topology_cpu_callback         : CPU_STARTING
      348 ns: cpu_callback                  : CPU_STARTING
      348 ns: flow_cache_cpu                : CPU_STARTING
      349 ns: cpu_callback                  : CPU_STARTING
      349 ns: update_runtime                : CPU_STARTING
      350 ns: dev_cpu_callback              : CPU_STARTING
      350 ns: rb_cpu_notify                 : CPU_STARTING
      351 ns: sysfs_cpu_notify              : CPU_STARTING
      352 ns: cpuset_track_online_cpus      : CPU_STARTING
      365 ns: vmstat_cpuup_callback         : CPU_STARTING
      381 ns: blk_cpu_notify                : CPU_STARTING
      393 ns: page_alloc_cpu_notify         : CPU_STARTING
      395 ns: timer_cpu_notify              : CPU_STARTING
      396 ns: relay_hotcpu_callback         : CPU_STARTING
      396 ns: slab_cpuup_callback           : CPU_STARTING
      397 ns: cpu_numa_callback             : CPU_STARTING
      397 ns: pageset_cpuup_callback        : CPU_STARTING
      397 ns: radix_tree_callback           : CPU_STARTING
      410 ns: buffer_cpu_notify             : CPU_STARTING
      410 ns: rcu_cpu_notify                : CPU_STARTING
      412 ns: rcu_barrier_cpu_hotplug       : CPU_STARTING
      426 ns: percpu_counter_hotcpu_callback: CPU_STARTING
      549 ns: ratelimit_handler             : CPU_STARTING
      549 ns: workqueue_cpu_callback        : CPU_STARTING
      592 ns: migration_call                : CPU_STARTING
=========================================================================
Total time for CPU_STARTING = .011654000 ms
=========================================================================
=============================================================================
statistics for CPU_ONLINE
=============================================================================
      334 ns: hotplug_cfd                   : CPU_ONLINE
      334 ns: relay_hotcpu_callback         : CPU_ONLINE
      334 ns: remote_softirq_cpu_notify     : CPU_ONLINE
      335 ns: hrtimer_cpu_notify            : CPU_ONLINE
      349 ns: topology_cpu_callback         : CPU_ONLINE
      352 ns: flow_cache_cpu                : CPU_ONLINE
      352 ns: slab_cpuup_callback           : CPU_ONLINE
      365 ns: dev_cpu_callback              : CPU_ONLINE
      365 ns: rb_cpu_notify                 : CPU_ONLINE
      379 ns: pageset_cpuup_callback        : CPU_ONLINE
      381 ns: page_alloc_cpu_notify         : CPU_ONLINE
      381 ns: rcu_cpu_notify                : CPU_ONLINE
      381 ns: timer_cpu_notify              : CPU_ONLINE
      395 ns: hotplug_hrtick                : CPU_ONLINE
      410 ns: blk_cpu_notify                : CPU_ONLINE
      426 ns: rcu_barrier_cpu_hotplug       : CPU_ONLINE
      455 ns: cpu_numa_callback             : CPU_ONLINE
      459 ns: radix_tree_callback           : CPU_ONLINE
      473 ns: buffer_cpu_notify             : CPU_ONLINE
      504 ns: ratelimit_handler             : CPU_ONLINE
      639 ns: percpu_counter_hotcpu_callback: CPU_ONLINE
      791 ns: update_runtime                : CPU_ONLINE
     1052 ns: cpu_callback                  : CPU_ONLINE
     1282 ns: cpu_callback                  : CPU_ONLINE
     1845 ns: cpu_callback                  : CPU_ONLINE
     2502 ns: vmstat_cpuup_callback         : CPU_ONLINE
     4332 ns: migration_call                : CPU_ONLINE
    14505 ns: workqueue_cpu_callback        : CPU_ONLINE
    54588 ns: sysfs_cpu_notify              : CPU_ONLINE
 12767882 ns: cpuset_track_online_cpus      : CPU_ONLINE
=========================================================================
Total time for CPU_ONLINE = 12.857182000 ms
=========================================================================


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFD PATCH 1/4] powerpc: cpu: Reduce the polling interval in __cpu_up()
  2009-06-16  5:38 [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Gautham R Shenoy
@ 2009-06-16  5:38 ` Gautham R Shenoy
  2009-06-16 16:06   ` Nathan Lynch
  2009-06-16  5:38 ` [RFD PATCH 2/4] cpu: sysfs interface for hotplugging bunch of CPUs Gautham R Shenoy
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Gautham R Shenoy @ 2009-06-16  5:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Balbir Singh, Rusty Russel, Paul E McKenney,
	Nathan Lynch, Ingo Molnar, Venkatesh Pallipadi, Andrew Morton,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li

The cpu online operation on a powerpc today takes order of 200-220ms. Of
this time, approximately 200ms is taken up by __cpu_up(). This is because
we poll every 200ms to check if the new cpu has notified it's presence
through the cpu_callin_map. We poll every 200ms until the new cpu sets
the value in cpu_callin_map or 5 seconds elapse, whichever comes earlier.

However, the time taken by the new processor to indicate it's presence has
found to be less than a millisecond. Keeping this in mind, reduce the
polling interval from 200ms to 1ms while retaining the 5 second timeout.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---
 arch/powerpc/kernel/smp.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 65484b2..00c13a1 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -411,9 +411,8 @@ int __cpuinit __cpu_up(unsigned int cpu)
 		 * CPUs can take much longer to come up in the
 		 * hotplug case.  Wait five seconds.
 		 */
-		for (c = 25; c && !cpu_callin_map[cpu]; c--) {
-			msleep(200);
-		}
+		for (c = 5000; c && !cpu_callin_map[cpu]; c--)
+			msleep(1);
 #endif
 
 	if (!cpu_callin_map[cpu]) {


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 1/4] powerpc: cpu: Reduce the polling interval in __cpu_up()
  2009-06-16  5:38 ` [RFD PATCH 1/4] powerpc: cpu: Reduce the polling interval in __cpu_up() Gautham R Shenoy
@ 2009-06-16 16:06   ` Nathan Lynch
  2009-06-16 16:37     ` Gautham R Shenoy
  0 siblings, 1 reply; 21+ messages in thread
From: Nathan Lynch @ 2009-06-16 16:06 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: linux-kernel, Peter Zijlstra, Balbir Singh, Rusty Russel,
	Paul E McKenney, Ingo Molnar, Venkatesh Pallipadi, Andrew Morton,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li

Please cc linuxppc-dev if you want the powerpc maintainer to pick this
up.

Gautham R Shenoy <ego@in.ibm.com> writes:
> The cpu online operation on a powerpc today takes order of 200-220ms. Of
> this time, approximately 200ms is taken up by __cpu_up(). This is because
> we poll every 200ms to check if the new cpu has notified it's presence
> through the cpu_callin_map. We poll every 200ms until the new cpu sets
> the value in cpu_callin_map or 5 seconds elapse, whichever comes earlier.
>
> However, the time taken by the new processor to indicate it's presence has
> found to be less than a millisecond

Only with your particular configuration (which is not identified).  It
can take much longer than 1ms on others.

> Keeping this in mind, reduce the
> polling interval from 200ms to 1ms while retaining the 5 second
> timeout.

Ack on the patch, but the changelog needs work.  I assume your
observations are from a pseries system -- please state this in the
changelog ("powerpc" is too broad), along with the processor model and
whether the LPAR's processors were configured in dedicated or shared
mode.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 1/4] powerpc: cpu: Reduce the polling interval in __cpu_up()
  2009-06-16 16:06   ` Nathan Lynch
@ 2009-06-16 16:37     ` Gautham R Shenoy
  0 siblings, 0 replies; 21+ messages in thread
From: Gautham R Shenoy @ 2009-06-16 16:37 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: linux-kernel, Peter Zijlstra, Balbir Singh, Rusty Russel,
	Paul E McKenney, Ingo Molnar, Venkatesh Pallipadi, Andrew Morton,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li

On Tue, Jun 16, 2009 at 11:06:45AM -0500, Nathan Lynch wrote:
> Please cc linuxppc-dev if you want the powerpc maintainer to pick this
> up.

Will do it. I still need to test this patch across the different
configurations. I posted it here just so that we get a rough idea
regarding what we're looking at.

Thanks for taking a look at this one!
> 
> Gautham R Shenoy <ego@in.ibm.com> writes:
> > The cpu online operation on a powerpc today takes order of 200-220ms. Of
> > this time, approximately 200ms is taken up by __cpu_up(). This is because
> > we poll every 200ms to check if the new cpu has notified it's presence
> > through the cpu_callin_map. We poll every 200ms until the new cpu sets
> > the value in cpu_callin_map or 5 seconds elapse, whichever comes earlier.
> >
> > However, the time taken by the new processor to indicate it's presence has
> > found to be less than a millisecond
> 
> Only with your particular configuration (which is not identified).  It
> can take much longer than 1ms on others.
> 
> > Keeping this in mind, reduce the
> > polling interval from 200ms to 1ms while retaining the 5 second
> > timeout.
> 
> Ack on the patch, but the changelog needs work.  I assume your
> observations are from a pseries system -- please state this in the
> changelog ("powerpc" is too broad), along with the processor model and
> whether the LPAR's processors were configured in dedicated or shared
> mode.

Will send these details with the patch separately Ccing linux-ppcdev list.

-- 
Thanks and Regards
gautham

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFD PATCH 2/4] cpu: sysfs interface for hotplugging bunch of CPUs.
  2009-06-16  5:38 [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Gautham R Shenoy
  2009-06-16  5:38 ` [RFD PATCH 1/4] powerpc: cpu: Reduce the polling interval in __cpu_up() Gautham R Shenoy
@ 2009-06-16  5:38 ` Gautham R Shenoy
  2009-06-16 16:22   ` Nathan Lynch
  2009-06-16  5:38 ` [RFD PATCH 3/4] cpu: Define new functions cpu_down_mask and cpu_up_mask Gautham R Shenoy
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Gautham R Shenoy @ 2009-06-16  5:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Balbir Singh, Rusty Russel, Paul E McKenney,
	Nathan Lynch, Ingo Molnar, Venkatesh Pallipadi, Andrew Morton,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li

The user can currently view the online and offline CPUs in the system through
the sysfs files named "online" and "offline" respectively which
are present in the directory /sys/devices/system/cpu/. These files currently
have 0444 permissions.

For the purpose of evacuation of a bunch of CPUs, we propose to extend this
same interface and make it 0644, by which the user can use the same interface
to bring a bunch of CPUs online or take a bunch of CPUs offline.

To do this, the user is required to echo the cpu-list which is expected to be
hotplugged.

Eg:

echo 2,3 > /sys/devices/system/cpu/offline #Offlines CPUs 2 and 3
echo 4 > /sys/devices/sytem/cpu/offline    #Offlines CPU 4
echo 2-4 > /sys/devices/system/cpu/online  #Onlines CPU 2,3,4

This patch changes the permissions of these sysfs files from 0444 to 0644.
It provides a dummy store function, which currently parses the input
provided by the user and copies them to another debug cpumask structure,
which can be accessed using the sysfs interfaces:
	/sys/devices/system/cpu/debug_offline
and
	/sys/devices/system/cpu/debug_online

Thus on performing a
	echo 2,3 > /sys/devices/system/cpu/offline
the operation
	cat /sys/devices/system/cpu/debug_offline
should yield
	2-3
as the result.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---
 drivers/base/cpu.c |   72 ++++++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 67 insertions(+), 5 deletions(-)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index e62a4cc..7a15e7b 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -116,18 +116,54 @@ static ssize_t print_cpus_map(char *buf, const struct cpumask *map)
 	return n;
 }
 
-#define	print_cpus_func(type) \
+#define show_cpus_func(type)						\
 static ssize_t print_cpus_##type(struct sysdev_class *class, char *buf)	\
 {									\
 	return print_cpus_map(buf, cpu_##type##_mask);			\
-}									\
-static struct sysdev_class_attribute attr_##type##_map = 		\
+}
+
+#define print_cpus_func(type)						\
+show_cpus_func(type);							\
+static struct sysdev_class_attribute attr_##type##_map =		\
 	_SYSDEV_CLASS_ATTR(type, 0444, print_cpus_##type, NULL)
 
-print_cpus_func(online);
 print_cpus_func(possible);
 print_cpus_func(present);
 
+static struct cpumask debug_offline_mask_data;
+static struct cpumask debug_online_mask_data;
+static struct cpumask *cpu_debug_offline_mask = &debug_offline_mask_data;
+static struct cpumask *cpu_debug_online_mask = &debug_online_mask_data;
+print_cpus_func(debug_offline);
+print_cpus_func(debug_online);
+
+show_cpus_func(online);
+static ssize_t store_cpus_online(struct sysdev_class *dev_class,
+					const char *buf, size_t count)
+{
+	ssize_t ret = count;
+	cpumask_var_t store_cpus_online_mask;
+
+	if (!alloc_cpumask_var(&store_cpus_online_mask, GFP_KERNEL))
+		return count;
+
+	ret = cpulist_parse(buf, store_cpus_online_mask);
+
+	if (ret < 0)
+		goto out;
+
+	cpumask_copy(cpu_debug_online_mask, store_cpus_online_mask);
+
+out:
+	free_cpumask_var(store_cpus_online_mask);
+	if (ret >= 0)
+		ret = count;
+	return ret;
+}
+static struct sysdev_class_attribute attr_online_map =
+	_SYSDEV_CLASS_ATTR(online, 0644, print_cpus_online,
+						store_cpus_online);
+
 /*
  * Print values for NR_CPUS and offlined cpus
  */
@@ -168,7 +204,31 @@ static ssize_t print_cpus_offline(struct sysdev_class *class, char *buf)
 	n += snprintf(&buf[n], len - n, "\n");
 	return n;
 }
-static SYSDEV_CLASS_ATTR(offline, 0444, print_cpus_offline, NULL);
+
+static ssize_t store_cpus_offline(struct sysdev_class *dev_class,
+					const char *buf, size_t count)
+{
+	ssize_t ret = count;
+	cpumask_var_t store_cpus_offline_mask;
+
+	if (!alloc_cpumask_var(&store_cpus_offline_mask, GFP_KERNEL))
+		return count;
+
+	ret = cpulist_parse(buf, store_cpus_offline_mask);
+
+	if (ret < 0)
+		goto out;
+
+	cpumask_copy(cpu_debug_offline_mask, store_cpus_offline_mask);
+
+out:
+	free_cpumask_var(store_cpus_offline_mask);
+	if (ret >= 0)
+		ret = count;
+	return ret;
+}
+static SYSDEV_CLASS_ATTR(offline, 0644, print_cpus_offline,
+						store_cpus_offline);
 
 static struct sysdev_class_attribute *cpu_state_attr[] = {
 	&attr_online_map,
@@ -176,6 +236,8 @@ static struct sysdev_class_attribute *cpu_state_attr[] = {
 	&attr_present_map,
 	&attr_kernel_max,
 	&attr_offline,
+	&attr_debug_online_map,
+	&attr_debug_offline_map,
 };
 
 static int cpu_states_init(void)


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 2/4] cpu: sysfs interface for hotplugging bunch of CPUs.
  2009-06-16  5:38 ` [RFD PATCH 2/4] cpu: sysfs interface for hotplugging bunch of CPUs Gautham R Shenoy
@ 2009-06-16 16:22   ` Nathan Lynch
  2009-06-16 16:33     ` Gautham R Shenoy
  0 siblings, 1 reply; 21+ messages in thread
From: Nathan Lynch @ 2009-06-16 16:22 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: linux-kernel, Peter Zijlstra, Balbir Singh, Rusty Russel,
	Paul E McKenney, Ingo Molnar, Venkatesh Pallipadi, Andrew Morton,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li

Gautham R Shenoy <ego@in.ibm.com> writes:
> echo 2,3 > /sys/devices/system/cpu/offline #Offlines CPUs 2 and 3
> echo 4 > /sys/devices/sytem/cpu/offline    #Offlines CPU 4
> echo 2-4 > /sys/devices/system/cpu/online  #Onlines CPU 2,3,4
>
> This patch changes the permissions of these sysfs files from 0444 to 0644.
> It provides a dummy store function, which currently parses the input
> provided by the user and copies them to another debug cpumask structure,
> which can be accessed using the sysfs interfaces:
> 	/sys/devices/system/cpu/debug_offline
> and
> 	/sys/devices/system/cpu/debug_online
>
> Thus on performing a
> 	echo 2,3 > /sys/devices/system/cpu/offline
> the operation
> 	cat /sys/devices/system/cpu/debug_offline
> should yield
> 	2-3
> as the result.

These debug_(on|off)line attributes aren't intended to be in the final
result, are they?  They don't seem useful beyond the development phase
of this feature...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 2/4] cpu: sysfs interface for hotplugging bunch of CPUs.
  2009-06-16 16:22   ` Nathan Lynch
@ 2009-06-16 16:33     ` Gautham R Shenoy
  0 siblings, 0 replies; 21+ messages in thread
From: Gautham R Shenoy @ 2009-06-16 16:33 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: linux-kernel, Peter Zijlstra, Balbir Singh, Rusty Russel,
	Paul E McKenney, Ingo Molnar, Venkatesh Pallipadi, Andrew Morton,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li

On Tue, Jun 16, 2009 at 11:22:56AM -0500, Nathan Lynch wrote:
> Gautham R Shenoy <ego@in.ibm.com> writes:
> > echo 2,3 > /sys/devices/system/cpu/offline #Offlines CPUs 2 and 3
> > echo 4 > /sys/devices/sytem/cpu/offline    #Offlines CPU 4
> > echo 2-4 > /sys/devices/system/cpu/online  #Onlines CPU 2,3,4
> >
> > This patch changes the permissions of these sysfs files from 0444 to 0644.
> > It provides a dummy store function, which currently parses the input
> > provided by the user and copies them to another debug cpumask structure,
> > which can be accessed using the sysfs interfaces:
> > 	/sys/devices/system/cpu/debug_offline
> > and
> > 	/sys/devices/system/cpu/debug_online
> >
> > Thus on performing a
> > 	echo 2,3 > /sys/devices/system/cpu/offline
> > the operation
> > 	cat /sys/devices/system/cpu/debug_offline
> > should yield
> > 	2-3
> > as the result.
> 
> These debug_(on|off)line attributes aren't intended to be in the final
> result, are they?  They don't seem useful beyond the development phase
> of this feature...

No, they aren't intended to be in the final result.

-- 
Thanks and Regards
gautham

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFD PATCH 3/4] cpu: Define new functions cpu_down_mask and cpu_up_mask
  2009-06-16  5:38 [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Gautham R Shenoy
  2009-06-16  5:38 ` [RFD PATCH 1/4] powerpc: cpu: Reduce the polling interval in __cpu_up() Gautham R Shenoy
  2009-06-16  5:38 ` [RFD PATCH 2/4] cpu: sysfs interface for hotplugging bunch of CPUs Gautham R Shenoy
@ 2009-06-16  5:38 ` Gautham R Shenoy
  2009-06-16  5:38 ` [RFD PATCH 4/4] cpu: measure time taken by subsystem notifiers during cpu-hotplug Gautham R Shenoy
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Gautham R Shenoy @ 2009-06-16  5:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Balbir Singh, Rusty Russel, Paul E McKenney,
	Nathan Lynch, Ingo Molnar, Venkatesh Pallipadi, Andrew Morton,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li

Currently cpu-hotplug operation is carried out on a single processor at any
given time. We create two functions which will enable us to offline/online
multiple CPUs in a single go.

These functions are:
	int cpu_down_mask(cpumask_var_t cpus_to_offline);
	int cpu_up_mask(cpumask_var_t cpus_to_online);

In this patch, these functions serially invoke the  _cpu_down() and _cpu_up()
functions for each of the CPUs in the cpumask.

The idea is to make the CPU-hotplug notifiers work on cpumasks so that they
can optimize for hotplugging multiple CPUs.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---
 drivers/base/cpu.c  |    4 ++
 include/linux/cpu.h |    2 +
 kernel/cpu.c        |   92 +++++++++++++++++++++++++++++++++++++--------------
 3 files changed, 73 insertions(+), 25 deletions(-)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 7a15e7b..1a382da 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -154,6 +154,8 @@ static ssize_t store_cpus_online(struct sysdev_class *dev_class,
 
 	cpumask_copy(cpu_debug_online_mask, store_cpus_online_mask);
 
+	ret = cpu_up_mask(store_cpus_online_mask);
+
 out:
 	free_cpumask_var(store_cpus_online_mask);
 	if (ret >= 0)
@@ -221,6 +223,8 @@ static ssize_t store_cpus_offline(struct sysdev_class *dev_class,
 
 	cpumask_copy(cpu_debug_offline_mask, store_cpus_offline_mask);
 
+	ret = cpu_down_mask(store_cpus_offline_mask);
+
 out:
 	free_cpumask_var(store_cpus_offline_mask);
 	if (ret >= 0)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 2643d84..4769ff6 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -68,6 +68,7 @@ static inline void unregister_cpu_notifier(struct notifier_block *nb)
 #endif
 
 int cpu_up(unsigned int cpu);
+int cpu_up_mask(const cpumask_var_t cpus_to_online);
 void notify_cpu_starting(unsigned int cpu);
 extern void cpu_hotplug_init(void);
 extern void cpu_maps_update_begin(void);
@@ -112,6 +113,7 @@ extern void put_online_cpus(void);
 #define register_hotcpu_notifier(nb)	register_cpu_notifier(nb)
 #define unregister_hotcpu_notifier(nb)	unregister_cpu_notifier(nb)
 int cpu_down(unsigned int cpu);
+int cpu_down_mask(const cpumask_var_t cpus_to_offline);
 
 #else		/* CONFIG_HOTPLUG_CPU */
 
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 395b697..2b5d4e0 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -267,9 +267,10 @@ out_release:
 	return err;
 }
 
-int __ref cpu_down(unsigned int cpu)
+int __ref cpu_down_mask(const cpumask_var_t cpus_to_offline)
 {
 	int err;
+	unsigned int cpu;
 
 	err = stop_machine_create();
 	if (err)
@@ -281,28 +282,48 @@ int __ref cpu_down(unsigned int cpu)
 		goto out;
 	}
 
-	set_cpu_active(cpu, false);
+	for_each_cpu(cpu, cpus_to_offline) {
+		set_cpu_active(cpu, false);
 
-	/*
-	 * Make sure the all cpus did the reschedule and are not
-	 * using stale version of the cpu_active_mask.
-	 * This is not strictly necessary becuase stop_machine()
-	 * that we run down the line already provides the required
-	 * synchronization. But it's really a side effect and we do not
-	 * want to depend on the innards of the stop_machine here.
-	 */
-	synchronize_sched();
+		/*
+		 * Make sure the all cpus did the reschedule and are not
+		 * using stale version of the cpu_active_mask.
+		 * This is not strictly necessary becuase stop_machine()
+		 * that we run down the line already provides the required
+		 * synchronization. But it's really a side effect and we do not
+		 * want to depend on the innards of the stop_machine here.
+		 */
+		synchronize_sched();
 
-	err = _cpu_down(cpu, 0);
+		err = _cpu_down(cpu, 0);
 
-	if (cpu_online(cpu))
-		set_cpu_active(cpu, true);
+		if (cpu_online(cpu))
+			set_cpu_active(cpu, true);
+	}
 
 out:
 	cpu_maps_update_done();
 	stop_machine_destroy();
 	return err;
 }
+
+int __ref cpu_down(unsigned int cpu)
+{
+	int err;
+	cpumask_var_t cpus_to_offline;
+
+	if (!alloc_cpumask_var(&cpus_to_offline, GFP_KERNEL))
+		return -ENOMEM;
+
+	cpumask_clear(cpus_to_offline);
+	cpumask_set_cpu(cpu, cpus_to_offline);
+
+	err = cpu_down_mask(cpus_to_offline);
+
+	free_cpumask_var(cpus_to_offline);
+
+	return err;
+}
 EXPORT_SYMBOL(cpu_down);
 #endif /*CONFIG_HOTPLUG_CPU*/
 
@@ -347,33 +368,54 @@ out_notify:
 	return ret;
 }
 
-int __cpuinit cpu_up(unsigned int cpu)
+int __cpuinit cpu_up_mask(const cpumask_var_t cpus_to_online)
 {
 	int err = 0;
-	if (!cpu_possible(cpu)) {
-		printk(KERN_ERR "can't online cpu %d because it is not "
-			"configured as may-hotadd at boot time\n", cpu);
+	unsigned int cpu;
+
+	cpu_maps_update_begin();
+	for_each_cpu(cpu, cpus_to_online) {
+		if (!cpu_possible(cpu)) {
+			printk(KERN_ERR "can't online cpu %d because it is not"
+			" configured as may-hotadd at boot time\n", cpu);
 #if defined(CONFIG_IA64) || defined(CONFIG_X86_64)
-		printk(KERN_ERR "please check additional_cpus= boot "
-				"parameter\n");
+			printk(KERN_ERR "please check additional_cpus= boot "
+					"parameter\n");
 #endif
-		return -EINVAL;
+			err = -EINVAL;
+			goto out;
+		}
 	}
 
-	cpu_maps_update_begin();
-
 	if (cpu_hotplug_disabled) {
 		err = -EBUSY;
 		goto out;
 	}
-
-	err = _cpu_up(cpu, 0);
+	for_each_cpu(cpu, cpus_to_online)
+		err = _cpu_up(cpu, 0);
 
 out:
 	cpu_maps_update_done();
 	return err;
 }
 
+int __cpuinit cpu_up(unsigned int cpu)
+{
+	int err = 0;
+	cpumask_var_t cpus_to_online;
+
+	if (!alloc_cpumask_var(&cpus_to_online, GFP_KERNEL))
+		return -ENOMEM;
+
+	cpumask_clear(cpus_to_online);
+	cpumask_set_cpu(cpu, cpus_to_online);
+
+	err = cpu_up_mask(cpus_to_online);
+
+	free_cpumask_var(cpus_to_online);
+
+	return err;
+}
 #ifdef CONFIG_PM_SLEEP_SMP
 static cpumask_var_t frozen_cpus;
 


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFD PATCH 4/4] cpu: measure time taken by subsystem notifiers during cpu-hotplug
  2009-06-16  5:38 [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Gautham R Shenoy
                   ` (2 preceding siblings ...)
  2009-06-16  5:38 ` [RFD PATCH 3/4] cpu: Define new functions cpu_down_mask and cpu_up_mask Gautham R Shenoy
@ 2009-06-16  5:38 ` Gautham R Shenoy
  2009-06-16  6:23 ` [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Andrew Morton
  2009-06-17 13:50 ` Suresh Siddha
  5 siblings, 0 replies; 21+ messages in thread
From: Gautham R Shenoy @ 2009-06-16  5:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Balbir Singh, Rusty Russel, Paul E McKenney,
	Nathan Lynch, Ingo Molnar, Venkatesh Pallipadi, Andrew Morton,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li

Place tracepoints at appropriate places to profile the time consumed by the
notifiers and the core-cpu-hotplug operations.

Change the notifier chain api to pass private data which can be used for
filtering out the trace results.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---
 include/trace/notifier_trace.h |   32 ++++++++++++++++++++++++++++++++
 kernel/cpu.c                   |   11 +++++++++++
 kernel/notifier.c              |   23 ++++++++++++++++++-----
 3 files changed, 61 insertions(+), 5 deletions(-)
 create mode 100644 include/trace/notifier_trace.h

diff --git a/include/trace/notifier_trace.h b/include/trace/notifier_trace.h
new file mode 100644
index 0000000..1591a40
--- /dev/null
+++ b/include/trace/notifier_trace.h
@@ -0,0 +1,32 @@
+#ifndef _HOTPLUG_CPU_TRACE_H_
+#define _HOTPLUG_CPU_TRACE_H_
+
+#include <linux/tracepoint.h>
+#include <linux/notifier.h>
+
+DECLARE_TRACE(hotplug_notifier_event_start,
+		TP_PROTO(void *notifier_call, unsigned int val,
+						void *chain_head),
+		TP_ARGS(notifier_call, val, chain_head));
+
+DECLARE_TRACE(hotplug_notifier_event_stop,
+		TP_PROTO(void *notifier_call, unsigned int val,
+						void *chain_head),
+		TP_ARGS(notifier_call, val, chain_head));
+
+DECLARE_TRACE(stop_machine_event_start,
+	TP_PROTO(void),
+	TP_ARGS());
+
+DECLARE_TRACE(stop_machine_event_stop,
+	TP_PROTO(void),
+	TP_ARGS());
+
+DECLARE_TRACE(cpu_up_event_start,
+	TP_PROTO(void),
+	TP_ARGS());
+
+DECLARE_TRACE(cpu_up_event_stop,
+	TP_PROTO(void),
+	TP_ARGS());
+#endif
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 2b5d4e0..256a3e4 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -14,6 +14,7 @@
 #include <linux/kthread.h>
 #include <linux/stop_machine.h>
 #include <linux/mutex.h>
+#include <trace/notifier_trace.h>
 
 #ifdef CONFIG_SMP
 /* Serializes the updates to cpu_online_mask, cpu_present_mask */
@@ -190,6 +191,9 @@ static int __ref take_cpu_down(void *_param)
 	return 0;
 }
 
+DEFINE_TRACE(stop_machine_event_start);
+DEFINE_TRACE(stop_machine_event_stop);
+
 /* Requires cpu_add_remove_lock to be held */
 static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
 {
@@ -229,7 +233,9 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
 	set_cpus_allowed_ptr(current,
 			     cpumask_of(cpumask_any_but(cpu_online_mask, cpu)));
 
+	trace_stop_machine_event_start();
 	err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
+	trace_stop_machine_event_stop();
 	if (err) {
 		/* CPU didn't die: tell everyone.  Can't complain. */
 		if (raw_notifier_call_chain(&cpu_chain, CPU_DOWN_FAILED | mod,
@@ -327,6 +333,9 @@ int __ref cpu_down(unsigned int cpu)
 EXPORT_SYMBOL(cpu_down);
 #endif /*CONFIG_HOTPLUG_CPU*/
 
+DEFINE_TRACE(cpu_up_event_start);
+DEFINE_TRACE(cpu_up_event_stop);
+
 /* Requires cpu_add_remove_lock to be held */
 static int __cpuinit _cpu_up(unsigned int cpu, int tasks_frozen)
 {
@@ -349,7 +358,9 @@ static int __cpuinit _cpu_up(unsigned int cpu, int tasks_frozen)
 	}
 
 	/* Arch-specific enabling code. */
+	trace_cpu_up_event_start();
 	ret = __cpu_up(cpu);
+	trace_cpu_up_event_stop();
 	if (ret != 0)
 		goto out_notify;
 	BUG_ON(!cpu_online(cpu));
diff --git a/kernel/notifier.c b/kernel/notifier.c
index 61d5aa5..5729035 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -5,6 +5,7 @@
 #include <linux/rcupdate.h>
 #include <linux/vmalloc.h>
 #include <linux/reboot.h>
+#include <trace/notifier_trace.h>
 
 /*
  *	Notifier list for kernel code which wants to be called
@@ -59,6 +60,9 @@ static int notifier_chain_unregister(struct notifier_block **nl,
 	return -ENOENT;
 }
 
+DEFINE_TRACE(hotplug_notifier_event_start);
+DEFINE_TRACE(hotplug_notifier_event_stop);
+
 /**
  * notifier_call_chain - Informs the registered notifiers about an event.
  *	@nl:		Pointer to head of the blocking notifier chain
@@ -68,12 +72,16 @@ static int notifier_chain_unregister(struct notifier_block **nl,
  *			value of this parameter is -1.
  *	@nr_calls:	Records the number of notifications sent. Don't care
  *			value of this field is NULL.
+ *	@chain_head:	Pointer to the head of the notifier chain. We cast it as
+ *			void * to allow different kinds of notifier chains to
+ *			pass the value of their chain heads.
  *	@returns:	notifier_call_chain returns the value returned by the
  *			last notifier function called.
  */
 static int __kprobes notifier_call_chain(struct notifier_block **nl,
 					unsigned long val, void *v,
-					int nr_to_call,	int *nr_calls)
+					int nr_to_call,	int *nr_calls,
+					void *chain_head)
 {
 	int ret = NOTIFY_DONE;
 	struct notifier_block *nb, *next_nb;
@@ -90,7 +98,11 @@ static int __kprobes notifier_call_chain(struct notifier_block **nl,
 			continue;
 		}
 #endif
+		trace_hotplug_notifier_event_start((void *)(nb->notifier_call),
+						val, (void *)chain_head);
 		ret = nb->notifier_call(nb, val, v);
+		trace_hotplug_notifier_event_stop((void *)(nb->notifier_call),
+						val, (void *) chain_head);
 
 		if (nr_calls)
 			(*nr_calls)++;
@@ -179,7 +191,7 @@ int __kprobes __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
 	int ret;
 
 	rcu_read_lock();
-	ret = notifier_call_chain(&nh->head, val, v, nr_to_call, nr_calls);
+	ret = notifier_call_chain(&nh->head, val, v, nr_to_call, nr_calls, nh);
 	rcu_read_unlock();
 	return ret;
 }
@@ -312,7 +324,7 @@ int __blocking_notifier_call_chain(struct blocking_notifier_head *nh,
 	if (rcu_dereference(nh->head)) {
 		down_read(&nh->rwsem);
 		ret = notifier_call_chain(&nh->head, val, v, nr_to_call,
-					nr_calls);
+					nr_calls, nh);
 		up_read(&nh->rwsem);
 	}
 	return ret;
@@ -388,7 +400,8 @@ int __raw_notifier_call_chain(struct raw_notifier_head *nh,
 			      unsigned long val, void *v,
 			      int nr_to_call, int *nr_calls)
 {
-	return notifier_call_chain(&nh->head, val, v, nr_to_call, nr_calls);
+	return notifier_call_chain(&nh->head, val, v, nr_to_call, nr_calls,
+									nh);
 }
 EXPORT_SYMBOL_GPL(__raw_notifier_call_chain);
 
@@ -491,7 +504,7 @@ int __srcu_notifier_call_chain(struct srcu_notifier_head *nh,
 	int idx;
 
 	idx = srcu_read_lock(&nh->srcu);
-	ret = notifier_call_chain(&nh->head, val, v, nr_to_call, nr_calls);
+	ret = notifier_call_chain(&nh->head, val, v, nr_to_call, nr_calls, nh);
 	srcu_read_unlock(&nh->srcu, idx);
 	return ret;
 }


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-16  5:38 [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Gautham R Shenoy
                   ` (3 preceding siblings ...)
  2009-06-16  5:38 ` [RFD PATCH 4/4] cpu: measure time taken by subsystem notifiers during cpu-hotplug Gautham R Shenoy
@ 2009-06-16  6:23 ` Andrew Morton
  2009-06-16  8:07   ` Vaidyanathan Srinivasan
  2009-06-17 13:50 ` Suresh Siddha
  5 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2009-06-16  6:23 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: linux-kernel, Peter Zijlstra, Balbir Singh, Rusty Russel,
	Paul E McKenney, Nathan Lynch, Ingo Molnar, Venkatesh Pallipadi,
	Vaidyanathan Srinivasan, Dipankar Sarma, Shoahua Li

On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:

> Currently on a ppc64 box with 16 CPUs, the time taken for
> a individual cpu-hotplug operation is as follows.
> 
> 	# time echo 0 > /sys/devices/system/cpu/cpu2/online
> 	real    0m0.025s
> 	user    0m0.000s
> 	sys     0m0.002s
> 
> 	# time echo 1 > /sys/devices/system/cpu/cpu2/online
> 	real    0m0.021s
> 	user    0m0.000s
> 	sys     0m0.000s

Surprised.  Do people really online and offline CPUs frequently enough
for this to be a problem?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-16  6:23 ` [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Andrew Morton
@ 2009-06-16  8:07   ` Vaidyanathan Srinivasan
  2009-06-16 21:00     ` Paul E. McKenney
  2009-06-17  7:32     ` Peter Zijlstra
  0 siblings, 2 replies; 21+ messages in thread
From: Vaidyanathan Srinivasan @ 2009-06-16  8:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Gautham R Shenoy, linux-kernel, Peter Zijlstra, Balbir Singh,
	Rusty Russel, Paul E McKenney, Nathan Lynch, Ingo Molnar,
	Venkatesh Pallipadi, Dipankar Sarma, Shoahua Li

* Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:

> On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
> 
> > Currently on a ppc64 box with 16 CPUs, the time taken for
> > a individual cpu-hotplug operation is as follows.
> > 
> > 	# time echo 0 > /sys/devices/system/cpu/cpu2/online
> > 	real    0m0.025s
> > 	user    0m0.000s
> > 	sys     0m0.002s
> > 
> > 	# time echo 1 > /sys/devices/system/cpu/cpu2/online
> > 	real    0m0.021s
> > 	user    0m0.000s
> > 	sys     0m0.000s
> 
> Surprised.  Do people really online and offline CPUs frequently enough
> for this to be a problem?

Certainly not for hardware faults or hardware replacement, but
cpu-hotplug interface is useful for changing system configuration to
meet different objectives like

* Reduce system capacity to reduce average power and reduce heat

* Increasing number of cores and threads in a CPU package is leading
  to multiple cpu offline/online operations for any perceivable effect

* Dynamically change CPU configurations in virtualized environments

Ref:

[1] Saving power by cpu evacuation sched_max_capacity_pct=n
    http://lkml.org/lkml/2009/5/13/173

[2] Make offline cpus to go to deepest idle state using
    http://lkml.org/lkml/2009/5/22/431

[3] cpuset: add new API to change cpuset top group's cpus
    http://lkml.org/lkml/2009/5/19/54        

For getting stuff off a certain CPU, cpu-hotplug framework seems to do
the right thing.  Identifying bottlenecks in the framework can
significantly help other use cases.

--Vaidy


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-16  8:07   ` Vaidyanathan Srinivasan
@ 2009-06-16 21:00     ` Paul E. McKenney
  2009-06-24 15:02       ` Pavel Machek
  2009-06-17  7:32     ` Peter Zijlstra
  1 sibling, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2009-06-16 21:00 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Andrew Morton, Gautham R Shenoy, linux-kernel, Peter Zijlstra,
	Balbir Singh, Rusty Russel, Nathan Lynch, Ingo Molnar,
	Venkatesh Pallipadi, Dipankar Sarma, Shoahua Li

On Tue, Jun 16, 2009 at 01:37:15PM +0530, Vaidyanathan Srinivasan wrote:
> * Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:
> 
> > On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
> > 
> > > Currently on a ppc64 box with 16 CPUs, the time taken for
> > > a individual cpu-hotplug operation is as follows.
> > > 
> > > 	# time echo 0 > /sys/devices/system/cpu/cpu2/online
> > > 	real    0m0.025s
> > > 	user    0m0.000s
> > > 	sys     0m0.002s
> > > 
> > > 	# time echo 1 > /sys/devices/system/cpu/cpu2/online
> > > 	real    0m0.021s
> > > 	user    0m0.000s
> > > 	sys     0m0.000s
> > 
> > Surprised.  Do people really online and offline CPUs frequently enough
> > for this to be a problem?
> 
> Certainly not for hardware faults or hardware replacement, but
> cpu-hotplug interface is useful for changing system configuration to
> meet different objectives like
> 
> * Reduce system capacity to reduce average power and reduce heat
> 
> * Increasing number of cores and threads in a CPU package is leading
>   to multiple cpu offline/online operations for any perceivable effect
> 
> * Dynamically change CPU configurations in virtualized environments

Perhaps also reducing boot-up time?  If I am correctly interpreting the
above numbers, an eight-CPU system would be consuming 175 milliseconds
bringing up the seven non-boot CPUs.  Reducing this by 150 milliseconds
might be of interest to some people.  ;-)

							Thanx, Paul

> Ref:
> 
> [1] Saving power by cpu evacuation sched_max_capacity_pct=n
>     http://lkml.org/lkml/2009/5/13/173
> 
> [2] Make offline cpus to go to deepest idle state using
>     http://lkml.org/lkml/2009/5/22/431
> 
> [3] cpuset: add new API to change cpuset top group's cpus
>     http://lkml.org/lkml/2009/5/19/54        
> 
> For getting stuff off a certain CPU, cpu-hotplug framework seems to do
> the right thing.  Identifying bottlenecks in the framework can
> significantly help other use cases.
> 
> --Vaidy
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-16 21:00     ` Paul E. McKenney
@ 2009-06-24 15:02       ` Pavel Machek
  0 siblings, 0 replies; 21+ messages in thread
From: Pavel Machek @ 2009-06-24 15:02 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Vaidyanathan Srinivasan, Andrew Morton, Gautham R Shenoy,
	linux-kernel, Peter Zijlstra, Balbir Singh, Rusty Russel,
	Nathan Lynch, Ingo Molnar, Venkatesh Pallipadi, Dipankar Sarma,
	Shoahua Li

On Tue 2009-06-16 14:00:59, Paul E. McKenney wrote:
> On Tue, Jun 16, 2009 at 01:37:15PM +0530, Vaidyanathan Srinivasan wrote:
> > * Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:
> > 
> > > On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
> > > 
> > > > Currently on a ppc64 box with 16 CPUs, the time taken for
> > > > a individual cpu-hotplug operation is as follows.
> > > > 
> > > > 	# time echo 0 > /sys/devices/system/cpu/cpu2/online
> > > > 	real    0m0.025s
> > > > 	user    0m0.000s
> > > > 	sys     0m0.002s
> > > > 
> > > > 	# time echo 1 > /sys/devices/system/cpu/cpu2/online
> > > > 	real    0m0.021s
> > > > 	user    0m0.000s
> > > > 	sys     0m0.000s
> > > 
> > > Surprised.  Do people really online and offline CPUs frequently enough
> > > for this to be a problem?
> > 
> > Certainly not for hardware faults or hardware replacement, but
> > cpu-hotplug interface is useful for changing system configuration to
> > meet different objectives like
> > 
> > * Reduce system capacity to reduce average power and reduce heat
> > 
> > * Increasing number of cores and threads in a CPU package is leading
> >   to multiple cpu offline/online operations for any perceivable effect
> > 
> > * Dynamically change CPU configurations in virtualized environments
> 
> Perhaps also reducing boot-up time?  If I am correctly interpreting the
> above numbers, an eight-CPU system would be consuming 175 milliseconds
> bringing up the seven non-boot CPUs.  Reducing this by 150 milliseconds
> might be of interest to some people.  ;-)

...also it should save 300msec from s2ram cycle. Actually maybe
suspend code should be modified first,  as it can demonstrate the
changes without kernel<-> user interface changing?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-16  8:07   ` Vaidyanathan Srinivasan
  2009-06-16 21:00     ` Paul E. McKenney
@ 2009-06-17  7:32     ` Peter Zijlstra
  2009-06-17  7:40       ` Balbir Singh
  2009-06-17 14:38       ` Paul E. McKenney
  1 sibling, 2 replies; 21+ messages in thread
From: Peter Zijlstra @ 2009-06-17  7:32 UTC (permalink / raw)
  To: svaidy
  Cc: Andrew Morton, Gautham R Shenoy, linux-kernel, Balbir Singh,
	Rusty Russel, Paul E McKenney, Nathan Lynch, Ingo Molnar,
	Venkatesh Pallipadi, Dipankar Sarma, Shoahua Li

On Tue, 2009-06-16 at 13:37 +0530, Vaidyanathan Srinivasan wrote:
> * Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:
> 
> > On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
> > 
> > > Currently on a ppc64 box with 16 CPUs, the time taken for
> > > a individual cpu-hotplug operation is as follows.
> > > 
> > >     # time echo 0 > /sys/devices/system/cpu/cpu2/online
> > >     real    0m0.025s
> > >     user    0m0.000s
> > >     sys     0m0.002s
> > > 
> > >     # time echo 1 > /sys/devices/system/cpu/cpu2/online
> > >     real    0m0.021s
> > >     user    0m0.000s
> > >     sys     0m0.000s
> > 
> > Surprised.  Do people really online and offline CPUs frequently enough
> > for this to be a problem?
> 
> Certainly not for hardware faults or hardware replacement, but
> cpu-hotplug interface is useful for changing system configuration to
> meet different objectives like
> 
> * Reduce system capacity to reduce average power and reduce heat
> 
> * Increasing number of cores and threads in a CPU package is leading
>   to multiple cpu offline/online operations for any perceivable effect
> 
> * Dynamically change CPU configurations in virtualized environments

I tend to agree with Andrew, if any of those things are done frequent
enough that the hotplug performance matter you're doing something mighty
odd.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-17  7:32     ` Peter Zijlstra
@ 2009-06-17  7:40       ` Balbir Singh
  2009-06-17 14:38       ` Paul E. McKenney
  1 sibling, 0 replies; 21+ messages in thread
From: Balbir Singh @ 2009-06-17  7:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: svaidy, Andrew Morton, Gautham R Shenoy, linux-kernel,
	Rusty Russel, Paul E McKenney, Nathan Lynch, Ingo Molnar,
	Venkatesh Pallipadi, Dipankar Sarma, Shoahua Li

* Peter Zijlstra <a.p.zijlstra@chello.nl> [2009-06-17 09:32:57]:

> On Tue, 2009-06-16 at 13:37 +0530, Vaidyanathan Srinivasan wrote:
> > * Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:
> > 
> > > On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
> > > 
> > > > Currently on a ppc64 box with 16 CPUs, the time taken for
> > > > a individual cpu-hotplug operation is as follows.
> > > > 
> > > >     # time echo 0 > /sys/devices/system/cpu/cpu2/online
> > > >     real    0m0.025s
> > > >     user    0m0.000s
> > > >     sys     0m0.002s
> > > > 
> > > >     # time echo 1 > /sys/devices/system/cpu/cpu2/online
> > > >     real    0m0.021s
> > > >     user    0m0.000s
> > > >     sys     0m0.000s
> > > 
> > > Surprised.  Do people really online and offline CPUs frequently enough
> > > for this to be a problem?
> > 
> > Certainly not for hardware faults or hardware replacement, but
> > cpu-hotplug interface is useful for changing system configuration to
> > meet different objectives like
> > 
> > * Reduce system capacity to reduce average power and reduce heat
> > 
> > * Increasing number of cores and threads in a CPU package is leading
> >   to multiple cpu offline/online operations for any perceivable effect
> > 
> > * Dynamically change CPU configurations in virtualized environments
> 
> I tend to agree with Andrew, if any of those things are done frequent
> enough that the hotplug performance matter you're doing something mighty
> odd.
>

Peter, what Vaidy mentioned are very useful cases, to add to that

Consider for example the need to turn of all threads belonging to a
package or the system. I can basically give out the cpu ids of all
threads and hotplug them out at once depending on the workload. In
effect turning off hyper-threading on the package.

Doing it all together provides benefits of (not complete, but
better control) over rollback apart from the speed benefit.

The benefit mentioned by Paul of speed up is very useful as well on
large systems and on the boot up time of virtual machines as well.
 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-17  7:32     ` Peter Zijlstra
  2009-06-17  7:40       ` Balbir Singh
@ 2009-06-17 14:38       ` Paul E. McKenney
  2009-06-17 15:07         ` Ingo Molnar
  1 sibling, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2009-06-17 14:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: svaidy, Andrew Morton, Gautham R Shenoy, linux-kernel,
	Balbir Singh, Rusty Russel, Nathan Lynch, Ingo Molnar,
	Venkatesh Pallipadi, Dipankar Sarma, Shoahua Li

On Wed, Jun 17, 2009 at 09:32:57AM +0200, Peter Zijlstra wrote:
> On Tue, 2009-06-16 at 13:37 +0530, Vaidyanathan Srinivasan wrote:
> > * Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:
> > 
> > > On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
> > > 
> > > > Currently on a ppc64 box with 16 CPUs, the time taken for
> > > > a individual cpu-hotplug operation is as follows.
> > > > 
> > > >     # time echo 0 > /sys/devices/system/cpu/cpu2/online
> > > >     real    0m0.025s
> > > >     user    0m0.000s
> > > >     sys     0m0.002s
> > > > 
> > > >     # time echo 1 > /sys/devices/system/cpu/cpu2/online
> > > >     real    0m0.021s
> > > >     user    0m0.000s
> > > >     sys     0m0.000s
> > > 
> > > Surprised.  Do people really online and offline CPUs frequently enough
> > > for this to be a problem?
> > 
> > Certainly not for hardware faults or hardware replacement, but
> > cpu-hotplug interface is useful for changing system configuration to
> > meet different objectives like
> > 
> > * Reduce system capacity to reduce average power and reduce heat
> > 
> > * Increasing number of cores and threads in a CPU package is leading
> >   to multiple cpu offline/online operations for any perceivable effect
> > 
> > * Dynamically change CPU configurations in virtualized environments
> 
> I tend to agree with Andrew, if any of those things are done frequent
> enough that the hotplug performance matter you're doing something mighty
> odd.

Boot speedup?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-17 14:38       ` Paul E. McKenney
@ 2009-06-17 15:07         ` Ingo Molnar
  2009-06-17 20:26           ` Peter Zijlstra
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2009-06-17 15:07 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, svaidy, Andrew Morton, Gautham R Shenoy,
	linux-kernel, Balbir Singh, Rusty Russel, Nathan Lynch,
	Venkatesh Pallipadi, Dipankar Sarma, Shoahua Li


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jun 17, 2009 at 09:32:57AM +0200, Peter Zijlstra wrote:
> > On Tue, 2009-06-16 at 13:37 +0530, Vaidyanathan Srinivasan wrote:
> > > * Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:
> > > 
> > > > On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
> > > > 
> > > > > Currently on a ppc64 box with 16 CPUs, the time taken for
> > > > > a individual cpu-hotplug operation is as follows.
> > > > > 
> > > > >     # time echo 0 > /sys/devices/system/cpu/cpu2/online
> > > > >     real    0m0.025s
> > > > >     user    0m0.000s
> > > > >     sys     0m0.002s
> > > > > 
> > > > >     # time echo 1 > /sys/devices/system/cpu/cpu2/online
> > > > >     real    0m0.021s
> > > > >     user    0m0.000s
> > > > >     sys     0m0.000s
> > > > 
> > > > Surprised.  Do people really online and offline CPUs frequently enough
> > > > for this to be a problem?
> > > 
> > > Certainly not for hardware faults or hardware replacement, but
> > > cpu-hotplug interface is useful for changing system configuration to
> > > meet different objectives like
> > > 
> > > * Reduce system capacity to reduce average power and reduce heat
> > > 
> > > * Increasing number of cores and threads in a CPU package is leading
> > >   to multiple cpu offline/online operations for any perceivable effect
> > > 
> > > * Dynamically change CPU configurations in virtualized environments
> > 
> > I tend to agree with Andrew, if any of those things are done 
> > frequent enough that the hotplug performance matter you're doing 
> > something mighty odd.
> 
> Boot speedup?

Also, if it brings more attention (and more stability and more 
bugfixes) to CPU hotplug that's only good.

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-17 15:07         ` Ingo Molnar
@ 2009-06-17 20:26           ` Peter Zijlstra
  2009-06-20 15:35             ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Zijlstra @ 2009-06-17 20:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul E. McKenney, svaidy, Andrew Morton, Gautham R Shenoy,
	linux-kernel, Balbir Singh, Rusty Russel, Nathan Lynch,
	Venkatesh Pallipadi, Dipankar Sarma, Shoahua Li

On Wed, 2009-06-17 at 17:07 +0200, Ingo Molnar wrote:
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jun 17, 2009 at 09:32:57AM +0200, Peter Zijlstra wrote:
> > > On Tue, 2009-06-16 at 13:37 +0530, Vaidyanathan Srinivasan wrote:
> > > > * Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:
> > > > 
> > > > > On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
> > > > > 
> > > > > > Currently on a ppc64 box with 16 CPUs, the time taken for
> > > > > > a individual cpu-hotplug operation is as follows.
> > > > > > 
> > > > > >     # time echo 0 > /sys/devices/system/cpu/cpu2/online
> > > > > >     real    0m0.025s
> > > > > >     user    0m0.000s
> > > > > >     sys     0m0.002s
> > > > > > 
> > > > > >     # time echo 1 > /sys/devices/system/cpu/cpu2/online
> > > > > >     real    0m0.021s
> > > > > >     user    0m0.000s
> > > > > >     sys     0m0.000s
> > > > > 
> > > > > Surprised.  Do people really online and offline CPUs frequently enough
> > > > > for this to be a problem?
> > > > 
> > > > Certainly not for hardware faults or hardware replacement, but
> > > > cpu-hotplug interface is useful for changing system configuration to
> > > > meet different objectives like
> > > > 
> > > > * Reduce system capacity to reduce average power and reduce heat
> > > > 
> > > > * Increasing number of cores and threads in a CPU package is leading
> > > >   to multiple cpu offline/online operations for any perceivable effect
> > > > 
> > > > * Dynamically change CPU configurations in virtualized environments
> > > 
> > > I tend to agree with Andrew, if any of those things are done 
> > > frequent enough that the hotplug performance matter you're doing 
> > > something mighty odd.
> > 
> > Boot speedup?
> 
> Also, if it brings more attention (and more stability and more 
> bugfixes) to CPU hotplug that's only good.

Sure, but do we need the extra complexity?

I mean, sure bootup speed might be nice, but any of the scenarios given
should simply not require cpu hotplug actions of a frequent enough
nature that any performance matters.

If you want to switch off all SMT siblings you don't do that 50 times a
second, you do that once per bootup or something.

Furthermore we already established that cpu hotlpug is not the proper
interface for thermal management, and dynamically changing virtualized
muck isn't something you do at 100Hz either.

So what worries me is the justification for this work. It might be good
and nice, but if the reasons are wrong it still worries me.

So again, why? -- the bootup thing is the only sane answer so far.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-17 20:26           ` Peter Zijlstra
@ 2009-06-20 15:35             ` Ingo Molnar
  2009-06-22  6:08               ` Nathan Lynch
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2009-06-20 15:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Paul E. McKenney, svaidy, Andrew Morton, Gautham R Shenoy,
	linux-kernel, Balbir Singh, Rusty Russel, Nathan Lynch,
	Venkatesh Pallipadi, Dipankar Sarma, Shoahua Li


* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> On Wed, 2009-06-17 at 17:07 +0200, Ingo Molnar wrote:
> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Wed, Jun 17, 2009 at 09:32:57AM +0200, Peter Zijlstra wrote:
> > > > On Tue, 2009-06-16 at 13:37 +0530, Vaidyanathan Srinivasan wrote:
> > > > > * Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:
> > > > > 
> > > > > > On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
> > > > > > 
> > > > > > > Currently on a ppc64 box with 16 CPUs, the time taken for
> > > > > > > a individual cpu-hotplug operation is as follows.
> > > > > > > 
> > > > > > >     # time echo 0 > /sys/devices/system/cpu/cpu2/online
> > > > > > >     real    0m0.025s
> > > > > > >     user    0m0.000s
> > > > > > >     sys     0m0.002s
> > > > > > > 
> > > > > > >     # time echo 1 > /sys/devices/system/cpu/cpu2/online
> > > > > > >     real    0m0.021s
> > > > > > >     user    0m0.000s
> > > > > > >     sys     0m0.000s
> > > > > > 
> > > > > > Surprised.  Do people really online and offline CPUs frequently enough
> > > > > > for this to be a problem?
> > > > > 
> > > > > Certainly not for hardware faults or hardware replacement, but
> > > > > cpu-hotplug interface is useful for changing system configuration to
> > > > > meet different objectives like
> > > > > 
> > > > > * Reduce system capacity to reduce average power and reduce heat
> > > > > 
> > > > > * Increasing number of cores and threads in a CPU package is leading
> > > > >   to multiple cpu offline/online operations for any perceivable effect
> > > > > 
> > > > > * Dynamically change CPU configurations in virtualized environments
> > > > 
> > > > I tend to agree with Andrew, if any of those things are done 
> > > > frequent enough that the hotplug performance matter you're doing 
> > > > something mighty odd.
> > > 
> > > Boot speedup?
> > 
> > Also, if it brings more attention (and more stability and more 
> > bugfixes) to CPU hotplug that's only good.
> 
> Sure, but do we need the extra complexity?
> 
> I mean, sure bootup speed might be nice, but any of the scenarios 
> given should simply not require cpu hotplug actions of a frequent 
> enough nature that any performance matters.

Well, the fact that the patches exist show that there's people 
caring about the speedup here. The speedup itself is non-trivial.

If the patches are technically correct, and if any existing 
uncleanlinesses in the affected code are fixed first (please list 
any TODO items in the CPU hotplug code you might know about), then 
there's no reason not to pursue these patches - unless the 
complexity increase is so huge that it makes the patches technically 
wrong.

The diffstat doesnt look _that_ awful IMO - 50 lines of code and i 
suspect the patches come with a promise to properly handle all prior 
and later bugs in this area? :)

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-20 15:35             ` Ingo Molnar
@ 2009-06-22  6:08               ` Nathan Lynch
  0 siblings, 0 replies; 21+ messages in thread
From: Nathan Lynch @ 2009-06-22  6:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Paul E. McKenney, svaidy, Andrew Morton,
	Gautham R Shenoy, linux-kernel, Balbir Singh, Rusty Russel,
	Venkatesh Pallipadi, Dipankar Sarma, Shoahua Li

Ingo Molnar <mingo@elte.hu> writes:

> * Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>
>> On Wed, 2009-06-17 at 17:07 +0200, Ingo Molnar wrote:
>> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
>> > 
>> > > On Wed, Jun 17, 2009 at 09:32:57AM +0200, Peter Zijlstra wrote:
>> > > > On Tue, 2009-06-16 at 13:37 +0530, Vaidyanathan Srinivasan wrote:
>> > > > > * Andrew Morton <akpm@linux-foundation.org> [2009-06-15 23:23:18]:
>> > > > > 
>> > > > > > On Tue, 16 Jun 2009 11:08:39 +0530 Gautham R Shenoy <ego@in.ibm.com> wrote:
>> > > > > > 
>> > > > > > > Currently on a ppc64 box with 16 CPUs, the time taken for
>> > > > > > > a individual cpu-hotplug operation is as follows.
>> > > > > > > 
>> > > > > > >     # time echo 0 > /sys/devices/system/cpu/cpu2/online
>> > > > > > >     real    0m0.025s
>> > > > > > >     user    0m0.000s
>> > > > > > >     sys     0m0.002s
>> > > > > > > 
>> > > > > > >     # time echo 1 > /sys/devices/system/cpu/cpu2/online
>> > > > > > >     real    0m0.021s
>> > > > > > >     user    0m0.000s
>> > > > > > >     sys     0m0.000s
>> > > > > > 
>> > > > > > Surprised.  Do people really online and offline CPUs frequently enough
>> > > > > > for this to be a problem?
>> > > > > 
>> > > > > Certainly not for hardware faults or hardware replacement, but
>> > > > > cpu-hotplug interface is useful for changing system configuration to
>> > > > > meet different objectives like
>> > > > > 
>> > > > > * Reduce system capacity to reduce average power and reduce heat
>> > > > > 
>> > > > > * Increasing number of cores and threads in a CPU package is leading
>> > > > >   to multiple cpu offline/online operations for any perceivable effect
>> > > > > 
>> > > > > * Dynamically change CPU configurations in virtualized environments
>> > > > 
>> > > > I tend to agree with Andrew, if any of those things are done 
>> > > > frequent enough that the hotplug performance matter you're doing 
>> > > > something mighty odd.
>> > > 
>> > > Boot speedup?
>> > 
>> > Also, if it brings more attention (and more stability and more 
>> > bugfixes) to CPU hotplug that's only good.
>> 
>> Sure, but do we need the extra complexity?
>> 
>> I mean, sure bootup speed might be nice, but any of the scenarios 
>> given should simply not require cpu hotplug actions of a frequent 
>> enough nature that any performance matters.
>
> Well, the fact that the patches exist show that there's people 
> caring about the speedup here. The speedup itself is non-trivial.

If I correctly understand the behavior of the patch set as posted, there
is no speedup beyond eliminating the overhead of multiple writes to
/sys/devices/system/cpu/cpu*/online.  To obtain non-trivial speedups via
bulk hotplug, one or both of the following items from the TODO list need
to be completed:

- Enhance the subsystem notifiers to work on a cpumask_var_t instead of a cpu
  id.

- Optimize the subsystem notifiers to reduce the time consumed while
  handling CPU_[DOWN_PREPARE/DEAD/UP_PREPARE/ONLINE] events for the
  cpumask_var_t.

Right?

(The powerpc-specific patch at the beginning of the series improves
hot-online time for a single cpu in some circumstances and is basically
unrelated to the aim of the patch set -- it should go upstream through
the powerpc tree independently.)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support.
  2009-06-16  5:38 [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Gautham R Shenoy
                   ` (4 preceding siblings ...)
  2009-06-16  6:23 ` [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Andrew Morton
@ 2009-06-17 13:50 ` Suresh Siddha
  5 siblings, 0 replies; 21+ messages in thread
From: Suresh Siddha @ 2009-06-17 13:50 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: linux-kernel@vger.kernel.org, Peter Zijlstra, Balbir Singh,
	Rusty Russel, Paul E McKenney, Nathan Lynch, Ingo Molnar,
	Pallipadi, Venkatesh, Andrew Morton, Vaidyanathan Srinivasan,
	Dipankar Sarma, Shoahua Li

On Mon, 2009-06-15 at 22:38 -0700, Gautham R Shenoy wrote:
> So, of the accounted time, a major chunk of time is consumed by
> cpuset_track_online_cpus() while handling CPU_DEAD and CPU_ONLINE
> notifications.
> 
>         11.320205 ms: cpuset_track_online_cpus      : CPU_DEAD
>         12.767882 ms: cpuset_track_online_cpus      : CPU_ONLINE
> 
> cpuset_trace_online_cpus() among other things performs the task of rebuilding
> the sched_domains for every online CPU in the system.

Are the above numbers with CONFIG_SCHED_DEBUG turned on/off?

thanks,
suresh


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-06-27 11:27 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-16  5:38 [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Gautham R Shenoy
2009-06-16  5:38 ` [RFD PATCH 1/4] powerpc: cpu: Reduce the polling interval in __cpu_up() Gautham R Shenoy
2009-06-16 16:06   ` Nathan Lynch
2009-06-16 16:37     ` Gautham R Shenoy
2009-06-16  5:38 ` [RFD PATCH 2/4] cpu: sysfs interface for hotplugging bunch of CPUs Gautham R Shenoy
2009-06-16 16:22   ` Nathan Lynch
2009-06-16 16:33     ` Gautham R Shenoy
2009-06-16  5:38 ` [RFD PATCH 3/4] cpu: Define new functions cpu_down_mask and cpu_up_mask Gautham R Shenoy
2009-06-16  5:38 ` [RFD PATCH 4/4] cpu: measure time taken by subsystem notifiers during cpu-hotplug Gautham R Shenoy
2009-06-16  6:23 ` [RFD PATCH 0/4] cpu: Bulk CPU Hotplug support Andrew Morton
2009-06-16  8:07   ` Vaidyanathan Srinivasan
2009-06-16 21:00     ` Paul E. McKenney
2009-06-24 15:02       ` Pavel Machek
2009-06-17  7:32     ` Peter Zijlstra
2009-06-17  7:40       ` Balbir Singh
2009-06-17 14:38       ` Paul E. McKenney
2009-06-17 15:07         ` Ingo Molnar
2009-06-17 20:26           ` Peter Zijlstra
2009-06-20 15:35             ` Ingo Molnar
2009-06-22  6:08               ` Nathan Lynch
2009-06-17 13:50 ` Suresh Siddha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox