public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Refcount Based Cpu-Hotplug Revisit.
@ 2007-10-16 10:33 Gautham R Shenoy
  2007-10-16 10:34 ` [RFC PATCH 1/4] Refcount Based Cpu-Hotplug Implementation Gautham R Shenoy
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Gautham R Shenoy @ 2007-10-16 10:33 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: linux-kernel, Srivatsa Vaddagiri, Rusty Russel, Dipankar Sarma,
	Oleg Nesterov, Ingo Molnar, Paul E McKenney

Hi, 
This patch series attempts to revisit the topic of 
cpu-hotplug locking model. 

Prior to this attempt, there were several different suggestions on 
how it should be implemented. 

The ones that were posted before were

a) Refcount + Waitqueue model:
   Here the threads that want to avoid a cpu hotplug operation while 
   they are operating in cpu-hotplug critical section, bump up the 
   reference to the global online cpu state. 
   The thread which wants to perform a cpu-hotplug, 
   blocks until the reference to the global online state goes
   to zero. Any threads which want to enter the cpu-hotplug critical
   section during an ongoing cpu-hotplug operatoin, are blocked using 
   a waitqueue.

   The advantange of this model was that it is along the lines of 
   the well known get/put model. Only that it allows sleeping of readers
   and writers.

   The disadvantage, as Andrew pointed out was that there do exist
   a whole bunch of lock_cpu_hotplug()'s whose existance is undocumented,
   and an approach like this will not improve such a situation.

b) Per Subsystem cpu-hotplug locks: Each subsystem which has cpu-hotplug
   critical data, uses a lock to protect that data. Such a subsystem
   needs to subscribe to the cpu-hotplug notification, especially the
   CPU_LOCK_ACQUIRE and CPU_LOCK_RELEASE events which are sent before
   and and after a cpu-hotplug operation. While handling these events
   respectively, the subsystem lock is taken or released. 
   
   The advantage this model offered was that lock was associated with the
   data, which made easy to understand the purpose of locking. 

   The disadvantage was that any cpu-hotplug aware function, could 
   not be called from a cpu-hotplug callback path, since we would have 
   acquired the subsystem lock during CPU_LOCK_ACQUIRE and attempting
   to reacquire it would result in a deadlock. 
   The case which pointed this limitation out was the implementation of
   synchronize_sched in preemptible rcu.

c) Freezer based cpu-hotplug: 
   The idea here was to freeze the system using the process freezer
   technique which is being used for suspend/hibernate purpose, before
   performing a cpu-hotplug operation. This would ensure that none of
   the kernel threads are accessing any of the cpu-hotplug critical
   data, because they are frozen at well known points. 

   This would have helped to remove all kinds of locks because when a 
   thread is accessing a cpu-hotplug critical data, it meant that the
   system was not frozen and hence there would be no cpu-hotplug 
   operation untill the thread either voluntarily calls try_to_freeze 
   or returns out of the kernel.

   The disadvantage of this approach was that any kind of dependencies
   between threads might call the freezer to fail. For eg, thread A is
   waiting for thread B to accomplish something, but thread B is already
   frozen, leading to a freeze failure. There could be other subtle
   races which might be difficult to track.

Some time in May 2007, Linus suggested using the refcount model, and
this patch series simplifies and reimplements the Refcount + waitqueue
model, based on the discussions in the past and inputs from
Vatsa and Rusty.

Patch 1/4: Implements the core refcount + waitqueue model.
Patch 2/4: Replaces all the lock_cpu_hotplug/unlock_cpu_hotplug instances
	   with get_online_cpus()/put_online_cpus()
Patch 3/4: Replaces the subsystem mutexes (we do have three of them now, 
           in sched.c, slab.c and workqueue.c) with get_online_cpus,
	   put_online_cpus.
Patch 4/4: Eliminates the CPU_DEAD and CPU_UP_CANCELLED event handling
 	   from workqueue.c

The patch series has survived an overnight test with kernbench on i386.
and has been tested with Paul Mckenney's latest preemptible rcu code.

Awaiting thy feedback!

Thanks and Regards
gautham.
-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2007-10-22  4:58 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-16 10:33 [RFC PATCH 0/4] Refcount Based Cpu-Hotplug Revisit Gautham R Shenoy
2007-10-16 10:34 ` [RFC PATCH 1/4] Refcount Based Cpu-Hotplug Implementation Gautham R Shenoy
2007-10-17  0:47   ` Rusty Russell
2007-10-17  5:37     ` Gautham R Shenoy
2007-10-17  6:29       ` Rusty Russell
2007-10-18  6:29         ` Gautham R Shenoy
2007-10-21 12:47       ` Oleg Nesterov
2007-10-17 10:53   ` Paul Jackson
2007-10-17 11:27     ` Paul Jackson
2007-10-17 11:50       ` Gautham R Shenoy
2007-10-17 12:04         ` Paul Jackson
2007-10-16 10:35 ` [RFC PATCH 2/4] Rename lock_cpu_hotplug to get_online_cpus Gautham R Shenoy
2007-10-17 16:13   ` Nathan Lynch
2007-10-18  7:57     ` Gautham R Shenoy
2007-10-18  8:22       ` Nathan Lynch
2007-10-18  8:59         ` Gautham R Shenoy
2007-10-18 17:30           ` Nathan Lynch
2007-10-19  5:04             ` Gautham R Shenoy
2007-10-22  0:43               ` Nathan Lynch
2007-10-22  4:51                 ` Gautham R Shenoy
2007-10-16 10:36 ` [RFC PATCH 3/4] Replace per-subsystem mutexes with get_online_cpus Gautham R Shenoy
2007-10-21 11:39   ` Oleg Nesterov
2007-10-22  4:58     ` Gautham R Shenoy
2007-10-16 10:37 ` [RFC PATCH 4/4] Remove CPU_DEAD/CPU_UP_CANCELLED handling from workqueue.c Gautham R Shenoy
2007-10-17 11:57   ` Oleg Nesterov
2007-10-16 17:20 ` [RFC PATCH 0/4] Refcount Based Cpu-Hotplug Revisit Linus Torvalds
2007-10-17  2:11   ` Dipankar Sarma
2007-10-17  2:23     ` Linus Torvalds
2007-10-17  4:17       ` Gautham R Shenoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox