* [PATCH 3/2] fix flush_workqueue() vs CPU_DEAD race
@ 2006-12-30 16:10 Oleg Nesterov
2007-01-03 0:27 ` Andrew Morton
0 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2006-12-30 16:10 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, David Howells, Christoph Hellwig, Gautham R Shenoy,
linux-kernel
"[PATCH 1/2] reimplement flush_workqueue()" fixed one race when CPU goes down
while flush_cpu_workqueue() plays with it. But there is another problem, CPU
can die before flush_workqueue() has a chance to call flush_cpu_workqueue().
In that case pending work_structs can migrate to CPU which was already checked,
so we should redo the "for_each_online_cpu(cpu)" loop.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
--- mm-6.20-rc2/kernel/workqueue.c~3_race 2006-12-29 18:37:31.000000000 +0300
+++ mm-6.20-rc2/kernel/workqueue.c 2006-12-30 18:09:07.000000000 +0300
@@ -65,6 +65,7 @@ struct workqueue_struct {
/* All the per-cpu workqueues on the system, for hotplug cpu to add/remove
threads to each one as cpus come/go. */
+static long hotplug_sequence __read_mostly;
static DEFINE_MUTEX(workqueue_mutex);
static LIST_HEAD(workqueues);
@@ -454,10 +455,16 @@ void fastcall flush_workqueue(struct wor
/* Always use first cpu's area. */
flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, singlethread_cpu));
} else {
+ long sequence;
int cpu;
+again:
+ sequence = hotplug_sequence;
for_each_online_cpu(cpu)
flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
+
+ if (unlikely(sequence != hotplug_sequence))
+ goto again;
}
mutex_unlock(&workqueue_mutex);
}
@@ -874,6 +881,7 @@ static int __devinit workqueue_cpu_callb
cleanup_workqueue_thread(wq, hotcpu);
list_for_each_entry(wq, &workqueues, list)
take_over_work(wq, hotcpu);
+ hotplug_sequence++;
break;
case CPU_LOCK_RELEASE:
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH 3/2] fix flush_workqueue() vs CPU_DEAD race
2006-12-30 16:10 [PATCH 3/2] fix flush_workqueue() vs CPU_DEAD race Oleg Nesterov
@ 2007-01-03 0:27 ` Andrew Morton
2007-01-03 14:04 ` Gautham R Shenoy
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2007-01-03 0:27 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Ingo Molnar, David Howells, Christoph Hellwig, Gautham R Shenoy,
linux-kernel
On Sat, 30 Dec 2006 19:10:31 +0300
Oleg Nesterov <oleg@tv-sign.ru> wrote:
> "[PATCH 1/2] reimplement flush_workqueue()" fixed one race when CPU goes down
> while flush_cpu_workqueue() plays with it. But there is another problem, CPU
> can die before flush_workqueue() has a chance to call flush_cpu_workqueue().
> In that case pending work_structs can migrate to CPU which was already checked,
> so we should redo the "for_each_online_cpu(cpu)" loop.
>
I have a mental note that these:
extend-notifier_call_chain-to-count-nr_calls-made.patch
extend-notifier_call_chain-to-count-nr_calls-made-fixes.patch
extend-notifier_call_chain-to-count-nr_calls-made-fixes-2.patch
define-and-use-new-eventscpu_lock_acquire-and-cpu_lock_release.patch
define-and-use-new-eventscpu_lock_acquire-and-cpu_lock_release-fix.patch
eliminate-lock_cpu_hotplug-in-kernel-schedc.patch
eliminate-lock_cpu_hotplug-in-kernel-schedc-fix.patch
handle-cpu_lock_acquire-and-cpu_lock_release-in-workqueue_cpu_callback.patch
should be scrapped. But really I forget what their status is. Gautham,
can you please remind us where we're at?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 3/2] fix flush_workqueue() vs CPU_DEAD race
2007-01-03 0:27 ` Andrew Morton
@ 2007-01-03 14:04 ` Gautham R Shenoy
2007-01-03 15:17 ` Gautham R Shenoy
0 siblings, 1 reply; 6+ messages in thread
From: Gautham R Shenoy @ 2007-01-03 14:04 UTC (permalink / raw)
To: Andrew Morton
Cc: Oleg Nesterov, Ingo Molnar, David Howells, Christoph Hellwig,
Gautham R Shenoy, linux-kernel, dipankar, vatsa
Hi Andrew,
Sorry, I am yet to check out Venki's and Oleg's patches as I
just returned from Vacation.
On Tue, Jan 02, 2007 at 04:27:27PM -0800, Andrew Morton wrote:
>
> I have a mental note that these:
>
> extend-notifier_call_chain-to-count-nr_calls-made.patch
> extend-notifier_call_chain-to-count-nr_calls-made-fixes.patch
> extend-notifier_call_chain-to-count-nr_calls-made-fixes-2.patch
These patches are needed because they allow us to send out the "failed"
notifications to only those subsystems that received the "prepare"
notifications earlier.
> define-and-use-new-eventscpu_lock_acquire-and-cpu_lock_release.patch
> define-and-use-new-eventscpu_lock_acquire-and-cpu_lock_release-fix.patch
These were posted inorder to have a common place where the subsystems
could lock their per-subsystem hotplug mutexes/semaphore from within the
cpu-hotplug-callback function. Hence they are needed IMO.
> eliminate-lock_cpu_hotplug-in-kernel-schedc.patch
> eliminate-lock_cpu_hotplug-in-kernel-schedc-fix.patch
These patches define and use a mutex to handle cpu-hotplug and eliminate
the use of lock_cpu_hotplug in sched.c. Hence they are still needed.
> handle-cpu_lock_acquire-and-cpu_lock_release-in-workqueue_cpu_callback.patch
Again, this one ensures that workqueue_mutex is taken/released on
CPU_LOCK_ACQUIRE/CPU_LOCK_RELEASE events in the cpuhotplug callback
function. So this one is required, unless it conflicts with what Oleg
has posted. Will check that out tonite.
>
> should be scrapped. But really I forget what their status is. Gautham,
> can you please remind us where we're at?
>
If all goes fine (w.r.t cpufreq and workqueue), eliminating
lock_cpu_hotplug from kernel/*.c should be relatively easy.<fingers crossed>
Thanks and Regards
gautham.
--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 3/2] fix flush_workqueue() vs CPU_DEAD race
2007-01-03 14:04 ` Gautham R Shenoy
@ 2007-01-03 15:17 ` Gautham R Shenoy
2007-01-03 17:26 ` Oleg Nesterov
0 siblings, 1 reply; 6+ messages in thread
From: Gautham R Shenoy @ 2007-01-03 15:17 UTC (permalink / raw)
To: Gautham R Shenoy
Cc: Andrew Morton, Oleg Nesterov, Ingo Molnar, David Howells,
Christoph Hellwig, linux-kernel, dipankar, vatsa
On Wed, Jan 03, 2007 at 07:34:59PM +0530, Gautham R Shenoy wrote:
>
> > handle-cpu_lock_acquire-and-cpu_lock_release-in-workqueue_cpu_callback.patch
>
> Again, this one ensures that workqueue_mutex is taken/released on
> CPU_LOCK_ACQUIRE/CPU_LOCK_RELEASE events in the cpuhotplug callback
> function. So this one is required, unless it conflicts with what Oleg
> has posted. Will check that out tonite.
We would still be needing this patch as it's complementing what Oleg has
posted.
Thanks and Regards
gautham.
--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 3/2] fix flush_workqueue() vs CPU_DEAD race
2007-01-03 15:17 ` Gautham R Shenoy
@ 2007-01-03 17:26 ` Oleg Nesterov
2007-01-04 4:30 ` Gautham R Shenoy
0 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2007-01-03 17:26 UTC (permalink / raw)
To: Gautham R Shenoy
Cc: Andrew Morton, Ingo Molnar, David Howells, Christoph Hellwig,
linux-kernel, dipankar, vatsa
On 01/03, Gautham R Shenoy wrote:
>
> On Wed, Jan 03, 2007 at 07:34:59PM +0530, Gautham R Shenoy wrote:
> >
> > > handle-cpu_lock_acquire-and-cpu_lock_release-in-workqueue_cpu_callback.patch
> >
> > Again, this one ensures that workqueue_mutex is taken/released on
> > CPU_LOCK_ACQUIRE/CPU_LOCK_RELEASE events in the cpuhotplug callback
> > function. So this one is required, unless it conflicts with what Oleg
> > has posted. Will check that out tonite.
>
> We would still be needing this patch as it's complementing what Oleg has
> posted.
I thought that these patches don't depend on each other, flush_work/workueue
don't care where cpu-hotplug takes workqueue_mutex, in CPU_LOCK_ACQUIRE or in
CPU_UP_PREPARE case (or CPU_DEAD/CPU_LOCK_RELEASE for unlock).
Could you clarify? Just curious.
Oleg.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 3/2] fix flush_workqueue() vs CPU_DEAD race
2007-01-03 17:26 ` Oleg Nesterov
@ 2007-01-04 4:30 ` Gautham R Shenoy
0 siblings, 0 replies; 6+ messages in thread
From: Gautham R Shenoy @ 2007-01-04 4:30 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Gautham R Shenoy, Andrew Morton, Ingo Molnar, David Howells,
Christoph Hellwig, linux-kernel, dipankar, vatsa
On Wed, Jan 03, 2007 at 08:26:57PM +0300, Oleg Nesterov wrote:
>
> I thought that these patches don't depend on each other, flush_work/workueue
> don't care where cpu-hotplug takes workqueue_mutex, in CPU_LOCK_ACQUIRE or in
> CPU_UP_PREPARE case (or CPU_DEAD/CPU_LOCK_RELEASE for unlock).
>
> Could you clarify? Just curious.
You are right. They don't depend on each other.
The intention behind introducing CPU_LOCK_ACQUIRE and CPU_LOCK_RELEASE
was to have a standard place where the subsystems could acquire/release
the "cpu hotplug protection" mutex in the cpu_hotplug callback function.
The same can be acheived by acquiring these mutexes in
CPU_UP_PREPARE/CPU_DOWN_PREPARE etc.
This is true for every subsystem that is cpu-hotplug aware.
> Oleg.
>
Thanks and Regards
gautham.
--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-01-04 4:30 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-30 16:10 [PATCH 3/2] fix flush_workqueue() vs CPU_DEAD race Oleg Nesterov
2007-01-03 0:27 ` Andrew Morton
2007-01-03 14:04 ` Gautham R Shenoy
2007-01-03 15:17 ` Gautham R Shenoy
2007-01-03 17:26 ` Oleg Nesterov
2007-01-04 4:30 ` Gautham R Shenoy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox