public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] workqueues: change cancel_work_sync() to clear work->data
@ 2010-02-24 20:20 Oleg Nesterov
  2010-02-25  3:13 ` Tejun Heo
  2010-02-25  9:53 ` Dmitry Torokhov
  0 siblings, 2 replies; 3+ messages in thread
From: Oleg Nesterov @ 2010-02-24 20:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dmitry Torokhov, Samu Onkalo, Tejun Heo, linux-kernel

In short: change cancel_work_sync(work) to mark this work as "never
queued" upon return.

When cancel_work_sync(work) succeeds, we know that this work can't be
queued or running, and since we own WORK_STRUCT_PENDING nobody can change
the bits in work->data under us. This means we can also clear the "cwq"
part along with _PENDING bit lockless before return, unless the work is
queued nobody can assume get_wq_data() is stable even under cwq->lock.

This change can speedup the subsequent cancel/flush requests, and as
Dmitry pointed out this simplifies the usage of work_struct's which
can be queued on different workqueues. Consider this pseudo code from
the input subsystem:

	struct workqueue_struct *WQ;
	struct work_struct *WORK;

	for (;;) {
		WQ = create_workqueue();
		...
		if (condition())
			queue_work(WQ, WORK);
		...
		cancel_work_sync(WORK);
		destroy_workqueue(WQ);
	}

If condition() returns T and then F, cancel_work_sync() will crash the
kernel because WORK->data still points to the already destroyed workqueue.
With this patch the code like above becomes correct.

Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/workqueue.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

--- wq/kernel/workqueue.c~1_CANCEL_CLEAR_WQ	2010-02-24 20:43:32.000000000 +0100
+++ wq/kernel/workqueue.c	2010-02-24 20:55:53.000000000 +0100
@@ -229,6 +229,16 @@ static inline void set_wq_data(struct wo
 	atomic_long_set(&work->data, new);
 }
 
+/*
+ * Clear WORK_STRUCT_PENDING and the workqueue on which it was queued.
+ */
+static inline void clear_wq_data(struct work_struct *work)
+{
+	unsigned long flags = *work_data_bits(work) &
+				(1UL << WORK_STRUCT_STATIC);
+	atomic_long_set(&work->data, flags);
+}
+
 static inline
 struct cpu_workqueue_struct *get_wq_data(struct work_struct *work)
 {
@@ -671,7 +681,7 @@ static int __cancel_work_timer(struct wo
 		wait_on_work(work);
 	} while (unlikely(ret < 0));
 
-	work_clear_pending(work);
+	clear_wq_data(work);
 	return ret;
 }
 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/3] workqueues: change cancel_work_sync() to clear work->data
  2010-02-24 20:20 [PATCH 1/3] workqueues: change cancel_work_sync() to clear work->data Oleg Nesterov
@ 2010-02-25  3:13 ` Tejun Heo
  2010-02-25  9:53 ` Dmitry Torokhov
  1 sibling, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2010-02-25  3:13 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Andrew Morton, Dmitry Torokhov, Samu Onkalo, linux-kernel

Hello,

On 02/25/2010 05:20 AM, Oleg Nesterov wrote:
> In short: change cancel_work_sync(work) to mark this work as "never
> queued" upon return.
> 
> When cancel_work_sync(work) succeeds, we know that this work can't be
> queued or running, and since we own WORK_STRUCT_PENDING nobody can change
> the bits in work->data under us. This means we can also clear the "cwq"
> part along with _PENDING bit lockless before return, unless the work is
> queued nobody can assume get_wq_data() is stable even under cwq->lock.
> 
> This change can speedup the subsequent cancel/flush requests, and as
> Dmitry pointed out this simplifies the usage of work_struct's which
> can be queued on different workqueues. Consider this pseudo code from
> the input subsystem:
> 
> 	struct workqueue_struct *WQ;
> 	struct work_struct *WORK;
> 
> 	for (;;) {
> 		WQ = create_workqueue();
> 		...
> 		if (condition())
> 			queue_work(WQ, WORK);
> 		...
> 		cancel_work_sync(WORK);
> 		destroy_workqueue(WQ);
> 	}
> 
> If condition() returns T and then F, cancel_work_sync() will crash the
> kernel because WORK->data still points to the already destroyed workqueue.
> With this patch the code like above becomes correct.
> 
> Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Yeap, generally looks good to me and I've been doing similar things to
implement non-reentrant workqueues (records the last cpu a work was on
in work->dataa after the work is dispatched to a worker thread).  I'll
add this to the series.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/3] workqueues: change cancel_work_sync() to clear work->data
  2010-02-24 20:20 [PATCH 1/3] workqueues: change cancel_work_sync() to clear work->data Oleg Nesterov
  2010-02-25  3:13 ` Tejun Heo
@ 2010-02-25  9:53 ` Dmitry Torokhov
  1 sibling, 0 replies; 3+ messages in thread
From: Dmitry Torokhov @ 2010-02-25  9:53 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Andrew Morton, Samu Onkalo, Tejun Heo, linux-kernel

On Wed, Feb 24, 2010 at 09:20:31PM +0100, Oleg Nesterov wrote:
> In short: change cancel_work_sync(work) to mark this work as "never
> queued" upon return.
> 
> When cancel_work_sync(work) succeeds, we know that this work can't be
> queued or running, and since we own WORK_STRUCT_PENDING nobody can change
> the bits in work->data under us. This means we can also clear the "cwq"
> part along with _PENDING bit lockless before return, unless the work is
> queued nobody can assume get_wq_data() is stable even under cwq->lock.
> 
> This change can speedup the subsequent cancel/flush requests, and as
> Dmitry pointed out this simplifies the usage of work_struct's which
> can be queued on different workqueues. Consider this pseudo code from
> the input subsystem:
> 
> 	struct workqueue_struct *WQ;
> 	struct work_struct *WORK;
> 
> 	for (;;) {
> 		WQ = create_workqueue();
> 		...
> 		if (condition())
> 			queue_work(WQ, WORK);
> 		...
> 		cancel_work_sync(WORK);
> 		destroy_workqueue(WQ);
> 	}
> 
> If condition() returns T and then F, cancel_work_sync() will crash the
> kernel because WORK->data still points to the already destroyed workqueue.
> With this patch the code like above becomes correct.
> 

Very nice, thank you for making the change.

> Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
> 
>  kernel/workqueue.c |   12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> --- wq/kernel/workqueue.c~1_CANCEL_CLEAR_WQ	2010-02-24 20:43:32.000000000 +0100
> +++ wq/kernel/workqueue.c	2010-02-24 20:55:53.000000000 +0100
> @@ -229,6 +229,16 @@ static inline void set_wq_data(struct wo
>  	atomic_long_set(&work->data, new);
>  }
>  
> +/*
> + * Clear WORK_STRUCT_PENDING and the workqueue on which it was queued.
> + */
> +static inline void clear_wq_data(struct work_struct *work)
> +{
> +	unsigned long flags = *work_data_bits(work) &
> +				(1UL << WORK_STRUCT_STATIC);
> +	atomic_long_set(&work->data, flags);
> +}
> +
>  static inline
>  struct cpu_workqueue_struct *get_wq_data(struct work_struct *work)
>  {
> @@ -671,7 +681,7 @@ static int __cancel_work_timer(struct wo
>  		wait_on_work(work);
>  	} while (unlikely(ret < 0));
>  
> -	work_clear_pending(work);
> +	clear_wq_data(work);
>  	return ret;
>  }
>  
> 

-- 
Dmitry

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-02-25  9:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-24 20:20 [PATCH 1/3] workqueues: change cancel_work_sync() to clear work->data Oleg Nesterov
2010-02-25  3:13 ` Tejun Heo
2010-02-25  9:53 ` Dmitry Torokhov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox