stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] workqueue: ensure @task is valid across kthread_stop()
@ 2014-02-15 14:02 Lai Jiangshan
  2014-02-18 21:37 ` [PATCH wq/for-3.14-fixes] " Tejun Heo
  0 siblings, 1 reply; 4+ messages in thread
From: Lai Jiangshan @ 2014-02-15 14:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, Lai Jiangshan, stable

When a worker is set WORKER_DIE, it may die very quickly,
and kthread_stop() will access to a stale task stuct/stack.
To avoid this, we use get_task_struct() to ensure @task is valid.

See more comments in kthread_create_on_node()&kthread_stop().
Note: comments in kthread_create_on_node() does not elaborate
any use case like this patch, but it is a valid way to use
kthread_stop().

CC: stable@vger.kernel.org
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 82ef9f3..783d5f2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1856,9 +1856,12 @@ static void destroy_worker(struct worker *worker)
 
 	idr_remove(&pool->worker_idr, worker->id);
 
+	/* Enusre the @worker->task is valid across kthread_stop() */
+	get_task_struct(worker->task);
 	spin_unlock_irq(&pool->lock);
 
 	kthread_stop(worker->task);
+	put_task_struct(worker->task);
 	kfree(worker);
 
 	spin_lock_irq(&pool->lock);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH wq/for-3.14-fixes] workqueue: ensure @task is valid across kthread_stop()
  2014-02-15 14:02 [PATCH] workqueue: ensure @task is valid across kthread_stop() Lai Jiangshan
@ 2014-02-18 21:37 ` Tejun Heo
  2014-02-19  3:39   ` Lai Jiangshan
  0 siblings, 1 reply; 4+ messages in thread
From: Tejun Heo @ 2014-02-18 21:37 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, stable

Hello, Lai.

I massaged the patch a bit and applied it to wq/for-3.14-fixes.

Thanks.
-------- 8< --------
>From 5bdfff96c69a4d5ab9c49e60abf9e070ecd2acbb Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Date: Sat, 15 Feb 2014 22:02:28 +0800

When a kworker should die, the kworkre is notified through WORKER_DIE
flag instead of kthread_should_stop().  This, IIRC, is primarily to
keep the test synchronized inside worker_pool lock.  WORKER_DIE is
first set while holding pool->lock, the lock is dropped and
kthread_stop() is called.

Unfortunately, this means that there's a slight chance that the target
kworker may see WORKER_DIE before kthread_stop() finishes and exits
and frees the target task before or during kthread_stop().

Fix it by pinning the target task before setting WORKER_DIE and
putting it after kthread_stop() is done.

tj: Improved patch description and comment.  Moved pinning above
    WORKER_DIE for better signify what it's protecting.

CC: stable@vger.kernel.org
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/workqueue.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 82ef9f3..193e977 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1851,6 +1851,12 @@ static void destroy_worker(struct worker *worker)
 	if (worker->flags & WORKER_IDLE)
 		pool->nr_idle--;
 
+	/*
+	 * Once WORKER_DIE is set, the kworker may destroy itself at any
+	 * point.  Pin to ensure the task stays until we're done with it.
+	 */
+	get_task_struct(worker->task);
+
 	list_del_init(&worker->entry);
 	worker->flags |= WORKER_DIE;
 
@@ -1859,6 +1865,7 @@ static void destroy_worker(struct worker *worker)
 	spin_unlock_irq(&pool->lock);
 
 	kthread_stop(worker->task);
+	put_task_struct(worker->task);
 	kfree(worker);
 
 	spin_lock_irq(&pool->lock);
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH wq/for-3.14-fixes] workqueue: ensure @task is valid across kthread_stop()
  2014-02-18 21:37 ` [PATCH wq/for-3.14-fixes] " Tejun Heo
@ 2014-02-19  3:39   ` Lai Jiangshan
  2014-02-20  0:13     ` Tejun Heo
  0 siblings, 1 reply; 4+ messages in thread
From: Lai Jiangshan @ 2014-02-19  3:39 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, stable

On 02/19/2014 05:37 AM, Tejun Heo wrote:
> Hello, Lai.
> 
> I massaged the patch a bit and applied it to wq/for-3.14-fixes.
> 
> Thanks.
> -------- 8< --------
>>>From 5bdfff96c69a4d5ab9c49e60abf9e070ecd2acbb Mon Sep 17 00:00:00 2001
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
> Date: Sat, 15 Feb 2014 22:02:28 +0800
> 
> When a kworker should die, the kworkre is notified through WORKER_DIE
> flag instead of kthread_should_stop().  This, IIRC, is primarily to
> keep the test synchronized inside worker_pool lock.  WORKER_DIE is
> first set while holding pool->lock, the lock is dropped and
> kthread_stop() is called.
> 
> Unfortunately, this means that there's a slight chance that the target
> kworker may see WORKER_DIE before kthread_stop() finishes and exits
> and frees the target task before or during kthread_stop().
> 
> Fix it by pinning the target task before setting WORKER_DIE and
> putting it after kthread_stop() is done.
> 
> tj: Improved patch description and comment.  Moved pinning above
>     WORKER_DIE for better signify what it's protecting.
> 
> CC: stable@vger.kernel.org

I think no one hit this bug. So I add this stable TAG?

(Jason's bug-report drives me to review the workqueue harder,
and I found this possible bug, but I think it is irrespective
with Jason's bug-report.)

> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> ---
>  kernel/workqueue.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 82ef9f3..193e977 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1851,6 +1851,12 @@ static void destroy_worker(struct worker *worker)
>  	if (worker->flags & WORKER_IDLE)
>  		pool->nr_idle--;
>  
> +	/*
> +	 * Once WORKER_DIE is set, the kworker may destroy itself at any
> +	 * point.  Pin to ensure the task stays until we're done with it.
> +	 */
> +	get_task_struct(worker->task);
> +
>  	list_del_init(&worker->entry);
>  	worker->flags |= WORKER_DIE;
>  
> @@ -1859,6 +1865,7 @@ static void destroy_worker(struct worker *worker)
>  	spin_unlock_irq(&pool->lock);
>  
>  	kthread_stop(worker->task);
> +	put_task_struct(worker->task);
>  	kfree(worker);
>  
>  	spin_lock_irq(&pool->lock);


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH wq/for-3.14-fixes] workqueue: ensure @task is valid across kthread_stop()
  2014-02-19  3:39   ` Lai Jiangshan
@ 2014-02-20  0:13     ` Tejun Heo
  0 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2014-02-20  0:13 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, stable

Hello, Lai.

On Wed, Feb 19, 2014 at 11:39:42AM +0800, Lai Jiangshan wrote:
> > CC: stable@vger.kernel.org
> 
> I think no one hit this bug. So I add this stable TAG?
> 
> (Jason's bug-report drives me to review the workqueue harder,
> and I found this possible bug, but I think it is irrespective
> with Jason's bug-report.)

Hmm.... the issue won't happen frequently but it can happen and the
change is on the safer side.  I think it'd be better to cc stable.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-02-20  0:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-15 14:02 [PATCH] workqueue: ensure @task is valid across kthread_stop() Lai Jiangshan
2014-02-18 21:37 ` [PATCH wq/for-3.14-fixes] " Tejun Heo
2014-02-19  3:39   ` Lai Jiangshan
2014-02-20  0:13     ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).