public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2.6.17-rc6-mm2] Fix exit race in per-task-delay-accounting
@ 2006-06-21  5:59 Balbir Singh
  2006-06-21  6:45 ` Balbir Singh
  0 siblings, 1 reply; 3+ messages in thread
From: Balbir Singh @ 2006-06-21  5:59 UTC (permalink / raw)
  To: akpm; +Cc: nagar@watson.ibm.com, Balbir Singh, jlan@engr.sgi.com,
	linux-kernel

Hi, Andrew,

This patch fixes a race in per-task-delay-accounting. This race
was reported by Jay Lan. I tested the patch using
cerebrus test control system for eight hours with getdelays running on
the side (for both push and pull of delay statistics).

It fixed the problem that Jay Lan saw.

Here's an explanation of the race condition

Consider tasks of the same thread group exiting, lets call them T1 and T2


T1                          T2

CPU0                        CPU1
=====                       =====

do_exit()
...                        do_exit()
taskstats_exit_send()                ...
                           taskstats_exit_send()
                           	fill_tgid()
                                delayacct_add_tsk()
delayacct_tsk_exit()            ....
                                __delayacct_add_tsk()


While T1 is yet to be removed from the thread group. T1->delays is set to NULL
between delayacct_add_tsk() and __delayacct_add_tsk() call.

When T2 looks for threads in the thread group, it finds T1 and tries to
collect stats for it. When we get to the spin_lock() in __delayacct_add_tsk(),
we find T1->delays is a NULL pointer.

Signed-off-by: Balbir Singh <balbir@in.ibm.com>
---

 include/linux/taskstats_kern.h |    1 +
 kernel/delayacct.c             |    5 ++++-
 kernel/taskstats.c             |    5 ++++-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff -puN kernel/taskstats.c~per-task-delay-accounting-fix-exit-race kernel/taskstats.c
--- linux-2.6.17-rc6/kernel/taskstats.c~per-task-delay-accounting-fix-exit-race	2006-06-20 13:49:29.000000000 +0530
+++ linux-2.6.17-rc6-balbir/kernel/taskstats.c	2006-06-20 13:49:29.000000000 +0530
@@ -25,7 +25,7 @@
 static DEFINE_PER_CPU(__u32, taskstats_seqnum) = { 0 };
 static int family_registered = 0;
 kmem_cache_t *taskstats_cache;
-static DEFINE_MUTEX(taskstats_exit_mutex);
+DEFINE_MUTEX(taskstats_exit_mutex);
 
 static struct genl_family family = {
 	.id		= GENL_ID_GENERATE,
@@ -193,6 +193,7 @@ static int taskstats_send_stats(struct s
 	if (rc < 0)
 		return rc;
 
+	mutex_lock(&taskstats_exit_mutex);
 	if (info->attrs[TASKSTATS_CMD_ATTR_PID]) {
 		u32 pid = nla_get_u32(info->attrs[TASKSTATS_CMD_ATTR_PID]);
 		rc = fill_pid(pid, NULL, &stats);
@@ -218,6 +219,7 @@ static int taskstats_send_stats(struct s
 		goto err;
 	}
 
+	mutex_unlock(&taskstats_exit_mutex);
 	nla_nest_end(rep_skb, na);
 
 	return send_reply(rep_skb, info->snd_pid, TASKSTATS_MSG_UNICAST);
@@ -226,6 +228,7 @@ nla_put_failure:
 	return genlmsg_cancel(rep_skb, reply);
 err:
 	nlmsg_free(rep_skb);
+	mutex_unlock(&taskstats_exit_mutex);
 	return rc;
 }
 
diff -puN kernel/delayacct.c~per-task-delay-accounting-fix-exit-race kernel/delayacct.c
--- linux-2.6.17-rc6/kernel/delayacct.c~per-task-delay-accounting-fix-exit-race	2006-06-20 13:49:29.000000000 +0530
+++ linux-2.6.17-rc6-balbir/kernel/delayacct.c	2006-06-20 13:49:29.000000000 +0530
@@ -48,8 +48,11 @@ void __delayacct_tsk_init(struct task_st
 
 void __delayacct_tsk_exit(struct task_struct *tsk)
 {
-	kmem_cache_free(delayacct_cache, tsk->delays);
+	struct task_delay_info *delays = tsk->delays;
+	mutex_lock(&taskstats_exit_mutex);
 	tsk->delays = NULL;
+	mutex_unlock(&taskstats_exit_mutex);
+	kmem_cache_free(delayacct_cache, delays);
 }
 
 /*
diff -puN include/linux/taskstats_kern.h~per-task-delay-accounting-fix-exit-race include/linux/taskstats_kern.h
--- linux-2.6.17-rc6/include/linux/taskstats_kern.h~per-task-delay-accounting-fix-exit-race	2006-06-20 13:49:29.000000000 +0530
+++ linux-2.6.17-rc6-balbir/include/linux/taskstats_kern.h	2006-06-20 13:49:29.000000000 +0530
@@ -17,6 +17,7 @@ enum {
 
 #ifdef CONFIG_TASKSTATS
 extern kmem_cache_t *taskstats_cache;
+extern struct mutex taskstats_exit_mutex;
 
 static inline void taskstats_exit_alloc(struct taskstats **ptidstats,
 					struct taskstats **ptgidstats)
_

-- 

	Balbir Singh,
	Linux Technology Center,
	IBM Software Labs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 2.6.17-rc6-mm2] Fix exit race in per-task-delay-accounting
  2006-06-21  5:59 [PATCH 2.6.17-rc6-mm2] Fix exit race in per-task-delay-accounting Balbir Singh
@ 2006-06-21  6:45 ` Balbir Singh
  2006-06-21 14:14   ` Shailabh Nagar
  0 siblings, 1 reply; 3+ messages in thread
From: Balbir Singh @ 2006-06-21  6:45 UTC (permalink / raw)
  To: akpm; +Cc: Balbir Singh, Shailabh Nagar, Jay Lan, linux-kernel

Balbir Singh wrote:
> Hi, Andrew,
> 
> This patch fixes a race in per-task-delay-accounting. This race
> was reported by Jay Lan. I tested the patch using
> cerebrus test control system for eight hours with getdelays running on
> the side (for both push and pull of delay statistics).
> 
> It fixed the problem that Jay Lan saw.
> 
> Here's an explanation of the race condition
> 
> Consider tasks of the same thread group exiting, lets call them T1 and T2
> 
> 
> T1                          T2
> 
> CPU0                        CPU1
> =====                       =====
> 
> do_exit()
> ...                        do_exit()
> taskstats_exit_send()                ...
>                            taskstats_exit_send()
>                            	fill_tgid()
>                                 delayacct_add_tsk()
> delayacct_tsk_exit()            ....
>                                 __delayacct_add_tsk()
> 
> 
> While T1 is yet to be removed from the thread group. T1->delays is set to NULL
> between delayacct_add_tsk() and __delayacct_add_tsk() call.
> 
> When T2 looks for threads in the thread group, it finds T1 and tries to
> collect stats for it. When we get to the spin_lock() in __delayacct_add_tsk(),
> we find T1->delays is a NULL pointer.
> 
> Signed-off-by: Balbir Singh <balbir@in.ibm.com>
> ---
> 
>  include/linux/taskstats_kern.h |    1 +
>  kernel/delayacct.c             |    5 ++++-
>  kernel/taskstats.c             |    5 ++++-
>  3 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff -puN kernel/taskstats.c~per-task-delay-accounting-fix-exit-race kernel/taskstats.c
> --- linux-2.6.17-rc6/kernel/taskstats.c~per-task-delay-accounting-fix-exit-race	2006-06-20 13:49:29.000000000 +0530
> +++ linux-2.6.17-rc6-balbir/kernel/taskstats.c	2006-06-20 13:49:29.000000000 +0530
> @@ -25,7 +25,7 @@
>  static DEFINE_PER_CPU(__u32, taskstats_seqnum) = { 0 };
>  static int family_registered = 0;
>  kmem_cache_t *taskstats_cache;
> -static DEFINE_MUTEX(taskstats_exit_mutex);
> +DEFINE_MUTEX(taskstats_exit_mutex);
>  
>  static struct genl_family family = {
>  	.id		= GENL_ID_GENERATE,
> @@ -193,6 +193,7 @@ static int taskstats_send_stats(struct s
>  	if (rc < 0)
>  		return rc;
>  
> +	mutex_lock(&taskstats_exit_mutex);
>  	if (info->attrs[TASKSTATS_CMD_ATTR_PID]) {
>  		u32 pid = nla_get_u32(info->attrs[TASKSTATS_CMD_ATTR_PID]);
>  		rc = fill_pid(pid, NULL, &stats);
> @@ -218,6 +219,7 @@ static int taskstats_send_stats(struct s
>  		goto err;
>  	}
>  
> +	mutex_unlock(&taskstats_exit_mutex);
>  	nla_nest_end(rep_skb, na);
>  
>  	return send_reply(rep_skb, info->snd_pid, TASKSTATS_MSG_UNICAST);
> @@ -226,6 +228,7 @@ nla_put_failure:
>  	return genlmsg_cancel(rep_skb, reply);
>  err:
>  	nlmsg_free(rep_skb);
> +	mutex_unlock(&taskstats_exit_mutex);
>  	return rc;
>  }
>  
> diff -puN kernel/delayacct.c~per-task-delay-accounting-fix-exit-race kernel/delayacct.c
> --- linux-2.6.17-rc6/kernel/delayacct.c~per-task-delay-accounting-fix-exit-race	2006-06-20 13:49:29.000000000 +0530
> +++ linux-2.6.17-rc6-balbir/kernel/delayacct.c	2006-06-20 13:49:29.000000000 +0530
> @@ -48,8 +48,11 @@ void __delayacct_tsk_init(struct task_st
>  
>  void __delayacct_tsk_exit(struct task_struct *tsk)
>  {
> -	kmem_cache_free(delayacct_cache, tsk->delays);
> +	struct task_delay_info *delays = tsk->delays;
> +	mutex_lock(&taskstats_exit_mutex);
>  	tsk->delays = NULL;
> +	mutex_unlock(&taskstats_exit_mutex);
> +	kmem_cache_free(delayacct_cache, delays);
>  }
>  
>  /*
> diff -puN include/linux/taskstats_kern.h~per-task-delay-accounting-fix-exit-race include/linux/taskstats_kern.h
> --- linux-2.6.17-rc6/include/linux/taskstats_kern.h~per-task-delay-accounting-fix-exit-race	2006-06-20 13:49:29.000000000 +0530
> +++ linux-2.6.17-rc6-balbir/include/linux/taskstats_kern.h	2006-06-20 13:49:29.000000000 +0530
> @@ -17,6 +17,7 @@ enum {
>  
>  #ifdef CONFIG_TASKSTATS
>  extern kmem_cache_t *taskstats_cache;
> +extern struct mutex taskstats_exit_mutex;
>  
>  static inline void taskstats_exit_alloc(struct taskstats **ptidstats,
>  					struct taskstats **ptgidstats)
> _
> 

The Cc's got messed up (a really odd name lookup happened at the mail
server).

	Sorry
	Balbir Singh,
	Linux Technology Center,
	IBM Software Labs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 2.6.17-rc6-mm2] Fix exit race in per-task-delay-accounting
  2006-06-21  6:45 ` Balbir Singh
@ 2006-06-21 14:14   ` Shailabh Nagar
  0 siblings, 0 replies; 3+ messages in thread
From: Shailabh Nagar @ 2006-06-21 14:14 UTC (permalink / raw)
  To: balbir; +Cc: akpm, Jay Lan, linux-kernel

Balbir Singh wrote:
> Balbir Singh wrote:
> 
>> Hi, Andrew,
>>
>> This patch fixes a race in per-task-delay-accounting. This race
>> was reported by Jay Lan. I tested the patch using
>> cerebrus test control system for eight hours with getdelays running on
>> the side (for both push and pull of delay statistics).
>>
>> It fixed the problem that Jay Lan saw.

Balbir, Andrew,

I'd recommend holding off on including the patch for a bit
since the solution proposed uses a
a) global and b) taskstats related
mutex to solve a race that is in the delay accounting code.

The locking required to resolve this race is for the exiting task
(whose tsk->delays is being set to null) and that too only for
accessing its delay accounting field.

Using a global mutex that is also being used to serialize exits
may be unnecessarily
- increasing contention
- introducing dependency between taskstats (generic) and delay accounting
layers.

So if there is an inexpensive way of achieving this locking using
something localized and specific to delay accounting (the most obvious
way seems to be a tsk->delay_lock but that has the unpleasant aspect of adding
yet another field to struct task) we should explore that first.

--Shailabh

>>
>> Signed-off-by: Balbir Singh <balbir@in.ibm.com>
>> ---
>>
>>  include/linux/taskstats_kern.h |    1 +
>>  kernel/delayacct.c             |    5 ++++-
>>  kernel/taskstats.c             |    5 ++++-
>>  3 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff -puN kernel/taskstats.c~per-task-delay-accounting-fix-exit-race
>> kernel/taskstats.c
>> ---

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-06-21 14:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-21  5:59 [PATCH 2.6.17-rc6-mm2] Fix exit race in per-task-delay-accounting Balbir Singh
2006-06-21  6:45 ` Balbir Singh
2006-06-21 14:14   ` Shailabh Nagar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox