public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] avoid race condition in pick_next_task_fair in  kernel/sched_fair.c
@ 2010-06-29  7:10 shenghui
  2010-06-29 10:43 ` Peter Zijlstra
  0 siblings, 1 reply; 44+ messages in thread
From: shenghui @ 2010-06-29  7:10 UTC (permalink / raw)
  To: kernel-janitors, linux-kernel, mingo, peterz, Greg KH

Hi,

      I walked some code in kernel/sched_fair.c in version 2.6.35-rc3, and
got the following potential failure:

static struct task_struct *pick_next_task_fair(struct rq *rq)
{
...
	if (!cfs_rq->nr_running)
		return NULL;

	do {
		se = pick_next_entity(cfs_rq);
		set_next_entity(cfs_rq, se);
		cfs_rq = group_cfs_rq(se);
	} while (cfs_rq);
...
}

/*
 * The dequeue_task method is called after nr_running is
 * decreased. We remove the task from the rbtree and
 * update the fair scheduling stats:
 */
static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
{
...
		dequeue_entity(cfs_rq, se, flags);
...
}

static void
dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
{
...
	if (se != cfs_rq->curr)
		__dequeue_entity(cfs_rq, se);
        account_entity_dequeue(cfs_rq, se);
	update_min_vruntime(cfs_rq);
...
}

dequeue_task_fair() is used to dequeue some task, and it calls
dequeue_entity() to finish the job. While dequeue_entity() calls
__dequeue_entity() first, then calls account_entity_dequeue()
to do substraction on cfs_rq->nr_running.
Here, if one task is dequeued, then cfs_rq->nr_running will be
changed later, e.g 1 to 0.
In the pick_next_task_fair(), the if check will get passed before
nr_running is set 0 and the following while structure is executed.

	do {
		se = pick_next_entity(cfs_rq);
		set_next_entity(cfs_rq, se);
		cfs_rq = group_cfs_rq(se);
	} while (cfs_rq);

We may get se set NULL here, and set_next_entity may deference
NULL pointer, which can lead to Oops.

I think some lock on the metadata can fix this issue, but we may
change plenty of code to add support for lock. I think the easist
way is just substacting nr_running before dequing tasks.

Following is my patch, please check it.


>From 4fe38deac173c7777cd02096950e979749170873 Mon Sep 17 00:00:00 2001
From: Wang Sheng-Hui <crosslonelyover@gmail.com>
Date: Tue, 29 Jun 2010 14:49:05 +0800
Subject: [PATCH] avoid race condition in pick_next_task_fair in
kernel/sched_fair.c


Signed-off-by: Wang Sheng-Hui <crosslonelyover@gmail.com>
---
 kernel/sched_fair.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index eed35ed..93073ff 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -823,9 +823,9 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct
sched_entity *se, int flags)

 	clear_buddies(cfs_rq, se);

+	account_entity_dequeue(cfs_rq, se);
 	if (se != cfs_rq->curr)
 		__dequeue_entity(cfs_rq, se);
-	account_entity_dequeue(cfs_rq, se);
 	update_min_vruntime(cfs_rq);

 	/*
@@ -1059,7 +1059,7 @@ enqueue_task_fair(struct rq *rq, struct
task_struct *p, int flags)
 }

 /*
- * The dequeue_task method is called before nr_running is
+ * The dequeue_task method is called after nr_running is
  * decreased. We remove the task from the rbtree and
  * update the fair scheduling stats:
  */
-- 
1.6.3.3



-- 


Thanks and Best Regards,
shenghui

^ permalink raw reply related	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2011-01-19 19:04 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-29  7:10 [PATCH] avoid race condition in pick_next_task_fair in kernel/sched_fair.c shenghui
2010-06-29 10:43 ` Peter Zijlstra
2010-06-29 11:24   ` shenghui
2010-06-29 11:35     ` Peter Zijlstra
2010-06-29 12:44       ` shenghui
2010-12-19  2:03   ` Miklos Vajna
2010-12-22  0:22     ` Miklos Vajna
2010-12-22  8:29       ` Peter Zijlstra
2010-12-22  8:41         ` Peter Zijlstra
2010-12-22  8:41         ` Mike Galbraith
2010-12-22  9:07           ` Peter Zijlstra
2010-12-22 13:31             ` Miklos Vajna
2010-12-22 14:00               ` Peter Zijlstra
2010-12-22 14:11                 ` Peter Zijlstra
2010-12-22 15:14                   ` Miklos Vajna
2010-12-22 15:25                     ` Peter Zijlstra
2010-12-22 17:08                     ` Peter Zijlstra
2010-12-22 17:16                       ` Ingo Molnar
2010-12-22 17:25                         ` Peter Zijlstra
2010-12-22 20:36                       ` Peter Zijlstra
2010-12-23  2:08                         ` Yong Zhang
2010-12-23 12:12                           ` Peter Zijlstra
2010-12-23 12:33                             ` Peter Zijlstra
2010-12-23 18:24                               ` Peter Zijlstra
     [not found]                                 ` <1293132304.6798.6.camel@marge.simson.net>
     [not found]                                   ` <1293132862.25981.22.camel@laptop>
     [not found]                                     ` <1293187425.7138.2.camel@marge.simson.net>
     [not found]                                       ` <1293188091.25981.200.camel@laptop>
     [not found]                                         ` <1293192999.18035.4.camel@marge.simson.net>
2010-12-24 15:59                                           ` [PATCH] sched, cgroup: Use exit hook to avoid use-after-free crash Peter Zijlstra
2010-12-24 16:40                                             ` Miklos Vajna
2010-12-24 16:48                                             ` Mike Galbraith
2010-12-24 17:07                                               ` Peter Zijlstra
2010-12-24 17:24                                                 ` Mike Galbraith
2010-12-25 17:55                                             ` Balbir Singh
2010-12-25 20:59                                             ` Paul Menage
2011-01-03  7:06                                               ` Peter Zijlstra
2010-12-29 15:25                                             ` Ingo Molnar
2010-12-29 23:07                                               ` Miklos Vajna
2010-12-31 10:04                                                 ` Mike Galbraith
2010-12-31 10:46                                                   ` Miklos Vajna
2010-12-31  8:32                                               ` [PATCH] " Mike Galbraith
2011-01-03  8:21                                                 ` Peter Zijlstra
2011-01-04 14:19                                                 ` [tip:sched/core] sched, autogroup: Fix reference leak tip-bot for Mike Galbraith
2011-01-04 14:57                                                   ` Oleg Nesterov
2011-01-04 19:06                                                     ` Mike Galbraith
2011-01-19 19:04                                             ` [tip:sched/urgent] sched, cgroup: Use exit hook to avoid use-after-free crash tip-bot for Peter Zijlstra
2010-12-22 21:11                       ` [PATCH] avoid race condition in pick_next_task_fair in kernel/sched_fair.c Miklos Vajna
2010-12-22 23:39                         ` Miklos Vajna

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox